why can't a table be part of the same publication as its schema

Started by Peter Eisentrautover 3 years ago46 messageshackers
Jump to latest
#1Peter Eisentraut
peter_e@gmx.net

Apparently, you can't add a table to a publication if its schema is
already part of the publication (and vice versa), e.g.,

=# alter publication p1 add table s1.t1;
ERROR: 22023: cannot add relation "s1.t1" to publication
DETAIL: Table's schema "s1" is already part of the publication or part
of the specified schema list.

Is there a reason for this? It looks a bit like a misfeature to me. It
constrains how you can move your tables around between schemas, based on
how somewhere else a publication has been constructed.

It seems to me that it shouldn't be difficult to handle the case that a
table is part of the publication via two different routes. (We must
already handle that since a subscription can use more than one publication.)

#2Amit Kapila
amit.kapila16@gmail.com
In reply to: Peter Eisentraut (#1)
Re: why can't a table be part of the same publication as its schema

On Thu, Sep 8, 2022 at 5:06 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:

Apparently, you can't add a table to a publication if its schema is
already part of the publication (and vice versa), e.g.,

=# alter publication p1 add table s1.t1;
ERROR: 22023: cannot add relation "s1.t1" to publication
DETAIL: Table's schema "s1" is already part of the publication or part
of the specified schema list.

Is there a reason for this?

Yes, because otherwise, there was confusion while dropping the objects
from publication. Consider in the above case, if we would have allowed
it and then the user performs ALTER PUBLICATION p1 DROP ALL TABLES IN
SCHEMA s1, then (a) shall we remove both schema s1 and a table that is
separately added (s1.t1) from that schema, or (b) just remove schema
s1? There is a point that the user can expect that as she has added
them separately, so we should allow her to drop them separately as
well. OTOH, if we go that way, then it is again questionable that when
the user has asked to Drop all the tables in the schema (via DROP ALL
TABLES IN SCHEMA command) then why keep some tables?

The other confusion from allowing publications to have schemas and
tables from the same schema is that say the user has created a
publication with the command CREATE PUBLICATION pub1 FOR ALL TABLES IN
SCHEMA s1, and then she can ask to allow dropping one of the tables in
the schema by ALTER PUBLICATION pub1 DROP TABLE s1.t1. Now, it will be
tricky to support and I think it doesn't make sense as well.

Similarly, what if the user has first created a publication with
"CREATE PUBLICATION pub1 ADD TABLE s1.t1, s1.t2, ... s1.t99;" and then
tries to drop all tables in one shot by ALTER PUBLICATION DROP ALL
TABLES IN SCHEMA sch1;?

To avoid these confusions, we have disallowed adding a table if its
schema is already part of the publication and vice-versa. Also, there
won't be any additional benefit to doing so.

--
With Regards,
Amit Kapila.

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Amit Kapila (#2)
Re: why can't a table be part of the same publication as its schema

Amit Kapila <amit.kapila16@gmail.com> writes:

To avoid these confusions, we have disallowed adding a table if its
schema is already part of the publication and vice-versa.

Really?

Is there logic in ALTER TABLE SET SCHEMA that rejects the command
dependent on the contents of the publication tables? If so, are
there locks taken in both ALTER TABLE SET SCHEMA and the
publication-modifying commands that are sufficient to prevent
race conditions in such changes?

This position sounds quite untenable from here, even if I found
your arguments-in-support convincing, which I don't really.
ISTM the rule should be along the lines of "table S.T should
be published either if schema S is published or S.T itself is".
There's no obvious need to interconnect the two conditions.

regards, tom lane

#4Amit Kapila
amit.kapila16@gmail.com
In reply to: Tom Lane (#3)
Re: why can't a table be part of the same publication as its schema

On Fri, Sep 9, 2022 at 11:44 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Amit Kapila <amit.kapila16@gmail.com> writes:

To avoid these confusions, we have disallowed adding a table if its
schema is already part of the publication and vice-versa.

Really?

Is there logic in ALTER TABLE SET SCHEMA that rejects the command
dependent on the contents of the publication tables?

Yes, it has. For example,
postgres=# create schema s1;
CREATE SCHEMA
postgres=# create table s1.t1(c1 int);
CREATE TABLE
postgres=# create schema s2;
CREATE SCHEMA

postgres=# create publication pub1 for all tables in schema s2, table s1.t1;
CREATE PUBLICATION

postgres=# Alter table s1.t1 set schema s2;
ERROR: cannot move table "t1" to schema "s2"
DETAIL: The schema "s2" and same schema's table "t1" cannot be part
of the same publication "pub1".

If so, are
there locks taken in both ALTER TABLE SET SCHEMA and the
publication-modifying commands that are sufficient to prevent
race conditions in such changes?

Good point. I have checked it and found that ALTER TABLE SET SCHEMA
takes AccessExclusiveLock on relation and AccessShareLock on the
schema which it is going to set. The alter publication command takes
ShareUpdateExclusiveLock on relation for dropping/adding a table to
publication which will prevent any race condition with ALTER TABLE SET
SCHEMA. However, the alter publication command takes AccessShareLock
for dropping/adding schema which won't block with ALTER TABLE SET
SCHEMA command. So, I think we need to change the lock mode for it in
alter publication command.

This position sounds quite untenable from here, even if I found
your arguments-in-support convincing, which I don't really.
ISTM the rule should be along the lines of "table S.T should
be published either if schema S is published or S.T itself is".
There's no obvious need to interconnect the two conditions.

This rule is currently followed when a subscription has more than one
publication. It is just that we didn't allow it in the same
publication because of a fear that it may cause confusion for some of
the users. The other thing to look at here is that the existing case
of a "FOR ALL TABLES" publication also follows a similar rule such
that it doesn't allow adding individual tables if the publication is
for all tables. For example,

postgres=# create publication pub1 for all tables;
CREATE PUBLICATION
postgres=# alter publication pub1 add table t1;
ERROR: publication "pub1" is defined as FOR ALL TABLES
DETAIL: Tables cannot be added to or dropped from FOR ALL TABLES publications.

So, why shouldn't a "FOR ALL TABLES IN SCHEMA" publication follow a
similar behavior?

--
With Regards,
Amit Kapila.

#5Robert Haas
robertmhaas@gmail.com
In reply to: Amit Kapila (#4)
Re: why can't a table be part of the same publication as its schema

On Fri, Sep 9, 2022 at 5:21 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

So, why shouldn't a "FOR ALL TABLES IN SCHEMA" publication follow a
similar behavior?

Surely that is not the same case at all. If you're publishing
everything, there's no point in also having a specific list of things
that you want published, but when you're publishing only some things,
there is. If my wife tells me to wash everything in the laundry basket
and also my nice pants, and I discover that my nice pants already
happen to be in the laundry basket, I do not tell her:

ERROR: my nice pants are already in the laundry basket

It feels like a mistake to me that there's any catalog representation
at all for a table that is published because it is part of a schema.
The way a feature like this should work is that the schema should be
labelled as published, and we should discover which tables are part of
it at any given time as we go. We shouldn't need separate catalog
entries for each table in the schema just because the schema is
published. But if we do have such catalog entries, surely there should
be a difference between the catalog entry that gets created when the
table is individually published and the one that gets created when the
containing schema is published. We have such tracking in other cases
(coninhcount, conislocal; attinhcount, attislocal).

In my opinion, this shouldn't have been committed working the way it does.

--
Robert Haas
EDB: http://www.enterprisedb.com

#6Zhijie Hou (Fujitsu)
houzj.fnst@fujitsu.com
In reply to: Robert Haas (#5)
RE: why can't a table be part of the same publication as its schema

On Friday, September 9, 2022 9:57 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Sep 9, 2022 at 5:21 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

So, why shouldn't a "FOR ALL TABLES IN SCHEMA" publication follow a
similar behavior?

Hi

It feels like a mistake to me that there's any catalog representation at all for a
table that is published because it is part of a schema.
The way a feature like this should work is that the schema should be labelled as
published, and we should discover which tables are part of it at any given time as
we go. We shouldn't need separate catalog entries for each table in the schema
just because the schema is published.

IIRC, the feature currently works almost the same as you described. It doesn't
create entry for tables that are published via its schema level, it only record
the published schema and check which tables are part of it.

Sorry, If I misunderstand your points or missed something.

Best regards,
Hou zj

#7Robert Haas
robertmhaas@gmail.com
In reply to: Zhijie Hou (Fujitsu) (#6)
Re: why can't a table be part of the same publication as its schema

On Fri, Sep 9, 2022 at 10:29 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:

IIRC, the feature currently works almost the same as you described. It doesn't
create entry for tables that are published via its schema level, it only record
the published schema and check which tables are part of it.

Oh, well if that's the case, that is great news. But then I don't
understand Amit's comment from before:

Yes, because otherwise, there was confusion while dropping the objects
from publication. Consider in the above case, if we would have allowed
it and then the user performs ALTER PUBLICATION p1 DROP ALL TABLES IN
SCHEMA s1, then (a) shall we remove both schema s1 and a table that is
separately added (s1.t1) from that schema, or (b) just remove schema
s1?

I believe that (b) is the correct behavior, so I assumed that this
issue must be some difficulty in implementing it, like a funny catalog
representation.

Things might be clearer if we'd made the syntax "ALTER PUBLICATION p1
{ ADD | DROP } { TABLE | SCHEMA } name". I don't understand why we
used this ALL TABLES IN SCHEMA language.

--
Robert Haas
EDB: http://www.enterprisedb.com

#8Mark Dilger
mark.dilger@enterprisedb.com
In reply to: Robert Haas (#7)
Re: why can't a table be part of the same publication as its schema

On Sep 9, 2022, at 8:18 AM, Robert Haas <robertmhaas@gmail.com> wrote:

Things might be clearer if we'd made the syntax "ALTER PUBLICATION p1
{ ADD | DROP } { TABLE | SCHEMA } name". I don't understand why we
used this ALL TABLES IN SCHEMA language.

The conversation, as I recall, was that "ADD SCHEMA foo" would only mean all tables in foo, until publication of other object types became supported, at which point "ADD SCHEMA foo" would suddenly mean more than it did before. People might find that surprising, so the "ALL TABLES IN" was intended to future-proof against surprising behavioral changes.


Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#9Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#7)
Re: why can't a table be part of the same publication as its schema

On Fri, Sep 9, 2022 at 8:48 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Sep 9, 2022 at 10:29 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:

IIRC, the feature currently works almost the same as you described. It doesn't
create entry for tables that are published via its schema level, it only record
the published schema and check which tables are part of it.

Oh, well if that's the case, that is great news.

Yes, the feature works as you and Hou-San have mentioned.

But then I don't
understand Amit's comment from before:

Yes, because otherwise, there was confusion while dropping the objects
from publication. Consider in the above case, if we would have allowed
it and then the user performs ALTER PUBLICATION p1 DROP ALL TABLES IN
SCHEMA s1, then (a) shall we remove both schema s1 and a table that is
separately added (s1.t1) from that schema, or (b) just remove schema
s1?

I believe that (b) is the correct behavior, so I assumed that this
issue must be some difficulty in implementing it, like a funny catalog
representation.

No, it was because of syntax. IIRC, during development, Greg Nancarrow
raised a point [1]/messages/by-id/CAJcOf-fTRZ3HiA5xU0-O-PT390A7wuUUkjP8uX3aQJLBsJNVmw@mail.gmail.com that a user can expect the individually added
tables for a schema which is also part of the publication to also get
dropped when she specifies DROP ALL TABLES IN SCHEMA. IIRC,
originally, the patch had a behavior (b) but then changed due to
discussion around this point. But now that it seems you and others
don't feel that was right, we can change back to (b) as I think that
shouldn't be difficult to achieve.

Things might be clearer if we'd made the syntax "ALTER PUBLICATION p1
{ ADD | DROP } { TABLE | SCHEMA } name". I don't understand why we
used this ALL TABLES IN SCHEMA language.

It was exactly due to the reason Mark had mentioned in the email [2]/messages/by-id/596EA671-66DF-4285-8560-0270DC062353@enterprisedb.com.

[1]: /messages/by-id/CAJcOf-fTRZ3HiA5xU0-O-PT390A7wuUUkjP8uX3aQJLBsJNVmw@mail.gmail.com
[2]: /messages/by-id/596EA671-66DF-4285-8560-0270DC062353@enterprisedb.com

--
With Regards,
Amit Kapila.

#10Robert Haas
robertmhaas@gmail.com
In reply to: Mark Dilger (#8)
Re: why can't a table be part of the same publication as its schema

On Fri, Sep 9, 2022 at 2:17 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote:

On Sep 9, 2022, at 8:18 AM, Robert Haas <robertmhaas@gmail.com> wrote:
Things might be clearer if we'd made the syntax "ALTER PUBLICATION p1
{ ADD | DROP } { TABLE | SCHEMA } name". I don't understand why we
used this ALL TABLES IN SCHEMA language.

The conversation, as I recall, was that "ADD SCHEMA foo" would only mean all tables in foo, until publication of other object types became supported, at which point "ADD SCHEMA foo" would suddenly mean more than it did before. People might find that surprising, so the "ALL TABLES IN" was intended to future-proof against surprising behavioral changes.

If I encountered this syntax in a vacuum, that's not what I would
think. I would think that ADD ALL TABLES IN SCHEMA meant add all the
tables in the schema to the publication one by one as individual
objects, i.e. add the tables that are currently as of this moment in
that schema to the publication; and I would think that ADD SCHEMA
meant remember that this schema is part of the publication and so
whenever tables are created and dropped in that schema (or moved in
and out) what is being published is automatically updated.

The analogy here seems to be to GRANT, which actually does support
both syntaxes. And if I understand correctly, GRANT ON SCHEMA gives
privileges on the schema; whereas GRANT ON ALL TABLES IN SCHEMA
modifies each table that is currently in that schema (never mind what
happens later).

--
Robert Haas
EDB: http://www.enterprisedb.com

#11Isaac Morland
isaac.morland@gmail.com
In reply to: Robert Haas (#10)
Re: why can't a table be part of the same publication as its schema

On Sat, 10 Sept 2022 at 19:18, Robert Haas <robertmhaas@gmail.com> wrote:

If I encountered this syntax in a vacuum, that's not what I would

think. I would think that ADD ALL TABLES IN SCHEMA meant add all the
tables in the schema to the publication one by one as individual
objects, i.e. add the tables that are currently as of this moment in
that schema to the publication; and I would think that ADD SCHEMA
meant remember that this schema is part of the publication and so
whenever tables are created and dropped in that schema (or moved in
and out) what is being published is automatically updated.

The analogy here seems to be to GRANT, which actually does support
both syntaxes. And if I understand correctly, GRANT ON SCHEMA gives
privileges on the schema; whereas GRANT ON ALL TABLES IN SCHEMA
modifies each table that is currently in that schema (never mind what
happens later).

Yes, except GRANT ON SCHEMA only grants access to the schema - CREATE or
USAGE. You cannot write GRANT SELECT ON SCHEMA to grant access to all
tables in the schema.

#12Mark Dilger
mark.dilger@enterprisedb.com
In reply to: Robert Haas (#10)
Re: why can't a table be part of the same publication as its schema

On Sep 10, 2022, at 4:17 PM, Robert Haas <robertmhaas@gmail.com> wrote:

I don't understand why we
used this ALL TABLES IN SCHEMA language.

The conversation, as I recall, was that "ADD SCHEMA foo" would only mean all tables in foo, until publication of other object types became supported, at which point "ADD SCHEMA foo" would suddenly mean more than it did before. People might find that surprising, so the "ALL TABLES IN" was intended to future-proof against surprising behavioral changes.

If I encountered this syntax in a vacuum, that's not what I would
think. I would think that ADD ALL TABLES IN SCHEMA meant add all the
tables in the schema to the publication one by one as individual
objects

Yes, it appears the syntax was chosen to avoid one kind of confusion, but created another kind. Per the docs on this feature:

FOR ALL TABLES IN SCHEMA
Marks the publication as one that replicates changes for all tables in the specified list of schemas, including tables created in the future.

Like you, I wouldn't expect that definition, given the behavior of GRANT with respect to the same grammatical construction.


Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#13Zhijie Hou (Fujitsu)
houzj.fnst@fujitsu.com
In reply to: Mark Dilger (#12)
RE: why can't a table be part of the same publication as its schema

On Monday, September 12, 2022 1:08 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote:

On Sep 10, 2022, at 4:17 PM, Robert Haas <robertmhaas@gmail.com> wrote:

I don't understand why we
used this ALL TABLES IN SCHEMA language.

The conversation, as I recall, was that "ADD SCHEMA foo" would only mean

all tables in foo, until publication of other object types became supported, at
which point "ADD SCHEMA foo" would suddenly mean more than it did before.
People might find that surprising, so the "ALL TABLES IN" was intended to
future-proof against surprising behavioral changes.

If I encountered this syntax in a vacuum, that's not what I would
think. I would think that ADD ALL TABLES IN SCHEMA meant add all the
tables in the schema to the publication one by one as individual
objects

Yes, it appears the syntax was chosen to avoid one kind of confusion, but created
another kind. Per the docs on this feature:

FOR ALL TABLES IN SCHEMA
Marks the publication as one that replicates changes for all tables in the
specified list of schemas, including tables created in the future.

Like you, I wouldn't expect that definition, given the behavior of GRANT with
respect to the same grammatical construction.

I'm a bit unsure if it should be compared to GRANT. Because even if we chose
"ALTER PUBLICATION p1 { ADD | DROP } SCHEMA name", it's also not
consistent with the meaning of GRANT ON SCHEMA, as GRANT ON SCHEMA doesn't
grant rights on the tables within schema if I understand correctly.

I feel we'd better compare the syntax with the existing publication command:
FOR ALL TABLES. If you create a publication FOR ALL TABLES, it means publishing
all the tables in the database *including* tables created in the future. I
think both the syntax and meaning of ALL TABLES IN SCHEMA are consistent with
the existing FOR ALL TABLES.

And the behavior is clearly documented, so personally I think it's fine.
https://www.postgresql.org/docs/devel/sql-createpublication.html
--
FOR ALL TABLES
Marks the publication as one that replicates changes for all tables in the database, including tables created in the future.
FOR ALL TABLES IN SCHEMA
Marks the publication as one that replicates changes for all tables in the specified list of schemas, including tables created in the future.
--

Besides, as mentioned(and suggested by Tom[1]/messages/by-id/155565.1628954580@sss.pgh.pa.us), we might support publishing
SEQUENCE(or others) in the future. It would give more flexibility to user if we
have another FOR ALL SEQUENCES(or other objects) IN SCHEMA.

[1]: /messages/by-id/155565.1628954580@sss.pgh.pa.us

Best regards,
Hou zj

#14vignesh C
vignesh21@gmail.com
In reply to: Amit Kapila (#9)
Re: why can't a table be part of the same publication as its schema

On Sat, 10 Sept 2022 at 07:32, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Sep 9, 2022 at 8:48 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Sep 9, 2022 at 10:29 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:

IIRC, the feature currently works almost the same as you described. It doesn't
create entry for tables that are published via its schema level, it only record
the published schema and check which tables are part of it.

Oh, well if that's the case, that is great news.

Yes, the feature works as you and Hou-San have mentioned.

But then I don't
understand Amit's comment from before:

Yes, because otherwise, there was confusion while dropping the objects
from publication. Consider in the above case, if we would have allowed
it and then the user performs ALTER PUBLICATION p1 DROP ALL TABLES IN
SCHEMA s1, then (a) shall we remove both schema s1 and a table that is
separately added (s1.t1) from that schema, or (b) just remove schema
s1?

I believe that (b) is the correct behavior, so I assumed that this
issue must be some difficulty in implementing it, like a funny catalog
representation.

No, it was because of syntax. IIRC, during development, Greg Nancarrow
raised a point [1] that a user can expect the individually added
tables for a schema which is also part of the publication to also get
dropped when she specifies DROP ALL TABLES IN SCHEMA. IIRC,
originally, the patch had a behavior (b) but then changed due to
discussion around this point. But now that it seems you and others
don't feel that was right, we can change back to (b) as I think that
shouldn't be difficult to achieve.

I have made the changes to allow creation of publication with a schema
and table of the same schema. The attached patch has the changes for
the same.
I'm planning to review and test the patch further.

Regards,
Vignesh

Attachments:

v1-0001-Allow-creation-of-publication-with-schema-and-tab.patchapplication/octet-stream; name=v1-0001-Allow-creation-of-publication-with-schema-and-tab.patchDownload+94-161
#15Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Zhijie Hou (Fujitsu) (#13)
Re: why can't a table be part of the same publication as its schema

At Mon, 12 Sep 2022 04:26:48 +0000, "houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com> wrote in

On Monday, September 12, 2022 1:08 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote:

On Sep 10, 2022, at 4:17 PM, Robert Haas <robertmhaas@gmail.com> wrote:

I don't understand why we
used this ALL TABLES IN SCHEMA language.

The conversation, as I recall, was that "ADD SCHEMA foo" would only mean

all tables in foo, until publication of other object types became supported, at
which point "ADD SCHEMA foo" would suddenly mean more than it did before.
People might find that surprising, so the "ALL TABLES IN" was intended to
future-proof against surprising behavioral changes.

If I encountered this syntax in a vacuum, that's not what I would
think. I would think that ADD ALL TABLES IN SCHEMA meant add all the
tables in the schema to the publication one by one as individual
objects

Yes, it appears the syntax was chosen to avoid one kind of confusion, but created
another kind. Per the docs on this feature:

FOR ALL TABLES IN SCHEMA
Marks the publication as one that replicates changes for all tables in the
specified list of schemas, including tables created in the future.

Like you, I wouldn't expect that definition, given the behavior of GRANT with
respect to the same grammatical construction.

I'm a bit unsure if it should be compared to GRANT. Because even if we chose
"ALTER PUBLICATION p1 { ADD | DROP } SCHEMA name", it's also not
consistent with the meaning of GRANT ON SCHEMA, as GRANT ON SCHEMA doesn't
grant rights on the tables within schema if I understand correctly.

I feel we'd better compare the syntax with the existing publication command:
FOR ALL TABLES. If you create a publication FOR ALL TABLES, it means publishing
all the tables in the database *including* tables created in the future. I
think both the syntax and meaning of ALL TABLES IN SCHEMA are consistent with
the existing FOR ALL TABLES.

IMHO, I feel closer to Robert. "ALL TABLES IN SCHEMA" sounds like the
concrete tables at the time of invocation. While I agree that it is
not directly comparable to GRANT, but if I see "ALTER PUBLICATION p1
ADD SCHEMA s1", I automatically translate that into "all tables in the
schema s1 at the time of using this publication". At least, it would
cause less confusion when it were "ALT PUB p1 DROP SCEMA s1" aginst
"DROP ALL TABLES IN SCHEMA s1".

However..

And the behavior is clearly documented, so personally I think it's fine.
https://www.postgresql.org/docs/devel/sql-createpublication.html
--
FOR ALL TABLES
Marks the publication as one that replicates changes for all tables in the database, including tables created in the future.
FOR ALL TABLES IN SCHEMA
Marks the publication as one that replicates changes for all tables in the specified list of schemas, including tables created in the future.
--

Besides, as mentioned(and suggested by Tom[1]), we might support publishing
SEQUENCE(or others) in the future. It would give more flexibility to user if we
have another FOR ALL SEQUENCES(or other objects) IN SCHEMA.

[1] /messages/by-id/155565.1628954580@sss.pgh.pa.us

Fair point. Should be stupid, but how about the following?

CREATE PUBLICATION p1 FOR TABLES * IN SCHEMA s1;
DROP PUBLICATION p1 FOR TABLES * IN SCHEMA s1;
ATLER PUBLICATION p1 ADD TABLES * IN SCHEMA s1;
ALTER PUBLICATION p1 DROP TABLES * IN SCHEMA s1;

This is an analog of synchronous_standby_names. But I'm not sure a
bare asterisc can appear there.. We could use ANY instead?

CREATE PUBLICATION p1 FOR TABLES ANY IN SCHEMA s1;
...

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#16Zhijie Hou (Fujitsu)
houzj.fnst@fujitsu.com
In reply to: Kyotaro Horiguchi (#15)
RE: why can't a table be part of the same publication as its schema

On Tuesday, September 13, 2022 12:40 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:

At Mon, 12 Sep 2022 04:26:48 +0000, "houzj.fnst@fujitsu.com"
<houzj.fnst@fujitsu.com> wrote in

On Monday, September 12, 2022 1:08 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote:

On Sep 10, 2022, at 4:17 PM, Robert Haas <robertmhaas@gmail.com> wrote:

I don't understand why we
used this ALL TABLES IN SCHEMA language.

The conversation, as I recall, was that "ADD SCHEMA foo" would
only mean

all tables in foo, until publication of other object types became
supported, at which point "ADD SCHEMA foo" would suddenly mean more than it did before.
People might find that surprising, so the "ALL TABLES IN" was
intended to future-proof against surprising behavioral changes.

If I encountered this syntax in a vacuum, that's not what I would
think. I would think that ADD ALL TABLES IN SCHEMA meant add all
the tables in the schema to the publication one by one as
individual objects

Yes, it appears the syntax was chosen to avoid one kind of
confusion, but created another kind. Per the docs on this feature:

FOR ALL TABLES IN SCHEMA
Marks the publication as one that replicates changes for all
tables in the specified list of schemas, including tables created in the future.

Like you, I wouldn't expect that definition, given the behavior of
GRANT with respect to the same grammatical construction.

I'm a bit unsure if it should be compared to GRANT. Because even if we
chose "ALTER PUBLICATION p1 { ADD | DROP } SCHEMA name", it's also not
consistent with the meaning of GRANT ON SCHEMA, as GRANT ON SCHEMA
doesn't grant rights on the tables within schema if I understand correctly.

I feel we'd better compare the syntax with the existing publication command:
FOR ALL TABLES. If you create a publication FOR ALL TABLES, it means
publishing all the tables in the database *including* tables created
in the future. I think both the syntax and meaning of ALL TABLES IN
SCHEMA are consistent with the existing FOR ALL TABLES.

IMHO, I feel closer to Robert. "ALL TABLES IN SCHEMA" sounds like the
concrete tables at the time of invocation. While I agree that it is not directly
comparable to GRANT, but if I see "ALTER PUBLICATION p1 ADD SCHEMA s1", I
automatically translate that into "all tables in the schema s1 at the time of using
this publication". At least, it would cause less confusion when it were "ALT PUB
p1 DROP SCEMA s1" aginst "DROP ALL TABLES IN SCHEMA s1".

However..

And the behavior is clearly documented, so personally I think it's fine.
https://www.postgresql.org/docs/devel/sql-createpublication.html
--
FOR ALL TABLES
Marks the publication as one that replicates changes for all tables in the

database, including tables created in the future.

FOR ALL TABLES IN SCHEMA
Marks the publication as one that replicates changes for all tables in the

specified list of schemas, including tables created in the future.

--

Besides, as mentioned(and suggested by Tom[1]), we might support
publishing SEQUENCE(or others) in the future. It would give more
flexibility to user if we have another FOR ALL SEQUENCES(or other objects) IN

SCHEMA.

[1]

/messages/by-id/155565.1628954580@sss.pgh.pa.u

s

Fair point. Should be stupid, but how about the following?

CREATE PUBLICATION p1 FOR TABLES * IN SCHEMA s1;
DROP PUBLICATION p1 FOR TABLES * IN SCHEMA s1;
ATLER PUBLICATION p1 ADD TABLES * IN SCHEMA s1; ALTER PUBLICATION
p1 DROP TABLES * IN SCHEMA s1;

This is an analog of synchronous_standby_names. But I'm not sure a bare
asterisc can appear there.. We could use ANY instead?

CREATE PUBLICATION p1 FOR TABLES ANY IN SCHEMA s1; ...

Thanks for the suggestions. But personally, I am not sure if this is better the
current syntax as it seems syntactically inconsistent with the existing "FOR
ALL TABLES". Also, the behavior to include future tables is consistent with FOR
ALL TABLES.

Best regards,
Hou zj

#17Zhijie Hou (Fujitsu)
houzj.fnst@fujitsu.com
In reply to: vignesh C (#14)
RE: why can't a table be part of the same publication as its schema

On Monday, September 12, 2022 10:14 PM vignesh C <vignesh21@gmail.com> wrote:

On Sat, 10 Sept 2022 at 07:32, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Sep 9, 2022 at 8:48 PM Robert Haas <robertmhaas@gmail.com>

wrote:

On Fri, Sep 9, 2022 at 10:29 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:

IIRC, the feature currently works almost the same as you
described. It doesn't create entry for tables that are published
via its schema level, it only record the published schema and check which

tables are part of it.

Oh, well if that's the case, that is great news.

Yes, the feature works as you and Hou-San have mentioned.

But then I don't
understand Amit's comment from before:

Yes, because otherwise, there was confusion while dropping the
objects from publication. Consider in the above case, if we would
have allowed it and then the user performs ALTER PUBLICATION p1
DROP ALL TABLES IN SCHEMA s1, then (a) shall we remove both schema
s1 and a table that is separately added (s1.t1) from that schema,
or (b) just remove schema s1?

I believe that (b) is the correct behavior, so I assumed that this
issue must be some difficulty in implementing it, like a funny
catalog representation.

No, it was because of syntax. IIRC, during development, Greg Nancarrow
raised a point [1] that a user can expect the individually added
tables for a schema which is also part of the publication to also get
dropped when she specifies DROP ALL TABLES IN SCHEMA. IIRC,
originally, the patch had a behavior (b) but then changed due to
discussion around this point. But now that it seems you and others
don't feel that was right, we can change back to (b) as I think that
shouldn't be difficult to achieve.

I have made the changes to allow creation of publication with a schema and
table of the same schema. The attached patch has the changes for the same.
I'm planning to review and test the patch further.

Thanks for the patch. While reviewing it, I found that the column list behavior
might need to be changed or confirmed after allowing the above case.

After applying the patch, we support adding a table with column list along with
the table's schema[1]- CREATE PUBLICATION pub FOR TABLE public.test(a), FOR ALL TABLES IN SCHEMA public; -----, and it will directly apply the column list in the
logical replication after applying the patch.

[1]: - CREATE PUBLICATION pub FOR TABLE public.test(a), FOR ALL TABLES IN SCHEMA public; -----
CREATE PUBLICATION pub FOR TABLE public.test(a), FOR ALL TABLES IN SCHEMA public;
-----

If from the point of view of consistency, for column list, we could report an
ERROR because we currently don't allow using different column lists for a
table. Maybe an ERROR like:

"ERROR: cannot use column for table x when the table's schema is also in the publication"

But if we want to report an ERROR for column list in above case. We might need
to restrict the ALTER TABLE SET SCHEMA as well because user could move a table
which is published with column list to a schema that is also published in the
publication, so we might need to add some similar check(which is removed in
Vignesh's patch) to tablecmd.c to disallow this case.

Another option could be just ingore the column list if table's schema is also
part of publication. But it seems slightly inconsistent with the rule that we
disallow using different column list for a table.

Best regards,
Hou zj

#18Peter Eisentraut
peter_e@gmx.net
In reply to: Zhijie Hou (Fujitsu) (#17)
Re: why can't a table be part of the same publication as its schema

On 14.09.22 07:10, houzj.fnst@fujitsu.com wrote:

After applying the patch, we support adding a table with column list along with
the table's schema[1], and it will directly apply the column list in the
logical replication after applying the patch.

[1]--
CREATE PUBLICATION pub FOR TABLE public.test(a), FOR ALL TABLES IN SCHEMA public;
-----

If from the point of view of consistency, for column list, we could report an
ERROR because we currently don't allow using different column lists for a
table. Maybe an ERROR like:

"ERROR: cannot use column for table x when the table's schema is also in the publication"

But if we want to report an ERROR for column list in above case. We might need
to restrict the ALTER TABLE SET SCHEMA as well because user could move a table
which is published with column list to a schema that is also published in the
publication, so we might need to add some similar check(which is removed in
Vignesh's patch) to tablecmd.c to disallow this case.

Another option could be just ingore the column list if table's schema is also
part of publication. But it seems slightly inconsistent with the rule that we
disallow using different column list for a table.

Ignoring things doesn't seem like a good idea.

A solution might be to disallow adding any schemas to a publication if
column lists on a table are specified.

#19Zhijie Hou (Fujitsu)
houzj.fnst@fujitsu.com
In reply to: Peter Eisentraut (#18)
RE: why can't a table be part of the same publication as its schema

On Thursday, September 15, 2022 3:37 AM Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote:

Hi,

On 14.09.22 07:10, houzj.fnst@fujitsu.com wrote:

After applying the patch, we support adding a table with column list
along with the table's schema[1], and it will directly apply the
column list in the logical replication after applying the patch.

[1]--
CREATE PUBLICATION pub FOR TABLE public.test(a), FOR ALL TABLES IN
SCHEMA public;
-----

If from the point of view of consistency, for column list, we could
report an ERROR because we currently don't allow using different
column lists for a table. Maybe an ERROR like:

"ERROR: cannot use column for table x when the table's schema is also in the

publication"

But if we want to report an ERROR for column list in above case. We
might need to restrict the ALTER TABLE SET SCHEMA as well because user
could move a table which is published with column list to a schema
that is also published in the publication, so we might need to add
some similar check(which is removed in Vignesh's patch) to tablecmd.c to

disallow this case.

Another option could be just ingore the column list if table's schema
is also part of publication. But it seems slightly inconsistent with
the rule that we disallow using different column list for a table.

Ignoring things doesn't seem like a good idea.

A solution might be to disallow adding any schemas to a publication if column
lists on a table are specified.

Thanks for the suggestion. If I understand correctly, you mean we can disallow
publishing a table with column list and any schema(a schema that the table
might not be part of) in the same publication[1]CREATE PUBLICATION pub FOR TABLE public.test(a), ALL TABLES IN SCHEMA s2; ERROR: "cannot add schema to publication when column list is used in the published table" --.

something like--
[1]: CREATE PUBLICATION pub FOR TABLE public.test(a), ALL TABLES IN SCHEMA s2; ERROR: "cannot add schema to publication when column list is used in the published table" --
ERROR: "cannot add schema to publication when column list is used in the published table"
--

Personally, it looks acceptable to me as user can anyway achieve the same
purpose by creating serval publications and combine it and we can save the
restriction at ALTER TABLE SET SCHEMA. Although it restricts some cases.
I will post a top-up patch about this soon.

About the row filter handling, maybe we don't need to restrict row filter like
above ? Because the rule is to simply merge the row filter with 'OR' among
publications, so it seems we could ignore the row filter in the publication when
the table's schema is also published in the same publication(which means no filter).

Best regards,
Hou zj

#20Zhijie Hou (Fujitsu)
houzj.fnst@fujitsu.com
In reply to: Zhijie Hou (Fujitsu) (#19)
RE: why can't a table be part of the same publication as its schema

On Thursday, September 15, 2022 10:48 AM houzj.fnst@fujitsu.com wrote:

On Thursday, September 15, 2022 3:37 AM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:

Hi,

On 14.09.22 07:10, houzj.fnst@fujitsu.com wrote:

After applying the patch, we support adding a table with column list
along with the table's schema[1], and it will directly apply the
column list in the logical replication after applying the patch.

[1]--
CREATE PUBLICATION pub FOR TABLE public.test(a), FOR ALL TABLES IN
SCHEMA public;
-----

If from the point of view of consistency, for column list, we could
report an ERROR because we currently don't allow using different
column lists for a table. Maybe an ERROR like:

"ERROR: cannot use column for table x when the table's schema is
also in the

publication"

But if we want to report an ERROR for column list in above case. We
might need to restrict the ALTER TABLE SET SCHEMA as well because
user could move a table which is published with column list to a
schema that is also published in the publication, so we might need
to add some similar check(which is removed in Vignesh's patch) to
tablecmd.c to

disallow this case.

Another option could be just ingore the column list if table's
schema is also part of publication. But it seems slightly
inconsistent with the rule that we disallow using different column list for a

table.

Ignoring things doesn't seem like a good idea.

A solution might be to disallow adding any schemas to a publication if
column lists on a table are specified.

Thanks for the suggestion. If I understand correctly, you mean we can disallow
publishing a table with column list and any schema(a schema that the table
might not be part of) in the same publication[1].

something like--
[1]CREATE PUBLICATION pub FOR TABLE public.test(a), ALL TABLES IN SCHEMA
s2;
ERROR: "cannot add schema to publication when column list is used in the
published table"
--

Personally, it looks acceptable to me as user can anyway achieve the same
purpose by creating serval publications and combine it and we can save the
restriction at ALTER TABLE SET SCHEMA. Although it restricts some cases.
I will post a top-up patch about this soon.

About the row filter handling, maybe we don't need to restrict row filter like
above ? Because the rule is to simply merge the row filter with 'OR' among
publications, so it seems we could ignore the row filter in the publication when
the table's schema is also published in the same publication(which means no
filter).

Attach the new version patch which added suggested restriction for column list
and merged Vignesh's patch.

Some other document might need to be updated. I will update them soon.

Best regards,
Hou zj

Attachments:

v3-0001-Allow-creation-of-publication-with-schema-and-tab.patchapplication/octet-stream; name=v3-0001-Allow-creation-of-publication-with-schema-and-tab.patchDownload+245-169
#21Peter Smith
smithpb2250@gmail.com
In reply to: Zhijie Hou (Fujitsu) (#20)
#22Amit Kapila
amit.kapila16@gmail.com
In reply to: Zhijie Hou (Fujitsu) (#19)
#23Amit Kapila
amit.kapila16@gmail.com
In reply to: Zhijie Hou (Fujitsu) (#20)
#24Zhijie Hou (Fujitsu)
houzj.fnst@fujitsu.com
In reply to: Amit Kapila (#23)
#25Amit Kapila
amit.kapila16@gmail.com
In reply to: Zhijie Hou (Fujitsu) (#24)
#26Zhijie Hou (Fujitsu)
houzj.fnst@fujitsu.com
In reply to: Amit Kapila (#25)
#27Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Zhijie Hou (Fujitsu) (#26)
#28Jonathan S. Katz
jkatz@postgresql.org
In reply to: Alvaro Herrera (#27)
#29Amit Kapila
amit.kapila16@gmail.com
In reply to: Alvaro Herrera (#27)
#30Jonathan S. Katz
jkatz@postgresql.org
In reply to: Jonathan S. Katz (#28)
#31Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Amit Kapila (#29)
#32Amit Kapila
amit.kapila16@gmail.com
In reply to: Alvaro Herrera (#31)
#33Robert Haas
robertmhaas@gmail.com
In reply to: Jonathan S. Katz (#30)
#34Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Kyotaro Horiguchi (#15)
#35Mark Dilger
mark.dilger@enterprisedb.com
In reply to: Jonathan S. Katz (#30)
#36Jonathan S. Katz
jkatz@postgresql.org
In reply to: Mark Dilger (#35)
#37Jonathan S. Katz
jkatz@postgresql.org
In reply to: Robert Haas (#33)
#38Mark Dilger
mark.dilger@enterprisedb.com
In reply to: Jonathan S. Katz (#36)
#39Jonathan S. Katz
jkatz@postgresql.org
In reply to: Mark Dilger (#38)
#40Zhijie Hou (Fujitsu)
houzj.fnst@fujitsu.com
In reply to: Mark Dilger (#38)
#41Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Robert Haas (#33)
#42Jonathan S. Katz
jkatz@postgresql.org
In reply to: Alvaro Herrera (#41)
#43Amit Kapila
amit.kapila16@gmail.com
In reply to: Jonathan S. Katz (#42)
#44Amit Kapila
amit.kapila16@gmail.com
In reply to: Jonathan S. Katz (#37)
#45Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Alvaro Herrera (#41)
#46Jonathan S. Katz
jkatz@postgresql.org
In reply to: Alvaro Herrera (#45)