adding partitioned tables to publications

Started by Amit Langoteover 6 years ago81 messageshackers

Langote_Amit_f8@lab.ntt.co.jp

over 6 years ago

One cannot currently add partitioned tables to a publication.

create table p (a int, b int) partition by hash (a);
create table p1 partition of p for values with (modulus 3, remainder 0);
create table p2 partition of p for values with (modulus 3, remainder 1);
create table p3 partition of p for values with (modulus 3, remainder 2);

create publication publish_p for table p;
ERROR: "p" is a partitioned table
DETAIL: Adding partitioned tables to publications is not supported.
HINT: You can add the table partitions individually.

One can do this instead:

create publication publish_p1 for table p1;
create publication publish_p2 for table p2;
create publication publish_p3 for table p3;

but maybe that's too much code to maintain for users.

I propose that we make this command:

create publication publish_p for table p;

automatically add all the partitions to the publication. Also, any
future partitions should also be automatically added to the
publication. So, publishing a partitioned table automatically
publishes all of its existing and future partitions. Attached patch
implements that.

What doesn't change with this patch is that the partitions on the
subscription side still have to match one-to-one with the partitions
on the publication side, because the changes are still replicated as
being made to the individual partitions, not as the changes to the
root partitioned table. It might be useful to implement that
functionality on the publication side, because it allows users to
define the replication target any way they need to, but this patch
doesn't implement that.

Thanks,
Amit

Langote_Amit_f8@lab.ntt.co.jp

over 6 years ago

In reply to: Amit Langote (#1)

Re: adding partitioned tables to publications

On Mon, Oct 7, 2019 at 9:55 AM Amit Langote <amitlangote09@gmail.com> wrote:

One cannot currently add partitioned tables to a publication.

create table p (a int, b int) partition by hash (a);
create table p1 partition of p for values with (modulus 3, remainder 0);
create table p2 partition of p for values with (modulus 3, remainder 1);
create table p3 partition of p for values with (modulus 3, remainder 2);

create publication publish_p for table p;
ERROR: "p" is a partitioned table
DETAIL: Adding partitioned tables to publications is not supported.
HINT: You can add the table partitions individually.

One can do this instead:

create publication publish_p1 for table p1;
create publication publish_p2 for table p2;
create publication publish_p3 for table p3;

but maybe that's too much code to maintain for users.

I propose that we make this command:

create publication publish_p for table p;

automatically add all the partitions to the publication. Also, any
future partitions should also be automatically added to the
publication. So, publishing a partitioned table automatically
publishes all of its existing and future partitions. Attached patch
implements that.

What doesn't change with this patch is that the partitions on the
subscription side still have to match one-to-one with the partitions
on the publication side, because the changes are still replicated as
being made to the individual partitions, not as the changes to the
root partitioned table. It might be useful to implement that
functionality on the publication side, because it allows users to
define the replication target any way they need to, but this patch
doesn't implement that.

Added this to the next CF: https://commitfest.postgresql.org/25/2301/

Thanks,
Amit

rafia.pghackers@gmail.com

over 6 years ago

In reply to: Amit Langote (#2)

Re: adding partitioned tables to publications

On Thu, 10 Oct 2019 at 08:29, Amit Langote <amitlangote09@gmail.com> wrote:

On Mon, Oct 7, 2019 at 9:55 AM Amit Langote <amitlangote09@gmail.com>
wrote:

One cannot currently add partitioned tables to a publication.

create table p (a int, b int) partition by hash (a);
create table p1 partition of p for values with (modulus 3, remainder 0);
create table p2 partition of p for values with (modulus 3, remainder 1);
create table p3 partition of p for values with (modulus 3, remainder 2);

create publication publish_p for table p;
ERROR: "p" is a partitioned table
DETAIL: Adding partitioned tables to publications is not supported.
HINT: You can add the table partitions individually.

One can do this instead:

create publication publish_p1 for table p1;
create publication publish_p2 for table p2;
create publication publish_p3 for table p3;

but maybe that's too much code to maintain for users.

I propose that we make this command:

create publication publish_p for table p;

automatically add all the partitions to the publication. Also, any
future partitions should also be automatically added to the
publication. So, publishing a partitioned table automatically
publishes all of its existing and future partitions. Attached patch
implements that.

What doesn't change with this patch is that the partitions on the
subscription side still have to match one-to-one with the partitions
on the publication side, because the changes are still replicated as
being made to the individual partitions, not as the changes to the
root partitioned table. It might be useful to implement that
functionality on the publication side, because it allows users to
define the replication target any way they need to, but this patch
doesn't implement that.

Added this to the next CF: https://commitfest.postgresql.org/25/2301/

Hi Amit,

Lately I was exploring logical replication feature of postgresql and I
found this addition in the scope of feature for partitioned tables a useful
one.

In order to understand the working of your patch a bit more, I performed an
experiment wherein I created a partitioned table with several children and
a default partition at the publisher side and normal tables of the same
name as parent, children, and default partition of the publisher side at
the subscriber side. Next I established the logical replication connection
and to my surprise the data was successfully replicated from partitioned
tables to normal tables and then this error filled the logs,
LOG: logical replication table synchronization worker for subscription
"my_subscription", table "parent" has started
ERROR: table "public.parent" not found on publisher

here parent is the name of the partitioned table at the publisher side and
it is present as normal table at subscriber side as well. Which is
understandable, it is trying to find a normal table of the same name but
couldn't find one, maybe it should not worry about that now also if not at
replication time.

Please let me know if this is something expected because in my opinion this
is not desirable, there should be some check to check the table type for
replication. This wasn't important till now maybe because only normal
tables were to be replicated, but with the extension of the scope of
logical replication to more objects such checks would be helpful.

On a separate note was thinking for partitioned tables, wouldn't it be
cleaner to have something like you create only partition table at the
subscriber and then when logical replication starts it creates the child
tables accordingly. Or would that be too much in future...?

--
Regards,
Rafia Sabih

Langote_Amit_f8@lab.ntt.co.jp

over 6 years ago

In reply to: Rafia Sabih (#3)

Re: adding partitioned tables to publications

Hello Rafia,

Great to hear that you are interested in this feature and thanks for
testing the patch.

On Thu, Oct 10, 2019 at 10:13 PM Rafia Sabih <rafia.pghackers@gmail.com> wrote:

Lately I was exploring logical replication feature of postgresql and I found this addition in the scope of feature for partitioned tables a useful one.

In order to understand the working of your patch a bit more, I performed an experiment wherein I created a partitioned table with several children and a default partition at the publisher side and normal tables of the same name as parent, children, and default partition of the publisher side at the subscriber side. Next I established the logical replication connection and to my surprise the data was successfully replicated from partitioned tables to normal tables and then this error filled the logs,
LOG: logical replication table synchronization worker for subscription "my_subscription", table "parent" has started
ERROR: table "public.parent" not found on publisher

here parent is the name of the partitioned table at the publisher side and it is present as normal table at subscriber side as well. Which is understandable, it is trying to find a normal table of the same name but couldn't find one, maybe it should not worry about that now also if not at replication time.

Please let me know if this is something expected because in my opinion this is not desirable, there should be some check to check the table type for replication. This wasn't important till now maybe because only normal tables were to be replicated, but with the extension of the scope of logical replication to more objects such checks would be helpful.

Thanks for sharing this case. I hadn't considered it, but you're
right that it should be handled sensibly. I have fixed table sync
code to handle this case properly. Could you please check your case
with the attached updated patch?

On a separate note was thinking for partitioned tables, wouldn't it be cleaner to have something like you create only partition table at the subscriber and then when logical replication starts it creates the child tables accordingly. Or would that be too much in future...?

Hmm, we'd first need to built the "automatic partition creation"
feature to consider doing something like that. I'm sure you'd agree
that we should undertake that project separately from this tiny
logical replication usability improvement project. :)

Thanks again.

Regards,
Amit

petr@2ndquadrant.com

over 6 years ago

In reply to: Amit Langote (#1)

Re: adding partitioned tables to publications

Hi,

On 07/10/2019 02:55, Amit Langote wrote:

One cannot currently add partitioned tables to a publication.

create table p (a int, b int) partition by hash (a);
create table p1 partition of p for values with (modulus 3, remainder 0);
create table p2 partition of p for values with (modulus 3, remainder 1);
create table p3 partition of p for values with (modulus 3, remainder 2);

create publication publish_p for table p;
ERROR: "p" is a partitioned table
DETAIL: Adding partitioned tables to publications is not supported.
HINT: You can add the table partitions individually.

One can do this instead:

create publication publish_p1 for table p1;
create publication publish_p2 for table p2;
create publication publish_p3 for table p3;

Or just create publication publish_p for table p1, p2, p3;

but maybe that's too much code to maintain for users.

I propose that we make this command:

create publication publish_p for table p;

+1

automatically add all the partitions to the publication. Also, any
future partitions should also be automatically added to the
publication. So, publishing a partitioned table automatically
publishes all of its existing and future partitions. Attached patch
implements that.

What doesn't change with this patch is that the partitions on the
subscription side still have to match one-to-one with the partitions
on the publication side, because the changes are still replicated as
being made to the individual partitions, not as the changes to the
root partitioned table. It might be useful to implement that
functionality on the publication side, because it allows users to
define the replication target any way they need to, but this patch
doesn't implement that.

Yeah for that to work subscription would need to also need to be able to
write to partitioned tables, so it needs both sides to add support for
this. I think if we do both what you did and the transparent handling of
root only, we'll need new keyword to differentiate the two. It might
make sense to think about if we want your way to need an extra keyword
or the transparent one will need it.

One issue that I see reading the patch is following set of commands:

CREATE TABLE foo ...;
CREATE PUBLICATION mypub FOR TABLE foo;

CREATE TABLE bar ...;
ALTER PUBLICATION mypub ADD TABLE bar;

ALTER TABLE foo ATTACH PARTITION bar ...;
ALTER TABLE foo DETACH PARTITION bar ...;

This will end up with bar not being in any publication even though it
was explicitly added. That might be acceptable caveat but it at least
should be clearly documented (IMHO with warning).

--
Petr Jelinek
2ndQuadrant - PostgreSQL Solutions for the Enterprise
https://www.2ndQuadrant.com/

david@fetter.org

over 6 years ago

In reply to: Amit Langote (#1)

Re: adding partitioned tables to publications

On Mon, Oct 07, 2019 at 09:55:23AM +0900, Amit Langote wrote:

One cannot currently add partitioned tables to a publication.

create table p (a int, b int) partition by hash (a);
create table p1 partition of p for values with (modulus 3, remainder 0);
create table p2 partition of p for values with (modulus 3, remainder 1);
create table p3 partition of p for values with (modulus 3, remainder 2);

create publication publish_p for table p;
ERROR: "p" is a partitioned table
DETAIL: Adding partitioned tables to publications is not supported.
HINT: You can add the table partitions individually.

One can do this instead:

create publication publish_p1 for table p1;
create publication publish_p2 for table p2;
create publication publish_p3 for table p3;

but maybe that's too much code to maintain for users.

I propose that we make this command:

create publication publish_p for table p;

automatically add all the partitions to the publication. Also, any
future partitions should also be automatically added to the
publication. So, publishing a partitioned table automatically
publishes all of its existing and future partitions. Attached patch
implements that.

What doesn't change with this patch is that the partitions on the
subscription side still have to match one-to-one with the partitions
on the publication side, because the changes are still replicated as
being made to the individual partitions, not as the changes to the
root partitioned table. It might be useful to implement that
functionality on the publication side, because it allows users to
define the replication target any way they need to, but this patch
doesn't implement that.

With this patch, is it possible to remove a partition manually from a
subscription, or will it just get automatically re-added at some
point?

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Langote_Amit_f8@lab.ntt.co.jp

over 6 years ago

In reply to: David Fetter (#6)

Re: adding partitioned tables to publications

Hi David,

On Sun, Oct 13, 2019 at 4:55 PM David Fetter <david@fetter.org> wrote:

On Mon, Oct 07, 2019 at 09:55:23AM +0900, Amit Langote wrote:

I propose that we make this command:

create publication publish_p for table p;

automatically add all the partitions to the publication. Also, any
future partitions should also be automatically added to the
publication. So, publishing a partitioned table automatically
publishes all of its existing and future partitions. Attached patch
implements that.

What doesn't change with this patch is that the partitions on the
subscription side still have to match one-to-one with the partitions
on the publication side, because the changes are still replicated as
being made to the individual partitions, not as the changes to the
root partitioned table. It might be useful to implement that
functionality on the publication side, because it allows users to
define the replication target any way they need to, but this patch
doesn't implement that.

With this patch, is it possible to remove a partition manually from a
subscription, or will it just get automatically re-added at some
point?

Hmm, I don't think there is any way (commands) to manually remove
tables from a subscription. Testing shows that if you drop a table on
the subscription server that is currently being fed data via a
subscription, then a subscription worker will complain and quit if it
receives a row targeting the dropped table and workers that are
subsequently started will do the same thing. Interestingly, this
behavior prevents replication for any other tables in the subscription
from proceeding, which seems unfortunate.

If you were asking if the patch extends the subscription side
functionality to re-add needed partitions that were manually removed
likely by accident, then no.

Thanks,
Amit

Langote_Amit_f8@lab.ntt.co.jp

over 6 years ago

In reply to: Petr Jelinek (#5)

Re: adding partitioned tables to publications

Hi Petr,

Thanks for your comments.

On Sun, Oct 13, 2019 at 5:01 AM Petr Jelinek <petr@2ndquadrant.com> wrote:

On 07/10/2019 02:55, Amit Langote wrote:

One cannot currently add partitioned tables to a publication.

create table p (a int, b int) partition by hash (a);
create table p1 partition of p for values with (modulus 3, remainder 0);
create table p2 partition of p for values with (modulus 3, remainder 1);
create table p3 partition of p for values with (modulus 3, remainder 2);

create publication publish_p for table p;
ERROR: "p" is a partitioned table
DETAIL: Adding partitioned tables to publications is not supported.
HINT: You can add the table partitions individually.

One can do this instead:

create publication publish_p1 for table p1;
create publication publish_p2 for table p2;
create publication publish_p3 for table p3;

Or just create publication publish_p for table p1, p2, p3;

Yep, facepalm! :)

So, one doesn't really need as many publication objects as there are
partitions as my version suggests, which is good. Although, as you
can tell, a user would still manually need to keep the set of
published partitions up to date, for example when new partitions are
added.

but maybe that's too much code to maintain for users.

I propose that we make this command:

create publication publish_p for table p;

+1

automatically add all the partitions to the publication. Also, any
future partitions should also be automatically added to the
publication. So, publishing a partitioned table automatically
publishes all of its existing and future partitions. Attached patch
implements that.

What doesn't change with this patch is that the partitions on the
subscription side still have to match one-to-one with the partitions
on the publication side, because the changes are still replicated as
being made to the individual partitions, not as the changes to the
root partitioned table. It might be useful to implement that
functionality on the publication side, because it allows users to
define the replication target any way they need to, but this patch
doesn't implement that.

Yeah for that to work subscription would need to also need to be able to
write to partitioned tables, so it needs both sides to add support for
this.

Ah, I didn't know that the subscription code doesn't out-of-the-box
support tuple routing. Indeed, we will need to fix that.

I think if we do both what you did and the transparent handling of
root only, we'll need new keyword to differentiate the two. It might
make sense to think about if we want your way to need an extra keyword
or the transparent one will need it.

I didn't think about that but maybe you are right.

One issue that I see reading the patch is following set of commands:

CREATE TABLE foo ...;
CREATE PUBLICATION mypub FOR TABLE foo;

CREATE TABLE bar ...;
ALTER PUBLICATION mypub ADD TABLE bar;

ALTER TABLE foo ATTACH PARTITION bar ...;
ALTER TABLE foo DETACH PARTITION bar ...;

This will end up with bar not being in any publication even though it
was explicitly added.

I tested and bar continues to be in the publication with above steps:

create table foo (a int) partition by list (a);
create publication mypub for table foo;
create table bar (a int);
alter publication mypub add table bar;
\d bar
Table "public.bar"
Column │ Type │ Collation │ Nullable │ Default
────────┼─────────┼───────────┼──────────┼─────────
a │ integer │ │ │
Publications:
"mypub"

alter table foo attach partition bar for values in (1);
\d bar
Table "public.bar"
Column │ Type │ Collation │ Nullable │ Default
────────┼─────────┼───────────┼──────────┼─────────
a │ integer │ │ │
Partition of: foo FOR VALUES IN (1)
Publications:
"mypub"

-- can't now drop bar from mypub (its membership is no longer standalone)
alter publication mypub drop table bar;
ERROR: cannot drop partition "bar" from an inherited publication
HINT: Drop the parent from publication instead.

alter table foo detach partition bar;

-- bar is still in mypub (now a standalone member)
\d bar
Table "public.bar"
Column │ Type │ Collation │ Nullable │ Default
────────┼─────────┼───────────┼──────────┼─────────
a │ integer │ │ │
Publications:
"mypub"

-- ok to drop now from mypub
alter publication mypub drop table bar;

Thanks,
Amit

Peter Eisentraut

peter_e@gmx.net

over 6 years ago

In reply to: Amit Langote (#4)

Re: adding partitioned tables to publications

This patch seems excessively complicated to me. Why don't you just add
the actual partitioned table to pg_publication_rel and then expand the
partition hierarchy in pgoutput (get_rel_sync_entry() or
GetRelationPublications() or somewhere around there). Then you don't
need to do any work in table DDL to keep the list of published tables up
to date.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

rafia.pghackers@gmail.com

over 6 years ago

In reply to: Amit Langote (#4)

Re: adding partitioned tables to publications

Hi Amit,

On Fri, 11 Oct 2019 at 08:06, Amit Langote <amitlangote09@gmail.com> wrote:

Thanks for sharing this case. I hadn't considered it, but you're
right that it should be handled sensibly. I have fixed table sync
code to handle this case properly. Could you please check your case
with the attached updated patch?

I was checking this today and found that the behavior doesn't change much

with the updated patch. The tables are still replicated, just that a select
count from parent table shows 0, rest of the partitions including default
one has the data from the publisher. I was expecting more like an error at
subscriber saying the table type is not same.

Please find the attached file for the test case, in case something is
unclear.

--
Regards,
Rafia Sabih

Langote_Amit_f8@lab.ntt.co.jp

over 6 years ago

In reply to: Peter Eisentraut (#9)

Re: adding partitioned tables to publications

Sorry about the delay.

On Mon, Nov 4, 2019 at 8:00 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

This patch seems excessively complicated to me. Why don't you just add
the actual partitioned table to pg_publication_rel and then expand the
partition hierarchy in pgoutput (get_rel_sync_entry() or
GetRelationPublications() or somewhere around there). Then you don't
need to do any work in table DDL to keep the list of published tables up
to date.

I tend to agree that having to manage this at the DDL level would be
bug-prone, not to mention pretty complicated code to implement it.

I have tried to implement it the way you suggested. So every decoded
change to a leaf partition will now be published not only via its own
publication but also via publications of its ancestors if any. That
irrespective of whether a row is directly inserted into the leaf
partition or got routed to it via insert done on an ancestor. In this
implementation, the only pg_publication_rel entry is the one
corresponding to the partitioned table.

On the subscription side, when creating pg_subscription_rel entries,
for a publication containing a partitioned table, all of its
partitions too must be fetched as being included in the publication.
That is necessary, because the initial syncing copy and subsequently
received changes must be applied to individual partitions. That could
be changed in the future by publishing leaf partition changes as
changes to the actually published partitioned table. That future
implementation will also hopefully take care of the concern that Rafia
mentioned on this thread that even with this patch, one must make sure
that tables match one-to-one when they're in publish-subscribe
relationship, which actually needs us to bake in low-level details
like table's relkind in the protocol exchanges.

Anyway, I've attached two patches -- 0001 is a refactoring patch. 0002
implements the feature.

Thanks,
Amit

Langote_Amit_f8@lab.ntt.co.jp

over 6 years ago

In reply to: Rafia Sabih (#10)

Re: adding partitioned tables to publications

Hello Rafia,

On Tue, Nov 5, 2019 at 12:41 AM Rafia Sabih <rafia.pghackers@gmail.com> wrote:

On Fri, 11 Oct 2019 at 08:06, Amit Langote <amitlangote09@gmail.com> wrote:

Thanks for sharing this case. I hadn't considered it, but you're
right that it should be handled sensibly. I have fixed table sync
code to handle this case properly. Could you please check your case
with the attached updated patch?

I was checking this today and found that the behavior doesn't change much with the updated patch. The tables are still replicated, just that a select count from parent table shows 0, rest of the partitions including default one has the data from the publisher. I was expecting more like an error at subscriber saying the table type is not same.

Please find the attached file for the test case, in case something is unclear.

Thanks for the test case.

With the latest patch I posted, you'll get the following error on subscriber:

create subscription mysub connection 'host=localhost port=5432
dbname=postgres' publication mypub;
ERROR: cannot use relation "public.t" as logical replication target
DETAIL: "public.t" is a regular table on subscription side whereas a
partitioned table on publication side

Although to be honest, I'd rather not see the error. As I mentioned
in my email earlier, it'd be nice to be able sync a partitioned table
and a regular table (or vice versa) via replication.

Thanks,
Amit

Langote_Amit_f8@lab.ntt.co.jp

over 6 years ago

In reply to: Amit Langote (#11)

Re: adding partitioned tables to publications

On Fri, Nov 8, 2019 at 1:27 PM Amit Langote <amitlangote09@gmail.com> wrote:

Anyway, I've attached two patches -- 0001 is a refactoring patch. 0002
implements the feature.

0002 didn't contain necessary pg_dump changes, which fixed in the
attached new version.

Thanks,
Amit

Peter Eisentraut

peter_e@gmx.net

over 6 years ago

In reply to: Amit Langote (#13)

Re: adding partitioned tables to publications

On 2019-11-11 08:59, Amit Langote wrote:

On Fri, Nov 8, 2019 at 1:27 PM Amit Langote <amitlangote09@gmail.com> wrote:

Anyway, I've attached two patches -- 0001 is a refactoring patch. 0002
implements the feature.

0002 didn't contain necessary pg_dump changes, which fixed in the
attached new version.

That looks more pleasant.

I don't understand why you go through great lengths to ensure that the
relkinds match between publisher and subscriber. We already ensure that
only regular tables are published and only regular tables are allowed as
subscription target. In the future, we may want to allow further
combinations. What situation are you trying to address here?

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Langote_Amit_f8@lab.ntt.co.jp

over 6 years ago

In reply to: Peter Eisentraut (#14)

Re: adding partitioned tables to publications

On Mon, Nov 11, 2019 at 9:49 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 2019-11-11 08:59, Amit Langote wrote:

On Fri, Nov 8, 2019 at 1:27 PM Amit Langote <amitlangote09@gmail.com> wrote:

Anyway, I've attached two patches -- 0001 is a refactoring patch. 0002
implements the feature.

0002 didn't contain necessary pg_dump changes, which fixed in the
attached new version.

That looks more pleasant.

Thanks for looking.

I don't understand why you go through great lengths to ensure that the
relkinds match between publisher and subscriber. We already ensure that
only regular tables are published and only regular tables are allowed as
subscription target. In the future, we may want to allow further
combinations. What situation are you trying to address here?

I'd really want to see the requirement for relkinds to have to match
go away, but as you can see, this patch doesn't modify enough of
pgoutput.c and worker.c to make that possible. Both the code for the
initital syncing and that for the subsequent real-time replication
assume that both source and target are regular tables. So even if
partitioned tables can now be in a publication, they're never sent in
the protocol messages, only their leaf partitions are. Initial
syncing code can be easily modified to support any combination of
source and target relations, but changes needed for real-time
replication seem non-trivial. Do you think we should do that before
we can say partitioned tables support logical replication?

Thanks,
Amit

Langote_Amit_f8@lab.ntt.co.jp

over 6 years ago

In reply to: Amit Langote (#15)

Re: adding partitioned tables to publications

On Tue, Nov 12, 2019 at 10:11 AM Amit Langote <amitlangote09@gmail.com> wrote:

Initial
syncing code can be easily modified to support any combination of
source and target relations, but changes needed for real-time
replication seem non-trivial.

I have spent some time hacking on this. With the attached updated
patch, adding a partitioned table to publication results in publishing
the inserts, updates, deletes of the table's leaf partitions as
inserts, updates, deletes of the table itself (it all happens inside
pgoutput). So, the replication target table doesn't necessarily have
to be a partitioned table and even if it is partitioned its partitions
don't have to match one-to-one.

One restriction remains though: partitioned tables on a subscriber
can't accept updates and deletes, because we'd need to map those to
updates and deletes of their partitions, including handling a tuple
possibly moving from one partition to another during an update.

Also, I haven't added subscription tests yet.

Attached updated patch. The previous division into a refactoring
patch and feature patch no longer made to sense to me, so there is
only one this time.

Thanks,
Amit

Peter Eisentraut

peter_e@gmx.net

over 6 years ago

In reply to: Amit Langote (#16)

Re: adding partitioned tables to publications

On 2019-11-18 09:53, Amit Langote wrote:

I have spent some time hacking on this. With the attached updated
patch, adding a partitioned table to publication results in publishing
the inserts, updates, deletes of the table's leaf partitions as
inserts, updates, deletes of the table itself (it all happens inside
pgoutput). So, the replication target table doesn't necessarily have
to be a partitioned table and even if it is partitioned its partitions
don't have to match one-to-one.

One restriction remains though: partitioned tables on a subscriber
can't accept updates and deletes, because we'd need to map those to
updates and deletes of their partitions, including handling a tuple
possibly moving from one partition to another during an update.

Right. Without that second part, the first part isn't really that
useful yet, is it?

I'm not sure what your intent with this patch is now. I thought the
previous behavior -- add a partitioned table to a publication and its
leaf tables appear in the replication output -- was pretty welcome. Do
we not want that anymore?

There should probably be an option to pick the behavior, like we do in
pg_dump.

What happens when you add a leaf table directly to a publication? Is it
replicated under its own identity or under its ancestor partitioned
table? (What if both the leaf table and a partitioned table are
publication members?)

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Peter Eisentraut

peter_e@gmx.net

over 6 years ago

In reply to: Amit Langote (#15)

Re: adding partitioned tables to publications

On 2019-11-12 02:11, Amit Langote wrote:

I don't understand why you go through great lengths to ensure that the
relkinds match between publisher and subscriber. We already ensure that
only regular tables are published and only regular tables are allowed as
subscription target. In the future, we may want to allow further
combinations. What situation are you trying to address here?

I'd really want to see the requirement for relkinds to have to match
go away, but as you can see, this patch doesn't modify enough of
pgoutput.c and worker.c to make that possible. Both the code for the
initital syncing and that for the subsequent real-time replication
assume that both source and target are regular tables. So even if
partitioned tables can now be in a publication, they're never sent in
the protocol messages, only their leaf partitions are. Initial
syncing code can be easily modified to support any combination of
source and target relations, but changes needed for real-time
replication seem non-trivial. Do you think we should do that before
we can say partitioned tables support logical replication?

My question was more simply why you have this check:

+   /*
+    * Cannot replicate from a regular to a partitioned table or vice
+    * versa.
+    */
+   if (local_relkind != pt->relkind)
+       ereport(ERROR,
+               (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+                errmsg("cannot use relation \"%s.%s\" as logical 
replication target",
+                       rv->schemaname, rv->relname),

It doesn't seem necessary. What happens if you remove it?

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Langote_Amit_f8@lab.ntt.co.jp

over 6 years ago

In reply to: Peter Eisentraut (#17)

Re: adding partitioned tables to publications

Hi Peter,

On Wed, Nov 20, 2019 at 4:55 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 2019-11-18 09:53, Amit Langote wrote:

I have spent some time hacking on this. With the attached updated
patch, adding a partitioned table to publication results in publishing
the inserts, updates, deletes of the table's leaf partitions as
inserts, updates, deletes of the table itself (it all happens inside
pgoutput). So, the replication target table doesn't necessarily have
to be a partitioned table and even if it is partitioned its partitions
don't have to match one-to-one.

One restriction remains though: partitioned tables on a subscriber
can't accept updates and deletes, because we'd need to map those to
updates and deletes of their partitions, including handling a tuple
possibly moving from one partition to another during an update.

Right. Without that second part, the first part isn't really that
useful yet, is it?

I would say yes.

I'm not sure what your intent with this patch is now. I thought the
previous behavior -- add a partitioned table to a publication and its
leaf tables appear in the replication output -- was pretty welcome. Do
we not want that anymore?

Hmm, I thought it would be more desirable to not expose a published
partitioned table's leaf partitions to a subscriber, because it allows
the target table to be defined more flexibly.

There should probably be an option to pick the behavior, like we do in
pg_dump.

I don't understand which existing behavior. Can you clarify?

Regarding allowing users to choose between publishing partitioned
table changes using leaf tables' schema vs as using own schema, I tend
to agree that there would be value in that. Users who choose the
former will have to ensure that target leaf partitions match exactly.
Users who want flexibility in how the target table is defined can use
the latter.

What happens when you add a leaf table directly to a publication? Is it
replicated under its own identity or under its ancestor partitioned
table? (What if both the leaf table and a partitioned table are
publication members?)

If both a leaf partition and an ancestor belong to the same
publication, then leaf partition changes are replicated using the
ancestor's schema. For a leaf partition to be replicated using its
own schema it must be published via a separate publication that
doesn't contain the ancestor. At least that's what the current patch
does.

Thanks,
Amit

Peter Eisentraut

peter_e@gmx.net

over 6 years ago

In reply to: Amit Langote (#19)

Re: adding partitioned tables to publications

On 2019-11-22 07:28, Amit Langote wrote:

Hmm, I thought it would be more desirable to not expose a published
partitioned table's leaf partitions to a subscriber, because it allows
the target table to be defined more flexibly.

There are multiple different variants that we probably eventually want
to support. But I think there is value in exposing the partition
structure to the subscriber. Most notably, it allows the subscriber to
run the initial table sync per partition rather than in one big chunk --
which ultimately reflects one of the reasons partitioning exists.

The other way, exposing only the partitioned table, is also useful,
especially if you want to partition differently on the subscriber. But
without the ability to target a partitioned table on the subscriber,
this would right now only allow you to replicate a partitioned table
into a non-partitioned table. Which is valid but probably not often useful.

What happens when you add a leaf table directly to a publication? Is it
replicated under its own identity or under its ancestor partitioned
table? (What if both the leaf table and a partitioned table are
publication members?)

If both a leaf partition and an ancestor belong to the same
publication, then leaf partition changes are replicated using the
ancestor's schema. For a leaf partition to be replicated using its
own schema it must be published via a separate publication that
doesn't contain the ancestor. At least that's what the current patch
does.

Hmm, that seems confusing. This would mean that if you add a
partitioned table to a publication that already contains leaf tables,
the publication behavior of the leaf tables would change. So again, I
think this alternative behavior of publishing partitions under the name
of their root table should be an explicit option on a publication, and
then it should be ensured somehow that individual partitions are not
added to the publication in confusing ways.

So, it's up to you which aspect of this you want to tackle, but I
thought your original goal of being able to add partitioned tables to
publications and have that implicitly expand to all member partitions on
the publication side seemed quite useful, self-contained, and
uncontroversial.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

amit.kapila16@gmail.com

over 6 years ago

In reply to: Peter Eisentraut (#20)

Langote_Amit_f8@lab.ntt.co.jp

over 6 years ago

In reply to: Peter Eisentraut (#20)

Langote_Amit_f8@lab.ntt.co.jp

over 6 years ago

In reply to: Amit Langote (#22)

Peter Eisentraut

peter_e@gmx.net

over 6 years ago

In reply to: Amit Langote (#23)

Langote_Amit_f8@lab.ntt.co.jp

over 6 years ago

In reply to: Peter Eisentraut (#24)

dilipbalaut@gmail.com

over 6 years ago

In reply to: Amit Langote (#25)

rafia.pghackers@gmail.com

over 6 years ago

In reply to: Amit Langote (#25)

Langote_Amit_f8@lab.ntt.co.jp

over 6 years ago

In reply to: Rafia Sabih (#27)

rafia.pghackers@gmail.com

over 6 years ago

In reply to: Amit Langote (#28)

Peter Eisentraut

peter_e@gmx.net

over 6 years ago

In reply to: Amit Langote (#28)

Peter Eisentraut

peter_e@gmx.net

over 6 years ago

In reply to: Rafia Sabih (#29)

Langote_Amit_f8@lab.ntt.co.jp

over 6 years ago

In reply to: Peter Eisentraut (#31)

Langote_Amit_f8@lab.ntt.co.jp

over 6 years ago

In reply to: Peter Eisentraut (#30)

Langote_Amit_f8@lab.ntt.co.jp

over 6 years ago

In reply to: Amit Langote (#33)

Peter Eisentraut

peter_e@gmx.net

over 6 years ago

In reply to: Amit Langote (#34)

rafia.pghackers@gmail.com

over 6 years ago

In reply to: Amit Langote (#34)

Langote_Amit_f8@lab.ntt.co.jp

over 6 years ago

In reply to: Peter Eisentraut (#35)

Langote_Amit_f8@lab.ntt.co.jp

over 6 years ago

In reply to: Rafia Sabih (#36)

Peter Eisentraut

peter_e@gmx.net

about 6 years ago

In reply to: Amit Langote (#38)

Langote_Amit_f8@lab.ntt.co.jp

about 6 years ago

In reply to: Peter Eisentraut (#39)

Peter Eisentraut

peter_e@gmx.net

about 6 years ago

In reply to: Amit Langote (#40)

Langote_Amit_f8@lab.ntt.co.jp

about 6 years ago

In reply to: Peter Eisentraut (#41)

Langote_Amit_f8@lab.ntt.co.jp

about 6 years ago

In reply to: Amit Langote (#42)

Peter Eisentraut

peter_e@gmx.net

about 6 years ago

In reply to: Amit Langote (#42)

Langote_Amit_f8@lab.ntt.co.jp

about 6 years ago

In reply to: Peter Eisentraut (#44)

Peter Eisentraut

peter_e@gmx.net

about 6 years ago

In reply to: Amit Langote (#45)

Peter Eisentraut

peter_e@gmx.net

about 6 years ago

In reply to: Amit Langote (#43)

Langote_Amit_f8@lab.ntt.co.jp

about 6 years ago

In reply to: Peter Eisentraut (#47)

Peter Eisentraut

peter_e@gmx.net

about 6 years ago

In reply to: Amit Langote (#48)

Langote_Amit_f8@lab.ntt.co.jp

about 6 years ago

In reply to: Peter Eisentraut (#49)

Langote_Amit_f8@lab.ntt.co.jp

about 6 years ago

In reply to: Amit Langote (#50)

Langote_Amit_f8@lab.ntt.co.jp

about 6 years ago

In reply to: Amit Langote (#51)

Peter Eisentraut

peter_e@gmx.net

about 6 years ago

In reply to: Amit Langote (#52)

petr@2ndquadrant.com

about 6 years ago

In reply to: Peter Eisentraut (#53)

Langote_Amit_f8@lab.ntt.co.jp

about 6 years ago

In reply to: Petr Jelinek (#54)

Langote_Amit_f8@lab.ntt.co.jp

about 6 years ago

In reply to: Amit Langote (#55)

petr@2ndquadrant.com

about 6 years ago

In reply to: Amit Langote (#56)

tgl@sss.pgh.pa.us

about 6 years ago

In reply to: Petr Jelinek (#57)

petr@2ndquadrant.com

about 6 years ago

In reply to: Tom Lane (#58)

tgl@sss.pgh.pa.us

about 6 years ago

In reply to: Petr Jelinek (#59)

petr@2ndquadrant.com

about 6 years ago

In reply to: Tom Lane (#60)

tgl@sss.pgh.pa.us

about 6 years ago

In reply to: Petr Jelinek (#61)

petr@2ndquadrant.com

about 6 years ago

In reply to: Tom Lane (#62)

Langote_Amit_f8@lab.ntt.co.jp

about 6 years ago

In reply to: Petr Jelinek (#63)

tgl@sss.pgh.pa.us

about 6 years ago

In reply to: Amit Langote (#64)

Peter Eisentraut

peter_e@gmx.net

about 6 years ago

In reply to: Amit Langote (#56)

Langote_Amit_f8@lab.ntt.co.jp

about 6 years ago

In reply to: Peter Eisentraut (#66)

Langote_Amit_f8@lab.ntt.co.jp

about 6 years ago

In reply to: Amit Langote (#67)

Peter Eisentraut

peter_e@gmx.net

about 6 years ago

In reply to: Amit Langote (#68)

Langote_Amit_f8@lab.ntt.co.jp

about 6 years ago

In reply to: Peter Eisentraut (#69)

Langote_Amit_f8@lab.ntt.co.jp

about 6 years ago

In reply to: Amit Langote (#70)

Peter Eisentraut

peter_e@gmx.net

about 6 years ago

In reply to: Amit Langote (#71)

Langote_Amit_f8@lab.ntt.co.jp

about 6 years ago

In reply to: Peter Eisentraut (#72)

Peter Eisentraut

peter_e@gmx.net

about 6 years ago

In reply to: Amit Langote (#73)

Langote_Amit_f8@lab.ntt.co.jp

about 6 years ago

In reply to: Peter Eisentraut (#74)

Langote_Amit_f8@lab.ntt.co.jp

about 6 years ago

In reply to: Amit Langote (#75)

Langote_Amit_f8@lab.ntt.co.jp

about 6 years ago

In reply to: Amit Langote (#75)

Peter Eisentraut

peter_e@gmx.net

about 6 years ago

In reply to: Amit Langote (#77)

Langote_Amit_f8@lab.ntt.co.jp

about 6 years ago

In reply to: Peter Eisentraut (#78)

Peter Eisentraut

peter_e@gmx.net

about 6 years ago

In reply to: Amit Langote (#79)

Langote_Amit_f8@lab.ntt.co.jp

about 6 years ago

In reply to: Peter Eisentraut (#80)