Skipping logical replication transactions on subscriber side
Hi all,
If a logical replication worker cannot apply the change on the
subscriber for some reason (e.g., missing table or violating a
constraint, etc.), logical replication stops until the problem is
resolved. Ideally, we resolve the problem on the subscriber (e.g., by
creating the missing table or removing the conflicting data, etc.) but
occasionally a problem cannot be fixed and it may be necessary to skip
the entire transaction in question. Currently, we have two ways to
skip transactions: advancing the LSN of the replication origin on the
subscriber and advancing the LSN of the replication slot on the
publisher. But both ways might not be able to skip exactly one
transaction in question and end up skipping other transactions too.
I’d like to propose a way to skip the particular transaction on the
subscriber side. As the first step, a transaction can be specified to
be skipped by specifying remote XID on the subscriber. This feature
would need two sub-features: (1) a sub-feature for users to identify
the problem subscription and the problem transaction’s XID, and (2) a
sub-feature to skip the particular transaction to apply.
For (1), I think the simplest way would be to put the details of the
change being applied in errcontext. For example, the following
errcontext shows the remote XID as well as the action name, the
relation name, and commit timestamp:
ERROR: duplicate key value violates unique constraint "test_pkey"
DETAIL: Key (c)=(1) already exists.
CONTEXT: during apply of "INSERT" for relation "public.test" in
transaction with xid 590 commit timestamp 2021-05-21
14:32:02.134273+09
The user can identify which remote XID has a problem during applying
the change (XID=590 in this case). As another idea, we can have a
statistics view for logical replication workers, showing information
of the last failure transaction.
For (2), what I'm thinking is to add a new action to ALTER
SUBSCRIPTION command like ALTER SUBSCRIPTION test_sub SET SKIP
TRANSACTION 590. Also, we can have actions to reset it; ALTER
SUBSCRIPTION test_sub RESET SKIP TRANSACTION. Those commands add the
XID to a new column of pg_subscription or a new catalog, having the
worker reread its subscription information. Once the worker skipped
the specified transaction, it resets the transaction to skip on the
catalog. The syntax allows users to specify one remote XID to skip. In
the future, it might be good if users can also specify multiple XIDs
(a range of XIDs or a list of XIDs, etc).
Feedback and comment are very welcome.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Mon, May 24, 2021 at 1:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
If a logical replication worker cannot apply the change on the
subscriber for some reason (e.g., missing table or violating a
constraint, etc.), logical replication stops until the problem is
resolved. Ideally, we resolve the problem on the subscriber (e.g., by
creating the missing table or removing the conflicting data, etc.) but
occasionally a problem cannot be fixed and it may be necessary to skip
the entire transaction in question. Currently, we have two ways to
skip transactions: advancing the LSN of the replication origin on the
subscriber and advancing the LSN of the replication slot on the
publisher. But both ways might not be able to skip exactly one
transaction in question and end up skipping other transactions too.I’d like to propose a way to skip the particular transaction on the
subscriber side. As the first step, a transaction can be specified to
be skipped by specifying remote XID on the subscriber. This feature
would need two sub-features: (1) a sub-feature for users to identify
the problem subscription and the problem transaction’s XID, and (2) a
sub-feature to skip the particular transaction to apply.For (1), I think the simplest way would be to put the details of the
change being applied in errcontext. For example, the following
errcontext shows the remote XID as well as the action name, the
relation name, and commit timestamp:ERROR: duplicate key value violates unique constraint "test_pkey"
DETAIL: Key (c)=(1) already exists.
CONTEXT: during apply of "INSERT" for relation "public.test" in
transaction with xid 590 commit timestamp 2021-05-21
14:32:02.134273+09
In the above, the subscription name/id is not mentioned. I think you
need it for sub-feature-2.
The user can identify which remote XID has a problem during applying
the change (XID=590 in this case). As another idea, we can have a
statistics view for logical replication workers, showing information
of the last failure transaction.
It might be good to display at both places. Having subscriber-side
information in the view might be helpful in other ways as well like we
can use it to display the number of transactions processed by a
particular subscriber.
I think you need to consider few more things here:
(a) Say the error occurs after applying some part of changes, then
just skipping the remaining part won't be sufficient, we probably need
to someway rollback the applied changes (by rolling back the
transaction or in some other way).
(b) How do you handle streamed transactions? It is possible that some
of the streams are successful and the error occurs after that, say
when writing to the stream file. Now, would you skip writing to stream
file or will you write it, and then during apply, you will skip the
entire transaction and remove the corresponding stream file.
(c) There is also a possibility that the error occurs while applying
the changes of some subtransaction (this is only possible for
streaming xacts), so, in such cases, do we allow users to rollback the
subtransaction or user has to rollback the entire transaction. I am
not sure but maybe for very large transactions users might just want
to rollback the subtransaction.
(d) How about prepared transactions? Do we need to rollback the
prepared transaction if user decides to skip such a transaction? We
already allow prepared transactions to be streamed to plugins and the
work for subscriber-side apply is in progress [1]/messages/by-id/CAHut+PsDysQA=JWXb6oGFr1npvqi1e7RzzXV-juCCxnbiwHvfA@mail.gmail.com, so I think we need
to consider this case as well.
(e) Do we want to provide such a feature via output plugins as well,
if not, why?
For (2), what I'm thinking is to add a new action to ALTER
SUBSCRIPTION command like ALTER SUBSCRIPTION test_sub SET SKIP
TRANSACTION 590. Also, we can have actions to reset it; ALTER
SUBSCRIPTION test_sub RESET SKIP TRANSACTION. Those commands add the
XID to a new column of pg_subscription or a new catalog, having the
worker reread its subscription information. Once the worker skipped
the specified transaction, it resets the transaction to skip on the
catalog.
What if we fail while updating the reset information in the catalog?
Will it be the responsibility of the user to reset such a transaction
or we will retry it after restart of worker? Now, say, we give such a
responsibility to the user and the user forgets to reset it then there
is a possibility that after wraparound we will again skip the
transaction which is not intended. And, if we want to retry it after
restart of worker, how will the worker remember the previous failure?
I think this will be a useful feature but we need to consider few more things.
[1]: /messages/by-id/CAHut+PsDysQA=JWXb6oGFr1npvqi1e7RzzXV-juCCxnbiwHvfA@mail.gmail.com
--
With Regards,
Amit Kapila.
On Mon, May 24, 2021 at 1:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Hi all,
If a logical replication worker cannot apply the change on the
subscriber for some reason (e.g., missing table or violating a
constraint, etc.), logical replication stops until the problem is
resolved. Ideally, we resolve the problem on the subscriber (e.g., by
creating the missing table or removing the conflicting data, etc.) but
occasionally a problem cannot be fixed and it may be necessary to skip
the entire transaction in question. Currently, we have two ways to
skip transactions: advancing the LSN of the replication origin on the
subscriber and advancing the LSN of the replication slot on the
publisher. But both ways might not be able to skip exactly one
transaction in question and end up skipping other transactions too.
Does it mean pg_replication_origin_advance() can't skip exactly one
txn? I'm not familiar with the function or never used it though, I was
just searching for "how to skip a single txn in postgres" and ended up
in [1]https://www.postgresql.org/docs/devel/logical-replication-conflicts.html. Could you please give some more details on scenarios when we
can't skip exactly one txn? Is there any other way to advance the LSN,
something like directly updating the pg_replication_slots catalog?
[1]: https://www.postgresql.org/docs/devel/logical-replication-conflicts.html
I’d like to propose a way to skip the particular transaction on the
subscriber side. As the first step, a transaction can be specified to
be skipped by specifying remote XID on the subscriber. This feature
would need two sub-features: (1) a sub-feature for users to identify
the problem subscription and the problem transaction’s XID, and (2) a
sub-feature to skip the particular transaction to apply.For (1), I think the simplest way would be to put the details of the
change being applied in errcontext. For example, the following
errcontext shows the remote XID as well as the action name, the
relation name, and commit timestamp:ERROR: duplicate key value violates unique constraint "test_pkey"
DETAIL: Key (c)=(1) already exists.
CONTEXT: during apply of "INSERT" for relation "public.test" in
transaction with xid 590 commit timestamp 2021-05-21
14:32:02.134273+09The user can identify which remote XID has a problem during applying
the change (XID=590 in this case). As another idea, we can have a
statistics view for logical replication workers, showing information
of the last failure transaction.
Agree with Amit on this. At times, it is difficult to look around in
the server logs, so it will be better to have it in both places.
For (2), what I'm thinking is to add a new action to ALTER
SUBSCRIPTION command like ALTER SUBSCRIPTION test_sub SET SKIP
TRANSACTION 590. Also, we can have actions to reset it; ALTER
SUBSCRIPTION test_sub RESET SKIP TRANSACTION. Those commands add the
XID to a new column of pg_subscription or a new catalog, having the
worker reread its subscription information. Once the worker skipped
the specified transaction, it resets the transaction to skip on the
catalog. The syntax allows users to specify one remote XID to skip. In
the future, it might be good if users can also specify multiple XIDs
(a range of XIDs or a list of XIDs, etc).
What's it like skipping a txn with txn id? Is it that the particular
txn is forced to commit or abort or just skipping some of the code in
the apply worker? IIUC, the behavior of RESET SKIP TRANSACTION is just
to forget the txn id specified in SET SKIP TRANSACTION right?
With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com
On Mon, May 24, 2021 at 7:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, May 24, 2021 at 1:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
If a logical replication worker cannot apply the change on the
subscriber for some reason (e.g., missing table or violating a
constraint, etc.), logical replication stops until the problem is
resolved. Ideally, we resolve the problem on the subscriber (e.g., by
creating the missing table or removing the conflicting data, etc.) but
occasionally a problem cannot be fixed and it may be necessary to skip
the entire transaction in question. Currently, we have two ways to
skip transactions: advancing the LSN of the replication origin on the
subscriber and advancing the LSN of the replication slot on the
publisher. But both ways might not be able to skip exactly one
transaction in question and end up skipping other transactions too.I’d like to propose a way to skip the particular transaction on the
subscriber side. As the first step, a transaction can be specified to
be skipped by specifying remote XID on the subscriber. This feature
would need two sub-features: (1) a sub-feature for users to identify
the problem subscription and the problem transaction’s XID, and (2) a
sub-feature to skip the particular transaction to apply.For (1), I think the simplest way would be to put the details of the
change being applied in errcontext. For example, the following
errcontext shows the remote XID as well as the action name, the
relation name, and commit timestamp:ERROR: duplicate key value violates unique constraint "test_pkey"
DETAIL: Key (c)=(1) already exists.
CONTEXT: during apply of "INSERT" for relation "public.test" in
transaction with xid 590 commit timestamp 2021-05-21
14:32:02.134273+09In the above, the subscription name/id is not mentioned. I think you
need it for sub-feature-2.
Agreed.
The user can identify which remote XID has a problem during applying
the change (XID=590 in this case). As another idea, we can have a
statistics view for logical replication workers, showing information
of the last failure transaction.It might be good to display at both places. Having subscriber-side
information in the view might be helpful in other ways as well like we
can use it to display the number of transactions processed by a
particular subscriber.
Yes. I think we can report that information to the stats collector. It
needs to live even after the worker exiting.
I think you need to consider few more things here:
(a) Say the error occurs after applying some part of changes, then
just skipping the remaining part won't be sufficient, we probably need
to someway rollback the applied changes (by rolling back the
transaction or in some other way).
After more thought, it might be better to that setting and resetting
the XID to skip requires disabling the subscription. This would not be
a restriction for users since logical replication is likely to already
stop (and possibly repeating restarting and stopping) due to an error.
Setting and resetting the XID modifies the system catalog so it's a
crash-safe change and survives beyond the server restarts. When a
logical replication worker starts, it checks the XID. If the worker
receives changes associated with the transaction with the specified
XID, it can ignore the entire transaction.
(b) How do you handle streamed transactions? It is possible that some
of the streams are successful and the error occurs after that, say
when writing to the stream file. Now, would you skip writing to stream
file or will you write it, and then during apply, you will skip the
entire transaction and remove the corresponding stream file.
I think streamed transactions can be handled in the same way described in (a).
(c) There is also a possibility that the error occurs while applying
the changes of some subtransaction (this is only possible for
streaming xacts), so, in such cases, do we allow users to rollback the
subtransaction or user has to rollback the entire transaction. I am
not sure but maybe for very large transactions users might just want
to rollback the subtransaction.
If the user specifies XID of a subtransaction, it would be better to
skip only the subtransaction. If specifies top transaction XID, it
would be better to skip the entire transaction. What do you think?
(d) How about prepared transactions? Do we need to rollback the
prepared transaction if user decides to skip such a transaction? We
already allow prepared transactions to be streamed to plugins and the
work for subscriber-side apply is in progress [1], so I think we need
to consider this case as well.
If a transaction replicated from the subscriber could be prepared on
the subscriber, it would be guaranteed to be able to be either
committed or rolled back. Given that this feature is to skip a problem
transaction, I think it should not do anything for transactions that
are already prepared on the subscriber.
(e) Do we want to provide such a feature via output plugins as well,
if not, why?
You mean to specify an XID to skip on the publisher side? Since I've
been considering this feature as a way to resume the logical
replication having a problem I've not thought of that idea but It
would be a good idea. Do you have any use cases? If we specified the
XID on the publisher, multiple subscribers would skip that
transaction.
For (2), what I'm thinking is to add a new action to ALTER
SUBSCRIPTION command like ALTER SUBSCRIPTION test_sub SET SKIP
TRANSACTION 590. Also, we can have actions to reset it; ALTER
SUBSCRIPTION test_sub RESET SKIP TRANSACTION. Those commands add the
XID to a new column of pg_subscription or a new catalog, having the
worker reread its subscription information. Once the worker skipped
the specified transaction, it resets the transaction to skip on the
catalog.What if we fail while updating the reset information in the catalog?
Will it be the responsibility of the user to reset such a transaction
or we will retry it after restart of worker? Now, say, we give such a
responsibility to the user and the user forgets to reset it then there
is a possibility that after wraparound we will again skip the
transaction which is not intended. And, if we want to retry it after
restart of worker, how will the worker remember the previous failure?
As described above, setting and resetting XID to skip is implemented
as a normal system catalog change, so it's crash-safe and persisted. I
think that the worker can either removes the XID or mark it as done
once it skipped the specified transaction so that it won't skip the
same XID again after wraparound. Also, it might be better if we reset
the XID also when a subscription field such as subconninfo is changed
because it could imply the worker will connect to another publisher
having a different XID space.
We also need to handle the cases where the user specifies an old XID
or XID whose transaction is already prepared on the subscriber. I
think the worker can reset the XID with a warning when it finds out
that the XID seems no longer valid or it cannot skip the specified
XID. For example in the former case, it can do that when the first
received transaction’s XID is newer than the specified XID. In the
latter case, it can do that when it receives the commit/rollback
prepared message of the specified XID.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, May 25, 2021 at 2:49 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
On Mon, May 24, 2021 at 1:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Hi all,
If a logical replication worker cannot apply the change on the
subscriber for some reason (e.g., missing table or violating a
constraint, etc.), logical replication stops until the problem is
resolved. Ideally, we resolve the problem on the subscriber (e.g., by
creating the missing table or removing the conflicting data, etc.) but
occasionally a problem cannot be fixed and it may be necessary to skip
the entire transaction in question. Currently, we have two ways to
skip transactions: advancing the LSN of the replication origin on the
subscriber and advancing the LSN of the replication slot on the
publisher. But both ways might not be able to skip exactly one
transaction in question and end up skipping other transactions too.Does it mean pg_replication_origin_advance() can't skip exactly one
txn? I'm not familiar with the function or never used it though, I was
just searching for "how to skip a single txn in postgres" and ended up
in [1]. Could you please give some more details on scenarios when we
can't skip exactly one txn? Is there any other way to advance the LSN,
something like directly updating the pg_replication_slots catalog?
Sorry, it's not impossible. Although the user mistakenly skips more
than one transaction by specifying a wrong LSN it's always possible to
skip an exact one transaction.
[1] - https://www.postgresql.org/docs/devel/logical-replication-conflicts.html
I’d like to propose a way to skip the particular transaction on the
subscriber side. As the first step, a transaction can be specified to
be skipped by specifying remote XID on the subscriber. This feature
would need two sub-features: (1) a sub-feature for users to identify
the problem subscription and the problem transaction’s XID, and (2) a
sub-feature to skip the particular transaction to apply.For (1), I think the simplest way would be to put the details of the
change being applied in errcontext. For example, the following
errcontext shows the remote XID as well as the action name, the
relation name, and commit timestamp:ERROR: duplicate key value violates unique constraint "test_pkey"
DETAIL: Key (c)=(1) already exists.
CONTEXT: during apply of "INSERT" for relation "public.test" in
transaction with xid 590 commit timestamp 2021-05-21
14:32:02.134273+09The user can identify which remote XID has a problem during applying
the change (XID=590 in this case). As another idea, we can have a
statistics view for logical replication workers, showing information
of the last failure transaction.Agree with Amit on this. At times, it is difficult to look around in
the server logs, so it will be better to have it in both places.For (2), what I'm thinking is to add a new action to ALTER
SUBSCRIPTION command like ALTER SUBSCRIPTION test_sub SET SKIP
TRANSACTION 590. Also, we can have actions to reset it; ALTER
SUBSCRIPTION test_sub RESET SKIP TRANSACTION. Those commands add the
XID to a new column of pg_subscription or a new catalog, having the
worker reread its subscription information. Once the worker skipped
the specified transaction, it resets the transaction to skip on the
catalog. The syntax allows users to specify one remote XID to skip. In
the future, it might be good if users can also specify multiple XIDs
(a range of XIDs or a list of XIDs, etc).What's it like skipping a txn with txn id? Is it that the particular
txn is forced to commit or abort or just skipping some of the code in
the apply worker?
What I'm thinking is to ignore the entire transaction with the
specified XID. IOW Logical replication workers don't even start the
transaction and ignore all changes associated with the XID.
IIUC, the behavior of RESET SKIP TRANSACTION is just
to forget the txn id specified in SET SKIP TRANSACTION right?
Right. I proposed this RESET command for users to cancel the skipping behavior.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, May 25, 2021 at 1:44 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, May 25, 2021 at 2:49 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:On Mon, May 24, 2021 at 1:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Hi all,
If a logical replication worker cannot apply the change on the
subscriber for some reason (e.g., missing table or violating a
constraint, etc.), logical replication stops until the problem is
resolved. Ideally, we resolve the problem on the subscriber (e.g., by
creating the missing table or removing the conflicting data, etc.) but
occasionally a problem cannot be fixed and it may be necessary to skip
the entire transaction in question. Currently, we have two ways to
skip transactions: advancing the LSN of the replication origin on the
subscriber and advancing the LSN of the replication slot on the
publisher. But both ways might not be able to skip exactly one
transaction in question and end up skipping other transactions too.Does it mean pg_replication_origin_advance() can't skip exactly one
txn? I'm not familiar with the function or never used it though, I was
just searching for "how to skip a single txn in postgres" and ended up
in [1]. Could you please give some more details on scenarios when we
can't skip exactly one txn? Is there any other way to advance the LSN,
something like directly updating the pg_replication_slots catalog?Sorry, it's not impossible. Although the user mistakenly skips more
than one transaction by specifying a wrong LSN it's always possible to
skip an exact one transaction.
IIUC, if the user specifies the "correct LSN", then it's possible to
skip exact txn for which the sync workers are unable to apply changes,
right?
How can the user get the LSN (which we call "correct LSN")? Is it from
pg_replication_slots? Or some other way?
If the user somehow can get the "correct LSN", can't the exact txn be
skipped using it with any of the existing ways, either using
pg_replication_origin_advance or any other ways?
If there's no way to get the "correct LSN", then why can't we just
print that LSN in the error context and/or in the new statistics view
for logical replication workers, so that any of the existing ways can
be used to skip exactly one txn?
IIUC, the feature proposed here guards against the users specifying
wrong LSN. If I'm right, what is the guarantee that users don't
specify the wrong txn id? Why can't we tell the users when a wrong LSN
is specified that "currently, an apply worker is failing to apply the
LSN XXXX, and you specified LSN YYYY, are you sure this is
intentional?"
Please correct me if I'm missing anything.
With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com
On Tue, May 25, 2021 at 7:21 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
On Tue, May 25, 2021 at 1:44 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, May 25, 2021 at 2:49 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:On Mon, May 24, 2021 at 1:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Hi all,
If a logical replication worker cannot apply the change on the
subscriber for some reason (e.g., missing table or violating a
constraint, etc.), logical replication stops until the problem is
resolved. Ideally, we resolve the problem on the subscriber (e.g., by
creating the missing table or removing the conflicting data, etc.) but
occasionally a problem cannot be fixed and it may be necessary to skip
the entire transaction in question. Currently, we have two ways to
skip transactions: advancing the LSN of the replication origin on the
subscriber and advancing the LSN of the replication slot on the
publisher. But both ways might not be able to skip exactly one
transaction in question and end up skipping other transactions too.Does it mean pg_replication_origin_advance() can't skip exactly one
txn? I'm not familiar with the function or never used it though, I was
just searching for "how to skip a single txn in postgres" and ended up
in [1]. Could you please give some more details on scenarios when we
can't skip exactly one txn? Is there any other way to advance the LSN,
something like directly updating the pg_replication_slots catalog?Sorry, it's not impossible. Although the user mistakenly skips more
than one transaction by specifying a wrong LSN it's always possible to
skip an exact one transaction.IIUC, if the user specifies the "correct LSN", then it's possible to
skip exact txn for which the sync workers are unable to apply changes,
right?How can the user get the LSN (which we call "correct LSN")? Is it from
pg_replication_slots? Or some other way?If the user somehow can get the "correct LSN", can't the exact txn be
skipped using it with any of the existing ways, either using
pg_replication_origin_advance or any other ways?
One possible way I know is to copy the logical replication slot used
by the subscriber and peek at the changes to identify the correct LSN
(maybe there is another handy way though) . For example, suppose that
two transactions insert tuples as follows on the publisher:
TX-A: BEGIN;
TX-A: INSERT INTO test VALUES (1);
TX-B: BEGIN;
TX-B: INSERT INTO test VALUES (10);
TX-B: COMMIT;
TX-A: INSERT INTO test VALUES (2);
TX-A: COMMIT;
And suppose further that the insertion with value = 10 (by TX-A)
cannot be applied only on the subscriber due to unique constraint
violation. If we copy the slot by
pg_copy_logical_replication_slot('test_sub', 'copy_slot', true,
'test_decoding') , we can peek at those changes with LSN as follows:
=# select * from pg_logical_slot_peek_changes('copy', null, null) order by lsn;
lsn | xid | data
-----------+-----+------------------------------------------
0/1911548 | 736 | BEGIN 736
0/1911548 | 736 | table public.hoge: INSERT: c[integer]:1
0/1911588 | 737 | BEGIN 737
0/1911588 | 737 | table public.hoge: INSERT: c[integer]:10
0/19115F8 | 737 | COMMIT 737
0/1911630 | 736 | table public.hoge: INSERT: c[integer]:2
0/19116A0 | 736 | COMMIT 736
(7 rows)
In this case, '0/19115F8' is the correct LSN to specify. We can
advance the replication origin to ' 0/19115F8' by
pg_replication_origin_advance() so that logical replication streams
transactions committed after ' 0/19115F8'. After the logical
replication restarting, it skips the transaction with xid = 737 but
replicates the transaction with xid = 736.
If there's no way to get the "correct LSN", then why can't we just
print that LSN in the error context and/or in the new statistics view
for logical replication workers, so that any of the existing ways can
be used to skip exactly one txn?
I think specifying XID to the subscription is more understandable for users.
IIUC, the feature proposed here guards against the users specifying
wrong LSN. If I'm right, what is the guarantee that users don't
specify the wrong txn id? Why can't we tell the users when a wrong LSN
is specified that "currently, an apply worker is failing to apply the
LSN XXXX, and you specified LSN YYYY, are you sure this is
intentional?"
With the initial idea, specifying the correct XID is the user's
responsibility. If they specify an old XID, the worker invalids it and
raises a warning to tell "the worker invalidated the specified XID as
it's too old". As the second idea, if we store the last failed XID
somewhere (e.g., a system catalog), the user can just specify to skip
that transaction. That is, instead of specifying the XID they could do
something like "ALTER SUBSCRIPTION test_sub RESOLVE CONFLICT BY SKIP".
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, May 25, 2021 at 12:26 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, May 24, 2021 at 7:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, May 24, 2021 at 1:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I think you need to consider few more things here:
(a) Say the error occurs after applying some part of changes, then
just skipping the remaining part won't be sufficient, we probably need
to someway rollback the applied changes (by rolling back the
transaction or in some other way).After more thought, it might be better to that setting and resetting
the XID to skip requires disabling the subscription.
It might be better if it doesn't require disabling the subscription
because it would be more steps for the user to disable/enable it. It
is not clear to me what exactly you want to gain by disabling the
subscription in this case.
This would not be
a restriction for users since logical replication is likely to already
stop (and possibly repeating restarting and stopping) due to an error.
Setting and resetting the XID modifies the system catalog so it's a
crash-safe change and survives beyond the server restarts. When a
logical replication worker starts, it checks the XID. If the worker
receives changes associated with the transaction with the specified
XID, it can ignore the entire transaction.(b) How do you handle streamed transactions? It is possible that some
of the streams are successful and the error occurs after that, say
when writing to the stream file. Now, would you skip writing to stream
file or will you write it, and then during apply, you will skip the
entire transaction and remove the corresponding stream file.I think streamed transactions can be handled in the same way described in (a).
(c) There is also a possibility that the error occurs while applying
the changes of some subtransaction (this is only possible for
streaming xacts), so, in such cases, do we allow users to rollback the
subtransaction or user has to rollback the entire transaction. I am
not sure but maybe for very large transactions users might just want
to rollback the subtransaction.If the user specifies XID of a subtransaction, it would be better to
skip only the subtransaction. If specifies top transaction XID, it
would be better to skip the entire transaction. What do you think?
makes sense.
(d) How about prepared transactions? Do we need to rollback the
prepared transaction if user decides to skip such a transaction? We
already allow prepared transactions to be streamed to plugins and the
work for subscriber-side apply is in progress [1], so I think we need
to consider this case as well.If a transaction replicated from the subscriber could be prepared on
the subscriber, it would be guaranteed to be able to be either
committed or rolled back. Given that this feature is to skip a problem
transaction, I think it should not do anything for transactions that
are already prepared on the subscriber.
makes sense, but I think we need to reset the XID in such a case.
(e) Do we want to provide such a feature via output plugins as well,
if not, why?You mean to specify an XID to skip on the publisher side? Since I've
been considering this feature as a way to resume the logical
replication having a problem I've not thought of that idea but It
would be a good idea. Do you have any use cases?
No. On again thinking about this, I think we can leave this for now.
If we specified the
XID on the publisher, multiple subscribers would skip that
transaction.For (2), what I'm thinking is to add a new action to ALTER
SUBSCRIPTION command like ALTER SUBSCRIPTION test_sub SET SKIP
TRANSACTION 590. Also, we can have actions to reset it; ALTER
SUBSCRIPTION test_sub RESET SKIP TRANSACTION. Those commands add the
XID to a new column of pg_subscription or a new catalog, having the
worker reread its subscription information. Once the worker skipped
the specified transaction, it resets the transaction to skip on the
catalog.What if we fail while updating the reset information in the catalog?
Will it be the responsibility of the user to reset such a transaction
or we will retry it after restart of worker? Now, say, we give such a
responsibility to the user and the user forgets to reset it then there
is a possibility that after wraparound we will again skip the
transaction which is not intended. And, if we want to retry it after
restart of worker, how will the worker remember the previous failure?As described above, setting and resetting XID to skip is implemented
as a normal system catalog change, so it's crash-safe and persisted. I
think that the worker can either removes the XID or mark it as done
once it skipped the specified transaction so that it won't skip the
same XID again after wraparound.
It all depends on when exactly you want to update the catalog
information. Say after skipping commit of the XID, we do update the
corresponding LSN to be communicated as already processed to the
subscriber and then get the error while updating the catalog
information then next time we might not know whether to update the
catalog for skipped XID.
Also, it might be better if we reset
the XID also when a subscription field such as subconninfo is changed
because it could imply the worker will connect to another publisher
having a different XID space.We also need to handle the cases where the user specifies an old XID
or XID whose transaction is already prepared on the subscriber. I
think the worker can reset the XID with a warning when it finds out
that the XID seems no longer valid or it cannot skip the specified
XID. For example in the former case, it can do that when the first
received transaction’s XID is newer than the specified XID.
But how can we guarantee that older XID can't be received later? Is
there a guarantee that we receive the transactions on subscriber in
XID order.
--
With Regards,
Amit Kapila.
On Tue, May 25, 2021 at 6:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, May 25, 2021 at 7:21 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:If there's no way to get the "correct LSN", then why can't we just
print that LSN in the error context and/or in the new statistics view
for logical replication workers, so that any of the existing ways can
be used to skip exactly one txn?I think specifying XID to the subscription is more understandable for users.
I agree with you that specifying XID could be easier and
understandable for users. I was thinking and studying a bit about what
other systems do in this regard. Why don't we try to provide conflict
resolution methods for users? The idea could be that either the
conflicts can be resolved automatically or manually. In the case of
manual resolution, users can use the existing methods or the XID stuff
you are proposing here and in case of automatic resolution, the
in-built or corresponding user-defined functions will be invoked for
conflict resolution. There are more details to figure out in the
automatic resolution scheme but I see a lot of value in doing the
same.
--
With Regards,
Amit Kapila.
On Wed, May 26, 2021 at 3:43 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, May 25, 2021 at 12:26 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, May 24, 2021 at 7:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, May 24, 2021 at 1:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I think you need to consider few more things here:
(a) Say the error occurs after applying some part of changes, then
just skipping the remaining part won't be sufficient, we probably need
to someway rollback the applied changes (by rolling back the
transaction or in some other way).After more thought, it might be better to that setting and resetting
the XID to skip requires disabling the subscription.It might be better if it doesn't require disabling the subscription
because it would be more steps for the user to disable/enable it. It
is not clear to me what exactly you want to gain by disabling the
subscription in this case.
The situation I’m considered is where the user specifies the XID while
the worker is applying the changes of the transaction with that XID.
In this case, I think we need to somehow rollback the changes applied
so far. Perhaps we can either rollback the transaction and ignore the
remaining changes or restart and ignore the entire transaction from
the beginning. Also, we need to handle the case where the user resets
the XID after the worker skips to write some stream files. I thought
those parts could be complicated but it might be not after more
thought.
This would not be
a restriction for users since logical replication is likely to already
stop (and possibly repeating restarting and stopping) due to an error.
Setting and resetting the XID modifies the system catalog so it's a
crash-safe change and survives beyond the server restarts. When a
logical replication worker starts, it checks the XID. If the worker
receives changes associated with the transaction with the specified
XID, it can ignore the entire transaction.(b) How do you handle streamed transactions? It is possible that some
of the streams are successful and the error occurs after that, say
when writing to the stream file. Now, would you skip writing to stream
file or will you write it, and then during apply, you will skip the
entire transaction and remove the corresponding stream file.I think streamed transactions can be handled in the same way described in (a).
If setting and resetting the XID can be performed during the worker
running, we would need to write stream files even if we’re receiving
changes that are associated with the specified XID. Since it could
happen that the user resets the XID after we processed some of the
streamed changes, we would need to decide whether or to skip the
transaction when starting to apply changes.
(c) There is also a possibility that the error occurs while applying
the changes of some subtransaction (this is only possible for
streaming xacts), so, in such cases, do we allow users to rollback the
subtransaction or user has to rollback the entire transaction. I am
not sure but maybe for very large transactions users might just want
to rollback the subtransaction.If the user specifies XID of a subtransaction, it would be better to
skip only the subtransaction. If specifies top transaction XID, it
would be better to skip the entire transaction. What do you think?makes sense.
(d) How about prepared transactions? Do we need to rollback the
prepared transaction if user decides to skip such a transaction? We
already allow prepared transactions to be streamed to plugins and the
work for subscriber-side apply is in progress [1], so I think we need
to consider this case as well.If a transaction replicated from the subscriber could be prepared on
the subscriber, it would be guaranteed to be able to be either
committed or rolled back. Given that this feature is to skip a problem
transaction, I think it should not do anything for transactions that
are already prepared on the subscriber.makes sense, but I think we need to reset the XID in such a case.
Agreed.
(e) Do we want to provide such a feature via output plugins as well,
if not, why?You mean to specify an XID to skip on the publisher side? Since I've
been considering this feature as a way to resume the logical
replication having a problem I've not thought of that idea but It
would be a good idea. Do you have any use cases?No. On again thinking about this, I think we can leave this for now.
If we specified the
XID on the publisher, multiple subscribers would skip that
transaction.For (2), what I'm thinking is to add a new action to ALTER
SUBSCRIPTION command like ALTER SUBSCRIPTION test_sub SET SKIP
TRANSACTION 590. Also, we can have actions to reset it; ALTER
SUBSCRIPTION test_sub RESET SKIP TRANSACTION. Those commands add the
XID to a new column of pg_subscription or a new catalog, having the
worker reread its subscription information. Once the worker skipped
the specified transaction, it resets the transaction to skip on the
catalog.What if we fail while updating the reset information in the catalog?
Will it be the responsibility of the user to reset such a transaction
or we will retry it after restart of worker? Now, say, we give such a
responsibility to the user and the user forgets to reset it then there
is a possibility that after wraparound we will again skip the
transaction which is not intended. And, if we want to retry it after
restart of worker, how will the worker remember the previous failure?As described above, setting and resetting XID to skip is implemented
as a normal system catalog change, so it's crash-safe and persisted. I
think that the worker can either removes the XID or mark it as done
once it skipped the specified transaction so that it won't skip the
same XID again after wraparound.It all depends on when exactly you want to update the catalog
information. Say after skipping commit of the XID, we do update the
corresponding LSN to be communicated as already processed to the
subscriber and then get the error while updating the catalog
information then next time we might not know whether to update the
catalog for skipped XID.Also, it might be better if we reset
the XID also when a subscription field such as subconninfo is changed
because it could imply the worker will connect to another publisher
having a different XID space.We also need to handle the cases where the user specifies an old XID
or XID whose transaction is already prepared on the subscriber. I
think the worker can reset the XID with a warning when it finds out
that the XID seems no longer valid or it cannot skip the specified
XID. For example in the former case, it can do that when the first
received transaction’s XID is newer than the specified XID.But how can we guarantee that older XID can't be received later? Is
there a guarantee that we receive the transactions on subscriber in
XID order.
Considering the above two comments, it might be better to provide a
way to skip the transaction that is already known to be conflicted
rather than allowing users to specify the arbitrary XID.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thu, May 27, 2021 at 9:56 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, May 26, 2021 at 3:43 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, May 25, 2021 at 12:26 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, May 24, 2021 at 7:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, May 24, 2021 at 1:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I think you need to consider few more things here:
(a) Say the error occurs after applying some part of changes, then
just skipping the remaining part won't be sufficient, we probably need
to someway rollback the applied changes (by rolling back the
transaction or in some other way).After more thought, it might be better to that setting and resetting
the XID to skip requires disabling the subscription.It might be better if it doesn't require disabling the subscription
because it would be more steps for the user to disable/enable it. It
is not clear to me what exactly you want to gain by disabling the
subscription in this case.The situation I’m considered is where the user specifies the XID while
the worker is applying the changes of the transaction with that XID.
In this case, I think we need to somehow rollback the changes applied
so far. Perhaps we can either rollback the transaction and ignore the
remaining changes or restart and ignore the entire transaction from
the beginning.
If we follow your suggestion of only allowing XIDs that have been
known to have conflicts then probably we don't need to worry about
rollbacks.
For (2), what I'm thinking is to add a new action to ALTER
SUBSCRIPTION command like ALTER SUBSCRIPTION test_sub SET SKIP
TRANSACTION 590. Also, we can have actions to reset it; ALTER
SUBSCRIPTION test_sub RESET SKIP TRANSACTION. Those commands add the
XID to a new column of pg_subscription or a new catalog, having the
worker reread its subscription information. Once the worker skipped
the specified transaction, it resets the transaction to skip on the
catalog.What if we fail while updating the reset information in the catalog?
Will it be the responsibility of the user to reset such a transaction
or we will retry it after restart of worker? Now, say, we give such a
responsibility to the user and the user forgets to reset it then there
is a possibility that after wraparound we will again skip the
transaction which is not intended. And, if we want to retry it after
restart of worker, how will the worker remember the previous failure?As described above, setting and resetting XID to skip is implemented
as a normal system catalog change, so it's crash-safe and persisted. I
think that the worker can either removes the XID or mark it as done
once it skipped the specified transaction so that it won't skip the
same XID again after wraparound.It all depends on when exactly you want to update the catalog
information. Say after skipping commit of the XID, we do update the
corresponding LSN to be communicated as already processed to the
subscriber and then get the error while updating the catalog
information then next time we might not know whether to update the
catalog for skipped XID.Also, it might be better if we reset
the XID also when a subscription field such as subconninfo is changed
because it could imply the worker will connect to another publisher
having a different XID space.We also need to handle the cases where the user specifies an old XID
or XID whose transaction is already prepared on the subscriber. I
think the worker can reset the XID with a warning when it finds out
that the XID seems no longer valid or it cannot skip the specified
XID. For example in the former case, it can do that when the first
received transaction’s XID is newer than the specified XID.But how can we guarantee that older XID can't be received later? Is
there a guarantee that we receive the transactions on subscriber in
XID order.Considering the above two comments, it might be better to provide a
way to skip the transaction that is already known to be conflicted
rather than allowing users to specify the arbitrary XID.
Okay, that makes sense but still not sure how will you identify if we
need to reset XID in case of failure doing that in the previous
attempt. Also, I am thinking that instead of a stat view, do we need
to consider having a system table (pg_replication_conflicts or
something like that) for this because what if stats information is
lost (say either due to crash or due to udp packet loss), can we rely
on stats view for this?
--
With Regards,
Amit Kapila.
On Wed, May 26, 2021 at 6:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, May 25, 2021 at 6:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, May 25, 2021 at 7:21 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:If there's no way to get the "correct LSN", then why can't we just
print that LSN in the error context and/or in the new statistics view
for logical replication workers, so that any of the existing ways can
be used to skip exactly one txn?I think specifying XID to the subscription is more understandable for users.
I agree with you that specifying XID could be easier and
understandable for users. I was thinking and studying a bit about what
other systems do in this regard. Why don't we try to provide conflict
resolution methods for users? The idea could be that either the
conflicts can be resolved automatically or manually. In the case of
manual resolution, users can use the existing methods or the XID stuff
you are proposing here and in case of automatic resolution, the
in-built or corresponding user-defined functions will be invoked for
conflict resolution. There are more details to figure out in the
automatic resolution scheme but I see a lot of value in doing the
same.
Yeah, I also see a lot of value in automatic conflict resolution. But
maybe we can have both ways? For example, in case where the user wants
to resolve conflicts in different ways or a conflict that cannot be
resolved by automatic resolution (not sure there is in practice
though), the manual resolution would also have value.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thu, May 27, 2021 at 2:48 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, May 27, 2021 at 9:56 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, May 26, 2021 at 3:43 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, May 25, 2021 at 12:26 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, May 24, 2021 at 7:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, May 24, 2021 at 1:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I think you need to consider few more things here:
(a) Say the error occurs after applying some part of changes, then
just skipping the remaining part won't be sufficient, we probably need
to someway rollback the applied changes (by rolling back the
transaction or in some other way).After more thought, it might be better to that setting and resetting
the XID to skip requires disabling the subscription.It might be better if it doesn't require disabling the subscription
because it would be more steps for the user to disable/enable it. It
is not clear to me what exactly you want to gain by disabling the
subscription in this case.The situation I’m considered is where the user specifies the XID while
the worker is applying the changes of the transaction with that XID.
In this case, I think we need to somehow rollback the changes applied
so far. Perhaps we can either rollback the transaction and ignore the
remaining changes or restart and ignore the entire transaction from
the beginning.If we follow your suggestion of only allowing XIDs that have been
known to have conflicts then probably we don't need to worry about
rollbacks.For (2), what I'm thinking is to add a new action to ALTER
SUBSCRIPTION command like ALTER SUBSCRIPTION test_sub SET SKIP
TRANSACTION 590. Also, we can have actions to reset it; ALTER
SUBSCRIPTION test_sub RESET SKIP TRANSACTION. Those commands add the
XID to a new column of pg_subscription or a new catalog, having the
worker reread its subscription information. Once the worker skipped
the specified transaction, it resets the transaction to skip on the
catalog.What if we fail while updating the reset information in the catalog?
Will it be the responsibility of the user to reset such a transaction
or we will retry it after restart of worker? Now, say, we give such a
responsibility to the user and the user forgets to reset it then there
is a possibility that after wraparound we will again skip the
transaction which is not intended. And, if we want to retry it after
restart of worker, how will the worker remember the previous failure?As described above, setting and resetting XID to skip is implemented
as a normal system catalog change, so it's crash-safe and persisted. I
think that the worker can either removes the XID or mark it as done
once it skipped the specified transaction so that it won't skip the
same XID again after wraparound.It all depends on when exactly you want to update the catalog
information. Say after skipping commit of the XID, we do update the
corresponding LSN to be communicated as already processed to the
subscriber and then get the error while updating the catalog
information then next time we might not know whether to update the
catalog for skipped XID.Also, it might be better if we reset
the XID also when a subscription field such as subconninfo is changed
because it could imply the worker will connect to another publisher
having a different XID space.We also need to handle the cases where the user specifies an old XID
or XID whose transaction is already prepared on the subscriber. I
think the worker can reset the XID with a warning when it finds out
that the XID seems no longer valid or it cannot skip the specified
XID. For example in the former case, it can do that when the first
received transaction’s XID is newer than the specified XID.But how can we guarantee that older XID can't be received later? Is
there a guarantee that we receive the transactions on subscriber in
XID order.Considering the above two comments, it might be better to provide a
way to skip the transaction that is already known to be conflicted
rather than allowing users to specify the arbitrary XID.Okay, that makes sense but still not sure how will you identify if we
need to reset XID in case of failure doing that in the previous
attempt.
It's a just idea but we can record the failed transaction with XID as
well as its commit LSN passed? The sequence I'm thinking is,
1. the worker records the XID and commit LSN of the failed transaction
to a catalog.
2. the user specifies how to resolve that conflict transaction
(currently only 'skip' is supported) and writes to the catalog.
3. the worker does the resolution method according to the catalog. If
the worker didn't start to apply those changes, it can skip the entire
transaction. If did, it rollbacks the transaction and ignores the
remaining.
The worker needs neither to reset information of the last failed
transaction nor to mark the conflicted transaction as resolved. The
worker will ignore that information when checking the catalog if the
commit LSN is passed.
Also, I am thinking that instead of a stat view, do we need
to consider having a system table (pg_replication_conflicts or
something like that) for this because what if stats information is
lost (say either due to crash or due to udp packet loss), can we rely
on stats view for this?
Yeah, it seems better to use a catalog.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thu, May 27, 2021 at 1:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, May 27, 2021 at 2:48 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Okay, that makes sense but still not sure how will you identify if we
need to reset XID in case of failure doing that in the previous
attempt.It's a just idea but we can record the failed transaction with XID as
well as its commit LSN passed? The sequence I'm thinking is,1. the worker records the XID and commit LSN of the failed transaction
to a catalog.
When will you record this info? I am not sure if we can try to update
this when an error has occurred. We can think of using try..catch in
apply worker and then record it in catch on error but would that be
advisable? One random thought that occurred to me is to that apply
worker notifies such information to the launcher (or maybe another
process) which will log this information.
2. the user specifies how to resolve that conflict transaction
(currently only 'skip' is supported) and writes to the catalog.
3. the worker does the resolution method according to the catalog. If
the worker didn't start to apply those changes, it can skip the entire
transaction. If did, it rollbacks the transaction and ignores the
remaining.The worker needs neither to reset information of the last failed
transaction nor to mark the conflicted transaction as resolved. The
worker will ignore that information when checking the catalog if the
commit LSN is passed.
So won't this require us to check the required info in the catalog
before applying each transaction? If so, that might be overhead, maybe
we can build some cache of the highest commitLSN that can be consulted
rather than the catalog table. I think we need to think about when to
remove rows for which conflict has been resolved as we can't let that
information grow infinitely.
Also, I am thinking that instead of a stat view, do we need
to consider having a system table (pg_replication_conflicts or
something like that) for this because what if stats information is
lost (say either due to crash or due to udp packet loss), can we rely
on stats view for this?Yeah, it seems better to use a catalog.
Okay.
--
With Regards,
Amit Kapila.
On Thu, May 27, 2021 at 12:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, May 26, 2021 at 6:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I agree with you that specifying XID could be easier and
understandable for users. I was thinking and studying a bit about what
other systems do in this regard. Why don't we try to provide conflict
resolution methods for users? The idea could be that either the
conflicts can be resolved automatically or manually. In the case of
manual resolution, users can use the existing methods or the XID stuff
you are proposing here and in case of automatic resolution, the
in-built or corresponding user-defined functions will be invoked for
conflict resolution. There are more details to figure out in the
automatic resolution scheme but I see a lot of value in doing the
same.Yeah, I also see a lot of value in automatic conflict resolution. But
maybe we can have both ways? For example, in case where the user wants
to resolve conflicts in different ways or a conflict that cannot be
resolved by automatic resolution (not sure there is in practice
though), the manual resolution would also have value.
Right, that is exactly what I was saying. So, even if both can be done
as separate patches, we should try to design the manual resolution in
a way that can be extended for an automatic resolution system. I think
we can try to have some initial idea/design/POC for an automatic
resolution as well to ensure that the manual resolution scheme can be
further extended.
--
With Regards,
Amit Kapila.
On Thu, May 27, 2021 at 7:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, May 27, 2021 at 12:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, May 26, 2021 at 6:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I agree with you that specifying XID could be easier and
understandable for users. I was thinking and studying a bit about what
other systems do in this regard. Why don't we try to provide conflict
resolution methods for users? The idea could be that either the
conflicts can be resolved automatically or manually. In the case of
manual resolution, users can use the existing methods or the XID stuff
you are proposing here and in case of automatic resolution, the
in-built or corresponding user-defined functions will be invoked for
conflict resolution. There are more details to figure out in the
automatic resolution scheme but I see a lot of value in doing the
same.Yeah, I also see a lot of value in automatic conflict resolution. But
maybe we can have both ways? For example, in case where the user wants
to resolve conflicts in different ways or a conflict that cannot be
resolved by automatic resolution (not sure there is in practice
though), the manual resolution would also have value.Right, that is exactly what I was saying. So, even if both can be done
as separate patches, we should try to design the manual resolution in
a way that can be extended for an automatic resolution system. I think
we can try to have some initial idea/design/POC for an automatic
resolution as well to ensure that the manual resolution scheme can be
further extended.
Totally agreed.
But perhaps we might want to note that the conflict resolution we're
talking about is to resolve conflicts at the row or column level. It
doesn't necessarily raise an ERROR and the granularity of resolution
is per record or column. For example, if a DELETE and an UPDATE
process the same tuple (searched by PK), the UPDATE may not find the
tuple and be ignored due to the tuple having been already deleted. In
this case, no ERROR will occur (i.g. UPDATE will be ignored), but the
user may want to do another conflict resolution. On the other hand,
the feature proposed here assumes that an error has already occurred
and logical replication has already been stopped. And resolves it by
skipping the entire transaction.
IIUC the conflict resolution can be thought of as a combination of
types of conflicts and the resolution that can be applied to them. For
example, if there is a conflict between INSERT and INSERT and the
latter INSERT violates the unique constraint, an ERROR is raised. So
DBA can resolve it manually. But there is another way to automatically
resolve it by selecting the tuple having a newer timestamp. On the
other hand, in the DELETE and UPDATE conflict described above, it's
possible to automatically ignore the fact that the UPDATE could update
the tuple. Or we can even generate an ERROR so that DBA can resolve it
manually. DBA can manually resolve the conflict in various ways:
skipping the entire transaction from the origin, choose the tuple
having a newer/older timestamp, etc.
In that sense, we can think of the feature proposed here as a feature
that provides a way to resolve the conflict that would originally
cause an ERROR by skipping the entire transaction. If we add a
solution that raises an ERROR for conflicts that don't originally
raise an ERROR (like DELETE and UPDATE conflict) in the future, we
will be able to manually skip each transaction for all types of
conflicts.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thu, May 27, 2021 at 7:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, May 27, 2021 at 1:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, May 27, 2021 at 2:48 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Okay, that makes sense but still not sure how will you identify if we
need to reset XID in case of failure doing that in the previous
attempt.It's a just idea but we can record the failed transaction with XID as
well as its commit LSN passed? The sequence I'm thinking is,1. the worker records the XID and commit LSN of the failed transaction
to a catalog.When will you record this info? I am not sure if we can try to update
this when an error has occurred. We can think of using try..catch in
apply worker and then record it in catch on error but would that be
advisable? One random thought that occurred to me is to that apply
worker notifies such information to the launcher (or maybe another
process) which will log this information.
Yeah, I was concerned about that too and had the same idea. The
information still could not be written if the server crashes before
the launcher writes it. But I think it's an acceptable.
2. the user specifies how to resolve that conflict transaction
(currently only 'skip' is supported) and writes to the catalog.
3. the worker does the resolution method according to the catalog. If
the worker didn't start to apply those changes, it can skip the entire
transaction. If did, it rollbacks the transaction and ignores the
remaining.The worker needs neither to reset information of the last failed
transaction nor to mark the conflicted transaction as resolved. The
worker will ignore that information when checking the catalog if the
commit LSN is passed.So won't this require us to check the required info in the catalog
before applying each transaction? If so, that might be overhead, maybe
we can build some cache of the highest commitLSN that can be consulted
rather than the catalog table.
I think workers can cache that information when starts and invalidates
and reload the cache when the catalog gets updated. Specifying to
skip XID will update the catalog, invalidating the cache.
I think we need to think about when to
remove rows for which conflict has been resolved as we can't let that
information grow infinitely.
I guess we can update catalog tuples in place when another conflict
happens next time. The catalog tuple should be fixed size. The
already-resolved conflict will have the commit LSN older than its
replication origin's LSN.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Sat, May 29, 2021 at 8:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, May 27, 2021 at 7:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, May 27, 2021 at 1:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
1. the worker records the XID and commit LSN of the failed transaction
to a catalog.When will you record this info? I am not sure if we can try to update
this when an error has occurred. We can think of using try..catch in
apply worker and then record it in catch on error but would that be
advisable? One random thought that occurred to me is to that apply
worker notifies such information to the launcher (or maybe another
process) which will log this information.Yeah, I was concerned about that too and had the same idea. The
information still could not be written if the server crashes before
the launcher writes it. But I think it's an acceptable.
True, because even if the launcher restarts, the apply worker will
error out again and resend the information. I guess we can have an
error queue where apply workers can add their information and the
launcher will then process those. If we do that, then we need to
probably define what we want to do if the queue gets full, either
apply worker nudge launcher and wait or it can just throw an error and
continue. If you have any better ideas to share this information then
we can consider those as well.
2. the user specifies how to resolve that conflict transaction
(currently only 'skip' is supported) and writes to the catalog.
3. the worker does the resolution method according to the catalog. If
the worker didn't start to apply those changes, it can skip the entire
transaction. If did, it rollbacks the transaction and ignores the
remaining.The worker needs neither to reset information of the last failed
transaction nor to mark the conflicted transaction as resolved. The
worker will ignore that information when checking the catalog if the
commit LSN is passed.So won't this require us to check the required info in the catalog
before applying each transaction? If so, that might be overhead, maybe
we can build some cache of the highest commitLSN that can be consulted
rather than the catalog table.I think workers can cache that information when starts and invalidates
and reload the cache when the catalog gets updated. Specifying to
skip XID will update the catalog, invalidating the cache.I think we need to think about when to
remove rows for which conflict has been resolved as we can't let that
information grow infinitely.I guess we can update catalog tuples in place when another conflict
happens next time. The catalog tuple should be fixed size. The
already-resolved conflict will have the commit LSN older than its
replication origin's LSN.
Okay, but I have a slight concern that we will keep xid in the system
which might have been no longer valid. So, we will keep this info
about subscribers around till one performs drop subscription,
hopefully, that doesn't lead to too many rows. This will be okay as
per the current design but say tomorrow we decide to parallelize the
apply for a subscription then there could be multiple errors
corresponding to a subscription and in that case, such a design might
appear quite limiting. One possibility could be that when the launcher
is periodically checking for new error messages, it can clean up the
conflicts catalog as well, or maybe autovacuum does this periodically
as it does for stats (via pgstat_vacuum_stat).
--
With Regards,
Amit Kapila.
On Sat, May 29, 2021 at 3:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Sat, May 29, 2021 at 8:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, May 27, 2021 at 7:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, May 27, 2021 at 1:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
1. the worker records the XID and commit LSN of the failed transaction
to a catalog.When will you record this info? I am not sure if we can try to update
this when an error has occurred. We can think of using try..catch in
apply worker and then record it in catch on error but would that be
advisable? One random thought that occurred to me is to that apply
worker notifies such information to the launcher (or maybe another
process) which will log this information.Yeah, I was concerned about that too and had the same idea. The
information still could not be written if the server crashes before
the launcher writes it. But I think it's an acceptable.True, because even if the launcher restarts, the apply worker will
error out again and resend the information. I guess we can have an
error queue where apply workers can add their information and the
launcher will then process those. If we do that, then we need to
probably define what we want to do if the queue gets full, either
apply worker nudge launcher and wait or it can just throw an error and
continue. If you have any better ideas to share this information then
we can consider those as well.
+1 for using error queue. Maybe we need to avoid queuing the same
error more than once to avoid the catalog from being updated
frequently?
2. the user specifies how to resolve that conflict transaction
(currently only 'skip' is supported) and writes to the catalog.
3. the worker does the resolution method according to the catalog. If
the worker didn't start to apply those changes, it can skip the entire
transaction. If did, it rollbacks the transaction and ignores the
remaining.The worker needs neither to reset information of the last failed
transaction nor to mark the conflicted transaction as resolved. The
worker will ignore that information when checking the catalog if the
commit LSN is passed.So won't this require us to check the required info in the catalog
before applying each transaction? If so, that might be overhead, maybe
we can build some cache of the highest commitLSN that can be consulted
rather than the catalog table.I think workers can cache that information when starts and invalidates
and reload the cache when the catalog gets updated. Specifying to
skip XID will update the catalog, invalidating the cache.I think we need to think about when to
remove rows for which conflict has been resolved as we can't let that
information grow infinitely.I guess we can update catalog tuples in place when another conflict
happens next time. The catalog tuple should be fixed size. The
already-resolved conflict will have the commit LSN older than its
replication origin's LSN.Okay, but I have a slight concern that we will keep xid in the system
which might have been no longer valid. So, we will keep this info
about subscribers around till one performs drop subscription,
hopefully, that doesn't lead to too many rows. This will be okay as
per the current design but say tomorrow we decide to parallelize the
apply for a subscription then there could be multiple errors
corresponding to a subscription and in that case, such a design might
appear quite limiting. One possibility could be that when the launcher
is periodically checking for new error messages, it can clean up the
conflicts catalog as well, or maybe autovacuum does this periodically
as it does for stats (via pgstat_vacuum_stat).
Yeah, it's better to have a way to cleanup no longer valid entries in
the catalog in the case where the worker failed to remove it. I prefer
the former idea so far, so I'll implement it in a PoC patch.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Mon, May 31, 2021 at 12:39 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Sat, May 29, 2021 at 3:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
1. the worker records the XID and commit LSN of the failed transaction
to a catalog.When will you record this info? I am not sure if we can try to update
this when an error has occurred. We can think of using try..catch in
apply worker and then record it in catch on error but would that be
advisable? One random thought that occurred to me is to that apply
worker notifies such information to the launcher (or maybe another
process) which will log this information.Yeah, I was concerned about that too and had the same idea. The
information still could not be written if the server crashes before
the launcher writes it. But I think it's an acceptable.True, because even if the launcher restarts, the apply worker will
error out again and resend the information. I guess we can have an
error queue where apply workers can add their information and the
launcher will then process those. If we do that, then we need to
probably define what we want to do if the queue gets full, either
apply worker nudge launcher and wait or it can just throw an error and
continue. If you have any better ideas to share this information then
we can consider those as well.+1 for using error queue. Maybe we need to avoid queuing the same
error more than once to avoid the catalog from being updated
frequently?
Yes, I think it is important because after logging the subscription
may still error again unless the user does something to skip or
resolve the conflict. I guess you need to check for the existence of
error in systable and or in the queue.
I guess we can update catalog tuples in place when another conflict
happens next time. The catalog tuple should be fixed size. The
already-resolved conflict will have the commit LSN older than its
replication origin's LSN.Okay, but I have a slight concern that we will keep xid in the system
which might have been no longer valid. So, we will keep this info
about subscribers around till one performs drop subscription,
hopefully, that doesn't lead to too many rows. This will be okay as
per the current design but say tomorrow we decide to parallelize the
apply for a subscription then there could be multiple errors
corresponding to a subscription and in that case, such a design might
appear quite limiting. One possibility could be that when the launcher
is periodically checking for new error messages, it can clean up the
conflicts catalog as well, or maybe autovacuum does this periodically
as it does for stats (via pgstat_vacuum_stat).Yeah, it's better to have a way to cleanup no longer valid entries in
the catalog in the case where the worker failed to remove it. I prefer
the former idea so far,
Which idea do you refer to here as former (cleaning up by launcher)?
so I'll implement it in a PoC patch.
Okay.
--
With Regards,
Amit Kapila.
On 27.05.21 12:04, Amit Kapila wrote:
Also, I am thinking that instead of a stat view, do we need
to consider having a system table (pg_replication_conflicts or
something like that) for this because what if stats information is
lost (say either due to crash or due to udp packet loss), can we rely
on stats view for this?Yeah, it seems better to use a catalog.
Okay.
Could you store it shared memory? You don't need it to be crash safe,
since the subscription will just run into the same error again after
restart. You just don't want it to be lost, like with the statistics
collector.
On Tue, Jun 1, 2021 at 12:55 AM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
On 27.05.21 12:04, Amit Kapila wrote:
Also, I am thinking that instead of a stat view, do we need
to consider having a system table (pg_replication_conflicts or
something like that) for this because what if stats information is
lost (say either due to crash or due to udp packet loss), can we rely
on stats view for this?Yeah, it seems better to use a catalog.
Okay.
Could you store it shared memory? You don't need it to be crash safe,
since the subscription will just run into the same error again after
restart. You just don't want it to be lost, like with the statistics
collector.
But, won't that be costly in cases where we have errors in the
processing of very large transactions? Subscription has to process all
the data before it gets an error. I think we can even imagine this
feature to be extended to use commitLSN as a skip candidate in which
case we can even avoid getting the data of that transaction from the
publisher. So if this information is persistent, the user can even set
the skip identifier after the restart before the publisher can send
all the data.
Also, I think we can't assume after the restart we will get the same
error because the user can perform some operations after the restart
and before we try to apply the same transaction. It might be that the
user wanted to see all the errors before the user can set the skip
identifier (and or method).
I think the XID (or say another identifier like commitLSN) which we
want to use for skipping the transaction as specified by the user has
to be stored in the catalog because otherwise, after the restart we
won't remember it and the user won't know that he needs to set it
again. Now, say we have multiple skip identifiers (XIDs, commitLSN,
..), isn't it better to store all conflict-related information in a
separate catalog like pg_subscription_conflict or something like that.
I think it might be also better to later extend it for auto conflict
resolution where the user can specify auto conflict resolution info
for a subscription. Is it better to store all such information in
pg_subscription or have a separate catalog? It is possible that even
if we have a separate catalog for conflict info, we might not want to
store error info there.
--
With Regards,
Amit Kapila.
On Tue, Jun 1, 2021 at 1:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jun 1, 2021 at 12:55 AM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:On 27.05.21 12:04, Amit Kapila wrote:
Also, I am thinking that instead of a stat view, do we need
to consider having a system table (pg_replication_conflicts or
something like that) for this because what if stats information is
lost (say either due to crash or due to udp packet loss), can we rely
on stats view for this?Yeah, it seems better to use a catalog.
Okay.
Could you store it shared memory? You don't need it to be crash safe,
since the subscription will just run into the same error again after
restart. You just don't want it to be lost, like with the statistics
collector.But, won't that be costly in cases where we have errors in the
processing of very large transactions? Subscription has to process all
the data before it gets an error.
I had the same concern. Particularly, the approach we currently
discussed is to skip the transaction based on the information written
by the worker rather than require the user to specify the XID.
Therefore, we will always require the worker to process the same large
transaction after the restart in order to skip the transaction.
I think we can even imagine this
feature to be extended to use commitLSN as a skip candidate in which
case we can even avoid getting the data of that transaction from the
publisher. So if this information is persistent, the user can even set
the skip identifier after the restart before the publisher can send
all the data.
Another possible benefit of writing it to a catalog is that we can
replicate it to the physical standbys. If we have failover slots in
the future, the physical standby server also can resolve the conflict
without processing a possibly large transaction.
I think the XID (or say another identifier like commitLSN) which we
want to use for skipping the transaction as specified by the user has
to be stored in the catalog because otherwise, after the restart we
won't remember it and the user won't know that he needs to set it
again. Now, say we have multiple skip identifiers (XIDs, commitLSN,
..), isn't it better to store all conflict-related information in a
separate catalog like pg_subscription_conflict or something like that.
I think it might be also better to later extend it for auto conflict
resolution where the user can specify auto conflict resolution info
for a subscription. Is it better to store all such information in
pg_subscription or have a separate catalog? It is possible that even
if we have a separate catalog for conflict info, we might not want to
store error info there.
Just to be clear, we need to store only the conflict-related
information that cannot be resolved without manual intervention,
right? That is, conflicts cause an error, exiting the workers. In
general, replication conflicts include also conflicts that don’t cause
an error. I think that those conflicts don’t necessarily need to be
stored in the catalog and don’t require manual intervention.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, Jun 1, 2021 at 10:07 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Jun 1, 2021 at 1:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jun 1, 2021 at 12:55 AM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:On 27.05.21 12:04, Amit Kapila wrote:
Also, I am thinking that instead of a stat view, do we need
to consider having a system table (pg_replication_conflicts or
something like that) for this because what if stats information is
lost (say either due to crash or due to udp packet loss), can we rely
on stats view for this?Yeah, it seems better to use a catalog.
Okay.
Could you store it shared memory? You don't need it to be crash safe,
since the subscription will just run into the same error again after
restart. You just don't want it to be lost, like with the statistics
collector.But, won't that be costly in cases where we have errors in the
processing of very large transactions? Subscription has to process all
the data before it gets an error.I had the same concern. Particularly, the approach we currently
discussed is to skip the transaction based on the information written
by the worker rather than require the user to specify the XID.
Yeah, but I was imagining that the user still needs to specify
something to indicate that we need to skip it, otherwise, we might try
to skip a transaction that the user wants to resolve by itself rather
than expecting us to skip it. Another point is if we don't store this
information in a persistent way then how will we restrict a user to
specify some random XID which is not even errored after restart.
Therefore, we will always require the worker to process the same large
transaction after the restart in order to skip the transaction.I think we can even imagine this
feature to be extended to use commitLSN as a skip candidate in which
case we can even avoid getting the data of that transaction from the
publisher. So if this information is persistent, the user can even set
the skip identifier after the restart before the publisher can send
all the data.Another possible benefit of writing it to a catalog is that we can
replicate it to the physical standbys. If we have failover slots in
the future, the physical standby server also can resolve the conflict
without processing a possibly large transaction.
makes sense.
I think the XID (or say another identifier like commitLSN) which we
want to use for skipping the transaction as specified by the user has
to be stored in the catalog because otherwise, after the restart we
won't remember it and the user won't know that he needs to set it
again. Now, say we have multiple skip identifiers (XIDs, commitLSN,
..), isn't it better to store all conflict-related information in a
separate catalog like pg_subscription_conflict or something like that.
I think it might be also better to later extend it for auto conflict
resolution where the user can specify auto conflict resolution info
for a subscription. Is it better to store all such information in
pg_subscription or have a separate catalog? It is possible that even
if we have a separate catalog for conflict info, we might not want to
store error info there.Just to be clear, we need to store only the conflict-related
information that cannot be resolved without manual intervention,
right? That is, conflicts cause an error, exiting the workers. In
general, replication conflicts include also conflicts that don’t cause
an error. I think that those conflicts don’t necessarily need to be
stored in the catalog and don’t require manual intervention.
Yeah, I think we want to record the error cases but which other
conflicts you are talking about here which doesn't lead to any sort of
error?
--
With Regards,
Amit Kapila.
On Tue, Jun 1, 2021 at 2:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jun 1, 2021 at 10:07 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Jun 1, 2021 at 1:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jun 1, 2021 at 12:55 AM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:On 27.05.21 12:04, Amit Kapila wrote:
Also, I am thinking that instead of a stat view, do we need
to consider having a system table (pg_replication_conflicts or
something like that) for this because what if stats information is
lost (say either due to crash or due to udp packet loss), can we rely
on stats view for this?Yeah, it seems better to use a catalog.
Okay.
Could you store it shared memory? You don't need it to be crash safe,
since the subscription will just run into the same error again after
restart. You just don't want it to be lost, like with the statistics
collector.But, won't that be costly in cases where we have errors in the
processing of very large transactions? Subscription has to process all
the data before it gets an error.I had the same concern. Particularly, the approach we currently
discussed is to skip the transaction based on the information written
by the worker rather than require the user to specify the XID.Yeah, but I was imagining that the user still needs to specify
something to indicate that we need to skip it, otherwise, we might try
to skip a transaction that the user wants to resolve by itself rather
than expecting us to skip it.
Yeah, currently what I'm thinking is that the worker writes the
conflict that caused an error somewhere. If the user wants to resolve
it manually they can specify the resolution method to the stopped
subscription. Until the user specifies the method and the worker
resolves it or some fields of the subscription such as subconninfo are
updated, the conflict is not resolved and the information lasts.
I think the XID (or say another identifier like commitLSN) which we
want to use for skipping the transaction as specified by the user has
to be stored in the catalog because otherwise, after the restart we
won't remember it and the user won't know that he needs to set it
again. Now, say we have multiple skip identifiers (XIDs, commitLSN,
..), isn't it better to store all conflict-related information in a
separate catalog like pg_subscription_conflict or something like that.
I think it might be also better to later extend it for auto conflict
resolution where the user can specify auto conflict resolution info
for a subscription. Is it better to store all such information in
pg_subscription or have a separate catalog? It is possible that even
if we have a separate catalog for conflict info, we might not want to
store error info there.Just to be clear, we need to store only the conflict-related
information that cannot be resolved without manual intervention,
right? That is, conflicts cause an error, exiting the workers. In
general, replication conflicts include also conflicts that don’t cause
an error. I think that those conflicts don’t necessarily need to be
stored in the catalog and don’t require manual intervention.Yeah, I think we want to record the error cases but which other
conflicts you are talking about here which doesn't lead to any sort of
error?
For example, I think it's one type of replication conflict that two
updates that arrived via logical replication or from the client update
the same record (e.g., having the same primary key) at the same time.
In that case an error doesn't happen and we always choose the update
that arrived later. But there are other possible resolution methods
such as choosing the one that arrived former, using the one having a
newer commit timestamp, using something like priority of the node, and
even raising an error so that the user manually resolves it.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, Jun 1, 2021 at 1:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Jun 1, 2021 at 2:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jun 1, 2021 at 10:07 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Jun 1, 2021 at 1:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jun 1, 2021 at 12:55 AM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:On 27.05.21 12:04, Amit Kapila wrote:
Also, I am thinking that instead of a stat view, do we need
to consider having a system table (pg_replication_conflicts or
something like that) for this because what if stats information is
lost (say either due to crash or due to udp packet loss), can we rely
on stats view for this?Yeah, it seems better to use a catalog.
Okay.
Could you store it shared memory? You don't need it to be crash safe,
since the subscription will just run into the same error again after
restart. You just don't want it to be lost, like with the statistics
collector.But, won't that be costly in cases where we have errors in the
processing of very large transactions? Subscription has to process all
the data before it gets an error.I had the same concern. Particularly, the approach we currently
discussed is to skip the transaction based on the information written
by the worker rather than require the user to specify the XID.Yeah, but I was imagining that the user still needs to specify
something to indicate that we need to skip it, otherwise, we might try
to skip a transaction that the user wants to resolve by itself rather
than expecting us to skip it.Yeah, currently what I'm thinking is that the worker writes the
conflict that caused an error somewhere. If the user wants to resolve
it manually they can specify the resolution method to the stopped
subscription. Until the user specifies the method and the worker
resolves it or some fields of the subscription such as subconninfo are
updated, the conflict is not resolved and the information lasts.
I think we can work out such details but not sure tinkering anything
with subconninfo was not in my mind.
I think the XID (or say another identifier like commitLSN) which we
want to use for skipping the transaction as specified by the user has
to be stored in the catalog because otherwise, after the restart we
won't remember it and the user won't know that he needs to set it
again. Now, say we have multiple skip identifiers (XIDs, commitLSN,
..), isn't it better to store all conflict-related information in a
separate catalog like pg_subscription_conflict or something like that.
I think it might be also better to later extend it for auto conflict
resolution where the user can specify auto conflict resolution info
for a subscription. Is it better to store all such information in
pg_subscription or have a separate catalog? It is possible that even
if we have a separate catalog for conflict info, we might not want to
store error info there.Just to be clear, we need to store only the conflict-related
information that cannot be resolved without manual intervention,
right? That is, conflicts cause an error, exiting the workers. In
general, replication conflicts include also conflicts that don’t cause
an error. I think that those conflicts don’t necessarily need to be
stored in the catalog and don’t require manual intervention.Yeah, I think we want to record the error cases but which other
conflicts you are talking about here which doesn't lead to any sort of
error?For example, I think it's one type of replication conflict that two
updates that arrived via logical replication or from the client update
the same record (e.g., having the same primary key) at the same time.
In that case an error doesn't happen and we always choose the update
that arrived later.
I think we choose whichever is earlier as we first try to find the
tuple in local rel and if not found then we silently ignore the
update/delete operation.
But there are other possible resolution methods
such as choosing the one that arrived former, using the one having a
newer commit timestamp, using something like priority of the node, and
even raising an error so that the user manually resolves it.
Agreed. I think we need to log only the ones which lead to error.
--
With Regards,
Amit Kapila.
On 01.06.21 06:01, Amit Kapila wrote:
But, won't that be costly in cases where we have errors in the
processing of very large transactions? Subscription has to process all
the data before it gets an error. I think we can even imagine this
feature to be extended to use commitLSN as a skip candidate in which
case we can even avoid getting the data of that transaction from the
publisher. So if this information is persistent, the user can even set
the skip identifier after the restart before the publisher can send
all the data.
At least in current practice, skipping parts of the logical replication
stream on the subscriber is a rare, emergency-level operation when
something that shouldn't have happened happened. So it doesn't really
matter how costly it is. It's not going to be more costly than the
error happening in the first place. All you'd need is one shared memory
slot per subscription to store a xid to skip.
We will also want some proper conflict handling at some point. But I
think what is being discussed here is meant to be a repair tool, not a
policy tool, and I'm afraid it might get over-engineered.
On Tue, Jun 1, 2021 at 9:05 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
On 01.06.21 06:01, Amit Kapila wrote:
But, won't that be costly in cases where we have errors in the
processing of very large transactions? Subscription has to process all
the data before it gets an error. I think we can even imagine this
feature to be extended to use commitLSN as a skip candidate in which
case we can even avoid getting the data of that transaction from the
publisher. So if this information is persistent, the user can even set
the skip identifier after the restart before the publisher can send
all the data.At least in current practice, skipping parts of the logical replication
stream on the subscriber is a rare, emergency-level operation when
something that shouldn't have happened happened. So it doesn't really
matter how costly it is. It's not going to be more costly than the
error happening in the first place. All you'd need is one shared memory
slot per subscription to store a xid to skip.
Leaving aside the performance point, how can we do by just storing
skip identifier (XID/commitLSN) in shared_memory? How will the apply
worker know about that information after restart? Do you expect the
user to set it again, if so, I think users might not like that? Also,
how will we prohibit users to give some identifier other than for
failed transactions, and if users provide that what should be our
action? Without that, if users provide XID of some in-progress
transaction, we might need to do more work (rollback) than just
skipping it.
We will also want some proper conflict handling at some point. But I
think what is being discussed here is meant to be a repair tool, not a
policy tool, and I'm afraid it might get over-engineered.
I got your point but I am also a bit skeptical that handling all
boundary cases might become tricky if we go with a simple shared
memory technique but OTOH if we can handle all such cases then it is
fine.
--
With Regards,
Amit Kapila.
On Wed, Jun 2, 2021 at 3:07 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jun 1, 2021 at 9:05 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:On 01.06.21 06:01, Amit Kapila wrote:
But, won't that be costly in cases where we have errors in the
processing of very large transactions? Subscription has to process all
the data before it gets an error. I think we can even imagine this
feature to be extended to use commitLSN as a skip candidate in which
case we can even avoid getting the data of that transaction from the
publisher. So if this information is persistent, the user can even set
the skip identifier after the restart before the publisher can send
all the data.At least in current practice, skipping parts of the logical replication
stream on the subscriber is a rare, emergency-level operation when
something that shouldn't have happened happened. So it doesn't really
matter how costly it is. It's not going to be more costly than the
error happening in the first place. All you'd need is one shared memory
slot per subscription to store a xid to skip.Leaving aside the performance point, how can we do by just storing
skip identifier (XID/commitLSN) in shared_memory? How will the apply
worker know about that information after restart? Do you expect the
user to set it again, if so, I think users might not like that? Also,
how will we prohibit users to give some identifier other than for
failed transactions, and if users provide that what should be our
action? Without that, if users provide XID of some in-progress
transaction, we might need to do more work (rollback) than just
skipping it.
I think the simplest solution would be to have a fixed-size array on
the shared memory to store information of skipping transactions on the
particular subscription. Given that this feature is meant to be a
repair tool in emergency cases, 32 or 64 entries seem enough. That
information should be visible to users via a system view and each
entry is cleared once the worker has skipped the transaction. Also, we
also would need to clear the entry if the meta information of the
subscription such as conninfo and slot name has been changed. The
worker reads that information at least when starting logical
replication. The worker receives changes from the publication and
checks if the transaction should be skipped when start to apply those
changes. If so the worker skips applying all changes of the
transaction and removes stream files if exist.
Regarding the point of how to check if the specified XID by the user
is valid, I guess it’s not easy to do that since XIDs sent from the
publisher are in random order. Considering the use case of this tool,
the situation seems like the logical replication gets stuck due to a
problem transaction and the worker repeatedly restarts and raises an
error. So I guess it also would be a good idea that the user can
specify to skip the first transaction (or first N transactions) since
the subscription starts logical replication. It’s less flexible but
seems enough to solve such a situation and doesn’t have such a problem
of validating the XID. If the functionality like letting the
subscriber know the oldest XID that is possibly sent is useful also
for other purposes it would also be a good idea to implement it but
not sure about other use cases.
Anyway, it seems to me that we need to consider the user interface
first, especially how and what the user specifies the transaction to
skip. My current feeling is that specifying XID is intuitive and
flexible but the user needs to have 2 steps: checks XID and then
specifies it, and there is a risk that the user mistakenly specifies a
wrong XID. On the other hand, the idea of specifying to skip the first
transaction doesn’t require the user to check and specify XID but is
less flexible, and “the first” transaction might be ambiguous for the
user.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, Jun 15, 2021 at 6:13 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Jun 2, 2021 at 3:07 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jun 1, 2021 at 9:05 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:On 01.06.21 06:01, Amit Kapila wrote:
But, won't that be costly in cases where we have errors in the
processing of very large transactions? Subscription has to process all
the data before it gets an error. I think we can even imagine this
feature to be extended to use commitLSN as a skip candidate in which
case we can even avoid getting the data of that transaction from the
publisher. So if this information is persistent, the user can even set
the skip identifier after the restart before the publisher can send
all the data.At least in current practice, skipping parts of the logical replication
stream on the subscriber is a rare, emergency-level operation when
something that shouldn't have happened happened. So it doesn't really
matter how costly it is. It's not going to be more costly than the
error happening in the first place. All you'd need is one shared memory
slot per subscription to store a xid to skip.Leaving aside the performance point, how can we do by just storing
skip identifier (XID/commitLSN) in shared_memory? How will the apply
worker know about that information after restart? Do you expect the
user to set it again, if so, I think users might not like that? Also,
how will we prohibit users to give some identifier other than for
failed transactions, and if users provide that what should be our
action? Without that, if users provide XID of some in-progress
transaction, we might need to do more work (rollback) than just
skipping it.I think the simplest solution would be to have a fixed-size array on
the shared memory to store information of skipping transactions on the
particular subscription. Given that this feature is meant to be a
repair tool in emergency cases, 32 or 64 entries seem enough.
IIUC, here you are talking about xids specified by the user to skip?
If so, then how will you get that information after the restart, and
why you need 32 or 64 entries for it?
Anyway, it seems to me that we need to consider the user interface
first, especially how and what the user specifies the transaction to
skip. My current feeling is that specifying XID is intuitive and
flexible but the user needs to have 2 steps: checks XID and then
specifies it, and there is a risk that the user mistakenly specifies a
wrong XID. On the other hand, the idea of specifying to skip the first
transaction doesn’t require the user to check and specify XID but is
less flexible, and “the first” transaction might be ambiguous for the
user.
I see your point in allowing to specify First N transactions but OTOH,
I am slightly afraid that it might lead to skipping some useful
transactions which will make replica out-of-sync. BTW, is there any
data point for the user to check how many transactions it can skip?
Normally, we won't be able to proceed till we resolve/skip the
transaction that is generating an error. One possibility could be that
we provide some *superuser* functions like
pg_logical_replication_skip_xact()/pg_logical_replication_reset_skip_xact()
which takes subscription name/id and xid as input parameters. Then, I
think we can store this information in ReplicationState and probably
try to map to originid from subscription name/id to retrieve that
info. We can probably document that the effects of these functions
won't last after the restart. Now, if this function is used by super
users then we can probably trust that they provide the XIDs that we
can trust to be skipped but OTOH making a restriction to allow these
functions to be used by superusers might restrict the usage of this
repair tool.
--
With Regards,
Amit Kapila.
On Wed, Jun 16, 2021 at 6:05 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jun 15, 2021 at 6:13 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Jun 2, 2021 at 3:07 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jun 1, 2021 at 9:05 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:On 01.06.21 06:01, Amit Kapila wrote:
But, won't that be costly in cases where we have errors in the
processing of very large transactions? Subscription has to process all
the data before it gets an error. I think we can even imagine this
feature to be extended to use commitLSN as a skip candidate in which
case we can even avoid getting the data of that transaction from the
publisher. So if this information is persistent, the user can even set
the skip identifier after the restart before the publisher can send
all the data.At least in current practice, skipping parts of the logical replication
stream on the subscriber is a rare, emergency-level operation when
something that shouldn't have happened happened. So it doesn't really
matter how costly it is. It's not going to be more costly than the
error happening in the first place. All you'd need is one shared memory
slot per subscription to store a xid to skip.Leaving aside the performance point, how can we do by just storing
skip identifier (XID/commitLSN) in shared_memory? How will the apply
worker know about that information after restart? Do you expect the
user to set it again, if so, I think users might not like that? Also,
how will we prohibit users to give some identifier other than for
failed transactions, and if users provide that what should be our
action? Without that, if users provide XID of some in-progress
transaction, we might need to do more work (rollback) than just
skipping it.I think the simplest solution would be to have a fixed-size array on
the shared memory to store information of skipping transactions on the
particular subscription. Given that this feature is meant to be a
repair tool in emergency cases, 32 or 64 entries seem enough.IIUC, here you are talking about xids specified by the user to skip?
Yes. I think we need to store pairs of subid and xid.
If so, then how will you get that information after the restart, and
why you need 32 or 64 entries for it?
That information doesn't last after the restart. I think that the
situation that DBA uses this tool would be that they fix the
subscription on the spot. Once the subscription skipped the
transaction, the entry of that information is cleared. So I’m thinking
that we don’t need to hold many entries and it does not necessarily to
be durable. I think your below idea of storing that information in
ReplicationState seems better to me.
Anyway, it seems to me that we need to consider the user interface
first, especially how and what the user specifies the transaction to
skip. My current feeling is that specifying XID is intuitive and
flexible but the user needs to have 2 steps: checks XID and then
specifies it, and there is a risk that the user mistakenly specifies a
wrong XID. On the other hand, the idea of specifying to skip the first
transaction doesn’t require the user to check and specify XID but is
less flexible, and “the first” transaction might be ambiguous for the
user.I see your point in allowing to specify First N transactions but OTOH,
I am slightly afraid that it might lead to skipping some useful
transactions which will make replica out-of-sync.
Agreed.
It might be better to skip only the first transaction.
BTW, is there any
data point for the user to check how many transactions it can skip?
Normally, we won't be able to proceed till we resolve/skip the
transaction that is generating an error. One possibility could be that
we provide some *superuser* functions like
pg_logical_replication_skip_xact()/pg_logical_replication_reset_skip_xact()
which takes subscription name/id and xid as input parameters. Then, I
think we can store this information in ReplicationState and probably
try to map to originid from subscription name/id to retrieve that
info. We can probably document that the effects of these functions
won't last after the restart.
ReplicationState seems a reasonable place to store that information.
Now, if this function is used by super
users then we can probably trust that they provide the XIDs that we
can trust to be skipped but OTOH making a restriction to allow these
functions to be used by superusers might restrict the usage of this
repair tool.
If we specify the subscription id or name, maybe we can allow also the
owner of subscription to do that operation?
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thu, Jun 17, 2021 at 3:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Now, if this function is used by super
users then we can probably trust that they provide the XIDs that we
can trust to be skipped but OTOH making a restriction to allow these
functions to be used by superusers might restrict the usage of this
repair tool.If we specify the subscription id or name, maybe we can allow also the
owner of subscription to do that operation?
Ah, the owner of the subscription must be superuser.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thu, Jun 17, 2021 at 6:20 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Jun 17, 2021 at 3:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Now, if this function is used by super
users then we can probably trust that they provide the XIDs that we
can trust to be skipped but OTOH making a restriction to allow these
functions to be used by superusers might restrict the usage of this
repair tool.If we specify the subscription id or name, maybe we can allow also the
owner of subscription to do that operation?Ah, the owner of the subscription must be superuser.
I've attached PoC patches.
0001 patch introduces the ability to skip transactions on the
subscriber side. We can specify XID to the subscription by like ALTER
SUBSCRIPTION test_sub SET SKIP TRANSACTION 100. The implementation
seems straightforward except for setting origin state. After skipping
the transaction we have to update the session origin state so that we
can start streaming the transaction next to the one that we just
skipped in case of the server crash or restarting the apply worker. We
set origin state to the commit WAL record. However, since we skip all
changes we don’t write any WAL even if we call CommitTransaction() at
the end of the skipped transaction. So the patch sets the origin state
to the transaction that updates the pg_subscription system catalog to
reset the skip XID. I think we need a discussion of this part.
With 0002 and 0003 patches, we report the error information in server
logs and the stats view, respectively. 0002 patch adds errcontext for
messages that happened during applying the changes:
ERROR: duplicate key value violates unique constraint "hoge_pkey"
DETAIL: Key (c)=(1) already exists.
CONTEXT: during apply of "INSERT" for relation "public.hoge" in
transaction with xid 736 committs 2021-06-27 12:12:30.053887+09
0003 patch adds pg_stat_logical_replication_error statistics view
discussed on another thread[1]/messages/by-id/DB35438F-9356-4841-89A0-412709EBD3AB@enterprisedb.com. The apply worker sends the error
information to the stats collector if an error happens during applying
changes. We can check those errors as follow:
postgres(1:25250)=# select * from pg_stat_logical_replication_error;
subname | relid | action | xid | last_failure
----------+-------+--------+-----+-------------------------------
test_sub | 16384 | INSERT | 736 | 2021-06-27 12:12:45.142675+09
(1 row)
I added only columns required for the skipping transaction feature to
the view for now.
Please note that those patches are meant to evaluate the concept we've
discussed so far. Those don't have the doc update yet.
Regards,
[1]: /messages/by-id/DB35438F-9356-4841-89A0-412709EBD3AB@enterprisedb.com
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v1-0003-Add-pg_stat_logical_replication_error-statistics-.patchapplication/octet-stream; name=v1-0003-Add-pg_stat_logical_replication_error-statistics-.patchDownload
From 354b0b6a0fa00c3cb487103084ce5618d6c6a38f Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 28 Jun 2021 13:22:13 +0900
Subject: [PATCH v1 3/3] Add pg_stat_logical_replication_error statistics view.
---
src/backend/catalog/system_views.sql | 10 ++
src/backend/postmaster/pgstat.c | 207 +++++++++++++++++++++++
src/backend/replication/logical/worker.c | 28 ++-
src/backend/utils/adt/pgstatfuncs.c | 57 +++++++
src/include/catalog/pg_proc.dat | 8 +
src/include/pgstat.h | 31 ++++
src/test/regress/expected/rules.out | 7 +
7 files changed, 347 insertions(+), 1 deletion(-)
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 999d984068..e00d6e4fc0 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1257,3 +1257,13 @@ REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
substream, subslotname, subsynccommit, subpublications)
ON pg_subscription TO public;
+
+CREATE VIEW pg_stat_logical_replication_error AS
+ SELECT
+ s.subname,
+ e.relid,
+ e.action,
+ e.xid,
+ e.last_failure
+ FROM pg_subscription as s,
+ LATERAL pg_stat_get_logical_replication_error(oid) as e;
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index b0d07c0e0b..51bb73196e 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -41,6 +41,7 @@
#include "catalog/partition.h"
#include "catalog/pg_database.h"
#include "catalog/pg_proc.h"
+#include "catalog/pg_subscription.h"
#include "common/ip.h"
#include "executor/instrument.h"
#include "libpq/libpq.h"
@@ -106,6 +107,7 @@
#define PGSTAT_TAB_HASH_SIZE 512
#define PGSTAT_FUNCTION_HASH_SIZE 512
#define PGSTAT_REPLSLOT_HASH_SIZE 32
+#define PGSTAT_LOGICALREP_ERR_HASH_SIZE 32
/* ----------
@@ -279,6 +281,7 @@ static PgStat_GlobalStats globalStats;
static PgStat_WalStats walStats;
static PgStat_SLRUStats slruStats[SLRU_NUM_ELEMENTS];
static HTAB *replSlotStatHash = NULL;
+static HTAB *logicalRepErrHash = NULL;
/*
* List of OIDs of databases we need to write out. If an entry is InvalidOid,
@@ -320,6 +323,8 @@ static bool pgstat_db_requested(Oid databaseid);
static PgStat_StatReplSlotEntry *pgstat_get_replslot_entry(NameData name, bool create_it);
static void pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotstats, TimestampTz ts);
+static PgStat_LogicalRepErrEntry * pgstat_get_logicalrep_error_entry(Oid subid, bool create);
+
static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg);
static void pgstat_send_funcstats(void);
static void pgstat_send_slru(void);
@@ -358,6 +363,7 @@ static void pgstat_recv_checksum_failure(PgStat_MsgChecksumFailure *msg, int len
static void pgstat_recv_connstat(PgStat_MsgConn *msg, int len);
static void pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len);
static void pgstat_recv_tempfile(PgStat_MsgTempFile *msg, int len);
+static void pgstat_recv_logicalrep_error(PgStat_MsgLogicalRepErr *msg, int len);
/* ------------------------------------------------------------
* Public functions called from postmaster follow
@@ -1134,6 +1140,25 @@ pgstat_vacuum_stat(void)
}
}
+ if (logicalRepErrHash)
+ {
+ PgStat_LogicalRepErrEntry *errentry;
+ HTAB *htab;
+
+ htab = pgstat_collect_oids(SubscriptionRelationId, Anum_pg_subscription_oid);
+
+ hash_seq_init(&hstat, logicalRepErrHash);
+ while ((errentry = (PgStat_LogicalRepErrEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ CHECK_FOR_INTERRUPTS();
+
+ if (hash_search(htab, (void *) &errentry->subid, HASH_FIND, NULL) == NULL)
+ pgstat_report_logicalrep_error_clear(errentry->subid);
+ }
+
+ hash_destroy(htab);
+ }
+
/*
* Lookup our own database entry; if not found, nothing more to do.
*/
@@ -1863,6 +1888,46 @@ pgstat_report_replslot_drop(const char *slotname)
pgstat_send(&msg, sizeof(PgStat_MsgReplSlot));
}
+/* ----------
+ * pgstat_report_logicalrep_error() -
+ *
+ * Tell the collector about error of logical replication transaction.
+ * ----------
+ */
+void
+pgstat_report_logicalrep_error(Oid subid, LogicalRepMsgType action,
+ TransactionId xid, Oid relid)
+{
+ PgStat_MsgLogicalRepErr msg;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_LOGICALREPERROR);
+ msg.m_subid = subid;
+ msg.m_clear = false;
+ msg.m_action = action;
+ msg.m_xid = xid;
+ msg.m_relid = relid;
+ msg.m_last_failure = GetCurrentTimestamp();
+ pgstat_send(&msg, sizeof(PgStat_MsgLogicalRepErr));
+}
+
+/* ----------
+ * pgstat_report_logicalrep_error_clear() -
+ *
+ * Tell the collector about dropping the subscription, clearing
+ * the corresponding logical replication error information.
+ * ----------
+ */
+void
+pgstat_report_logicalrep_error_clear(Oid subid)
+{
+ PgStat_MsgLogicalRepErr msg;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_LOGICALREPERROR);
+ msg.m_subid = subid;
+ msg.m_clear = true;
+ pgstat_send(&msg, sizeof(PgStat_MsgLogicalRepErr));
+}
+
/* ----------
* pgstat_ping() -
*
@@ -2895,6 +2960,23 @@ pgstat_fetch_replslot(NameData slotname)
return pgstat_get_replslot_entry(slotname, false);
}
+/*
+ * ---------
+ * pgstat_fetch_replslot() -
+ *
+ * Support function for the SQL-callable pgstat* functions. Returns
+ * a pointer to the logical replication error struct.
+ * ---------
+ */
+PgStat_LogicalRepErrEntry *
+pgstat_fetch_logicalrep_error(Oid subid)
+{
+ backend_read_statsfile();
+
+ return pgstat_get_logicalrep_error_entry(subid, false);
+}
+
+
/*
* Shut down a single backend's statistics reporting at process exit.
*
@@ -3424,6 +3506,10 @@ PgstatCollectorMain(int argc, char *argv[])
pgstat_recv_connstat(&msg.msg_conn, len);
break;
+ case PGSTAT_MTYPE_LOGICALREPERROR:
+ pgstat_recv_logicalrep_error(&msg.msg_logicalreperr, len);
+ break;
+
default:
break;
}
@@ -3725,6 +3811,22 @@ pgstat_write_statsfiles(bool permanent, bool allDbs)
}
}
+ /*
+ * Write logical replication transaction error struct.
+ */
+ if (logicalRepErrHash)
+ {
+ PgStat_LogicalRepErrEntry *errent;
+
+ hash_seq_init(&hstat, logicalRepErrHash);
+ while ((errent = (PgStat_LogicalRepErrEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ fputc('L', fpout);
+ rc = fwrite(errent, sizeof(PgStat_LogicalRepErrEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+ }
+ }
+
/*
* No more output to be done. Close the temp file and replace the old
* pgstat.stat with it. The ferror() check replaces testing for error
@@ -4184,6 +4286,46 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
break;
}
+ case 'L':
+ {
+ PgStat_LogicalRepErrEntry errbuf;
+ PgStat_LogicalRepErrEntry *errent;
+
+ if (fread(&errbuf, 1, sizeof(PgStat_LogicalRepErrEntry), fpin)
+ != sizeof(PgStat_LogicalRepErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /* Create hash table if we don't have it already. */
+ if (logicalRepErrHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_LogicalRepErrEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ logicalRepErrHash = hash_create("Logical replication transaction error hash",
+ PGSTAT_LOGICALREP_ERR_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ }
+
+ errent =
+ (PgStat_LogicalRepErrEntry *) hash_search(logicalRepErrHash,
+ (void *) &errbuf.subid,
+ HASH_ENTER, NULL);
+ errent->subid = errbuf.subid;
+ errent->relid = errbuf.relid;
+ errent->action = errbuf.action;
+ errent->xid = errbuf.xid;
+ errent->last_failure = errbuf.last_failure;
+ break;
+ }
+
case 'E':
goto done;
@@ -4396,6 +4538,7 @@ pgstat_read_db_statsfile_timestamp(Oid databaseid, bool permanent,
PgStat_WalStats myWalStats;
PgStat_SLRUStats mySLRUStats[SLRU_NUM_ELEMENTS];
PgStat_StatReplSlotEntry myReplSlotStats;
+ PgStat_LogicalRepErrEntry myLogicalRepErrs;
FILE *fpin;
int32 format_id;
const char *statfile = permanent ? PGSTAT_STAT_PERMANENT_FILENAME : pgstat_stat_filename;
@@ -4526,6 +4669,18 @@ pgstat_read_db_statsfile_timestamp(Oid databaseid, bool permanent,
}
break;
+ case 'L':
+ if (fread(&myLogicalRepErrs, 1, sizeof(PgStat_LogicalRepErrEntry), fpin)
+ != sizeof(PgStat_LogicalRepErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ FreeFile(fpin);
+ return false;
+ }
+ break;
+
case 'E':
goto done;
@@ -4716,6 +4871,7 @@ pgstat_clear_snapshot(void)
pgStatLocalContext = NULL;
pgStatDBHash = NULL;
replSlotStatHash = NULL;
+ logicalRepErrHash = NULL;
/*
* Historically the backend_status.c facilities lived in this file, and
@@ -5650,6 +5806,33 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
}
}
+/* ----------
+ * pgstat_recv_logicalrep_error() -
+ *
+ * Process a LOGICALREPERROR message.
+ * ----------
+ */
+static void
+pgstat_recv_logicalrep_error(PgStat_MsgLogicalRepErr *msg, int len)
+{
+ PgStat_LogicalRepErrEntry *errent;
+
+ if (msg->m_clear)
+ {
+ if (logicalRepErrHash != NULL)
+ hash_search(logicalRepErrHash, (void *) &msg->m_subid,
+ HASH_REMOVE, NULL);
+ return;
+ }
+
+ errent = pgstat_get_logicalrep_error_entry(msg->m_subid, true);
+
+ errent->relid = msg->m_relid;
+ errent->action = msg->m_action;
+ errent->xid = msg->m_xid;
+ errent->last_failure = msg->m_last_failure;
+}
+
/* ----------
* pgstat_write_statsfile_needed() -
*
@@ -5747,6 +5930,30 @@ pgstat_get_replslot_entry(NameData name, bool create)
return slotent;
}
+static PgStat_LogicalRepErrEntry *
+pgstat_get_logicalrep_error_entry(Oid subid, bool create)
+{
+ PgStat_LogicalRepErrEntry *errent;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);
+
+ if (logicalRepErrHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_LogicalRepErrEntry);
+ logicalRepErrHash = hash_create("Logical replication transaction error hash",
+ PGSTAT_LOGICALREP_ERR_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+ }
+
+ errent = (PgStat_LogicalRepErrEntry *) hash_search(logicalRepErrHash,
+ (void *) &subid,
+ action, NULL);
+ return errent;
+}
+
/* ----------
* pgstat_reset_replslot
*
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index b65f72c9a4..3a09c27b16 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3349,7 +3349,30 @@ ApplyWorkerMain(Datum main_arg)
walrcv_startstreaming(LogRepWorkerWalRcvConn, &options);
/* Run the main loop. */
- LogicalRepApplyLoop(origin_startpos);
+ PG_TRY();
+ {
+ LogicalRepApplyLoop(origin_startpos);
+ }
+ PG_CATCH();
+ {
+ if (apply_error_callback_arg.action != -1)
+ {
+ Oid relid;
+
+ if (apply_error_callback_arg.rel)
+ relid = apply_error_callback_arg.rel->localreloid;
+ else
+ relid = InvalidOid;
+
+ pgstat_report_logicalrep_error(MySubscription->oid,
+ apply_error_callback_arg.action,
+ apply_error_callback_arg.remote_xid,
+ relid);
+ }
+
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
proc_exit(0);
}
@@ -3459,6 +3482,9 @@ stop_skipping_changes(bool reset_xid, LogicalRepCommitData *commit_data)
CommitTransactionCommand();
+ /* Clear the error information in the stats collector too */
+ pgstat_report_logicalrep_error_clear(MySubscription->oid);
+
return true;
}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 14056f5347..6dd4f3887f 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -24,6 +24,7 @@
#include "pgstat.h"
#include "postmaster/bgworker_internals.h"
#include "postmaster/postmaster.h"
+#include "replication/logicalproto.h"
#include "replication/slot.h"
#include "storage/proc.h"
#include "storage/procarray.h"
@@ -2380,3 +2381,59 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
/* Returns the record as Datum */
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
+
+/*
+ * Get the logical replication error for the given subscription.
+ */
+Datum
+pg_stat_get_logical_replication_error(PG_FUNCTION_ARGS)
+{
+#define PG_STAT_GET_LOGICAL_REPLICATION_ERROR_COLS 5
+ Oid subid = PG_GETARG_OID(0);
+ TupleDesc tupdesc;
+ Datum values[PG_STAT_GET_LOGICAL_REPLICATION_ERROR_COLS];
+ bool nulls[PG_STAT_GET_LOGICAL_REPLICATION_ERROR_COLS];
+ PgStat_LogicalRepErrEntry *errent;
+
+ /* Initialise values and NULL flags arrays */
+ MemSet(values, 0, sizeof(values));
+ MemSet(nulls, 0, sizeof(nulls));
+
+ /* Initialise attributes information in the tuple descriptor */
+ tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_LOGICAL_REPLICATION_ERROR_COLS);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "subid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 2, "relid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 3, "action",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 4, "xid",
+ XIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 5, "last_failure",
+ TIMESTAMPTZOID, -1, 0);
+ BlessTupleDesc(tupdesc);
+
+ errent = pgstat_fetch_logicalrep_error(subid);
+
+ values[0] = ObjectIdGetDatum(subid);
+
+ if (!errent)
+ {
+ MemSet(nulls, true, sizeof(nulls));
+ nulls[0] = false;
+ }
+ else
+ {
+ if (OidIsValid(errent->relid))
+ values[1] = ObjectIdGetDatum(errent->relid);
+ else
+ nulls[1] = true;
+
+ values[2] = CStringGetTextDatum(logicalrep_action(errent->action));
+ values[3] = TransactionIdGetDatum(errent->xid);
+ values[4] = TimestampTzGetDatum(errent->last_failure);
+ }
+
+ /* Returns the record as Datum */
+ PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
+}
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index fde251fa4f..1c5283a4e3 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5317,6 +5317,14 @@
proargmodes => '{i,o,o,o,o,o,o,o,o,o,o}',
proargnames => '{slot_name,slot_name,spill_txns,spill_count,spill_bytes,stream_txns,stream_count,stream_bytes,total_txns,total_bytes,stats_reset}',
prosrc => 'pg_stat_get_replication_slot' },
+{ oid => '8523', descr => 'statistics: information about logical replication error',
+ proname => 'pg_stat_get_logical_replication_error', prorows => '1', proisstrict => 'f',
+ proretset => 't', provolatile => 's', proparallel => 'r',
+ prorettype => 'record', proargtypes => 'oid',
+ proallargtypes => '{oid,oid,oid,text,xid,timestamptz}',
+ proargmodes => '{i,o,o,o,o,o}',
+ proargnames => '{subid,subid,relid,action,xid,last_failure}',
+ prosrc => 'pg_stat_get_logical_replication_error' },
{ oid => '6118', descr => 'statistics: information about subscription',
proname => 'pg_stat_get_subscription', prorows => '10', proisstrict => 'f',
proretset => 't', provolatile => 's', proparallel => 'r',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 9612c0a6c2..46f6d29a21 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -14,6 +14,7 @@
#include "datatype/timestamp.h"
#include "portability/instr_time.h"
#include "postmaster/pgarch.h" /* for MAX_XFN_CHARS */
+#include "replication/logicalproto.h"
#include "utils/backend_progress.h" /* for backward compatibility */
#include "utils/backend_status.h" /* for backward compatibility */
#include "utils/hsearch.h"
@@ -66,6 +67,7 @@ typedef enum StatMsgType
PGSTAT_MTYPE_RESETSINGLECOUNTER,
PGSTAT_MTYPE_RESETSLRUCOUNTER,
PGSTAT_MTYPE_RESETREPLSLOTCOUNTER,
+ PGSTAT_MTYPE_LOGICALREPERROR,
PGSTAT_MTYPE_AUTOVAC_START,
PGSTAT_MTYPE_VACUUM,
PGSTAT_MTYPE_ANALYZE,
@@ -539,6 +541,21 @@ typedef struct PgStat_MsgReplSlot
PgStat_Counter m_total_bytes;
} PgStat_MsgReplSlot;
+/* ----------
+ * PgStat_MsgLogicalRepErr Sent by a apply worker to to update the error
+ * of logical replication transaction.
+ * ----------
+ */
+typedef struct PgStat_MsgLogicalRepErr
+{
+ PgStat_MsgHdr m_hdr;
+ Oid m_subid;
+ bool m_clear;
+ Oid m_relid;
+ LogicalRepMsgType m_action;
+ TransactionId m_xid;
+ TimestampTz m_last_failure;
+} PgStat_MsgLogicalRepErr;
/* ----------
* PgStat_MsgRecoveryConflict Sent by the backend upon recovery conflict
@@ -710,6 +727,7 @@ typedef union PgStat_Msg
PgStat_MsgChecksumFailure msg_checksumfailure;
PgStat_MsgReplSlot msg_replslot;
PgStat_MsgConn msg_conn;
+ PgStat_MsgLogicalRepErr msg_logicalreperr;
} PgStat_Msg;
@@ -908,6 +926,15 @@ typedef struct PgStat_StatReplSlotEntry
TimestampTz stat_reset_timestamp;
} PgStat_StatReplSlotEntry;
+typedef struct PgStat_LogicalRepErrEntry
+{
+ Oid subid;
+ Oid relid;
+ LogicalRepMsgType action;
+ TransactionId xid;
+ TimestampTz last_failure;
+} PgStat_LogicalRepErrEntry;
+
/*
* Working state needed to accumulate per-function-call timing statistics.
@@ -1011,6 +1038,9 @@ extern void pgstat_report_checksum_failure(void);
extern void pgstat_report_replslot(const PgStat_StatReplSlotEntry *repSlotStat);
extern void pgstat_report_replslot_create(const char *slotname);
extern void pgstat_report_replslot_drop(const char *slotname);
+extern void pgstat_report_logicalrep_error(Oid subid, LogicalRepMsgType action,
+ TransactionId xid, Oid relid);
+extern void pgstat_report_logicalrep_error_clear(Oid subid);
extern void pgstat_initialize(void);
@@ -1106,6 +1136,7 @@ extern PgStat_GlobalStats *pgstat_fetch_global(void);
extern PgStat_WalStats *pgstat_fetch_stat_wal(void);
extern PgStat_SLRUStats *pgstat_fetch_slru(void);
extern PgStat_StatReplSlotEntry *pgstat_fetch_replslot(NameData slotname);
+extern PgStat_LogicalRepErrEntry *pgstat_fetch_logicalrep_error(Oid subid);
extern void pgstat_count_slru_page_zeroed(int slru_idx);
extern void pgstat_count_slru_page_hit(int slru_idx);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index e5ab11275d..506801e19b 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1879,6 +1879,13 @@ pg_stat_gssapi| SELECT s.pid,
s.gss_enc AS encrypted
FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid, query_id)
WHERE (s.client_port IS NOT NULL);
+pg_stat_logical_replication_error| SELECT s.subname,
+ e.relid,
+ e.action,
+ e.xid,
+ e.last_failure
+ FROM pg_subscription s,
+ LATERAL pg_stat_get_logical_replication_error(s.oid) e(subid, relid, action, xid, last_failure);
pg_stat_progress_analyze| SELECT s.pid,
s.datid,
d.datname,
--
2.24.3 (Apple Git-128)
v1-0002-Add-errcontext-to-errors-of-the-applying-logical-.patchapplication/octet-stream; name=v1-0002-Add-errcontext-to-errors-of-the-applying-logical-.patchDownload
From 45f456701eb015d9e34ab28d580124981f90e420 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 28 Jun 2021 13:21:58 +0900
Subject: [PATCH v1 2/3] Add errcontext to errors of the applying logical
replication changes.
---
src/backend/replication/logical/proto.c | 41 ++++++++
src/backend/replication/logical/worker.c | 119 +++++++++++++++++++----
src/include/replication/logicalproto.h | 1 +
3 files changed, 143 insertions(+), 18 deletions(-)
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 1cf59e0fb0..08e63c3a89 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -898,3 +898,44 @@ logicalrep_read_stream_abort(StringInfo in, TransactionId *xid,
*xid = pq_getmsgint(in, 4);
*subxid = pq_getmsgint(in, 4);
}
+
+/*
+ * get string representing LogicalRepMsgType.
+ */
+const char *
+logicalrep_action(LogicalRepMsgType action)
+{
+ switch (action)
+ {
+ case LOGICAL_REP_MSG_BEGIN:
+ return "BEGIN";
+ case LOGICAL_REP_MSG_COMMIT:
+ return "COMMIT";
+ case LOGICAL_REP_MSG_INSERT:
+ return "INSERT";
+ case LOGICAL_REP_MSG_UPDATE:
+ return "UPDATE";
+ case LOGICAL_REP_MSG_DELETE:
+ return "DELETE";
+ case LOGICAL_REP_MSG_TRUNCATE:
+ return "TRUNCATE";
+ case LOGICAL_REP_MSG_RELATION:
+ return "RELATION";
+ case LOGICAL_REP_MSG_TYPE:
+ return "TYPE";
+ case LOGICAL_REP_MSG_ORIGIN:
+ return "ORIGIN";
+ case LOGICAL_REP_MSG_MESSAGE:
+ return "MESSAGE";
+ case LOGICAL_REP_MSG_STREAM_START:
+ return "STREAM START";
+ case LOGICAL_REP_MSG_STREAM_END:
+ return "STREAM END";
+ case LOGICAL_REP_MSG_STREAM_ABORT:
+ return "STREAM ABORT";
+ case LOGICAL_REP_MSG_STREAM_COMMIT:
+ return "STREAM COMMIT";
+ default:
+ return "UNKNOWN";
+ }
+}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index b90a8df166..b65f72c9a4 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -149,6 +149,21 @@ typedef struct ApplyExecutionData
PartitionTupleRouting *proute; /* partition routing info */
} ApplyExecutionData;
+typedef struct ApplyErrCallbackArg
+{
+ LogicalRepMsgType action; /* 0 if invalid */
+ LogicalRepRelMapEntry *rel;
+ TransactionId remote_xid;
+ TimestampTz committs;
+} ApplyErrCallbackArg;
+static ApplyErrCallbackArg apply_error_callback_arg =
+{
+ .action = 0,
+ .rel = NULL,
+ .remote_xid = InvalidTransactionId,
+ .committs = 0,
+};
+
/*
* Stream xid hash entry. Whenever we see a new xid we create this entry in the
* xidhash and along with it create the streaming file and store the fileset handle.
@@ -276,6 +291,8 @@ static bool start_skipping_changes(TransactionId xid);
static bool stop_skipping_changes(bool reset_xid,
LogicalRepCommitData *commit_data);
+static void apply_error_callback(void *arg);
+static void reset_apply_error_context_info(void);
/*
* Should this worker apply changes for given relation.
@@ -788,6 +805,8 @@ apply_handle_begin(StringInfo s)
LogicalRepBeginData begin_data;
logicalrep_read_begin(s, &begin_data);
+ apply_error_callback_arg.remote_xid = begin_data.xid;
+ apply_error_callback_arg.committs = begin_data.committime;
remote_final_lsn = begin_data.final_lsn;
@@ -828,6 +847,7 @@ apply_handle_commit(StringInfo s)
process_syncing_tables(commit_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -876,6 +896,7 @@ apply_handle_stream_start(StringInfo s)
* streaming data and subxact info.
*/
begin_replication_step();
+ apply_error_callback_arg.remote_xid = stream_xid;
/* notify handle methods we're processing a remote transaction */
in_streamed_transaction = true;
@@ -941,6 +962,7 @@ apply_handle_stream_stop(StringInfo s)
MemoryContextReset(LogicalStreamingContext);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -964,7 +986,10 @@ apply_handle_stream_abort(StringInfo s)
* just delete the files with serialized info.
*/
if (xid == subxid)
+ {
+ apply_error_callback_arg.remote_xid = xid;
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
+ }
else
{
/*
@@ -989,6 +1014,7 @@ apply_handle_stream_abort(StringInfo s)
char path[MAXPGPATH];
StreamXidHash *ent;
+ apply_error_callback_arg.remote_xid = subxid;
subidx = -1;
begin_replication_step();
subxact_info_read(MyLogicalRepWorker->subid, xid);
@@ -1013,6 +1039,7 @@ apply_handle_stream_abort(StringInfo s)
cleanup_subxact_info();
end_replication_step();
CommitTransactionCommand();
+ reset_apply_error_context_info();
return;
}
@@ -1047,6 +1074,8 @@ apply_handle_stream_abort(StringInfo s)
/* Stop the skipping transaction if enabled */
stop_skipping_changes(true, NULL);
+
+ reset_apply_error_context_info();
}
/*
@@ -1064,6 +1093,8 @@ apply_handle_stream_commit(StringInfo s)
errmsg_internal("STREAM COMMIT message without STREAM STOP")));
xid = logicalrep_read_stream_commit(s, &commit_data);
+ apply_error_callback_arg.remote_xid = xid;
+ apply_error_callback_arg.committs = commit_data.committime;
remote_final_lsn = commit_data.commit_lsn;
@@ -1083,6 +1114,8 @@ apply_handle_stream_commit(StringInfo s)
process_syncing_tables(commit_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+
+ reset_apply_error_context_info();
}
/*
@@ -1330,6 +1363,8 @@ apply_handle_insert(StringInfo s)
return;
}
+ apply_error_callback_arg.rel = rel;
+
/* Initialize the executor state. */
edata = create_edata_for_relation(rel);
estate = edata->estate;
@@ -1451,6 +1486,8 @@ apply_handle_update(StringInfo s)
return;
}
+ apply_error_callback_arg.rel = rel;
+
/* Check if we can do the update. */
check_relation_updatable(rel);
@@ -1607,6 +1644,8 @@ apply_handle_delete(StringInfo s)
return;
}
+ apply_error_callback_arg.rel = rel;
+
/* Check if we can do the delete. */
check_relation_updatable(rel);
@@ -2066,6 +2105,7 @@ static void
apply_dispatch(StringInfo s)
{
LogicalRepMsgType action = pq_getmsgbyte(s);
+ ErrorContextCallback errcallback;
/*
* Skip all data-modification changes if we're skipping changes of this
@@ -2078,43 +2118,49 @@ apply_dispatch(StringInfo s)
action == LOGICAL_REP_MSG_TRUNCATE))
return;
+ /* Push apply error context callback */
+ apply_error_callback_arg.action = action;
+ errcallback.callback = apply_error_callback;
+ errcallback.previous = error_context_stack;
+ error_context_stack = &errcallback;
+
switch (action)
{
case LOGICAL_REP_MSG_BEGIN:
apply_handle_begin(s);
- return;
+ break;
case LOGICAL_REP_MSG_COMMIT:
apply_handle_commit(s);
- return;
+ break;
case LOGICAL_REP_MSG_INSERT:
apply_handle_insert(s);
- return;
+ break;
case LOGICAL_REP_MSG_UPDATE:
apply_handle_update(s);
- return;
+ break;
case LOGICAL_REP_MSG_DELETE:
apply_handle_delete(s);
- return;
+ break;
case LOGICAL_REP_MSG_TRUNCATE:
apply_handle_truncate(s);
- return;
+ break;
case LOGICAL_REP_MSG_RELATION:
apply_handle_relation(s);
- return;
+ break;
case LOGICAL_REP_MSG_TYPE:
apply_handle_type(s);
- return;
+ break;
case LOGICAL_REP_MSG_ORIGIN:
apply_handle_origin(s);
- return;
+ break;
case LOGICAL_REP_MSG_MESSAGE:
@@ -2123,29 +2169,32 @@ apply_dispatch(StringInfo s)
* Although, it could be used by other applications that use this
* output plugin.
*/
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_START:
apply_handle_stream_start(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_END:
apply_handle_stream_stop(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_ABORT:
apply_handle_stream_abort(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_COMMIT:
apply_handle_stream_commit(s);
- return;
+ break;
+
+ default:
+ ereport(ERROR,
+ (errcode(ERRCODE_PROTOCOL_VIOLATION),
+ errmsg("invalid logical replication message type \"%c\"", action)));
}
- ereport(ERROR,
- (errcode(ERRCODE_PROTOCOL_VIOLATION),
- errmsg_internal("invalid logical replication message type \"%c\"",
- action)));
+ /* Pop the error context stack */
+ error_context_stack = errcallback.previous;
}
/*
@@ -3412,3 +3461,37 @@ stop_skipping_changes(bool reset_xid, LogicalRepCommitData *commit_data)
return true;
}
+
+static void
+apply_error_callback(void *arg)
+{
+ StringInfoData buf;
+
+ initStringInfo(&buf);
+ appendStringInfo(&buf, _("during apply of \"%s\""),
+ logicalrep_action(apply_error_callback_arg.action));
+
+
+ if (apply_error_callback_arg.rel)
+ appendStringInfo(&buf, _(" for relation \"%s.%s\""),
+ apply_error_callback_arg.rel->remoterel.nspname,
+ apply_error_callback_arg.rel->remoterel.relname);
+
+ if (TransactionIdIsNormal(apply_error_callback_arg.remote_xid))
+ appendStringInfo(&buf, _(" in transaction with xid %u committs %s"),
+ apply_error_callback_arg.remote_xid,
+ apply_error_callback_arg.committs == 0
+ ? "(unset)"
+ : timestamptz_to_str(apply_error_callback_arg.committs));
+
+ errcontext("%s", buf.data);
+}
+
+static void
+reset_apply_error_context_info(void)
+{
+ apply_error_callback_arg.action = -1;
+ apply_error_callback_arg.rel = NULL;
+ apply_error_callback_arg.remote_xid = InvalidTransactionId;
+ apply_error_callback_arg.committs = 0;
+}
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 55b90c03ea..a1bd3e2d9a 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -173,5 +173,6 @@ extern void logicalrep_write_stream_abort(StringInfo out, TransactionId xid,
TransactionId subxid);
extern void logicalrep_read_stream_abort(StringInfo in, TransactionId *xid,
TransactionId *subxid);
+extern const char *logicalrep_action(LogicalRepMsgType action);
#endif /* LOGICAL_PROTO_H */
--
2.24.3 (Apple Git-128)
v1-0001-Add-ALTER-SUBSCRIPTION-SET-SKIP-TRANSACTION.patchapplication/octet-stream; name=v1-0001-Add-ALTER-SUBSCRIPTION-SET-SKIP-TRANSACTION.patchDownload
From cc69040e063a91acce0a7c7b9b4defe14500bb31 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 28 Jun 2021 13:18:58 +0900
Subject: [PATCH v1 1/3] Add ALTER SUBSCRIPTION SET SKIP TRANSACTION.
---
src/backend/catalog/pg_subscription.c | 10 ++
src/backend/commands/subscriptioncmds.c | 21 +++
src/backend/parser/gram.y | 17 ++
src/backend/replication/logical/worker.c | 216 ++++++++++++++++++++---
src/include/catalog/pg_subscription.h | 4 +
src/include/nodes/parsenodes.h | 5 +-
6 files changed, 243 insertions(+), 30 deletions(-)
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index 29fc4218cd..7b79de6351 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -103,6 +103,16 @@ GetSubscription(Oid subid, bool missing_ok)
Assert(!isnull);
sub->publications = textarray_to_stringlist(DatumGetArrayTypeP(datum));
+ /* Get skip XID */
+ datum = SysCacheGetAttr(SUBSCRIPTIONOID,
+ tup,
+ Anum_pg_subscription_subskipxid,
+ &isnull);
+ if (!isnull)
+ sub->skipxid = DatumGetTransactionId(datum);
+ else
+ sub->skipxid = InvalidTransactionId;
+
ReleaseSysCache(tup);
return sub;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index b862e59f1d..97fd56e371 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -429,6 +429,7 @@ CreateSubscription(CreateSubscriptionStmt *stmt, bool isTopLevel)
values[Anum_pg_subscription_subenabled - 1] = BoolGetDatum(enabled);
values[Anum_pg_subscription_subbinary - 1] = BoolGetDatum(binary);
values[Anum_pg_subscription_substream - 1] = BoolGetDatum(streaming);
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (slotname)
@@ -1020,6 +1021,26 @@ AlterSubscription(AlterSubscriptionStmt *stmt, bool isTopLevel)
break;
}
+ case ALTER_SUBSCRIPTION_SET_SKIP_XID:
+ {
+ if (sub->skipxid != stmt->skip_xid)
+ {
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(stmt->skip_xid);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+ update_tuple = true;
+ }
+
+ break;
+ }
+ case ALTER_SUBSCRIPTION_RESET_SKIP_XID:
+ {
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+ update_tuple = true;
+ break;
+ }
+
default:
elog(ERROR, "unrecognized ALTER SUBSCRIPTION kind %d",
stmt->kind);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index eb24195438..ef9213570f 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9772,6 +9772,23 @@ AlterSubscriptionStmt:
(Node *)makeInteger(false), @1));
$$ = (Node *)n;
}
+ | ALTER SUBSCRIPTION name SET SKIP TRANSACTION Iconst
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_SET_SKIP_XID;
+ n->subname = $3;
+ n->skip_xid = $7;
+ $$ = (Node *)n;
+ }
+ | ALTER SUBSCRIPTION name RESET SKIP TRANSACTION
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_RESET_SKIP_XID;
+ n->subname = $3;
+ $$ = (Node *)n;
+ }
;
/*****************************************************************************
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index bbb659dad0..b90a8df166 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -62,6 +62,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -181,6 +182,19 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * True if we're skipping changes of the specified transaction in
+ * MySubscription->skip_xid. Please note that We don’t skip receiving those
+ * changes. We decide whether or not to skip applying the changes when starting
+ * to apply. That is, for streamed transactions, we receive the streamed changes
+ * anyway and then cleanup streamed files when applying the stream-commit or
+ * stream-abort message. When stopping the skipping behavior, we reset the skip
+ * XID (subskipxid) in the pg_subscription and associate origin status to the
+ * transaction that resets the skip XID so that we can start streaming from the
+ * next transaction.
+ */
+static bool skipping_changes = false;
+
/*
* Hash table for storing the streaming xid information along with shared file
* set for streaming and subxact files.
@@ -236,8 +250,7 @@ static void maybe_reread_subscription(void);
/* prototype needed because of stream_commit */
static void apply_dispatch(StringInfo s);
-static void apply_handle_commit_internal(StringInfo s,
- LogicalRepCommitData *commit_data);
+static void apply_handle_commit_internal(LogicalRepCommitData *commit_data);
static void apply_handle_insert_internal(ApplyExecutionData *edata,
ResultRelInfo *relinfo,
TupleTableSlot *remoteslot);
@@ -256,6 +269,13 @@ static void apply_handle_tuple_routing(ApplyExecutionData *edata,
TupleTableSlot *remoteslot,
LogicalRepTupleData *newtup,
CmdType operation);
+static void apply_streamed_changes(TransactionId xid,
+ LogicalRepCommitData *commit_data);
+
+static bool start_skipping_changes(TransactionId xid);
+static bool stop_skipping_changes(bool reset_xid,
+ LogicalRepCommitData *commit_data);
+
/*
* Should this worker apply changes for given relation.
@@ -771,7 +791,9 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
- in_remote_transaction = true;
+ /* Start skipping all changes of this transaction if necessary */
+ if (!start_skipping_changes(begin_data.xid))
+ in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
}
@@ -795,7 +817,12 @@ apply_handle_commit(StringInfo s)
LSN_FORMAT_ARGS(commit_data.commit_lsn),
LSN_FORMAT_ARGS(remote_final_lsn))));
- apply_handle_commit_internal(s, &commit_data);
+ /*
+ * Stop the skipping transaction if enabled. Otherwise, commit the
+ * changes that are just applied.
+ */
+ if (!stop_skipping_changes(true, &commit_data))
+ apply_handle_commit_internal(&commit_data);
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(commit_data.end_lsn);
@@ -813,9 +840,10 @@ apply_handle_origin(StringInfo s)
{
/*
* ORIGIN message can only come inside streaming transaction or inside
- * remote transaction and before any actual writes.
+ * remote transaction and before any actual writes unless we're skipping
+ * changes of the transaction.
*/
- if (!in_streamed_transaction &&
+ if (!in_streamed_transaction && !skipping_changes &&
(!in_remote_transaction ||
(IsTransactionState() && !am_tablesync_worker())))
ereport(ERROR,
@@ -837,6 +865,9 @@ apply_handle_stream_start(StringInfo s)
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg_internal("duplicate STREAM START message")));
+ /* extract XID of the top-level transaction */
+ stream_xid = logicalrep_read_stream_start(s, &first_segment);
+
/*
* Start a transaction on stream start, this transaction will be committed
* on the stream stop unless it is a tablesync worker in which case it
@@ -849,9 +880,6 @@ apply_handle_stream_start(StringInfo s)
/* notify handle methods we're processing a remote transaction */
in_streamed_transaction = true;
- /* extract XID of the top-level transaction */
- stream_xid = logicalrep_read_stream_start(s, &first_segment);
-
if (!TransactionIdIsValid(stream_xid))
ereport(ERROR,
(errcode(ERRCODE_PROTOCOL_VIOLATION),
@@ -1016,6 +1044,9 @@ apply_handle_stream_abort(StringInfo s)
end_replication_step();
CommitTransactionCommand();
}
+
+ /* Stop the skipping transaction if enabled */
+ stop_skipping_changes(true, NULL);
}
/*
@@ -1025,14 +1056,7 @@ static void
apply_handle_stream_commit(StringInfo s)
{
TransactionId xid;
- StringInfoData s2;
- int nchanges;
- char path[MAXPGPATH];
- char *buffer = NULL;
LogicalRepCommitData commit_data;
- StreamXidHash *ent;
- MemoryContext oldcxt;
- BufFile *fd;
if (in_streamed_transaction)
ereport(ERROR,
@@ -1041,8 +1065,40 @@ apply_handle_stream_commit(StringInfo s)
xid = logicalrep_read_stream_commit(s, &commit_data);
+ remote_final_lsn = commit_data.commit_lsn;
+
elog(DEBUG1, "received commit for streamed transaction %u", xid);
+ /*
+ * Stop skipping transaction information if enabled. Otherwise, apply
+ * all streamed changes and commit the transaction.
+ */
+ if (!stop_skipping_changes(true, &commit_data))
+ apply_streamed_changes(xid, &commit_data);
+
+ /* unlink the files with serialized changes and subxact info */
+ stream_cleanup_files(MyLogicalRepWorker->subid, xid);
+
+ /* Process any tables that are being synchronized in parallel. */
+ process_syncing_tables(commit_data.end_lsn);
+
+ pgstat_report_activity(STATE_IDLE, NULL);
+}
+
+/*
+ * Apply all streamed changes with the xid.
+ */
+static void
+apply_streamed_changes(TransactionId xid, LogicalRepCommitData *commit_data)
+{
+ StringInfoData s2;
+ int nchanges;
+ char path[MAXPGPATH];
+ char *buffer = NULL;
+ StreamXidHash *ent;
+ MemoryContext oldcxt;
+ BufFile *fd;
+
/* Make sure we have an open transaction */
begin_replication_step();
@@ -1074,8 +1130,6 @@ apply_handle_stream_commit(StringInfo s)
MemoryContextSwitchTo(oldcxt);
- remote_final_lsn = commit_data.commit_lsn;
-
/*
* Make sure the handle apply_dispatch methods are aware we're in a remote
* transaction.
@@ -1153,22 +1207,15 @@ apply_handle_stream_commit(StringInfo s)
elog(DEBUG1, "replayed %d (all) changes from file \"%s\"",
nchanges, path);
- apply_handle_commit_internal(s, &commit_data);
-
- /* unlink the files with serialized changes and subxact info */
- stream_cleanup_files(MyLogicalRepWorker->subid, xid);
-
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(commit_data.end_lsn);
-
- pgstat_report_activity(STATE_IDLE, NULL);
+ /* commit the streamed transaction */
+ apply_handle_commit_internal(commit_data);
}
/*
* Helper function for apply_handle_commit and apply_handle_stream_commit.
*/
static void
-apply_handle_commit_internal(StringInfo s, LogicalRepCommitData *commit_data)
+apply_handle_commit_internal(LogicalRepCommitData *commit_data)
{
if (IsTransactionState())
{
@@ -2020,6 +2067,17 @@ apply_dispatch(StringInfo s)
{
LogicalRepMsgType action = pq_getmsgbyte(s);
+ /*
+ * Skip all data-modification changes if we're skipping changes of this
+ * transaction.
+ */
+ if (skipping_changes &&
+ (action == LOGICAL_REP_MSG_INSERT ||
+ action == LOGICAL_REP_MSG_UPDATE ||
+ action == LOGICAL_REP_MSG_DELETE ||
+ action == LOGICAL_REP_MSG_TRUNCATE))
+ return;
+
switch (action)
{
case LOGICAL_REP_MSG_BEGIN:
@@ -2309,7 +2367,8 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
/* confirm all writes so far */
send_feedback(last_received, false, false);
- if (!in_remote_transaction && !in_streamed_transaction)
+ if (!in_remote_transaction && !in_streamed_transaction &&
+ !skipping_changes)
{
/*
* If we didn't get any transactions for a while there might be
@@ -3254,3 +3313,102 @@ IsLogicalWorker(void)
{
return MyLogicalRepWorker != NULL;
}
+
+/*
+ * Start skipping changes of the given transaction. Return true if we
+ * enabled the skipping behavior.
+ */
+static bool
+start_skipping_changes(TransactionId xid)
+{
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ if (!TransactionIdIsValid(MySubscription->skipxid) ||
+ MySubscription->skipxid != xid)
+ return false;
+
+ skipping_changes = true;
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction with xid %u",
+ xid));
+
+ return true;
+}
+
+/*
+ * Stop skipping changes and reset the skip XID. Return true if we were
+ * skipping changes and have stopped it.
+ *
+ * reset_xid is true if the caller wants to reset the skip XID (subskipxid)
+ * after disabling the skipping behavior. Also, if *commit_data is non-NULL,
+ * we set origin state to the transaction commit that resets the skip XID so
+ * that we can start streaming from the transaction next to the one that we
+ * just skipped.
+ */
+static bool
+stop_skipping_changes(bool reset_xid, LogicalRepCommitData *commit_data)
+{
+ Relation rel;
+ HeapTuple tup;
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ if (!skipping_changes)
+ return false;
+
+ Assert(TransactionIdIsValid(MySubscription->skipxid));
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ /* Stop skipping changes */
+ skipping_changes = false;
+
+ ereport(LOG,
+ errmsg("done skipping logical replication transaction with xid %u",
+ MySubscription->skipxid));
+
+ if (!reset_xid)
+ return true;
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ /* Update the system catalog to reset the skip XID */
+ StartTransactionCommand();
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ TransactionIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ /* Set subskipxid to null */
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+
+ heap_freetuple(tup);
+ table_close(rel, RowExclusiveLock);
+
+ if (commit_data)
+ {
+ /*
+ * Update origin state so we can restart streaming from correct
+ * position in case of crash.
+ */
+ replorigin_session_origin_lsn = commit_data->end_lsn;
+ replorigin_session_origin_timestamp = commit_data->committime;
+ }
+
+ CommitTransactionCommand();
+
+ return true;
+}
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 0060ebfb40..a13b936891 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -57,6 +57,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
bool substream; /* Stream in-progress transactions. */
+ TransactionId subskipxid BKI_FORCE_NULL; /* All changes associated with
+ this XID are skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -94,6 +97,7 @@ typedef struct Subscription
bool binary; /* Indicates if the subscription wants data in
* binary format */
bool stream; /* Allow streaming in-progress transactions. */
+ TransactionId skipxid; /* All changes of the XID are skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index def9651b34..49e5d3e1e5 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3664,7 +3664,9 @@ typedef enum AlterSubscriptionType
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
ALTER_SUBSCRIPTION_DROP_PUBLICATION,
ALTER_SUBSCRIPTION_REFRESH,
- ALTER_SUBSCRIPTION_ENABLED
+ ALTER_SUBSCRIPTION_ENABLED,
+ ALTER_SUBSCRIPTION_SET_SKIP_XID,
+ ALTER_SUBSCRIPTION_RESET_SKIP_XID,
} AlterSubscriptionType;
typedef struct AlterSubscriptionStmt
@@ -3675,6 +3677,7 @@ typedef struct AlterSubscriptionStmt
char *conninfo; /* Connection string to publisher */
List *publication; /* One or more publication to subscribe to */
List *options; /* List of DefElem nodes */
+ TransactionId skip_xid; /* XID to skip */
} AlterSubscriptionStmt;
typedef struct DropSubscriptionStmt
--
2.24.3 (Apple Git-128)
On Mon, Jun 28, 2021 at 10:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Jun 17, 2021 at 6:20 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Jun 17, 2021 at 3:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Now, if this function is used by super
users then we can probably trust that they provide the XIDs that we
can trust to be skipped but OTOH making a restriction to allow these
functions to be used by superusers might restrict the usage of this
repair tool.If we specify the subscription id or name, maybe we can allow also the
owner of subscription to do that operation?Ah, the owner of the subscription must be superuser.
I've attached PoC patches.
0001 patch introduces the ability to skip transactions on the
subscriber side. We can specify XID to the subscription by like ALTER
SUBSCRIPTION test_sub SET SKIP TRANSACTION 100. The implementation
seems straightforward except for setting origin state. After skipping
the transaction we have to update the session origin state so that we
can start streaming the transaction next to the one that we just
skipped in case of the server crash or restarting the apply worker. We
set origin state to the commit WAL record. However, since we skip all
changes we don’t write any WAL even if we call CommitTransaction() at
the end of the skipped transaction. So the patch sets the origin state
to the transaction that updates the pg_subscription system catalog to
reset the skip XID. I think we need a discussion of this part.
IIUC, for streaming transactions you are allowing stream file to be
created and then remove it at stream_commit/stream_abort time, is that
right? If so, in which cases are you imagining the files to be
created, is it in the case of relation message
(LOGICAL_REP_MSG_RELATION)? Assuming the previous two statements are
correct, this will skip the relation message as well as part of the
removal of stream files which might lead to a problem because the
publisher won't know that we have skipped the relation message and it
won't send it again. This can cause problems while processing the next
messages.
With 0002 and 0003 patches, we report the error information in server
logs and the stats view, respectively. 0002 patch adds errcontext for
messages that happened during applying the changes:ERROR: duplicate key value violates unique constraint "hoge_pkey"
DETAIL: Key (c)=(1) already exists.
CONTEXT: during apply of "INSERT" for relation "public.hoge" in
transaction with xid 736 committs 2021-06-27 12:12:30.053887+090003 patch adds pg_stat_logical_replication_error statistics view
discussed on another thread[1]. The apply worker sends the error
information to the stats collector if an error happens during applying
changes. We can check those errors as follow:postgres(1:25250)=# select * from pg_stat_logical_replication_error;
subname | relid | action | xid | last_failure
----------+-------+--------+-----+-------------------------------
test_sub | 16384 | INSERT | 736 | 2021-06-27 12:12:45.142675+09
(1 row)I added only columns required for the skipping transaction feature to
the view for now.
Isn't it better to add an error message if possible?
Please note that those patches are meant to evaluate the concept we've
discussed so far. Those don't have the doc update yet.
I think your patch is on the lines of what we have discussed. It would
be good if you can update docs and add few tests.
--
With Regards,
Amit Kapila.
On Wed, Jun 30, 2021 at 4:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Jun 28, 2021 at 10:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
0003 patch adds pg_stat_logical_replication_error statistics view
discussed on another thread[1]. The apply worker sends the error
information to the stats collector if an error happens during applying
changes. We can check those errors as follow:postgres(1:25250)=# select * from pg_stat_logical_replication_error;
subname | relid | action | xid | last_failure
----------+-------+--------+-----+-------------------------------
test_sub | 16384 | INSERT | 736 | 2021-06-27 12:12:45.142675+09
(1 row)I added only columns required for the skipping transaction feature to
the view for now.Isn't it better to add an error message if possible?
Don't we want to clear stats at drop subscription as well? We do drop
database stats in dropdb via pgstat_drop_database, so I think we need
to clear subscription stats at the time of drop subscription.
In the 0003 patch, if I am reading it correctly then the patch is not
doing anything for tablesync worker. It is not clear to me at this
stage what exactly we want to do about it? Do we want to just ignore
errors from tablesync worker and let the system behave as it is
without this feature? If we want to do anything then I think the way
to skip the initial table sync would be to behave like the user has
given 'copy_data' option as false.
--
With Regards,
Amit Kapila.
On Wed, Jun 30, 2021 at 8:05 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Jun 28, 2021 at 10:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Jun 17, 2021 at 6:20 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Jun 17, 2021 at 3:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Now, if this function is used by super
users then we can probably trust that they provide the XIDs that we
can trust to be skipped but OTOH making a restriction to allow these
functions to be used by superusers might restrict the usage of this
repair tool.If we specify the subscription id or name, maybe we can allow also the
owner of subscription to do that operation?Ah, the owner of the subscription must be superuser.
I've attached PoC patches.
0001 patch introduces the ability to skip transactions on the
subscriber side. We can specify XID to the subscription by like ALTER
SUBSCRIPTION test_sub SET SKIP TRANSACTION 100. The implementation
seems straightforward except for setting origin state. After skipping
the transaction we have to update the session origin state so that we
can start streaming the transaction next to the one that we just
skipped in case of the server crash or restarting the apply worker. We
set origin state to the commit WAL record. However, since we skip all
changes we don’t write any WAL even if we call CommitTransaction() at
the end of the skipped transaction. So the patch sets the origin state
to the transaction that updates the pg_subscription system catalog to
reset the skip XID. I think we need a discussion of this part.IIUC, for streaming transactions you are allowing stream file to be
created and then remove it at stream_commit/stream_abort time, is that
right?
Right.
If so, in which cases are you imagining the files to be
created, is it in the case of relation message
(LOGICAL_REP_MSG_RELATION)? Assuming the previous two statements are
correct, this will skip the relation message as well as part of the
removal of stream files which might lead to a problem because the
publisher won't know that we have skipped the relation message and it
won't send it again. This can cause problems while processing the next
messages.
Good point. In the current patch, we skip all streamed changes at
stream_commit/abort but it should apply changes while skipping only
data-modification changes as we do for non-stream changes.
With 0002 and 0003 patches, we report the error information in server
logs and the stats view, respectively. 0002 patch adds errcontext for
messages that happened during applying the changes:ERROR: duplicate key value violates unique constraint "hoge_pkey"
DETAIL: Key (c)=(1) already exists.
CONTEXT: during apply of "INSERT" for relation "public.hoge" in
transaction with xid 736 committs 2021-06-27 12:12:30.053887+090003 patch adds pg_stat_logical_replication_error statistics view
discussed on another thread[1]. The apply worker sends the error
information to the stats collector if an error happens during applying
changes. We can check those errors as follow:postgres(1:25250)=# select * from pg_stat_logical_replication_error;
subname | relid | action | xid | last_failure
----------+-------+--------+-----+-------------------------------
test_sub | 16384 | INSERT | 736 | 2021-06-27 12:12:45.142675+09
(1 row)I added only columns required for the skipping transaction feature to
the view for now.Isn't it better to add an error message if possible?
Please note that those patches are meant to evaluate the concept we've
discussed so far. Those don't have the doc update yet.I think your patch is on the lines of what we have discussed. It would
be good if you can update docs and add few tests.
Okay. I'll incorporate the above suggestions in the next version patch.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thu, Jul 1, 2021 at 1:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Jun 30, 2021 at 8:05 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
If so, in which cases are you imagining the files to be
created, is it in the case of relation message
(LOGICAL_REP_MSG_RELATION)? Assuming the previous two statements are
correct, this will skip the relation message as well as part of the
removal of stream files which might lead to a problem because the
publisher won't know that we have skipped the relation message and it
won't send it again. This can cause problems while processing the next
messages.Good point. In the current patch, we skip all streamed changes at
stream_commit/abort but it should apply changes while skipping only
data-modification changes as we do for non-stream changes.
Right.
--
With Regards,
Amit Kapila.
On Thu, Jul 1, 2021 at 12:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Jun 30, 2021 at 4:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Jun 28, 2021 at 10:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
0003 patch adds pg_stat_logical_replication_error statistics view
discussed on another thread[1]. The apply worker sends the error
information to the stats collector if an error happens during applying
changes. We can check those errors as follow:postgres(1:25250)=# select * from pg_stat_logical_replication_error;
subname | relid | action | xid | last_failure
----------+-------+--------+-----+-------------------------------
test_sub | 16384 | INSERT | 736 | 2021-06-27 12:12:45.142675+09
(1 row)I added only columns required for the skipping transaction feature to
the view for now.Isn't it better to add an error message if possible?
Don't we want to clear stats at drop subscription as well? We do drop
database stats in dropdb via pgstat_drop_database, so I think we need
to clear subscription stats at the time of drop subscription.
Yes, it needs to be cleared. In the 0003 patch, pgstat_vacuum_stat()
sends the message to clear the stats. I think it's better to have
pgstat_vacuum_stat() do that job similar to dropping replication slot
statistics rather than relying on the single message send at DROP
SUBSCRIPTION. I've considered doing both: sending the message at DROP
SUBSCRIPTION and periodical checking by pgstat_vacuum_stat(), but
dropping subscription not setting a replication slot is able to
rollback. So we need to send it only at commit time. Given that we
don’t necessarily need the stats to be updated immediately, I think
it’s reasonable to go with only a way of pgstat_vacuum_stat().
In the 0003 patch, if I am reading it correctly then the patch is not
doing anything for tablesync worker. It is not clear to me at this
stage what exactly we want to do about it? Do we want to just ignore
errors from tablesync worker and let the system behave as it is
without this feature? If we want to do anything then I think the way
to skip the initial table sync would be to behave like the user has
given 'copy_data' option as false.
It might be better to have also sync workers report errors, even if
SKIP TRANSACTION feature doesn’t support anything for initial table
synchronization. From the user perspective, The initial table
synchronization is also the part of logical replication operations. If
we report only error information of applying logical changes, it could
confuse users.
But I’m not sure about the way to skip the initial table
synchronization. Once we set `copy_data` to false, all table
synchronizations are disabled. Some of them might have been able to
synchronize successfully. It might be useful if the user can disable
the table initialization for the particular tables.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thu, Jul 1, 2021 at 6:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Jul 1, 2021 at 12:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Don't we want to clear stats at drop subscription as well? We do drop
database stats in dropdb via pgstat_drop_database, so I think we need
to clear subscription stats at the time of drop subscription.Yes, it needs to be cleared. In the 0003 patch, pgstat_vacuum_stat()
sends the message to clear the stats. I think it's better to have
pgstat_vacuum_stat() do that job similar to dropping replication slot
statistics rather than relying on the single message send at DROP
SUBSCRIPTION. I've considered doing both: sending the message at DROP
SUBSCRIPTION and periodical checking by pgstat_vacuum_stat(), but
dropping subscription not setting a replication slot is able to
rollback. So we need to send it only at commit time. Given that we
don’t necessarily need the stats to be updated immediately, I think
it’s reasonable to go with only a way of pgstat_vacuum_stat().
Okay, that makes sense. Can we consider sending the multiple ids in
one message as we do for relations or functions in
pgstat_vacuum_stat()? That will reduce some message traffic. BTW, do
we have some way to avoid wrapping around the OID before we clean up
via pgstat_vacuum_stat()?
In the 0003 patch, if I am reading it correctly then the patch is not
doing anything for tablesync worker. It is not clear to me at this
stage what exactly we want to do about it? Do we want to just ignore
errors from tablesync worker and let the system behave as it is
without this feature? If we want to do anything then I think the way
to skip the initial table sync would be to behave like the user has
given 'copy_data' option as false.It might be better to have also sync workers report errors, even if
SKIP TRANSACTION feature doesn’t support anything for initial table
synchronization. From the user perspective, The initial table
synchronization is also the part of logical replication operations. If
we report only error information of applying logical changes, it could
confuse users.But I’m not sure about the way to skip the initial table
synchronization. Once we set `copy_data` to false, all table
synchronizations are disabled. Some of them might have been able to
synchronize successfully. It might be useful if the user can disable
the table initialization for the particular tables.
True but I guess the user can wait for all the tablesyncs to either
finish or get an error corresponding to the table sync. After that, it
can use 'copy_data' as false. This is not a very good method but I
don't see any other option. I guess whatever is the case logging
errors from tablesyncs is anyway not a bad idea.
Instead of using the syntax "ALTER SUBSCRIPTION name SET SKIP
TRANSACTION Iconst", isn't it better to use it as a subscription
option like Mark has done for his patch (disable_on_error)?
I am slightly nervous about this way of allowing the user to skip the
errors because if it is not used carefully then it can easily lead to
inconsistent data on the subscriber. I agree that as only superusers
will be allowed to use this option and we can document clearly the
side-effects, the risk could be reduced but is that sufficient? It is
not that we don't have any other tool which allows users to make their
data inconsistent (one recent example is functions
(heap_force_kill/heap_force_freeze) in pg_surgery module) if not used
carefully but it might be better to not expose such tools.
OTOH, if we use the error infrastructure of this patch and allow users
to just disable the subscription on error as was proposed by Mark then
that can't lead to any inconsistency.
What do you think?
--
With Regards,
Amit Kapila.
On Mon, Jul 5, 2021 at 3:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Jul 1, 2021 at 6:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Jul 1, 2021 at 12:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Instead of using the syntax "ALTER SUBSCRIPTION name SET SKIP
TRANSACTION Iconst", isn't it better to use it as a subscription
option like Mark has done for his patch (disable_on_error)?I am slightly nervous about this way of allowing the user to skip the
errors because if it is not used carefully then it can easily lead to
inconsistent data on the subscriber. I agree that as only superusers
will be allowed to use this option and we can document clearly the
side-effects, the risk could be reduced but is that sufficient?
I see that users can create a similar effect by using
pg_replication_origin_advance() and it is mentioned in the docs that
careless use of this function can lead to inconsistently replicated
data. So, this new way doesn't seem to be any more dangerous than what
we already have.
--
With Regards,
Amit Kapila.
Hi,
Have a few notes about pg_stat_logical_replication_error from the DBA point
of view (which will use this view in the future).
1. As I understand it, this view might contain many errors related to
different subscriptions. It is better to name
"pg_stat_logical_replication_errors" using the plural form (like this done
for stat views for tables, indexes, functions). Also, I'd like to suggest
thinking twice about the view name (and function used in view DDL) -
"pg_stat_logical_replication_error" contains very common "logical
replication" words, but the view contains errors related to subscriptions
only. In the future there could be other kinds of errors related to logical
replication, but not related to subscriptions - what will you do?
2. Add a field with database name or id - it helps to quickly understand to
which database the subscription belongs.
3. Add a counter field with total number of errors - it helps to calculate
errors rates and aggregations (sum), and don't lose information about
errors between view checks.
4. Add text of last error (if it will not be too expensive).
5. Rename the "action" field to "command", as I know this is right from
terminology point of view.
Finally, the view might seems like this:
postgres(1:25250)=# select * from pg_stat_logical_replication_errors;
subname | datid | relid | command | xid | total | last_failure |
last_failure_text
----------+--------+-------+---------+-----+-------+-------------------------------+---------------------------
sub_1 | 12345 | 16384 | INSERT | 736 | 145 | 2021-06-27 12:12:45.142675+09
| something goes wrong...
sub_2 | 12346 | 16458 | UPDATE | 845 | 12 | 2021-06-27 12:16:01.458752+09 |
hmm, something goes wrong
Regards, Alexey
On Mon, Jul 5, 2021 at 2:59 PM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:
On Thu, Jun 17, 2021 at 6:20 PM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:On Thu, Jun 17, 2021 at 3:24 PM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:
Now, if this function is used by super
users then we can probably trust that they provide the XIDs that we
can trust to be skipped but OTOH making a restriction to allow these
functions to be used by superusers might restrict the usage of this
repair tool.If we specify the subscription id or name, maybe we can allow also the
owner of subscription to do that operation?Ah, the owner of the subscription must be superuser.
I've attached PoC patches.
0001 patch introduces the ability to skip transactions on the
subscriber side. We can specify XID to the subscription by like ALTER
SUBSCRIPTION test_sub SET SKIP TRANSACTION 100. The implementation
seems straightforward except for setting origin state. After skipping
the transaction we have to update the session origin state so that we
can start streaming the transaction next to the one that we just
skipped in case of the server crash or restarting the apply worker. We
set origin state to the commit WAL record. However, since we skip all
changes we don’t write any WAL even if we call CommitTransaction() at
the end of the skipped transaction. So the patch sets the origin state
to the transaction that updates the pg_subscription system catalog to
reset the skip XID. I think we need a discussion of this part.With 0002 and 0003 patches, we report the error information in server
logs and the stats view, respectively. 0002 patch adds errcontext for
messages that happened during applying the changes:ERROR: duplicate key value violates unique constraint "hoge_pkey"
DETAIL: Key (c)=(1) already exists.
CONTEXT: during apply of "INSERT" for relation "public.hoge" in
transaction with xid 736 committs 2021-06-27 12:12:30.053887+090003 patch adds pg_stat_logical_replication_error statistics view
discussed on another thread[1]. The apply worker sends the error
information to the stats collector if an error happens during applying
changes. We can check those errors as follow:postgres(1:25250)=# select * from pg_stat_logical_replication_error;
subname | relid | action | xid | last_failure
----------+-------+--------+-----+-------------------------------
test_sub | 16384 | INSERT | 736 | 2021-06-27 12:12:45.142675+09
(1 row)I added only columns required for the skipping transaction feature to
the view for now.Please note that those patches are meant to evaluate the concept we've
discussed so far. Those don't have the doc update yet.Regards,
[1]
/messages/by-id/DB35438F-9356-4841-89A0-412709EBD3AB@enterprisedb.com--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
--
С уважением Алексей В. Лесовский
On Mon, Jul 5, 2021 at 7:33 PM Alexey Lesovsky <lesovsky@gmail.com> wrote:
Hi,
Have a few notes about pg_stat_logical_replication_error from the DBA point of view (which will use this view in the future).
Thank you for the comments!
1. As I understand it, this view might contain many errors related to different subscriptions. It is better to name "pg_stat_logical_replication_errors" using the plural form (like this done for stat views for tables, indexes, functions).
Agreed.
Also, I'd like to suggest thinking twice about the view name (and function used in view DDL) - "pg_stat_logical_replication_error" contains very common "logical replication" words, but the view contains errors related to subscriptions only. In the future there could be other kinds of errors related to logical replication, but not related to subscriptions - what will you do?
Is pg_stat_subscription_errors or
pg_stat_logical_replication_apply_errors better?
2. Add a field with database name or id - it helps to quickly understand to which database the subscription belongs.
Agreed.
3. Add a counter field with total number of errors - it helps to calculate errors rates and aggregations (sum), and don't lose information about errors between view checks.
Do you mean to increment the error count if the error (command, xid,
and relid) is the same as the previous one? or to have the total
number of errors per subscription? And what can we infer from the
error rates and aggregations?
4. Add text of last error (if it will not be too expensive).
Agreed.
5. Rename the "action" field to "command", as I know this is right from terminology point of view.
Okay.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Mon, Jul 5, 2021 at 6:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Jul 1, 2021 at 6:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Jul 1, 2021 at 12:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Don't we want to clear stats at drop subscription as well? We do drop
database stats in dropdb via pgstat_drop_database, so I think we need
to clear subscription stats at the time of drop subscription.Yes, it needs to be cleared. In the 0003 patch, pgstat_vacuum_stat()
sends the message to clear the stats. I think it's better to have
pgstat_vacuum_stat() do that job similar to dropping replication slot
statistics rather than relying on the single message send at DROP
SUBSCRIPTION. I've considered doing both: sending the message at DROP
SUBSCRIPTION and periodical checking by pgstat_vacuum_stat(), but
dropping subscription not setting a replication slot is able to
rollback. So we need to send it only at commit time. Given that we
don’t necessarily need the stats to be updated immediately, I think
it’s reasonable to go with only a way of pgstat_vacuum_stat().Okay, that makes sense. Can we consider sending the multiple ids in
one message as we do for relations or functions in
pgstat_vacuum_stat()? That will reduce some message traffic.
Yes. Since subscriptions are objects that are not frequently created
and dropped I prioritized not to increase the message type. But if we
do that for subscriptions, is it better to do that for replication
slots as well? It seems to me that the lifetime of subscriptions and
replication slots are similar.
BTW, do
we have some way to avoid wrapping around the OID before we clean up
via pgstat_vacuum_stat()?
As far as I know there is not.
In the 0003 patch, if I am reading it correctly then the patch is not
doing anything for tablesync worker. It is not clear to me at this
stage what exactly we want to do about it? Do we want to just ignore
errors from tablesync worker and let the system behave as it is
without this feature? If we want to do anything then I think the way
to skip the initial table sync would be to behave like the user has
given 'copy_data' option as false.It might be better to have also sync workers report errors, even if
SKIP TRANSACTION feature doesn’t support anything for initial table
synchronization. From the user perspective, The initial table
synchronization is also the part of logical replication operations. If
we report only error information of applying logical changes, it could
confuse users.But I’m not sure about the way to skip the initial table
synchronization. Once we set `copy_data` to false, all table
synchronizations are disabled. Some of them might have been able to
synchronize successfully. It might be useful if the user can disable
the table initialization for the particular tables.True but I guess the user can wait for all the tablesyncs to either
finish or get an error corresponding to the table sync. After that, it
can use 'copy_data' as false. This is not a very good method but I
don't see any other option. I guess whatever is the case logging
errors from tablesyncs is anyway not a bad idea.Instead of using the syntax "ALTER SUBSCRIPTION name SET SKIP
TRANSACTION Iconst", isn't it better to use it as a subscription
option like Mark has done for his patch (disable_on_error)?
According to the doc, ALTER SUBSCRIPTION ... SET is used to alter
parameters originally set by CREATE SUBSCRIPTION. Therefore, we can
specify a subset of parameters that can be specified by CREATE
SUBSCRIPTION. It makes sense to me for 'disable_on_error' since it can
be specified by CREATE SUBSCRIPTION. Whereas SKIP TRANSACTION stuff
cannot be done. Are you concerned about adding a syntax to ALTER
SUBSCRIPTION?
I am slightly nervous about this way of allowing the user to skip the
errors because if it is not used carefully then it can easily lead to
inconsistent data on the subscriber. I agree that as only superusers
will be allowed to use this option and we can document clearly the
side-effects, the risk could be reduced but is that sufficient? It is
not that we don't have any other tool which allows users to make their
data inconsistent (one recent example is functions
(heap_force_kill/heap_force_freeze) in pg_surgery module) if not used
carefully but it might be better to not expose such tools.OTOH, if we use the error infrastructure of this patch and allow users
to just disable the subscription on error as was proposed by Mark then
that can't lead to any inconsistency.What do you think?
As you mentioned in another mail, what we can do with this feature is
the same as pg_replication_origin_advance(). Like there is a risk that
the user specifies a wrong LSN to pg_replication_origin_advance(),
there is a similar risk at this feature.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, Jul 6, 2021 at 11:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Jul 5, 2021 at 7:33 PM Alexey Lesovsky <lesovsky@gmail.com> wrote:
Hi,
Have a few notes about pg_stat_logical_replication_error from the DBA point of view (which will use this view in the future).Thank you for the comments!
1. As I understand it, this view might contain many errors related to different subscriptions. It is better to name "pg_stat_logical_replication_errors" using the plural form (like this done for stat views for tables, indexes, functions).
Agreed.
Also, I'd like to suggest thinking twice about the view name (and function used in view DDL) - "pg_stat_logical_replication_error" contains very common "logical replication" words, but the view contains errors related to subscriptions only. In the future there could be other kinds of errors related to logical replication, but not related to subscriptions - what will you do?
Is pg_stat_subscription_errors or
pg_stat_logical_replication_apply_errors better?
Few more to consider: pg_stat_apply_failures,
pg_stat_subscription_failures, pg_stat_apply_conflicts,
pg_stat_subscription_conflicts.
2. Add a field with database name or id - it helps to quickly understand to which database the subscription belongs.
Agreed.
3. Add a counter field with total number of errors - it helps to calculate errors rates and aggregations (sum), and don't lose information about errors between view checks.
Do you mean to increment the error count if the error (command, xid,
and relid) is the same as the previous one? or to have the total
number of errors per subscription?
I would prefer the total number of errors per subscription.
And what can we infer from the
error rates and aggregations?
Say, if we add a column like failure_type/conflict_type as well and
one would be interested in knowing how many conflicts are due to
primary key conflicts vs. update/delete conflicts.
You might want to consider keeping this view patch before the skip_xid
patch in your patch series as this will be base for the skip_xid
patch.
--
With Regards,
Amit Kapila.
On Tue, Jul 6, 2021 at 12:30 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Jul 5, 2021 at 6:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Jul 1, 2021 at 6:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Jul 1, 2021 at 12:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Don't we want to clear stats at drop subscription as well? We do drop
database stats in dropdb via pgstat_drop_database, so I think we need
to clear subscription stats at the time of drop subscription.Yes, it needs to be cleared. In the 0003 patch, pgstat_vacuum_stat()
sends the message to clear the stats. I think it's better to have
pgstat_vacuum_stat() do that job similar to dropping replication slot
statistics rather than relying on the single message send at DROP
SUBSCRIPTION. I've considered doing both: sending the message at DROP
SUBSCRIPTION and periodical checking by pgstat_vacuum_stat(), but
dropping subscription not setting a replication slot is able to
rollback. So we need to send it only at commit time. Given that we
don’t necessarily need the stats to be updated immediately, I think
it’s reasonable to go with only a way of pgstat_vacuum_stat().Okay, that makes sense. Can we consider sending the multiple ids in
one message as we do for relations or functions in
pgstat_vacuum_stat()? That will reduce some message traffic.Yes. Since subscriptions are objects that are not frequently created
and dropped I prioritized not to increase the message type. But if we
do that for subscriptions, is it better to do that for replication
slots as well? It seems to me that the lifetime of subscriptions and
replication slots are similar.
Yeah, I think it makes sense to do for both, we can work on slots
patch separately. I don't see a reason why we shouldn't send a single
message for multiple clear/drop entries.
True but I guess the user can wait for all the tablesyncs to either
finish or get an error corresponding to the table sync. After that, it
can use 'copy_data' as false. This is not a very good method but I
don't see any other option. I guess whatever is the case logging
errors from tablesyncs is anyway not a bad idea.Instead of using the syntax "ALTER SUBSCRIPTION name SET SKIP
TRANSACTION Iconst", isn't it better to use it as a subscription
option like Mark has done for his patch (disable_on_error)?According to the doc, ALTER SUBSCRIPTION ... SET is used to alter
parameters originally set by CREATE SUBSCRIPTION. Therefore, we can
specify a subset of parameters that can be specified by CREATE
SUBSCRIPTION. It makes sense to me for 'disable_on_error' since it can
be specified by CREATE SUBSCRIPTION. Whereas SKIP TRANSACTION stuff
cannot be done. Are you concerned about adding a syntax to ALTER
SUBSCRIPTION?
Both for additional syntax and consistency with disable_on_error.
Isn't it just a current implementation that Alter only allows to
change parameters supported by Create? Is there a reason why we can't
allow Alter to set/change some parameters not supported by Create?
--
With Regards,
Amit Kapila.
On Tue, Jul 6, 2021 at 10:58 AM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:
Also, I'd like to suggest thinking twice about the view name (and
function used in view DDL) - "pg_stat_logical_replication_error" contains
very common "logical replication" words, but the view contains errors
related to subscriptions only. In the future there could be other kinds of
errors related to logical replication, but not related to subscriptions -
what will you do?
Is pg_stat_subscription_errors or
pg_stat_logical_replication_apply_errors better?
It seems to me 'pg_stat_subscription_conflicts' proposed by Amit Kapila is
the most suitable, because it directly says about conflicts occurring on
the subscription side. The name 'pg_stat_subscription_errors' is also good,
especially in case of further extension if some kind of similar errors will
be tracked.
3. Add a counter field with total number of errors - it helps to
calculate errors rates and aggregations (sum), and don't lose information
about errors between view checks.Do you mean to increment the error count if the error (command, xid,
and relid) is the same as the previous one? or to have the total
number of errors per subscription? And what can we infer from the
error rates and aggregations?
To be honest, I hurried up when I wrote the first email, and read only
about stats view. Later, I read the starting email about the patch and
rethought this note.
As I understand, when the conflict occurs, replication stops (until
conflict is resolved), an error appears in the stats view. Now, no new
errors can occur in the blocked subscription. Hence, there are impossible
situations when many errors (like spikes) have occurred and a user didn't
see that. If I am correct in my assumption, there is no need for counters.
They are necessary only when errors might occur too frequently (like
pg_stat_database.deadlocks). But if this is possible, I would prefer the
total number of errors per subscription, as also proposed by Amit.
Under "error rates and aggregations" I also mean in the context of when a
high number of errors occured in a short period of time. If a user can
read the "total errors" counter and keep this metric in his monitoring
system, he will be able to calculate rates over time using functions in the
monitoring system. This is extremely useful.
I also would like to clarify, when conflict is resolved - the error record
is cleared or kept in the view? If it is cleared, the error counter is
required (because we don't want to lose all history of errors). If it is
kept - the flag telling about the error is resolved is needed (or set xid
to NULL). I mean when the user is watching the view, he should be able to
identify if the error has already been resolved or not.
--
Regards, Alexey
On Tue, Jul 6, 2021 at 10:58 AM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:
On Mon, Jul 5, 2021 at 7:33 PM Alexey Lesovsky <lesovsky@gmail.com> wrote:
Hi,
Have a few notes about pg_stat_logical_replication_error from the DBApoint of view (which will use this view in the future).
Thank you for the comments!
1. As I understand it, this view might contain many errors related to
different subscriptions. It is better to name
"pg_stat_logical_replication_errors" using the plural form (like this done
for stat views for tables, indexes, functions).Agreed.
Also, I'd like to suggest thinking twice about the view name (and
function used in view DDL) - "pg_stat_logical_replication_error" contains
very common "logical replication" words, but the view contains errors
related to subscriptions only. In the future there could be other kinds of
errors related to logical replication, but not related to subscriptions -
what will you do?Is pg_stat_subscription_errors or
pg_stat_logical_replication_apply_errors better?2. Add a field with database name or id - it helps to quickly understand
to which database the subscription belongs.
Agreed.
3. Add a counter field with total number of errors - it helps to
calculate errors rates and aggregations (sum), and don't lose information
about errors between view checks.Do you mean to increment the error count if the error (command, xid,
and relid) is the same as the previous one? or to have the total
number of errors per subscription? And what can we infer from the
error rates and aggregations?4. Add text of last error (if it will not be too expensive).
Agreed.
5. Rename the "action" field to "command", as I know this is right from
terminology point of view.
Okay.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
--
С уважением Алексей В. Лесовский
On Tue, Jul 6, 2021 at 6:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jul 6, 2021 at 12:30 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Jul 5, 2021 at 6:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Jul 1, 2021 at 6:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Jul 1, 2021 at 12:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Don't we want to clear stats at drop subscription as well? We do drop
database stats in dropdb via pgstat_drop_database, so I think we need
to clear subscription stats at the time of drop subscription.Yes, it needs to be cleared. In the 0003 patch, pgstat_vacuum_stat()
sends the message to clear the stats. I think it's better to have
pgstat_vacuum_stat() do that job similar to dropping replication slot
statistics rather than relying on the single message send at DROP
SUBSCRIPTION. I've considered doing both: sending the message at DROP
SUBSCRIPTION and periodical checking by pgstat_vacuum_stat(), but
dropping subscription not setting a replication slot is able to
rollback. So we need to send it only at commit time. Given that we
don’t necessarily need the stats to be updated immediately, I think
it’s reasonable to go with only a way of pgstat_vacuum_stat().Okay, that makes sense. Can we consider sending the multiple ids in
one message as we do for relations or functions in
pgstat_vacuum_stat()? That will reduce some message traffic.Yes. Since subscriptions are objects that are not frequently created
and dropped I prioritized not to increase the message type. But if we
do that for subscriptions, is it better to do that for replication
slots as well? It seems to me that the lifetime of subscriptions and
replication slots are similar.Yeah, I think it makes sense to do for both, we can work on slots
patch separately. I don't see a reason why we shouldn't send a single
message for multiple clear/drop entries.
+1
True but I guess the user can wait for all the tablesyncs to either
finish or get an error corresponding to the table sync. After that, it
can use 'copy_data' as false. This is not a very good method but I
don't see any other option. I guess whatever is the case logging
errors from tablesyncs is anyway not a bad idea.Instead of using the syntax "ALTER SUBSCRIPTION name SET SKIP
TRANSACTION Iconst", isn't it better to use it as a subscription
option like Mark has done for his patch (disable_on_error)?According to the doc, ALTER SUBSCRIPTION ... SET is used to alter
parameters originally set by CREATE SUBSCRIPTION. Therefore, we can
specify a subset of parameters that can be specified by CREATE
SUBSCRIPTION. It makes sense to me for 'disable_on_error' since it can
be specified by CREATE SUBSCRIPTION. Whereas SKIP TRANSACTION stuff
cannot be done. Are you concerned about adding a syntax to ALTER
SUBSCRIPTION?Both for additional syntax and consistency with disable_on_error.
Isn't it just a current implementation that Alter only allows to
change parameters supported by Create? Is there a reason why we can't
allow Alter to set/change some parameters not supported by Create?
I think there is not reason for that but looking at ALTER TABLE I
thought there is such a policy. I thought the skipping transaction
feature is somewhat different from disable_on_error feature. The
former seems a feature to deal with a problem on the spot whereas the
latter seems a setting of a subscription. Anyway, if we use the
subscription option, we can reset the XID by setting 0? Or do we need
ALTER SUBSCRIPTION RESET?
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wed, Jul 7, 2021 at 11:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Jul 6, 2021 at 6:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
According to the doc, ALTER SUBSCRIPTION ... SET is used to alter
parameters originally set by CREATE SUBSCRIPTION. Therefore, we can
specify a subset of parameters that can be specified by CREATE
SUBSCRIPTION. It makes sense to me for 'disable_on_error' since it can
be specified by CREATE SUBSCRIPTION. Whereas SKIP TRANSACTION stuff
cannot be done. Are you concerned about adding a syntax to ALTER
SUBSCRIPTION?Both for additional syntax and consistency with disable_on_error.
Isn't it just a current implementation that Alter only allows to
change parameters supported by Create? Is there a reason why we can't
allow Alter to set/change some parameters not supported by Create?I think there is not reason for that but looking at ALTER TABLE I
thought there is such a policy.
If we are looking for precedent then I think we allow to set
configuration parameters via Alter Database but not via Create
Database. Does that address your concern?
I thought the skipping transaction
feature is somewhat different from disable_on_error feature. The
former seems a feature to deal with a problem on the spot whereas the
latter seems a setting of a subscription. Anyway, if we use the
subscription option, we can reset the XID by setting 0? Or do we need
ALTER SUBSCRIPTION RESET?
The other commands like Alter Table, Alter Database, etc, which
provides a way to Set some parameter/option, have a Reset variant. I
think it would be good to have it for Alter Subscription as well but
we might want to allow other parameters to be reset by that as well.
--
With Regards,
Amit Kapila.
On Thu, Jul 8, 2021 at 6:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Jul 7, 2021 at 11:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Jul 6, 2021 at 6:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
According to the doc, ALTER SUBSCRIPTION ... SET is used to alter
parameters originally set by CREATE SUBSCRIPTION. Therefore, we can
specify a subset of parameters that can be specified by CREATE
SUBSCRIPTION. It makes sense to me for 'disable_on_error' since it can
be specified by CREATE SUBSCRIPTION. Whereas SKIP TRANSACTION stuff
cannot be done. Are you concerned about adding a syntax to ALTER
SUBSCRIPTION?Both for additional syntax and consistency with disable_on_error.
Isn't it just a current implementation that Alter only allows to
change parameters supported by Create? Is there a reason why we can't
allow Alter to set/change some parameters not supported by Create?I think there is not reason for that but looking at ALTER TABLE I
thought there is such a policy.If we are looking for precedent then I think we allow to set
configuration parameters via Alter Database but not via Create
Database. Does that address your concern?
Thank you for the info! But it seems like CREATE DATABASE doesn't
support SET in the first place. Also interestingly, ALTER SUBSCRIPTION
support both ENABLE/DISABLE and SET (enabled = on/off). I’m not sure
from the point of view of consistency with other CREATE, ALTER
commands, and disable_on_error but it might be better to avoid adding
additional syntax.
I thought the skipping transaction
feature is somewhat different from disable_on_error feature. The
former seems a feature to deal with a problem on the spot whereas the
latter seems a setting of a subscription. Anyway, if we use the
subscription option, we can reset the XID by setting 0? Or do we need
ALTER SUBSCRIPTION RESET?The other commands like Alter Table, Alter Database, etc, which
provides a way to Set some parameter/option, have a Reset variant. I
think it would be good to have it for Alter Subscription as well but
we might want to allow other parameters to be reset by that as well.
Agreed.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, Jul 6, 2021 at 7:13 PM Alexey Lesovsky <lesovsky@gmail.com> wrote:
On Tue, Jul 6, 2021 at 10:58 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Also, I'd like to suggest thinking twice about the view name (and function used in view DDL) - "pg_stat_logical_replication_error" contains very common "logical replication" words, but the view contains errors related to subscriptions only. In the future there could be other kinds of errors related to logical replication, but not related to subscriptions - what will you do?
Is pg_stat_subscription_errors or
pg_stat_logical_replication_apply_errors better?It seems to me 'pg_stat_subscription_conflicts' proposed by Amit Kapila is the most suitable, because it directly says about conflicts occurring on the subscription side. The name 'pg_stat_subscription_errors' is also good, especially in case of further extension if some kind of similar errors will be tracked.
I personally prefer pg_stat_subscription_errors since
pg_stat_subscription_conflicts could be used for conflict resolution
features in the future. This stats view I'm proposing is meant to
focus on errors that happened during applying logical changes. So
using the term 'errors' seems to make sense to me.
3. Add a counter field with total number of errors - it helps to calculate errors rates and aggregations (sum), and don't lose information about errors between view checks.
Do you mean to increment the error count if the error (command, xid,
and relid) is the same as the previous one? or to have the total
number of errors per subscription? And what can we infer from the
error rates and aggregations?To be honest, I hurried up when I wrote the first email, and read only about stats view. Later, I read the starting email about the patch and rethought this note.
As I understand, when the conflict occurs, replication stops (until conflict is resolved), an error appears in the stats view. Now, no new errors can occur in the blocked subscription. Hence, there are impossible situations when many errors (like spikes) have occurred and a user didn't see that. If I am correct in my assumption, there is no need for counters. They are necessary only when errors might occur too frequently (like pg_stat_database.deadlocks). But if this is possible, I would prefer the total number of errors per subscription, as also proposed by Amit.
Yeah, the total number of errors seems better.
Under "error rates and aggregations" I also mean in the context of when a high number of errors occured in a short period of time. If a user can read the "total errors" counter and keep this metric in his monitoring system, he will be able to calculate rates over time using functions in the monitoring system. This is extremely useful.
Thanks for your explanation. Agreed. But the rate depends on
wal_retrieve_retry_interval so is not likely to be high in practice.
I also would like to clarify, when conflict is resolved - the error record is cleared or kept in the view? If it is cleared, the error counter is required (because we don't want to lose all history of errors). If it is kept - the flag telling about the error is resolved is needed (or set xid to NULL). I mean when the user is watching the view, he should be able to identify if the error has already been resolved or not.
With the current patch, once the conflict is resolved by skipping the
transaction in question, its entry on the stats view is cleared. As
you suggested, if we have the total error counts in that view, it
would be good to keep the count and clear other fields such as xid,
last_failure, and command etc.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Fri, Jul 9, 2021 at 5:43 AM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:
On Tue, Jul 6, 2021 at 7:13 PM Alexey Lesovsky <lesovsky@gmail.com> wrote:
On Tue, Jul 6, 2021 at 10:58 AM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:
Also, I'd like to suggest thinking twice about the view name (and
function used in view DDL) - "pg_stat_logical_replication_error" contains
very common "logical replication" words, but the view contains errors
related to subscriptions only. In the future there could be other kinds of
errors related to logical replication, but not related to subscriptions -
what will you do?Is pg_stat_subscription_errors or
pg_stat_logical_replication_apply_errors better?It seems to me 'pg_stat_subscription_conflicts' proposed by Amit Kapila
is the most suitable, because it directly says about conflicts occurring on
the subscription side. The name 'pg_stat_subscription_errors' is also good,
especially in case of further extension if some kind of similar errors will
be tracked.I personally prefer pg_stat_subscription_errors since
pg_stat_subscription_conflicts could be used for conflict resolution
features in the future. This stats view I'm proposing is meant to
focus on errors that happened during applying logical changes. So
using the term 'errors' seems to make sense to me.
Agreed
3. Add a counter field with total number of errors - it helps to
calculate errors rates and aggregations (sum), and don't lose information
about errors between view checks.Do you mean to increment the error count if the error (command, xid,
and relid) is the same as the previous one? or to have the total
number of errors per subscription? And what can we infer from the
error rates and aggregations?To be honest, I hurried up when I wrote the first email, and read only
about stats view. Later, I read the starting email about the patch and
rethought this note.As I understand, when the conflict occurs, replication stops (until
conflict is resolved), an error appears in the stats view. Now, no new
errors can occur in the blocked subscription. Hence, there are impossible
situations when many errors (like spikes) have occurred and a user didn't
see that. If I am correct in my assumption, there is no need for counters.
They are necessary only when errors might occur too frequently (like
pg_stat_database.deadlocks). But if this is possible, I would prefer the
total number of errors per subscription, as also proposed by Amit.Yeah, the total number of errors seems better.
Agreed
Under "error rates and aggregations" I also mean in the context of when
a high number of errors occured in a short period of time. If a user can
read the "total errors" counter and keep this metric in his monitoring
system, he will be able to calculate rates over time using functions in the
monitoring system. This is extremely useful.Thanks for your explanation. Agreed. But the rate depends on
wal_retrieve_retry_interval so is not likely to be high in practice.
Agreed
I also would like to clarify, when conflict is resolved - the error
record is cleared or kept in the view? If it is cleared, the error counter
is required (because we don't want to lose all history of errors). If it is
kept - the flag telling about the error is resolved is needed (or set xid
to NULL). I mean when the user is watching the view, he should be able to
identify if the error has already been resolved or not.With the current patch, once the conflict is resolved by skipping the
transaction in question, its entry on the stats view is cleared. As
you suggested, if we have the total error counts in that view, it
would be good to keep the count and clear other fields such as xid,
last_failure, and command etc.
Ok, looks nice. But I am curious how this will work in the case when there
are two (or more) errors in the same subscription, but different relations?
After resolution all these records are kept or they will be merged into a
single record (because subscription was the same for all errors)?
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
--
Regards, Alexey Lesovsky
On Fri, Jul 9, 2021 at 5:58 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Jul 8, 2021 at 6:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Jul 7, 2021 at 11:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Jul 6, 2021 at 6:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
According to the doc, ALTER SUBSCRIPTION ... SET is used to alter
parameters originally set by CREATE SUBSCRIPTION. Therefore, we can
specify a subset of parameters that can be specified by CREATE
SUBSCRIPTION. It makes sense to me for 'disable_on_error' since it can
be specified by CREATE SUBSCRIPTION. Whereas SKIP TRANSACTION stuff
cannot be done. Are you concerned about adding a syntax to ALTER
SUBSCRIPTION?Both for additional syntax and consistency with disable_on_error.
Isn't it just a current implementation that Alter only allows to
change parameters supported by Create? Is there a reason why we can't
allow Alter to set/change some parameters not supported by Create?I think there is not reason for that but looking at ALTER TABLE I
thought there is such a policy.If we are looking for precedent then I think we allow to set
configuration parameters via Alter Database but not via Create
Database. Does that address your concern?Thank you for the info! But it seems like CREATE DATABASE doesn't
support SET in the first place. Also interestingly, ALTER SUBSCRIPTION
support both ENABLE/DISABLE and SET (enabled = on/off).
I think that is redundant but not sure if there is any reason behind doing so.
I’m not sure
from the point of view of consistency with other CREATE, ALTER
commands, and disable_on_error but it might be better to avoid adding
additional syntax.
If we can avoid introducing new syntax that in itself is a good reason
to introduce it as an option.
--
With Regards,
Amit Kapila.
On Fri, Jul 9, 2021 at 9:02 AM Alexey Lesovsky <lesovsky@gmail.com> wrote:
On Fri, Jul 9, 2021 at 5:43 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I also would like to clarify, when conflict is resolved - the error record is cleared or kept in the view? If it is cleared, the error counter is required (because we don't want to lose all history of errors). If it is kept - the flag telling about the error is resolved is needed (or set xid to NULL). I mean when the user is watching the view, he should be able to identify if the error has already been resolved or not.
With the current patch, once the conflict is resolved by skipping the
transaction in question, its entry on the stats view is cleared. As
you suggested, if we have the total error counts in that view, it
would be good to keep the count and clear other fields such as xid,
last_failure, and command etc.Ok, looks nice. But I am curious how this will work in the case when there are two (or more) errors in the same subscription, but different relations?
We can't proceed unless the first error is resolved, so there
shouldn't be multiple unresolved errors. However, there is an
exception to it which is during initial table sync and I think the
view should have separate rows for each table sync.
--
With Regards,
Amit Kapila.
On Mon, Jul 12, 2021 at 8:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
Ok, looks nice. But I am curious how this will work in the case when
there are two (or more) errors in the same subscription, but different
relations?We can't proceed unless the first error is resolved, so there
shouldn't be multiple unresolved errors.
Ok. I thought multiple errors are possible when many tables are initialized
using parallel workers (with max_sync_workers_per_subscription > 1).
--
Regards, Alexey
On Mon, Jul 12, 2021 at 9:37 AM Alexey Lesovsky <lesovsky@gmail.com> wrote:
On Mon, Jul 12, 2021 at 8:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
Ok, looks nice. But I am curious how this will work in the case when there are two (or more) errors in the same subscription, but different relations?
We can't proceed unless the first error is resolved, so there
shouldn't be multiple unresolved errors.Ok. I thought multiple errors are possible when many tables are initialized using parallel workers (with max_sync_workers_per_subscription > 1).
Yeah, that is possible but that covers under the second condition
mentioned by me and in such cases I think we should have separate rows
for each tablesync. Is that right, Sawada-san or do you have something
else in mind?
--
With Regards,
Amit Kapila.
On Mon, Jul 12, 2021 at 1:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Jul 12, 2021 at 9:37 AM Alexey Lesovsky <lesovsky@gmail.com> wrote:
On Mon, Jul 12, 2021 at 8:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
Ok, looks nice. But I am curious how this will work in the case when there are two (or more) errors in the same subscription, but different relations?
We can't proceed unless the first error is resolved, so there
shouldn't be multiple unresolved errors.Ok. I thought multiple errors are possible when many tables are initialized using parallel workers (with max_sync_workers_per_subscription > 1).
Yeah, that is possible but that covers under the second condition
mentioned by me and in such cases I think we should have separate rows
for each tablesync. Is that right, Sawada-san or do you have something
else in mind?
Yeah, I agree to have separate rows for each table sync. The table
should not be processed by both the table sync worker and the apply
worker at a time so the pair of subscription OID and relation OID will
be unique. I think that we have a boolean column in the view,
indicating whether the error entry is reported by the table sync
worker or the apply worker, or maybe we also can have the action
column show "TABLE SYNC" if the error is reported by the table sync
worker.
When it comes to removing the subscription errors in
pgstat_vacuum_stat(), I think we need to seq scan on the hash table
and send the messages to purge the subscription error entries.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Mon, Jul 12, 2021 at 11:13 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Jul 12, 2021 at 1:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Jul 12, 2021 at 9:37 AM Alexey Lesovsky <lesovsky@gmail.com> wrote:
On Mon, Jul 12, 2021 at 8:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
Ok, looks nice. But I am curious how this will work in the case when there are two (or more) errors in the same subscription, but different relations?
We can't proceed unless the first error is resolved, so there
shouldn't be multiple unresolved errors.Ok. I thought multiple errors are possible when many tables are initialized using parallel workers (with max_sync_workers_per_subscription > 1).
Yeah, that is possible but that covers under the second condition
mentioned by me and in such cases I think we should have separate rows
for each tablesync. Is that right, Sawada-san or do you have something
else in mind?Yeah, I agree to have separate rows for each table sync. The table
should not be processed by both the table sync worker and the apply
worker at a time so the pair of subscription OID and relation OID will
be unique. I think that we have a boolean column in the view,
indicating whether the error entry is reported by the table sync
worker or the apply worker, or maybe we also can have the action
column show "TABLE SYNC" if the error is reported by the table sync
worker.
Or similar to backend_type (text) in pg_stat_activity, we can have
something like error_source (text) which will display apply worker or
tablesync worker? I think if we have this column then even if there is
a chance that both apply and sync worker operates on the same
relation, we can identify it via this column.
--
With Regards,
Amit Kapila.
On Mon, Jul 12, 2021 at 8:52 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Jul 12, 2021 at 11:13 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Jul 12, 2021 at 1:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Jul 12, 2021 at 9:37 AM Alexey Lesovsky <lesovsky@gmail.com> wrote:
On Mon, Jul 12, 2021 at 8:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
Ok, looks nice. But I am curious how this will work in the case when there are two (or more) errors in the same subscription, but different relations?
We can't proceed unless the first error is resolved, so there
shouldn't be multiple unresolved errors.Ok. I thought multiple errors are possible when many tables are initialized using parallel workers (with max_sync_workers_per_subscription > 1).
Yeah, that is possible but that covers under the second condition
mentioned by me and in such cases I think we should have separate rows
for each tablesync. Is that right, Sawada-san or do you have something
else in mind?Yeah, I agree to have separate rows for each table sync. The table
should not be processed by both the table sync worker and the apply
worker at a time so the pair of subscription OID and relation OID will
be unique. I think that we have a boolean column in the view,
indicating whether the error entry is reported by the table sync
worker or the apply worker, or maybe we also can have the action
column show "TABLE SYNC" if the error is reported by the table sync
worker.Or similar to backend_type (text) in pg_stat_activity, we can have
something like error_source (text) which will display apply worker or
tablesync worker? I think if we have this column then even if there is
a chance that both apply and sync worker operates on the same
relation, we can identify it via this column.
Sounds good. I'll incorporate this in the next version patch that I'm
planning to submit this week.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wed, Jul 14, 2021 at 5:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Jul 12, 2021 at 8:52 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Jul 12, 2021 at 11:13 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Jul 12, 2021 at 1:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Jul 12, 2021 at 9:37 AM Alexey Lesovsky <lesovsky@gmail.com> wrote:
On Mon, Jul 12, 2021 at 8:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
Ok, looks nice. But I am curious how this will work in the case when there are two (or more) errors in the same subscription, but different relations?
We can't proceed unless the first error is resolved, so there
shouldn't be multiple unresolved errors.Ok. I thought multiple errors are possible when many tables are initialized using parallel workers (with max_sync_workers_per_subscription > 1).
Yeah, that is possible but that covers under the second condition
mentioned by me and in such cases I think we should have separate rows
for each tablesync. Is that right, Sawada-san or do you have something
else in mind?Yeah, I agree to have separate rows for each table sync. The table
should not be processed by both the table sync worker and the apply
worker at a time so the pair of subscription OID and relation OID will
be unique. I think that we have a boolean column in the view,
indicating whether the error entry is reported by the table sync
worker or the apply worker, or maybe we also can have the action
column show "TABLE SYNC" if the error is reported by the table sync
worker.Or similar to backend_type (text) in pg_stat_activity, we can have
something like error_source (text) which will display apply worker or
tablesync worker? I think if we have this column then even if there is
a chance that both apply and sync worker operates on the same
relation, we can identify it via this column.Sounds good. I'll incorporate this in the next version patch that I'm
planning to submit this week.
Sorry, I could not make it this week. I'll submit them early next week.
While updating the patch I thought we need to have more design
discussion on two points of clearing error details after the error is
resolved:
1. How to clear apply worker errors. IIUC we've discussed that once
the apply worker skipped the transaction we leave the error entry
itself but clear its fields except for some fields such as failure
counts. But given that the stats messages could be lost, how can we
ensure to clear those error details? For table sync workers’ error, we
can have autovacuum workers periodically check entires of
pg_subscription_rel and clear the error entry if the table sync worker
completes table sync (i.g., checking if srsubstate = ‘r’). But there
is no such information for the apply workers and subscriptions. In
addition to sending the message clearing the error details just after
skipping the transaction, I thought that we can have apply workers
periodically send the message clearing the error details but it seems
not good.
2. Do we really want to leave the table sync worker even after the
error is resolved and the table sync completes? Unlike the apply
worker error, the number of table sync worker errors could be very
large, for example, if a subscriber subscribes to many tables. If we
leave those errors in the stats view, it uses more memory space and
could affect writing and reading stats file performance. If such left
table sync error entries are not helpful in practice I think we can
remove them rather than clear some fields. What do you think?
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Fri, Jul 16, 2021 at 8:33 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Jul 14, 2021 at 5:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Sounds good. I'll incorporate this in the next version patch that I'm
planning to submit this week.Sorry, I could not make it this week. I'll submit them early next week.
No problem.
While updating the patch I thought we need to have more design
discussion on two points of clearing error details after the error is
resolved:1. How to clear apply worker errors. IIUC we've discussed that once
the apply worker skipped the transaction we leave the error entry
itself but clear its fields except for some fields such as failure
counts. But given that the stats messages could be lost, how can we
ensure to clear those error details? For table sync workers’ error, we
can have autovacuum workers periodically check entires of
pg_subscription_rel and clear the error entry if the table sync worker
completes table sync (i.g., checking if srsubstate = ‘r’). But there
is no such information for the apply workers and subscriptions.
But won't the corresponding subscription (pg_subscription) have the
XID as InvalidTransactionid once the xid is skipped or at least a
different XID then we would have in pg_stat view? Can we use that to
reset entry via vacuum?
In
addition to sending the message clearing the error details just after
skipping the transaction, I thought that we can have apply workers
periodically send the message clearing the error details but it seems
not good.
Yeah, such things should be a last resort.
2. Do we really want to leave the table sync worker even after the
error is resolved and the table sync completes? Unlike the apply
worker error, the number of table sync worker errors could be very
large, for example, if a subscriber subscribes to many tables. If we
leave those errors in the stats view, it uses more memory space and
could affect writing and reading stats file performance. If such left
table sync error entries are not helpful in practice I think we can
remove them rather than clear some fields. What do you think?
Sounds reasonable to me. One might think to update the subscription
error count by including table_sync errors but not sure if that is
helpful and even if that is helpful, we can extend it later.
--
With Regards,
Amit Kapila.
On Mon, Jul 19, 2021 at 2:22 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Jul 16, 2021 at 8:33 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Jul 14, 2021 at 5:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Sounds good. I'll incorporate this in the next version patch that I'm
planning to submit this week.Sorry, I could not make it this week. I'll submit them early next week.
No problem.
While updating the patch I thought we need to have more design
discussion on two points of clearing error details after the error is
resolved:1. How to clear apply worker errors. IIUC we've discussed that once
the apply worker skipped the transaction we leave the error entry
itself but clear its fields except for some fields such as failure
counts. But given that the stats messages could be lost, how can we
ensure to clear those error details? For table sync workers’ error, we
can have autovacuum workers periodically check entires of
pg_subscription_rel and clear the error entry if the table sync worker
completes table sync (i.g., checking if srsubstate = ‘r’). But there
is no such information for the apply workers and subscriptions.But won't the corresponding subscription (pg_subscription) have the
XID as InvalidTransactionid once the xid is skipped or at least a
different XID then we would have in pg_stat view? Can we use that to
reset entry via vacuum?
I think the XID is InvalidTransaction until the user specifies it. So
I think we cannot know whether we're before skipping or after skipping
only by the transaction ID. No?
In
addition to sending the message clearing the error details just after
skipping the transaction, I thought that we can have apply workers
periodically send the message clearing the error details but it seems
not good.Yeah, such things should be a last resort.
2. Do we really want to leave the table sync worker even after the
error is resolved and the table sync completes? Unlike the apply
worker error, the number of table sync worker errors could be very
large, for example, if a subscriber subscribes to many tables. If we
leave those errors in the stats view, it uses more memory space and
could affect writing and reading stats file performance. If such left
table sync error entries are not helpful in practice I think we can
remove them rather than clear some fields. What do you think?Sounds reasonable to me. One might think to update the subscription
error count by including table_sync errors but not sure if that is
helpful and even if that is helpful, we can extend it later.
Agreed.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Sat, Jul 17, 2021 at 12:02 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Jul 14, 2021 at 5:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Jul 12, 2021 at 8:52 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Jul 12, 2021 at 11:13 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Jul 12, 2021 at 1:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Jul 12, 2021 at 9:37 AM Alexey Lesovsky <lesovsky@gmail.com> wrote:
On Mon, Jul 12, 2021 at 8:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
Ok, looks nice. But I am curious how this will work in the case when there are two (or more) errors in the same subscription, but different relations?
We can't proceed unless the first error is resolved, so there
shouldn't be multiple unresolved errors.Ok. I thought multiple errors are possible when many tables are initialized using parallel workers (with max_sync_workers_per_subscription > 1).
Yeah, that is possible but that covers under the second condition
mentioned by me and in such cases I think we should have separate rows
for each tablesync. Is that right, Sawada-san or do you have something
else in mind?Yeah, I agree to have separate rows for each table sync. The table
should not be processed by both the table sync worker and the apply
worker at a time so the pair of subscription OID and relation OID will
be unique. I think that we have a boolean column in the view,
indicating whether the error entry is reported by the table sync
worker or the apply worker, or maybe we also can have the action
column show "TABLE SYNC" if the error is reported by the table sync
worker.Or similar to backend_type (text) in pg_stat_activity, we can have
something like error_source (text) which will display apply worker or
tablesync worker? I think if we have this column then even if there is
a chance that both apply and sync worker operates on the same
relation, we can identify it via this column.Sounds good. I'll incorporate this in the next version patch that I'm
planning to submit this week.Sorry, I could not make it this week. I'll submit them early next week.
While updating the patch I thought we need to have more design
discussion on two points of clearing error details after the error is
resolved:1. How to clear apply worker errors. IIUC we've discussed that once
the apply worker skipped the transaction we leave the error entry
itself but clear its fields except for some fields such as failure
counts. But given that the stats messages could be lost, how can we
ensure to clear those error details? For table sync workers’ error, we
can have autovacuum workers periodically check entires of
pg_subscription_rel and clear the error entry if the table sync worker
completes table sync (i.g., checking if srsubstate = ‘r’). But there
is no such information for the apply workers and subscriptions. In
addition to sending the message clearing the error details just after
skipping the transaction, I thought that we can have apply workers
periodically send the message clearing the error details but it seems
not good.
I think that the motivation behind the idea of leaving error entries
and clearing theirs some fields is that users can check if the error
is successfully resolved and the worker is working find. But we can
check it also in another way, for example, checking
pg_stat_subscription view. So is it worth considering leaving the
apply worker errors as they are?
2. Do we really want to leave the table sync worker even after the
error is resolved and the table sync completes? Unlike the apply
worker error, the number of table sync worker errors could be very
large, for example, if a subscriber subscribes to many tables. If we
leave those errors in the stats view, it uses more memory space and
could affect writing and reading stats file performance. If such left
table sync error entries are not helpful in practice I think we can
remove them rather than clear some fields. What do you think?
I've attached the updated version patch that incorporated all comments
I got so far except for the clearing error details part I mentioned
above. After getting a consensus on those parts, I'll incorporate the
idea into the patches.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v2-0003-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchapplication/x-patch; name=v2-0003-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchDownload
From 4a2abc82db9ab37699f09df9be86f150c58db3cf Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 28 Jun 2021 13:18:58 +0900
Subject: [PATCH v2 3/3] Add skip_xid option to ALTER SUBSCRIPTION.
---
doc/src/sgml/logical-replication.sgml | 33 ++-
doc/src/sgml/ref/alter_subscription.sgml | 47 +++-
src/backend/catalog/pg_subscription.c | 10 +
src/backend/commands/subscriptioncmds.c | 138 +++++++++--
src/backend/parser/gram.y | 11 +-
src/backend/replication/logical/worker.c | 252 +++++++++++++++++----
src/include/catalog/pg_subscription.h | 4 +
src/include/nodes/parsenodes.h | 4 +-
src/test/regress/expected/subscription.out | 24 ++
src/test/regress/sql/subscription.sql | 20 ++
src/test/subscription/t/023_skip_xact.pl | 185 +++++++++++++++
11 files changed, 645 insertions(+), 83 deletions(-)
create mode 100644 src/test/subscription/t/023_skip_xact.pl
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 88646bc859..d222e64122 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -333,14 +333,41 @@
<para>
A conflict will produce an error and will stop the replication; it must be
resolved manually by the user. Details about the conflict can be found in
- the subscriber's server log.
+ <link linkend="monitoring-pg-stat-subscription-errors"/> as well as the
+ subscriber's server log.
</para>
<para>
The resolution can be done either by changing data on the subscriber so
that it does not conflict with the incoming change or by skipping the
- transaction that conflicts with the existing data. The transaction can be
- skipped by calling the <link linkend="pg-replication-origin-advance">
+ transaction that conflicts with the existing data. When a conflict produce
+ an error, it is shown in <structname>pg_stat_subscription_errors</structname>
+ view as follows:
+ </para>
+
+ <programlisting>
+postgres=# SELECT * FROM pg_stat_subscription_errors;
+ datname | subname | relid | command | xid | failure_source | failure_count | last_failure | last_failure_message
+----------+----------+-------+---------+-----+----------------+---------------+-------------------------------+------------------------------------------------------------
+ postgres | test_sub | 16385 | INSERT | 740 | apply | 1 | 2021-07-15 21:54:58.804595+00 | duplicate key value violates unique constraint "test_pkey"
+</programlisting>
+
+ <para>
+ and it is also shown in subscriber's server log as follows:
+ </para>
+
+<screen>
+ERROR: duplicate key value violates unique constraint "test_pkey"
+DETAIL: Key (id)=(1) already exists.
+CONTEXT: during apply of "INSERT" for relation "public.test" in transaction with xid 740 committs 2021-07-15 21:54:58.802874+00
+</screen>
+
+ <para>
+ The transaction ID to skip (740 in above cases) can be found in those outputs.
+ The transaction can be skipped by setting <replaceable>skip_xid</replaceable> to
+ the subscription by <command>ALTER SUBSCRIPTION ... SET</command>.
+ Alternatively, The transaction can also be skipped by calling the
+ <link linkend="pg-replication-origin-advance">
<function>pg_replication_origin_advance()</function></link> function with
a <parameter>node_name</parameter> corresponding to the subscription name,
and a position. The current position of origins can be seen in the
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index a6f994450d..e961f83eca 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -29,6 +29,7 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> REFRESH PUB
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> ENABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> DISABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <replaceable>new_name</replaceable>
</synopsis>
@@ -192,15 +193,47 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
<varlistentry>
<term><literal>SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )</literal></term>
+ <term><literal>RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )</literal></term>
<listitem>
<para>
- This clause alters parameters originally set by
- <xref linkend="sql-createsubscription"/>. See there for more
- information. The parameters that can be altered
- are <literal>slot_name</literal>,
- <literal>synchronous_commit</literal>,
- <literal>binary</literal>, and
- <literal>streaming</literal>.
+ This clause sets or resets a subscription option. The parameters that can be
+ set are the parameters originally set by <xref linkend="sql-createsubscription"/>:
+ <literal>slot_name</literal>, <literal>synchronous_commit</literal>,
+ <literal>binary</literal>, <literal>streaming</literal>, and following
+ parameter:
+ </para>
+ <para>
+ <variablelist>
+ <varlistentry>
+ <term><literal>skip_xid</literal> (<type>xid</type>)</term>
+ <listitem>
+ <para>
+ If incoming data violates any constraints the logical replication
+ will stop until it is resolved (See
+ <xref linkend="logical-replication-conflicts"/> for the details).
+ The resolution can be done either by changing data on the
+ subscriber so that it doesn't conflict with incoming change or by
+ skipping the whole transaction. This option specifies transaction
+ ID that logical replication worker skips to apply. The logical
+ replication worker skips all data modification changes of the
+ specified transaction. Therefore, since it skips the whole
+ transaction including the changes that don't violate the constraint,
+ it should only be used as a last resort. This option has no effect
+ for the transaction that is already prepared with enabling
+ <literal>two_phase</literal> on susbscriber. After the logical
+ replication successfully skips the transaction, the transaction ID
+ (stored in
+ <structname>pg_subscription</structname>.<structfield>subskipxid</structfield>)
+ is cleared.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ <para>
+ The parameters that can be reset are: <literal>streaming</literal>,
+ <literal>binary</literal>, <literal>synchronous_commit</literal>,
+ and <literal>skip_xid</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index 25021e25a4..cb22cd7463 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -104,6 +104,16 @@ GetSubscription(Oid subid, bool missing_ok)
Assert(!isnull);
sub->publications = textarray_to_stringlist(DatumGetArrayTypeP(datum));
+ /* Get skip XID */
+ datum = SysCacheGetAttr(SUBSCRIPTIONOID,
+ tup,
+ Anum_pg_subscription_subskipxid,
+ &isnull);
+ if (!isnull)
+ sub->skipxid = DatumGetTransactionId(datum);
+ else
+ sub->skipxid = InvalidTransactionId;
+
ReleaseSysCache(tup);
return sub;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 239d263f83..b0a4b1de60 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -60,6 +60,7 @@
#define SUBOPT_BINARY 0x00000080
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
+#define SUBOPT_SKIP_XID 0x00000400
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -81,6 +82,7 @@ typedef struct SubOpts
bool binary;
bool streaming;
bool twophase;
+ TransactionId skip_xid;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -99,7 +101,8 @@ static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname,
*/
static void
parse_subscription_options(ParseState *pstate, List *stmt_options,
- bits32 supported_opts, SubOpts *opts)
+ bits32 supported_opts, SubOpts *opts,
+ bool is_reset)
{
ListCell *lc;
@@ -128,12 +131,23 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->streaming = false;
if (IsSet(supported_opts, SUBOPT_TWOPHASE_COMMIT))
opts->twophase = false;
+ if (IsSet(supported_opts, SUBOPT_SKIP_XID))
+ opts->skip_xid = InvalidTransactionId;
/* Parse options */
foreach(lc, stmt_options)
{
DefElem *defel = (DefElem *) lfirst(lc);
+ if (is_reset)
+ {
+ if (defel->arg != NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("RESET must not include values for parameters")));
+
+ }
+
if (IsSet(supported_opts, SUBOPT_CONNECT) &&
strcmp(defel->defname, "connect") == 0)
{
@@ -141,7 +155,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_CONNECT;
- opts->connect = defGetBoolean(defel);
+ if (!is_reset)
+ opts->connect = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_ENABLED) &&
strcmp(defel->defname, "enabled") == 0)
@@ -150,7 +165,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_ENABLED;
- opts->enabled = defGetBoolean(defel);
+ if (!is_reset)
+ opts->enabled = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_CREATE_SLOT) &&
strcmp(defel->defname, "create_slot") == 0)
@@ -159,7 +175,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_CREATE_SLOT;
- opts->create_slot = defGetBoolean(defel);
+ if (!is_reset)
+ opts->create_slot = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_SLOT_NAME) &&
strcmp(defel->defname, "slot_name") == 0)
@@ -168,7 +185,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_SLOT_NAME;
- opts->slot_name = defGetString(defel);
+ if (!is_reset)
+ opts->slot_name = defGetString(defel);
/* Setting slot_name = NONE is treated as no slot name. */
if (strcmp(opts->slot_name, "none") == 0)
@@ -181,7 +199,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_COPY_DATA;
- opts->copy_data = defGetBoolean(defel);
+ if (!is_reset)
+ opts->copy_data = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_SYNCHRONOUS_COMMIT) &&
strcmp(defel->defname, "synchronous_commit") == 0)
@@ -190,12 +209,15 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_SYNCHRONOUS_COMMIT;
- opts->synchronous_commit = defGetString(defel);
+ if (!is_reset)
+ {
+ opts->synchronous_commit = defGetString(defel);
- /* Test if the given value is valid for synchronous_commit GUC. */
- (void) set_config_option("synchronous_commit", opts->synchronous_commit,
- PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
- false, 0, false);
+ /* Test if the given value is valid for synchronous_commit GUC. */
+ (void) set_config_option("synchronous_commit", opts->synchronous_commit,
+ PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
+ false, 0, false);
+ }
}
else if (IsSet(supported_opts, SUBOPT_REFRESH) &&
strcmp(defel->defname, "refresh") == 0)
@@ -204,7 +226,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_REFRESH;
- opts->refresh = defGetBoolean(defel);
+ if (!is_reset)
+ opts->refresh = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_BINARY) &&
strcmp(defel->defname, "binary") == 0)
@@ -213,7 +236,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_BINARY;
- opts->binary = defGetBoolean(defel);
+ if (!is_reset)
+ opts->binary = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_STREAMING) &&
strcmp(defel->defname, "streaming") == 0)
@@ -222,7 +246,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_STREAMING;
- opts->streaming = defGetBoolean(defel);
+ if (!is_reset)
+ opts->streaming = defGetBoolean(defel);
}
else if (strcmp(defel->defname, "two_phase") == 0)
{
@@ -243,7 +268,26 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_TWOPHASE_COMMIT;
- opts->twophase = defGetBoolean(defel);
+ if (!is_reset)
+ opts->twophase = defGetBoolean(defel);
+ }
+ else if (strcmp(defel->defname, "skip_xid") == 0)
+ {
+ if (!is_reset)
+ {
+ int64 arg;
+ TransactionId xid;
+
+ arg = defGetInt64(defel);
+ xid = (TransactionId) arg;
+ if (arg < 0 || !TransactionIdIsNormal(xid))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid transaction id")));
+ opts->skip_xid = xid;
+ }
+
+ opts->specified_opts |= SUBOPT_SKIP_XID;
}
else
ereport(ERROR,
@@ -414,7 +458,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT);
- parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
+ parse_subscription_options(pstate, stmt->options, supported_opts, &opts,
+ false);
/*
* Since creating a replication slot is not transactional, rolling back
@@ -487,6 +532,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
CharGetDatum(opts.twophase ?
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -883,14 +929,14 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
switch (stmt->kind)
{
- case ALTER_SUBSCRIPTION_OPTIONS:
+ case ALTER_SUBSCRIPTION_SET_OPTIONS:
{
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
- SUBOPT_STREAMING);
+ SUBOPT_STREAMING | SUBOPT_SKIP_XID);
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
if (IsSet(opts.specified_opts, SUBOPT_SLOT_NAME))
{
@@ -935,14 +981,60 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
replaces[Anum_pg_subscription_substream - 1] = true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_SKIP_XID))
+ {
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(opts.skip_xid);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+ }
+
update_tuple = true;
break;
}
+ case ALTER_SUBSCRIPTION_RESET_OPTIONS:
+ {
+ supported_opts = (SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_STREAMING |
+ SUBOPT_BINARY | SUBOPT_SKIP_XID);
+
+ parse_subscription_options(pstate, stmt->options,
+ supported_opts, &opts, true);
+
+ if (IsSet(opts.specified_opts, SUBOPT_SYNCHRONOUS_COMMIT))
+ {
+ values[Anum_pg_subscription_subsynccommit - 1] =
+ CStringGetTextDatum("off");
+ replaces[Anum_pg_subscription_subsynccommit - 1] = true;
+ }
+
+ if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
+ {
+ values[Anum_pg_subscription_substream - 1] =
+ BoolGetDatum(false);
+ replaces[Anum_pg_subscription_substream - 1] = true;
+ }
+
+ if (IsSet(opts.specified_opts, SUBOPT_BINARY))
+ {
+ values[Anum_pg_subscription_subbinary - 1] =
+ BoolGetDatum(false);
+ replaces[Anum_pg_subscription_subbinary - 1] = true;
+ }
+
+ if (IsSet(opts.specified_opts, SUBOPT_SKIP_XID))
+ {
+ nulls[Anum_pg_subscription_subskipxid - 1] =
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+ }
+
+ update_tuple = true;
+ break;
+ }
case ALTER_SUBSCRIPTION_ENABLED:
{
parse_subscription_options(pstate, stmt->options,
- SUBOPT_ENABLED, &opts);
+ SUBOPT_ENABLED, &opts, false);
+
Assert(IsSet(opts.specified_opts, SUBOPT_ENABLED));
if (!sub->slotname && opts.enabled)
@@ -977,7 +1069,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
{
supported_opts = SUBOPT_COPY_DATA | SUBOPT_REFRESH;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
values[Anum_pg_subscription_subpublications - 1] =
publicationListToArray(stmt->publication);
@@ -1027,7 +1119,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts |= SUBOPT_COPY_DATA;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
publist = merge_publications(sub->publications, stmt->publication, isadd, stmt->subname);
values[Anum_pg_subscription_subpublications - 1] =
@@ -1075,7 +1167,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
errmsg("ALTER SUBSCRIPTION ... REFRESH is not allowed for disabled subscriptions")));
parse_subscription_options(pstate, stmt->options,
- SUBOPT_COPY_DATA, &opts);
+ SUBOPT_COPY_DATA, &opts, false);
/*
* The subscription option "two_phase" requires that
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 10da5c5c51..41a1d333f6 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9699,7 +9699,16 @@ AlterSubscriptionStmt:
{
AlterSubscriptionStmt *n =
makeNode(AlterSubscriptionStmt);
- n->kind = ALTER_SUBSCRIPTION_OPTIONS;
+ n->kind = ALTER_SUBSCRIPTION_SET_OPTIONS;
+ n->subname = $3;
+ n->options = $5;
+ $$ = (Node *)n;
+ }
+ | ALTER SUBSCRIPTION name RESET definition
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_RESET_OPTIONS;
n->subname = $3;
n->options = $5;
$$ = (Node *)n;
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 7c2ec983bb..e09929206f 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -136,6 +136,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -277,6 +278,16 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * True if we're skipping changes of the specified transaction in
+ * MySubscription->skip_xid. Please note that we don’t skip receiving the changes
+ * since we decide whether or not to skip applying the changes when starting to
+ * apply. When stopping the skipping behavior, we reset the skip XID (subskipxid)
+ * in the pg_subscription and associate origin status to the transaction that resets
+ * the skip XID so that we can start streaming from the next transaction.
+ */
+static bool skipping_changes = false;
+
/*
* Hash table for storing the streaming xid information along with shared file
* set for streaming and subxact files.
@@ -332,8 +343,7 @@ static void maybe_reread_subscription(void);
/* prototype needed because of stream_commit */
static void apply_dispatch(StringInfo s);
-static void apply_handle_commit_internal(StringInfo s,
- LogicalRepCommitData *commit_data);
+static void apply_handle_commit_internal(LogicalRepCommitData *commit_data);
static void apply_handle_insert_internal(ApplyExecutionData *edata,
ResultRelInfo *relinfo,
TupleTableSlot *remoteslot);
@@ -361,6 +371,9 @@ static void set_apply_error_context_rel(LogicalRepRelMapEntry *rel);
static void reset_apply_error_context_rel(void);
static void reset_apply_error_context_info(void);
+static void maybe_start_skipping_changes(TransactionId xid);
+static void stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs);
+
/*
* Should this worker apply changes for given relation.
*
@@ -858,6 +871,9 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
+ /* Start skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -882,7 +898,18 @@ apply_handle_commit(StringInfo s)
LSN_FORMAT_ARGS(commit_data.commit_lsn),
LSN_FORMAT_ARGS(remote_final_lsn))));
- apply_handle_commit_internal(s, &commit_data);
+ /*
+ * Stop the skipping transaction if enabled. Otherwise, commit the
+ * changes that are just applied.
+ */
+ if (skipping_changes)
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ apply_handle_commit_internal(&commit_data);
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(commit_data.end_lsn);
@@ -910,6 +937,9 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
+ /* Start skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -936,47 +966,55 @@ apply_handle_prepare(StringInfo s)
LSN_FORMAT_ARGS(remote_final_lsn))));
/*
- * Compute unique GID for two_phase transactions. We don't use GID of
- * prepared transaction sent by server as that can lead to deadlock when
- * we have multiple subscriptions from same node point to publications on
- * the same node. See comments atop worker.c
+ * Prepare transaction if we haven't skipped the changes of this transaction.
*/
- TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
- gid, sizeof(gid));
+ if (skipping_changes)
+ stop_skipping_changes(prepare_data.end_lsn, prepare_data.prepare_time);
+ else
+ {
+ /*
+ * Compute unique GID for two_phase transactions. We don't use GID of
+ * prepared transaction sent by server as that can lead to deadlock when
+ * we have multiple subscriptions from same node point to publications on
+ * the same node. See comments atop worker.c
+ */
+ TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
+ gid, sizeof(gid));
- /*
- * Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction. It is done this way because at
- * commit prepared time, we won't know whether we have skipped preparing a
- * transaction because of no change.
- *
- * XXX, We can optimize such that at commit prepared time, we first check
- * whether we have prepared the transaction or not but that doesn't seem
- * worthwhile because such cases shouldn't be common.
- */
- begin_replication_step();
+ /*
+ * Unlike commit, here, we always prepare the transaction even though no
+ * change has happened in this transaction. It is done this way because at
+ * commit prepared time, we won't know whether we have skipped preparing a
+ * transaction because of no change.
+ *
+ * XXX, We can optimize such that at commit prepared time, we first check
+ * whether we have prepared the transaction or not but that doesn't seem
+ * worthwhile because such cases shouldn't be common.
+ */
+ begin_replication_step();
- /*
- * BeginTransactionBlock is necessary to balance the EndTransactionBlock
- * called within the PrepareTransactionBlock below.
- */
- BeginTransactionBlock();
- CommitTransactionCommand(); /* Completes the preceding Begin command. */
+ /*
+ * BeginTransactionBlock is necessary to balance the EndTransactionBlock
+ * called within the PrepareTransactionBlock below.
+ */
+ BeginTransactionBlock();
+ CommitTransactionCommand(); /* Completes the preceding Begin command. */
- /*
- * Update origin state so we can restart streaming from correct position
- * in case of crash.
- */
- replorigin_session_origin_lsn = prepare_data.end_lsn;
- replorigin_session_origin_timestamp = prepare_data.prepare_time;
+ /*
+ * Update origin state so we can restart streaming from correct position
+ * in case of crash.
+ */
+ replorigin_session_origin_lsn = prepare_data.end_lsn;
+ replorigin_session_origin_timestamp = prepare_data.prepare_time;
- PrepareTransactionBlock(gid);
- end_replication_step();
- CommitTransactionCommand();
- pgstat_report_stat(false);
+ PrepareTransactionBlock(gid);
+ end_replication_step();
+ CommitTransactionCommand();
+ pgstat_report_stat(false);
- store_flush_position(prepare_data.end_lsn);
+ }
+ store_flush_position(prepare_data.end_lsn);
in_remote_transaction = false;
/* Process any tables that are being synchronized in parallel. */
@@ -1089,9 +1127,10 @@ apply_handle_origin(StringInfo s)
{
/*
* ORIGIN message can only come inside streaming transaction or inside
- * remote transaction and before any actual writes.
+ * remote transaction and before any actual writes unless we're skipping
+ * changes of the transaction.
*/
- if (!in_streamed_transaction &&
+ if (!in_streamed_transaction && !skipping_changes &&
(!in_remote_transaction ||
(IsTransactionState() && !am_tablesync_worker())))
ereport(ERROR,
@@ -1113,6 +1152,9 @@ apply_handle_stream_start(StringInfo s)
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg_internal("duplicate STREAM START message")));
+ /* extract XID of the top-level transaction */
+ stream_xid = logicalrep_read_stream_start(s, &first_segment);
+
/*
* Start a transaction on stream start, this transaction will be committed
* on the stream stop unless it is a tablesync worker in which case it
@@ -1125,9 +1167,6 @@ apply_handle_stream_start(StringInfo s)
/* notify handle methods we're processing a remote transaction */
in_streamed_transaction = true;
- /* extract XID of the top-level transaction */
- stream_xid = logicalrep_read_stream_start(s, &first_segment);
-
if (!TransactionIdIsValid(stream_xid))
ereport(ERROR,
(errcode(ERRCODE_PROTOCOL_VIOLATION),
@@ -1209,6 +1248,7 @@ apply_handle_stream_abort(StringInfo s)
errmsg_internal("STREAM ABORT message without STREAM STOP")));
logicalrep_read_stream_abort(s, &xid, &subxid);
+ maybe_start_skipping_changes(xid);
/*
* If the two XIDs are the same, it's in fact abort of toplevel xact, so
@@ -1301,6 +1341,10 @@ apply_handle_stream_abort(StringInfo s)
CommitTransactionCommand();
}
+ /* Stop the skipping transaction if enabled */
+ if (skipping_changes)
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
reset_apply_error_context_info();
}
@@ -1311,11 +1355,11 @@ static void
apply_handle_stream_commit(StringInfo s)
{
TransactionId xid;
+ LogicalRepCommitData commit_data;
StringInfoData s2;
int nchanges;
char path[MAXPGPATH];
char *buffer = NULL;
- LogicalRepCommitData commit_data;
StreamXidHash *ent;
MemoryContext oldcxt;
BufFile *fd;
@@ -1329,8 +1373,13 @@ apply_handle_stream_commit(StringInfo s)
apply_error_callback_arg.remote_xid = xid;
apply_error_callback_arg.committs = commit_data.committime;
+ remote_final_lsn = commit_data.commit_lsn;
+
elog(DEBUG1, "received commit for streamed transaction %u", xid);
+ /* Start skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(xid);
+
/* Make sure we have an open transaction */
begin_replication_step();
@@ -1362,13 +1411,12 @@ apply_handle_stream_commit(StringInfo s)
MemoryContextSwitchTo(oldcxt);
- remote_final_lsn = commit_data.commit_lsn;
-
/*
* Make sure the handle apply_dispatch methods are aware we're in a remote
* transaction.
*/
in_remote_transaction = true;
+
pgstat_report_activity(STATE_RUNNING, NULL);
end_replication_step();
@@ -1441,7 +1489,17 @@ apply_handle_stream_commit(StringInfo s)
elog(DEBUG1, "replayed %d (all) changes from file \"%s\"",
nchanges, path);
- apply_handle_commit_internal(s, &commit_data);
+ if (skipping_changes)
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ {
+ /* commit the streamed transaction */
+ apply_handle_commit_internal(&commit_data);
+ }
/* unlink the files with serialized changes and subxact info */
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
@@ -1450,7 +1508,6 @@ apply_handle_stream_commit(StringInfo s)
process_syncing_tables(commit_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
-
reset_apply_error_context_info();
}
@@ -1458,7 +1515,7 @@ apply_handle_stream_commit(StringInfo s)
* Helper function for apply_handle_commit and apply_handle_stream_commit.
*/
static void
-apply_handle_commit_internal(StringInfo s, LogicalRepCommitData *commit_data)
+apply_handle_commit_internal(LogicalRepCommitData *commit_data)
{
if (IsTransactionState())
{
@@ -2330,6 +2387,17 @@ apply_dispatch(StringInfo s)
LogicalRepMsgType action = pq_getmsgbyte(s);
ErrorContextCallback errcallback;
+ /*
+ * Skip all data-modification changes if we're skipping changes of this
+ * transaction.
+ */
+ if (skipping_changes &&
+ (action == LOGICAL_REP_MSG_INSERT ||
+ action == LOGICAL_REP_MSG_UPDATE ||
+ action == LOGICAL_REP_MSG_DELETE ||
+ action == LOGICAL_REP_MSG_TRUNCATE))
+ return;
+
/*
* Push apply error context callback. Other fields will be filled
* during applying the change.
@@ -3789,3 +3857,91 @@ reset_logicalrep_error_context_rel(void)
apply_error_callback_arg.relname = NULL;
}
}
+
+/*
+ * Start skipping changes of the transaction if the given XID matches the
+ * transaction ID specified by skip_xid option.
+ */
+static void
+maybe_start_skipping_changes(TransactionId xid)
+{
+ Assert(!skipping_changes);
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ if (!TransactionIdIsValid(MySubscription->skipxid) ||
+ MySubscription->skipxid != xid)
+ return;
+
+ skipping_changes = true;
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction with xid %u",
+ xid));
+}
+
+/*
+ * Stop skipping changes and reset the skip XID.
+ *
+ * If origin_lsn and origin_committs are valid, we set origin state to the
+ * transaction commit that resets the skip XID so that we can start streaming
+ * from the transaction next to the one that we just skipped.
+ */
+static void
+stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs)
+{
+ Relation rel;
+ HeapTuple tup;
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ Assert(skipping_changes);
+ Assert(TransactionIdIsValid(MySubscription->skipxid));
+ Assert(in_remote_transaction);
+
+ /* Stop skipping changes */
+ skipping_changes = false;
+ ereport(LOG,
+ errmsg("done skipping logical replication transaction with xid %u",
+ MySubscription->skipxid));
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ /* Update the system catalog to reset the skip XID */
+ if (!IsTransactionState())
+ StartTransactionCommand();
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ TransactionIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ /* Set subskipxid to null */
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+
+ heap_freetuple(tup);
+ table_close(rel, RowExclusiveLock);
+
+ if (!XLogRecPtrIsInvalid(origin_lsn))
+ {
+ /*
+ * Update origin state so we can restart streaming from correct
+ * position in case of crash.
+ */
+ replorigin_session_origin_lsn = origin_lsn;
+ replorigin_session_origin_timestamp = origin_committs;
+ }
+
+ CommitTransactionCommand();
+ pgstat_report_stat(false);
+}
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 21061493ea..e5a95a02ec 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -67,6 +67,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
char subtwophasestate; /* Stream two-phase transactions */
+ TransactionId subskipxid BKI_FORCE_NULL; /* All changes associated with
+ this XID are skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -103,6 +106,7 @@ typedef struct Subscription
* binary format */
bool stream; /* Allow streaming in-progress transactions. */
char twophasestate; /* Allow streaming two-phase transactions */
+ TransactionId skipxid; /* All changes of the XID are skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index def9651b34..2c6d321284 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3658,7 +3658,8 @@ typedef struct CreateSubscriptionStmt
typedef enum AlterSubscriptionType
{
- ALTER_SUBSCRIPTION_OPTIONS,
+ ALTER_SUBSCRIPTION_SET_OPTIONS,
+ ALTER_SUBSCRIPTION_RESET_OPTIONS,
ALTER_SUBSCRIPTION_CONNECTION,
ALTER_SUBSCRIPTION_SET_PUBLICATION,
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
@@ -3675,6 +3676,7 @@ typedef struct AlterSubscriptionStmt
char *conninfo; /* Connection string to publisher */
List *publication; /* One or more publication to subscribe to */
List *options; /* List of DefElem nodes */
+ TransactionId skip_xid; /* XID to skip */
} AlterSubscriptionStmt;
typedef struct DropSubscriptionStmt
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index ad6b4e4bd3..11c9da4162 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -283,6 +283,30 @@ ERROR: unrecognized subscription parameter: "two_phase"
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
ERROR: cannot set streaming = true for two-phase enabled subscription
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 4294967295);
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid, synchronous_commit, binary, streaming);
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 4294967296);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = -1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+ERROR: invalid transaction id
+-- fail - RESET unsupporting
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ERROR: unrecognized subscription parameter: "connect"
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+ERROR: unrecognized subscription parameter: "enabled"
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+ERROR: RESET must not include values for parameters
\dRs+
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index b732871407..1db0a6d22f 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -217,6 +217,26 @@ ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 4294967295);
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid, synchronous_commit, binary, streaming);
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 4294967296);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = -1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+
+-- fail - RESET unsupporting
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+
\dRs+
DROP SUBSCRIPTION regress_testsub;
diff --git a/src/test/subscription/t/023_skip_xact.pl b/src/test/subscription/t/023_skip_xact.pl
new file mode 100644
index 0000000000..7b29828cce
--- /dev/null
+++ b/src/test/subscription/t/023_skip_xact.pl
@@ -0,0 +1,185 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Test skipping logical replication transactions
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 8;
+
+sub test_subscription_error
+{
+ my ($node, $expected, $source, $relname, $msg) = @_;
+
+ # Wait for the error statistics to be updated.
+ $node->poll_query_until(
+ 'postgres', qq[
+SELECT count(1) > 0 FROM pg_stat_subscription_errors
+WHERE relid = '$relname'::regclass AND failure_source = '$source';
+]) or die "Timed out while waiting for statistics to be updated";
+
+ my $result = $node->safe_psql(
+ 'postgres',
+ qq[
+SELECT datname, subname, command, relid::regclass, failure_source, failure_count > 0
+FROM pg_stat_subscription_errors
+WHERE relid = '$relname'::regclass AND failure_source = '$source';
+]);
+ is($result, $expected, $msg);
+}
+
+# Create publisher node.
+my $node_publisher = get_new_node('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = get_new_node('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+
+# don't overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ 'wal_retrieve_retry_interval = 5s');
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we create
+# the same tables but with primary keys. Also, insert some data that will conflict
+# with the data replicated from publisher later.
+$node_publisher->safe_psql('postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int);
+CREATE TABLE test_tab2 (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+INSERT INTO test_tab1 VALUES (1);
+INSERT INTO test_tab2 VALUES (1);
+COMMIT;
+]);
+$node_subscriber->safe_psql('postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int primary key);
+CREATE TABLE test_tab2 (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab2 VALUES (1);
+INSERT INTO test_tab_streaming SELECT 10000, md5(10000::text);
+COMMIT;
+]);
+
+# Check if there is no subscription errors before starting logical replication.
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_errors");
+is($result, qq(0), 'check no subscription error');
+
+# Setup logical replication.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ q[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Start logical replication. The table sync for test_tab2 on tap_sub will enter
+# infinite error due to violating the unique constraint.
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (two_phase = on);");
+my $appname_streaming = 'tap_sub_streaming';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming' PUBLICATION tap_pub_streaming WITH (streaming = on);");
+
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Also wait for initial table sync for test_tab1 and test_tab_streaming to finish.
+$node_subscriber->poll_query_until('postgres',
+ q[
+SELECT count(1) = 2 FROM pg_subscription_rel
+WHERE srrelid in ('test_tab1'::regclass, 'test_tab_streaming'::regclass) AND srsubstate = 'r'
+]) or die "Timed out while waiting for subscriber to synchronize data";
+
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab1");
+is($result, q(1), 'check initial data was copied to subscriber');
+
+# Insert more data to test_tab1, raising an error on the subscriber due to violating
+# the unique constraint on test_tab1.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab1 VALUES (1)");
+
+# Insert enough rows to test_tab_streaming to exceed the 64kB limit, also raising an
+# error on the subscriber for the same reason.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);");
+
+# Check both two errors on tap_sub subscription are reported.
+test_subscription_error($node_subscriber, qq(postgres|tap_sub|INSERT|test_tab1|apply|t),
+ 'apply', 'test_tab1', 'error reporting by the apply worker');
+test_subscription_error($node_subscriber, qq(postgres|tap_sub||test_tab2|tablesync|t),
+ 'tablesync', 'test_tab2', 'error reporting by the table sync worker');
+test_subscription_error($node_subscriber, qq(postgres|tap_sub_streaming|INSERT|test_tab_streaming|apply|t),
+ 'apply', 'test_tab_streaming', 'error reporting by the apply worker');
+
+# Set XIDs of the transactions in question to the subscriptions to skip.
+my $skip_xid1 = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT xid FROM pg_stat_subscription_errors WHERE relid = 'test_tab1'::regclass");
+my $skip_xid2 = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT xid FROM pg_stat_subscription_errors WHERE relid = 'test_tab_streaming'::regclass");
+
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (skip_xid = $skip_xid1)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub_streaming SET (skip_xid = $skip_xid2)");
+
+# Restart the subscriber to restart logical replication without interval.
+$node_subscriber->restart;
+
+# Wait for the transaction in question is skipped.
+$node_subscriber->poll_query_until(
+ 'postgres',
+ q[
+SELECT count(1) = 2 FROM pg_subscription
+WHERE subname in ('tap_sub', 'tap_sub_streaming') AND subskipxid IS NULL
+]) or die "Timed out while waiting for the transaction to be skipped";
+
+# Insert data to test_tab1 that doesn't conflict.
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab1 VALUES (2)");
+
+# Also, insert data to test_tab_streaming that doesn't conflict.
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab_streaming VALUES (10001, md5(10001::text))");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Check the data is successfully replicated after skipping the transaction.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT * FROM test_tab1");
+is($result, q(1
+2), "subscription gets changes after skipped transaction");
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM test_tab_streaming");
+is($result, q(2), "subscription gets changes after skipped transaction");
+
+# Check if the view doesn't show any entries after dropping the subscription.
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+DROP SUBSCRIPTION tap_sub;
+DROP SUBSCRIPTION tap_sub_streaming;
+]);
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_errors");
+is($result, q(0), 'no error after dropping subscription');
--
2.24.3 (Apple Git-128)
v2-0001-Add-errcontext-to-errors-of-the-applying-logical-.patchapplication/x-patch; name=v2-0001-Add-errcontext-to-errors-of-the-applying-logical-.patchDownload
From 8578720819ea56aed4993bc926402b67179868de Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 28 Jun 2021 13:21:58 +0900
Subject: [PATCH v2 1/3] Add errcontext to errors of the applying logical
replication changes.
---
src/backend/commands/tablecmds.c | 7 +
src/backend/replication/logical/proto.c | 49 +++++
src/backend/replication/logical/worker.c | 220 ++++++++++++++++++++---
src/include/replication/logicalproto.h | 1 +
src/include/replication/logicalworker.h | 2 +
5 files changed, 257 insertions(+), 22 deletions(-)
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 46b108caa6..4662ec4787 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -78,6 +78,7 @@
#include "partitioning/partbounds.h"
#include "partitioning/partdesc.h"
#include "pgstat.h"
+#include "replication/logicalworker.h"
#include "rewrite/rewriteDefine.h"
#include "rewrite/rewriteHandler.h"
#include "rewrite/rewriteManip.h"
@@ -1897,6 +1898,9 @@ ExecuteTruncateGuts(List *explicit_rels,
continue;
}
+ /* Set logical replication error callback info if necessary */
+ set_logicalrep_error_context_rel(rel);
+
/*
* Build the lists of foreign tables belonging to each foreign server
* and pass each list to the foreign data wrapper's callback function,
@@ -2004,6 +2008,9 @@ ExecuteTruncateGuts(List *explicit_rels,
pgstat_count_truncate(rel);
}
+ /* Reset logical replication error callback info */
+ reset_logicalrep_error_context_rel();
+
/* Now go through the hash table, and truncate foreign tables */
if (ft_htab)
{
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 13c8c3bd5b..833a97aec9 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -1109,3 +1109,52 @@ logicalrep_read_stream_abort(StringInfo in, TransactionId *xid,
*xid = pq_getmsgint(in, 4);
*subxid = pq_getmsgint(in, 4);
}
+
+/*
+ * get string representing LogicalRepMsgType.
+ */
+char *
+logicalrep_message_type(LogicalRepMsgType action)
+{
+ switch (action)
+ {
+ case LOGICAL_REP_MSG_BEGIN:
+ return "BEGIN";
+ case LOGICAL_REP_MSG_COMMIT:
+ return "COMMIT";
+ case LOGICAL_REP_MSG_INSERT:
+ return "INSERT";
+ case LOGICAL_REP_MSG_UPDATE:
+ return "UPDATE";
+ case LOGICAL_REP_MSG_DELETE:
+ return "DELETE";
+ case LOGICAL_REP_MSG_TRUNCATE:
+ return "TRUNCATE";
+ case LOGICAL_REP_MSG_RELATION:
+ return "RELATION";
+ case LOGICAL_REP_MSG_TYPE:
+ return "TYPE";
+ case LOGICAL_REP_MSG_ORIGIN:
+ return "ORIGIN";
+ case LOGICAL_REP_MSG_MESSAGE:
+ return "MESSAGE";
+ case LOGICAL_REP_MSG_STREAM_START:
+ return "STREAM START";
+ case LOGICAL_REP_MSG_STREAM_END:
+ return "STREAM END";
+ case LOGICAL_REP_MSG_STREAM_ABORT:
+ return "STREAM ABORT";
+ case LOGICAL_REP_MSG_STREAM_COMMIT:
+ return "STREAM COMMIT";
+ case LOGICAL_REP_MSG_BEGIN_PREPARE:
+ return "BEGIN PREPARE";
+ case LOGICAL_REP_MSG_PREPARE:
+ return "PREPARE";
+ case LOGICAL_REP_MSG_COMMIT_PREPARED:
+ return "COMMIT PREPARED";
+ case LOGICAL_REP_MSG_ROLLBACK_PREPARED:
+ return "ROLLBACK PREPARED";
+ }
+
+ elog(ERROR, "invalid logical replication message type \"%c\"", action);
+}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index b9a7a7ffbb..c23713468c 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -221,6 +221,27 @@ typedef struct ApplyExecutionData
PartitionTupleRouting *proute; /* partition routing info */
} ApplyExecutionData;
+/* Struct for saving and restoring apply information */
+typedef struct ApplyErrCallbackArg
+{
+ LogicalRepMsgType command; /* 0 if invalid */
+
+ /* Local relation information */
+ char *nspname; /* used for error context */
+ char *relname; /* used for error context */
+
+ TransactionId remote_xid;
+ TimestampTz committs;
+} ApplyErrCallbackArg;
+static ApplyErrCallbackArg apply_error_callback_arg =
+{
+ .command = 0,
+ .relname = NULL,
+ .nspname = NULL,
+ .remote_xid = InvalidTransactionId,
+ .committs = 0,
+};
+
/*
* Stream xid hash entry. Whenever we see a new xid we create this entry in the
* xidhash and along with it create the streaming file and store the fileset handle.
@@ -333,6 +354,10 @@ static void apply_handle_tuple_routing(ApplyExecutionData *edata,
/* Compute GID for two_phase transactions */
static void TwoPhaseTransactionGid(Oid subid, TransactionId xid, char *gid, int szgid);
+static void apply_error_callback(void *arg);
+static void set_apply_error_context_rel(LogicalRepRelMapEntry *rel);
+static void reset_apply_error_context_rel(void);
+static void reset_apply_error_context_info(void);
/*
* Should this worker apply changes for given relation.
@@ -826,6 +851,8 @@ apply_handle_begin(StringInfo s)
LogicalRepBeginData begin_data;
logicalrep_read_begin(s, &begin_data);
+ apply_error_callback_arg.remote_xid = begin_data.xid;
+ apply_error_callback_arg.committs = begin_data.committime;
remote_final_lsn = begin_data.final_lsn;
@@ -859,6 +886,7 @@ apply_handle_commit(StringInfo s)
process_syncing_tables(commit_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -876,6 +904,7 @@ apply_handle_begin_prepare(StringInfo s)
errmsg_internal("tablesync worker received a BEGIN PREPARE message")));
logicalrep_read_begin_prepare(s, &begin_data);
+ apply_error_callback_arg.remote_xid = begin_data.xid;
remote_final_lsn = begin_data.prepare_lsn;
@@ -894,6 +923,8 @@ apply_handle_prepare(StringInfo s)
char gid[GIDSIZE];
logicalrep_read_prepare(s, &prepare_data);
+ apply_error_callback_arg.remote_xid = prepare_data.xid;
+ apply_error_callback_arg.committs = prepare_data.prepare_time;
if (prepare_data.prepare_lsn != remote_final_lsn)
ereport(ERROR,
@@ -950,6 +981,7 @@ apply_handle_prepare(StringInfo s)
process_syncing_tables(prepare_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -962,6 +994,8 @@ apply_handle_commit_prepared(StringInfo s)
char gid[GIDSIZE];
logicalrep_read_commit_prepared(s, &prepare_data);
+ apply_error_callback_arg.remote_xid = prepare_data.xid;
+ apply_error_callback_arg.committs = prepare_data.commit_time;
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
@@ -989,6 +1023,7 @@ apply_handle_commit_prepared(StringInfo s)
process_syncing_tables(prepare_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1001,6 +1036,7 @@ apply_handle_rollback_prepared(StringInfo s)
char gid[GIDSIZE];
logicalrep_read_rollback_prepared(s, &rollback_data);
+ apply_error_callback_arg.remote_xid = rollback_data.xid;
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, rollback_data.xid,
@@ -1038,6 +1074,7 @@ apply_handle_rollback_prepared(StringInfo s)
process_syncing_tables(rollback_data.rollback_end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1094,6 +1131,8 @@ apply_handle_stream_start(StringInfo s)
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg_internal("invalid transaction ID in streamed replication transaction")));
+ apply_error_callback_arg.remote_xid = stream_xid;
+
/*
* Initialize the xidhash table if we haven't yet. This will be used for
* the entire duration of the apply worker so create it in permanent
@@ -1150,6 +1189,7 @@ apply_handle_stream_stop(StringInfo s)
MemoryContextReset(LogicalStreamingContext);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1173,7 +1213,10 @@ apply_handle_stream_abort(StringInfo s)
* just delete the files with serialized info.
*/
if (xid == subxid)
+ {
+ apply_error_callback_arg.remote_xid = xid;
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
+ }
else
{
/*
@@ -1198,6 +1241,7 @@ apply_handle_stream_abort(StringInfo s)
char path[MAXPGPATH];
StreamXidHash *ent;
+ apply_error_callback_arg.remote_xid = subxid;
subidx = -1;
begin_replication_step();
subxact_info_read(MyLogicalRepWorker->subid, xid);
@@ -1222,6 +1266,7 @@ apply_handle_stream_abort(StringInfo s)
cleanup_subxact_info();
end_replication_step();
CommitTransactionCommand();
+ reset_apply_error_context_info();
return;
}
@@ -1253,6 +1298,8 @@ apply_handle_stream_abort(StringInfo s)
end_replication_step();
CommitTransactionCommand();
}
+
+ reset_apply_error_context_info();
}
/*
@@ -1277,6 +1324,8 @@ apply_handle_stream_commit(StringInfo s)
errmsg_internal("STREAM COMMIT message without STREAM STOP")));
xid = logicalrep_read_stream_commit(s, &commit_data);
+ apply_error_callback_arg.remote_xid = xid;
+ apply_error_callback_arg.committs = commit_data.committime;
elog(DEBUG1, "received commit for streamed transaction %u", xid);
@@ -1399,6 +1448,8 @@ apply_handle_stream_commit(StringInfo s)
process_syncing_tables(commit_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+
+ reset_apply_error_context_info();
}
/*
@@ -1518,6 +1569,9 @@ apply_handle_insert(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ set_apply_error_context_rel(rel);
+
/* Initialize the executor state. */
edata = create_edata_for_relation(rel);
estate = edata->estate;
@@ -1541,6 +1595,9 @@ apply_handle_insert(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ reset_apply_error_context_rel();
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -1639,6 +1696,9 @@ apply_handle_update(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ set_apply_error_context_rel(rel);
+
/* Check if we can do the update. */
check_relation_updatable(rel);
@@ -1692,6 +1752,9 @@ apply_handle_update(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ reset_apply_error_context_rel();
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -1795,6 +1858,9 @@ apply_handle_delete(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ set_apply_error_context_rel(rel);
+
/* Check if we can do the delete. */
check_relation_updatable(rel);
@@ -1820,6 +1886,9 @@ apply_handle_delete(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ reset_apply_error_context_rel();
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -2224,6 +2293,9 @@ apply_handle_truncate(StringInfo s)
* Even if we used CASCADE on the upstream primary we explicitly default
* to replaying changes without further cascading. This might be later
* changeable with a user specified option.
+ *
+ * Both namespace and relation name for error callback will be set in
+ * ExecuteTruncateGuts().
*/
ExecuteTruncateGuts(rels,
relids,
@@ -2254,44 +2326,54 @@ static void
apply_dispatch(StringInfo s)
{
LogicalRepMsgType action = pq_getmsgbyte(s);
+ ErrorContextCallback errcallback;
+
+ /*
+ * Push apply error context callback. Other fields will be filled
+ * during applying the change.
+ */
+ apply_error_callback_arg.command = action;
+ errcallback.callback = apply_error_callback;
+ errcallback.previous = error_context_stack;
+ error_context_stack = &errcallback;
switch (action)
{
case LOGICAL_REP_MSG_BEGIN:
apply_handle_begin(s);
- return;
+ break;
case LOGICAL_REP_MSG_COMMIT:
apply_handle_commit(s);
- return;
+ break;
case LOGICAL_REP_MSG_INSERT:
apply_handle_insert(s);
- return;
+ break;
case LOGICAL_REP_MSG_UPDATE:
apply_handle_update(s);
- return;
+ break;
case LOGICAL_REP_MSG_DELETE:
apply_handle_delete(s);
- return;
+ break;
case LOGICAL_REP_MSG_TRUNCATE:
apply_handle_truncate(s);
- return;
+ break;
case LOGICAL_REP_MSG_RELATION:
apply_handle_relation(s);
- return;
+ break;
case LOGICAL_REP_MSG_TYPE:
apply_handle_type(s);
- return;
+ break;
case LOGICAL_REP_MSG_ORIGIN:
apply_handle_origin(s);
- return;
+ break;
case LOGICAL_REP_MSG_MESSAGE:
@@ -2300,45 +2382,48 @@ apply_dispatch(StringInfo s)
* Although, it could be used by other applications that use this
* output plugin.
*/
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_START:
apply_handle_stream_start(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_END:
apply_handle_stream_stop(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_ABORT:
apply_handle_stream_abort(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_COMMIT:
apply_handle_stream_commit(s);
- return;
+ break;
case LOGICAL_REP_MSG_BEGIN_PREPARE:
apply_handle_begin_prepare(s);
- return;
+ break;
case LOGICAL_REP_MSG_PREPARE:
apply_handle_prepare(s);
- return;
+ break;
case LOGICAL_REP_MSG_COMMIT_PREPARED:
apply_handle_commit_prepared(s);
- return;
+ break;
case LOGICAL_REP_MSG_ROLLBACK_PREPARED:
apply_handle_rollback_prepared(s);
- return;
+ break;
+
+ default:
+ ereport(ERROR,
+ (errcode(ERRCODE_PROTOCOL_VIOLATION),
+ errmsg("invalid logical replication message type \"%c\"", action)));
}
- ereport(ERROR,
- (errcode(ERRCODE_PROTOCOL_VIOLATION),
- errmsg_internal("invalid logical replication message type \"%c\"",
- action)));
+ /* Pop the error context stack */
+ error_context_stack = errcallback.previous;
}
/*
@@ -3571,3 +3656,94 @@ IsLogicalWorker(void)
{
return MyLogicalRepWorker != NULL;
}
+
+static void
+apply_error_callback(void *arg)
+{
+ StringInfoData buf;
+
+ if (apply_error_callback_arg.command == 0)
+ return;
+
+ initStringInfo(&buf);
+ appendStringInfo(&buf, _("during apply of \"%s\""),
+ logicalrep_message_type(apply_error_callback_arg.command));
+
+ if (apply_error_callback_arg.relname)
+ appendStringInfo(&buf, _(" for relation \"%s.%s\""),
+ apply_error_callback_arg.nspname,
+ apply_error_callback_arg.relname);
+
+ if (TransactionIdIsNormal(apply_error_callback_arg.remote_xid))
+ appendStringInfo(&buf, _(" in transaction with xid %u committs %s"),
+ apply_error_callback_arg.remote_xid,
+ apply_error_callback_arg.committs == 0
+ ? "(unset)"
+ : timestamptz_to_str(apply_error_callback_arg.committs));
+
+ errcontext("%s", buf.data);
+}
+
+/* Set relation information for apply error callback */
+static void
+set_apply_error_context_rel(LogicalRepRelMapEntry *rel)
+{
+ apply_error_callback_arg.nspname = rel->remoterel.nspname;
+ apply_error_callback_arg.relname = rel->remoterel.relname;
+}
+
+/* Reset relation information for apply error callback */
+static void
+reset_apply_error_context_rel(void)
+{
+ apply_error_callback_arg.nspname = NULL;
+ apply_error_callback_arg.relname = NULL;
+}
+
+/* Reset all information for apply error callback */
+static void
+reset_apply_error_context_info(void)
+{
+ apply_error_callback_arg.command = 0;
+ apply_error_callback_arg.remote_xid = InvalidTransactionId;
+ apply_error_callback_arg.committs = 0;
+ reset_apply_error_context_rel();
+}
+
+/*
+ * Set relation information for error callback by the given relation.
+ * Both set_logicalrep_error_context_rel() and
+ * reset_logicalrep_error_context_rel() functions are intended to be
+ * used by functions outside of logical replication module where don't
+ * use LogicalRepRelMapEntry.
+ *
+ * The caller must call reset_logicalrep_error_context_rel() after use
+ * so we free the memory used for names.
+ */
+void
+set_logicalrep_error_context_rel(Relation rel)
+{
+ if (IsLogicalWorker())
+ {
+ apply_error_callback_arg.nspname =
+ get_namespace_name(RelationGetNamespace(rel));
+ apply_error_callback_arg.relname =
+ pstrdup(RelationGetRelationName(rel));
+ }
+}
+
+/* Reset relation information for error callback set */
+void
+reset_logicalrep_error_context_rel(void)
+{
+ if (IsLogicalWorker())
+ {
+ if (apply_error_callback_arg.nspname)
+ pfree(apply_error_callback_arg.nspname);
+ apply_error_callback_arg.nspname = NULL;
+
+ if (apply_error_callback_arg.relname)
+ pfree(apply_error_callback_arg.relname);
+ apply_error_callback_arg.relname = NULL;
+ }
+}
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 63de90d94a..c78a4409bc 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -242,5 +242,6 @@ extern void logicalrep_write_stream_abort(StringInfo out, TransactionId xid,
TransactionId subxid);
extern void logicalrep_read_stream_abort(StringInfo in, TransactionId *xid,
TransactionId *subxid);
+extern char *logicalrep_message_type(LogicalRepMsgType action);
#endif /* LOGICAL_PROTO_H */
diff --git a/src/include/replication/logicalworker.h b/src/include/replication/logicalworker.h
index 2ad61a001a..d3e8514ffd 100644
--- a/src/include/replication/logicalworker.h
+++ b/src/include/replication/logicalworker.h
@@ -15,5 +15,7 @@
extern void ApplyWorkerMain(Datum main_arg);
extern bool IsLogicalWorker(void);
+extern void set_logicalrep_error_context_rel(Relation rel);
+extern void reset_logicalrep_error_context_rel(void);
#endif /* LOGICALWORKER_H */
--
2.24.3 (Apple Git-128)
v2-0002-Add-pg_stat_logical_replication_error-statistics-.patchapplication/x-patch; name=v2-0002-Add-pg_stat_logical_replication_error-statistics-.patchDownload
From e513f9b2e1e8c5b05c1aafe08a2794445e280501 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 16 Jul 2021 23:10:22 +0900
Subject: [PATCH v2 2/3] Add pg_stat_logical_replication_error statistics view.
---
doc/src/sgml/monitoring.sgml | 126 +++++++
src/backend/catalog/system_views.sql | 15 +
src/backend/postmaster/pgstat.c | 451 +++++++++++++++++++++++
src/backend/replication/logical/worker.c | 48 ++-
src/backend/utils/adt/pgstatfuncs.c | 106 ++++++
src/backend/utils/error/elog.c | 16 +
src/include/catalog/pg_proc.dat | 8 +
src/include/pgstat.h | 73 ++++
src/include/utils/elog.h | 1 +
src/test/regress/expected/rules.out | 12 +
10 files changed, 853 insertions(+), 3 deletions(-)
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 74a58a916c..06c4b0c8a5 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -346,6 +346,15 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
</entry>
</row>
+ <row>
+ <entry><structname>pg_stat_subscription_errors</structname><indexterm><primary>pg_stat_subscription_errors</primary></indexterm></entry>
+ <entry>One row per error happened on subscription, showing information about
+ the subscription errors.
+ See <link linkend="monitoring-pg-stat-subscription-errors">
+ <structname>pg_stat_subscription_errors</structname></link> for details.
+ </entry>
+ </row>
+
<row>
<entry><structname>pg_stat_ssl</structname><indexterm><primary>pg_stat_ssl</primary></indexterm></entry>
<entry>One row per connection (regular and replication), showing information about
@@ -3050,6 +3059,123 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
</sect2>
+ <sect2 id="monitoring-pg-stat-subscription-errors">
+ <title><structname>pg_stat_subscription_errors</structname></title>
+
+ <indexterm>
+ <primary>pg_stat_subscription_errors</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_stat_subscription_errors</structname> view will contain one
+ row per subscription error, and additional rows for errors reported by workers
+ handling the initial data copy of the subscribed tables.
+ </para>
+
+ <table id="pg-stat-subscription-errors" xreflabel="pg_stat_subscription-errors">
+ <title><structname>pg_stat_subscription_errors</structname> View</title>
+ <tgroup cols="1">
+ <thead>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ Column Type
+ </para>
+ <para>
+ Description
+ </para></entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>datname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the database in which the subscription is created.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the subscription
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>relid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker is processing when the
+ error happened.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>command</structfield> <type>text</type>
+ </para>
+ <para>
+ Name of command being applied when the error happened.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>xid</structfield> <type>xid</type>
+ </para>
+ <para>
+ Transaction ID of publisher node being applied when the error
+ happened.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>failure_source</structfield> <type>text</type>
+ </para>
+ <para>
+ Type of worker reported the error: <literal>apply</literal> or
+ <literal>tablesync</literal>.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>failure_count</structfield> <type>uint8</type>
+ </para>
+ <para>
+ Number of times error happened on the worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_failure</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which the last error happened.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_failure_message</structfield> <type>text</type>
+ </para>
+ <para>
+ Error message which is reported last failure time.
+ </para></entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ </sect2>
+
<sect2 id="monitoring-pg-stat-ssl-view">
<title><structname>pg_stat_ssl</structname></title>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 55f6e3711d..6031f063d2 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1257,3 +1257,18 @@ REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
substream, subtwophasestate, subslotname, subsynccommit, subpublications)
ON pg_subscription TO public;
+
+CREATE VIEW pg_stat_subscription_errors AS
+ SELECT
+ d.datname,
+ s.subname,
+ e.relid,
+ e.command,
+ e.xid,
+ e.failure_source,
+ e.failure_count,
+ e.last_failure,
+ e.last_failure_message
+ FROM pg_subscription as s,
+ LATERAL pg_stat_get_subscription_error(s.oid) as e
+ JOIN pg_database as d ON (e.datid = d.oid);
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 11702f2a80..fc79b724e8 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -41,6 +41,7 @@
#include "catalog/partition.h"
#include "catalog/pg_database.h"
#include "catalog/pg_proc.h"
+#include "catalog/pg_subscription.h"
#include "common/ip.h"
#include "executor/instrument.h"
#include "libpq/libpq.h"
@@ -106,6 +107,7 @@
#define PGSTAT_TAB_HASH_SIZE 512
#define PGSTAT_FUNCTION_HASH_SIZE 512
#define PGSTAT_REPLSLOT_HASH_SIZE 32
+#define PGSTAT_SUBSCRIPTION_ERR_HASH_SIZE 32
/* ----------
@@ -279,6 +281,7 @@ static PgStat_GlobalStats globalStats;
static PgStat_WalStats walStats;
static PgStat_SLRUStats slruStats[SLRU_NUM_ELEMENTS];
static HTAB *replSlotStatHash = NULL;
+static HTAB *subscriptionErrHash = NULL;
/*
* List of OIDs of databases we need to write out. If an entry is InvalidOid,
@@ -320,6 +323,11 @@ static bool pgstat_db_requested(Oid databaseid);
static PgStat_StatReplSlotEntry *pgstat_get_replslot_entry(NameData name, bool create_it);
static void pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotstats, TimestampTz ts);
+static PgStat_StatSubErrEntry * pgstat_get_subscription_error_entry(Oid subid,
+ bool create);
+static PgStat_StatSubRelErrEntry * pgstat_get_subscription_rel_error_entry(PgStat_StatSubErrEntry *suberrent,
+ Oid subrelid);
+
static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg);
static void pgstat_send_funcstats(void);
static void pgstat_send_slru(void);
@@ -358,6 +366,8 @@ static void pgstat_recv_checksum_failure(PgStat_MsgChecksumFailure *msg, int len
static void pgstat_recv_connstat(PgStat_MsgConn *msg, int len);
static void pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len);
static void pgstat_recv_tempfile(PgStat_MsgTempFile *msg, int len);
+static void pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len);
+static void pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len);
/* ------------------------------------------------------------
* Public functions called from postmaster follow
@@ -1134,6 +1144,52 @@ pgstat_vacuum_stat(void)
}
}
+ /*
+ * Search for all the dead subscriptions in stats hashtable and tell the
+ * stats collector to drop them.
+ */
+ if (subscriptionErrHash)
+ {
+ PgStat_MsgSubscriptionPurge s_msg;
+ PgStat_StatSubErrEntry *suberrent;
+ HTAB *htab;
+
+ htab = pgstat_collect_oids(SubscriptionRelationId, Anum_pg_subscription_oid);
+
+ s_msg.m_nentries = 0;
+ hash_seq_init(&hstat, subscriptionErrHash);
+ while ((suberrent = (PgStat_StatSubErrEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ CHECK_FOR_INTERRUPTS();
+
+ if (hash_search(htab, (void *) &(suberrent->subid), HASH_FIND, NULL) == NULL)
+ s_msg.m_subids[s_msg.m_nentries++] = suberrent->subid;
+
+ /* If the message is full, send it out and reinitialize to empty */
+ if (msg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONPURGE)
+ {
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + s_msg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(&msg, len);
+ s_msg.m_nentries = 0;
+ }
+ }
+
+ /* Send the rest */
+ if (s_msg.m_nentries > 0)
+ {
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + s_msg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(&msg, len);
+ }
+
+ hash_destroy(htab);
+ }
+
/*
* Lookup our own database entry; if not found, nothing more to do.
*/
@@ -1863,6 +1919,56 @@ pgstat_report_replslot_drop(const char *slotname)
pgstat_send(&msg, sizeof(PgStat_MsgReplSlot));
}
+/* ----------
+ * pgstat_report_subscription_error() -
+ *
+ * Tell the collector about error of subscription.
+ * ----------
+ */
+void
+pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command, TransactionId xid,
+ const char *errmsg)
+{
+ PgStat_MsgSubscriptionErr msg;
+ int len;
+
+ len = offsetof(PgStat_MsgSubscriptionErr, m_errmsg[0]) + strlen(errmsg);
+ Assert(len < PGSTAT_MAX_MSG_SIZE);
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_databaseid = MyDatabaseId;
+ msg.m_relid = relid;
+ msg.m_clear = false;
+ msg.m_command = command;
+ msg.m_xid = xid;
+ msg.m_last_failure = GetCurrentTimestamp();
+ strlcpy(msg.m_errmsg, errmsg, PGSTAT_SUBSCRIPTIONERR_MSGLEN);
+
+ pgstat_send(&msg, len);
+}
+
+/* ----------
+ * pgstat_clear_subscription_error() -
+ *
+ * Tell the collector about clear the error of subscription.
+ * ----------
+ */
+void
+pgstat_clear_subscription_error(Oid subid, Oid subrelid)
+{
+ PgStat_MsgSubscriptionErr msg;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_clear = true;
+
+ pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_clear) + sizeof(bool));
+}
+
/* ----------
* pgstat_ping() -
*
@@ -2895,6 +3001,23 @@ pgstat_fetch_replslot(NameData slotname)
return pgstat_get_replslot_entry(slotname, false);
}
+/*
+ * ---------
+ * pgstat_fetch_subscription_error() -
+ *
+ * Support function for the SQL-callable pgstat* functions. Returns
+ * a pointer to the logical replication error struct.
+ * ---------
+ */
+PgStat_StatSubErrEntry *
+pgstat_fetch_subscription_error(Oid subid)
+{
+ backend_read_statsfile();
+
+ return pgstat_get_subscription_error_entry(subid, false);
+}
+
+
/*
* Shut down a single backend's statistics reporting at process exit.
*
@@ -3424,6 +3547,14 @@ PgstatCollectorMain(int argc, char *argv[])
pgstat_recv_connstat(&msg.msg_conn, len);
break;
+ case PGSTAT_MTYPE_SUBSCRIPTIONERR:
+ pgstat_recv_subscription_error(&msg.msg_subscriptionerr, len);
+ break;
+
+ case PGSTAT_MTYPE_SUBSCRIPTIONPURGE:
+ pgstat_recv_subscription_purge(&msg.msg_subscriptionpurge, len);
+ break;
+
default:
break;
}
@@ -3725,6 +3856,41 @@ pgstat_write_statsfiles(bool permanent, bool allDbs)
}
}
+ /*
+ * Write subscription error structs
+ */
+ if (subscriptionErrHash)
+ {
+ PgStat_StatSubErrEntry *suberrent;
+
+ hash_seq_init(&hstat, subscriptionErrHash);
+ while ((suberrent = (PgStat_StatSubErrEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ PgStat_StatSubRelErrEntry *relerrent;
+ HASH_SEQ_STATUS relhstat;
+ long nrels = hash_get_num_entries(suberrent->suberrors);
+
+ /* Skip this subscription if not have any errors */
+ if (suberrent->suberrors == NULL)
+ continue;
+
+ fputc('S', fpout);
+ rc = fwrite(suberrent, sizeof(PgStat_StatSubErrEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+
+ /* the number of errors follows */
+ rc = fwrite(&nrels, sizeof(long), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+
+ hash_seq_init(&relhstat, suberrent->suberrors);
+ while ((relerrent = (PgStat_StatSubRelErrEntry *) hash_seq_search(&relhstat)) != NULL)
+ {
+ rc = fwrite(relerrent, sizeof(PgStat_StatSubRelErrEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+ }
+ }
+ }
+
/*
* No more output to be done. Close the temp file and replace the old
* pgstat.stat with it. The ferror() check replaces testing for error
@@ -4184,6 +4350,96 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
break;
}
+ /*
+ * 'S' A PgStat_StatSubErrEntry struct followed by the number
+ * of errors and PgStat_StatSubRelErrEntry structs describing a
+ * subscription errors.
+ */
+ case 'S':
+ {
+ PgStat_StatSubErrEntry suberrbuf;
+ PgStat_StatSubErrEntry *suberrent;
+ long nrels;
+
+ /* Read the subscription entry */
+ if (fread(&suberrbuf, 1, sizeof(PgStat_StatSubErrEntry), fpin) !=
+ sizeof(PgStat_StatSubErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /* Create hash table if we don't have it already. */
+ if (subscriptionErrHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubErrEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ subscriptionErrHash = hash_create("Subscription error hash",
+ PGSTAT_SUBSCRIPTION_ERR_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ }
+
+ /* Enter the subscription error entry */
+ suberrent =
+ (PgStat_StatSubErrEntry *) hash_search(subscriptionErrHash,
+ (void *) &(suberrbuf.subid),
+ HASH_ENTER, NULL);
+ suberrent->suberrors = NULL;
+
+ /* Read the number of errors in the subscription */
+ if (fread(&nrels, 1, sizeof(long), fpin) != sizeof(long))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ for (int i = 0; i < nrels; i++)
+ {
+ PgStat_StatSubRelErrEntry *subrelent;
+ PgStat_StatSubRelErrEntry subrelbuf;
+
+ if (fread(&subrelbuf, 1, sizeof(PgStat_StatSubRelErrEntry), fpin) !=
+ sizeof(PgStat_StatSubRelErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ if (suberrent->suberrors == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubRelErrEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ suberrent->suberrors = hash_create("Subscription relation error hash",
+ PGSTAT_SUBSCRIPTION_ERR_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ }
+
+ /* Enter the error information to the subscription hash */
+ subrelent =
+ (PgStat_StatSubRelErrEntry *) hash_search(suberrent->suberrors,
+ (void *) &(subrelbuf.subrelid),
+ HASH_ENTER, NULL);
+
+ memcpy(subrelent, &subrelbuf, sizeof(PgStat_StatSubRelErrEntry));
+ }
+
+ break;
+ }
+
case 'E':
goto done;
@@ -4526,6 +4782,50 @@ pgstat_read_db_statsfile_timestamp(Oid databaseid, bool permanent,
}
break;
+ /*
+ * 'S' A PgStat_StatSubErrEntry struct followed by the number
+ * of errors and PgStat_StatSubRelErrEntry structs describing a
+ * subscription errors.
+ */
+ case 'S':
+ {
+ PgStat_StatSubErrEntry mySubErrs;
+ PgStat_StatSubRelErrEntry subrelbuf;
+ long nrels;
+
+ if (fread(&mySubErrs, 1, sizeof(PgStat_StatSubErrEntry), fpin)
+ != sizeof(PgStat_StatSubErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ FreeFile(fpin);
+ return false;
+ }
+
+ if (fread(&nrels, 1, sizeof(long), fpin) != sizeof(long))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ for (int i = 0; i < nrels; i++)
+ {
+ if (fread(&subrelbuf, 1, sizeof(PgStat_StatSubRelErrEntry), fpin) !=
+ sizeof(PgStat_StatSubRelErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+ }
+ }
+
+ break;
+
case 'E':
goto done;
@@ -4716,6 +5016,7 @@ pgstat_clear_snapshot(void)
pgStatLocalContext = NULL;
pgStatDBHash = NULL;
replSlotStatHash = NULL;
+ subscriptionErrHash = NULL;
/*
* Historically the backend_status.c facilities lived in this file, and
@@ -5650,6 +5951,76 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
}
}
+/* ----------
+ * pgstat_recv_subscription_error() -
+ *
+ * Process a SUBSCRIPTIONERR message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len)
+{
+ PgStat_StatSubErrEntry *suberrent;
+ PgStat_StatSubRelErrEntry *relerrent;
+
+ /* Get subscription errors */
+ suberrent = pgstat_get_subscription_error_entry(msg->m_subid, true);
+ Assert(suberrent);
+
+ /* Get the error entry of the relation */
+ relerrent = pgstat_get_subscription_rel_error_entry(suberrent,
+ msg->m_subrelid);
+ Assert(relerrent);
+
+ if (msg->m_clear)
+ {
+ /* reset all fields except for databaseid and failure_count */
+ relerrent->relid = InvalidOid;
+ relerrent->command = 0;
+ relerrent->xid = InvalidTransactionId;
+ relerrent->last_failure = 0;
+ relerrent->errmsg[0] = '\0';
+ }
+ else
+ {
+ relerrent->databaseid = msg->m_databaseid;
+ relerrent->relid = msg->m_relid;
+ relerrent->command = msg->m_command;
+ relerrent->xid = msg->m_xid;
+ relerrent->failure_count++;
+ relerrent->last_failure = msg->m_last_failure;
+ strlcpy(relerrent->errmsg, msg->m_errmsg, PGSTAT_SUBSCRIPTIONERR_MSGLEN);
+ }
+}
+
+/* ----------
+ * pgstat_recv_subscription_purge() -
+ *
+ * Process a SUBSCRIPTIONPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len)
+{
+ if (subscriptionErrHash != NULL)
+ return;
+
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ PgStat_StatSubErrEntry *suberrent;
+
+ suberrent = hash_search(subscriptionErrHash, (void *) &(msg->m_subids[i]),
+ HASH_FIND, NULL);
+
+ /* Cleanup the hash table for errors */
+ if (suberrent->suberrors != NULL)
+ hash_destroy(suberrent->suberrors);
+
+ (void) hash_search(subscriptionErrHash, (void *) &(msg->m_subids[i]),
+ HASH_REMOVE, NULL);
+ }
+}
+
/* ----------
* pgstat_write_statsfile_needed() -
*
@@ -5747,6 +6118,86 @@ pgstat_get_replslot_entry(NameData name, bool create)
return slotent;
}
+/* ----------
+ * pgstat_get_subscription_error_entry
+ *
+ * Return the entry of subscription error entry with the subscription OID.
+ * Return NULL if not found and the caller didn't request to create it.
+ *
+ * create tells whether to create the new slot entry if it is not found.
+ * ----------
+ */
+static PgStat_StatSubErrEntry *
+pgstat_get_subscription_error_entry(Oid subid, bool create)
+{
+ PgStat_StatSubErrEntry *suberrent;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);
+ bool found;
+
+ if (subscriptionErrHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubErrEntry);
+ subscriptionErrHash = hash_create("Subscription error hash",
+ PGSTAT_SUBSCRIPTION_ERR_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+ }
+
+ suberrent = (PgStat_StatSubErrEntry *) hash_search(subscriptionErrHash,
+ (void *) &subid,
+ action, &found);
+
+ if (create && !found)
+ suberrent->suberrors = NULL;
+
+ return suberrent;
+}
+
+/* ----------
+ * pgstat_get_subscription_rel_error_entry
+ *
+ * Lookup and create the subscription error entry with 'relid' from 'suberrent'.
+ * ----------
+ */
+static PgStat_StatSubRelErrEntry *
+pgstat_get_subscription_rel_error_entry(PgStat_StatSubErrEntry *suberrent,
+ Oid subrelid)
+{
+ PgStat_StatSubRelErrEntry *relerrent;
+ bool found;
+
+ if (suberrent->suberrors == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubRelErrEntry);
+ suberrent->suberrors = hash_create("Subscription relation error hash",
+ PGSTAT_SUBSCRIPTION_ERR_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+ }
+
+ relerrent = (PgStat_StatSubRelErrEntry *) hash_search(suberrent->suberrors,
+ (void *) &subrelid,
+ HASH_ENTER, &found);
+
+ /* initialize fields */
+ if (!found)
+ {
+ relerrent->command = 0;
+ relerrent->xid = InvalidTransactionId;
+ relerrent->failure_count = 0;
+ relerrent->last_failure = 0;
+ relerrent->errmsg[0] = '\0';
+ }
+
+ return relerrent;
+}
+
/* ----------
* pgstat_reset_replslot
*
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index c23713468c..7c2ec983bb 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -227,6 +227,7 @@ typedef struct ApplyErrCallbackArg
LogicalRepMsgType command; /* 0 if invalid */
/* Local relation information */
+ Oid relid; /* used for error reporting */
char *nspname; /* used for error context */
char *relname; /* used for error context */
@@ -236,6 +237,7 @@ typedef struct ApplyErrCallbackArg
static ApplyErrCallbackArg apply_error_callback_arg =
{
.command = 0,
+ .relid = InvalidOid,
.relname = NULL,
.nspname = NULL,
.remote_xid = InvalidTransactionId,
@@ -3524,8 +3526,26 @@ ApplyWorkerMain(Datum main_arg)
{
char *syncslotname;
- /* This is table synchronization worker, call initial sync. */
- syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ PG_TRY();
+ {
+ /* This is table synchronization worker, call initial sync. */
+ syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ }
+ PG_CATCH();
+ {
+ elog(NOTICE, "errmsg \"%s\"",
+ geterrmessage());
+
+ /* report the table sync error */
+ pgstat_report_subscription_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ MyLogicalRepWorker->relid,
+ 0,
+ InvalidTransactionId,
+ geterrmessage());
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
/* allocate slot name in long-lived context */
myslotname = MemoryContextStrdup(ApplyContext, syncslotname);
@@ -3643,7 +3663,24 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- LogicalRepApplyLoop(origin_startpos);
+ PG_TRY();
+ {
+ LogicalRepApplyLoop(origin_startpos);
+ }
+ PG_CATCH();
+ {
+ /* report the apply error */
+ if (apply_error_callback_arg.command != 0)
+ pgstat_report_subscription_error(MySubscription->oid,
+ InvalidOid,
+ apply_error_callback_arg.relid,
+ apply_error_callback_arg.command,
+ apply_error_callback_arg.remote_xid,
+ geterrmessage());
+
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
proc_exit(0);
}
@@ -3688,6 +3725,7 @@ apply_error_callback(void *arg)
static void
set_apply_error_context_rel(LogicalRepRelMapEntry *rel)
{
+ apply_error_callback_arg.relid = rel->localreloid;
apply_error_callback_arg.nspname = rel->remoterel.nspname;
apply_error_callback_arg.relname = rel->remoterel.relname;
}
@@ -3696,6 +3734,7 @@ set_apply_error_context_rel(LogicalRepRelMapEntry *rel)
static void
reset_apply_error_context_rel(void)
{
+ apply_error_callback_arg.relid = InvalidOid;
apply_error_callback_arg.nspname = NULL;
apply_error_callback_arg.relname = NULL;
}
@@ -3725,6 +3764,7 @@ set_logicalrep_error_context_rel(Relation rel)
{
if (IsLogicalWorker())
{
+ apply_error_callback_arg.relid = RelationGetRelid(rel);
apply_error_callback_arg.nspname =
get_namespace_name(RelationGetNamespace(rel));
apply_error_callback_arg.relname =
@@ -3738,6 +3778,8 @@ reset_logicalrep_error_context_rel(void)
{
if (IsLogicalWorker())
{
+ apply_error_callback_arg.relid = InvalidOid;
+
if (apply_error_callback_arg.nspname)
pfree(apply_error_callback_arg.nspname);
apply_error_callback_arg.nspname = NULL;
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index f0e09eae4d..b155c30fcf 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -24,6 +24,8 @@
#include "pgstat.h"
#include "postmaster/bgworker_internals.h"
#include "postmaster/postmaster.h"
+#include "replication/logicalproto.h"
+#include "replication/logicalworker.h"
#include "replication/slot.h"
#include "storage/proc.h"
#include "storage/procarray.h"
@@ -2380,3 +2382,107 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
/* Returns the record as Datum */
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
+
+/*
+ * Get the logical replication error for the given subscription.
+ */
+Datum
+pg_stat_get_subscription_errors(PG_FUNCTION_ARGS)
+{
+#define PG_STAT_GET_SUBSCRIPTION_ERROR_COLS 9
+ Oid subid = PG_GETARG_OID(0);
+ ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+ TupleDesc tupdesc;
+ Tuplestorestate *tupstore;
+ MemoryContext per_query_ctx;
+ MemoryContext oldcontext;
+ PgStat_StatSubErrEntry *suberrent;
+ PgStat_StatSubRelErrEntry *relerrent;
+ HASH_SEQ_STATUS hstat;
+
+ /* check to see if caller supports us returning a tuplestore */
+ if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("set-valued function called in context that cannot accept a set")));
+ if (!(rsinfo->allowedModes & SFRM_Materialize))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("materialize mode required, but it is not allowed in this context")));
+
+ /* Build a tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ elog(ERROR, "return type must be a row type");
+
+ per_query_ctx = rsinfo->econtext->ecxt_per_query_memory;
+ oldcontext = MemoryContextSwitchTo(per_query_ctx);
+
+ tupstore = tuplestore_begin_heap(true, false, work_mem);
+ rsinfo->returnMode = SFRM_Materialize;
+ rsinfo->setResult = tupstore;
+ rsinfo->setDesc = tupdesc;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ /* Get subscription errors */
+ suberrent = pgstat_fetch_subscription_error(subid);
+
+ /* Return NULL if the subscription doesn't have any errors */
+ if (suberrent == NULL || suberrent->suberrors == NULL)
+ PG_RETURN_NULL();
+
+ hash_seq_init(&hstat, suberrent->suberrors);
+ while ((relerrent = (PgStat_StatSubRelErrEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ Datum values[PG_STAT_GET_SUBSCRIPTION_ERROR_COLS];
+ bool nulls[PG_STAT_GET_SUBSCRIPTION_ERROR_COLS];
+
+ MemSet(nulls, 0, sizeof(nulls));
+
+ /* databaseid */
+ values[0] = ObjectIdGetDatum(relerrent->databaseid);
+
+ /* subid */
+ values[1] = ObjectIdGetDatum(subid);
+
+ /* relid */
+ if (OidIsValid(relerrent->relid))
+ values[2] = ObjectIdGetDatum(relerrent->relid);
+ else
+ nulls[2] = true;
+
+ /* command */
+ if (OidIsValid(relerrent->subrelid))
+ nulls[3] = true;
+ else
+ values[3] = CStringGetTextDatum(logicalrep_message_type(relerrent->command));
+
+ /* xid */
+ if (TransactionIdIsValid(relerrent->xid))
+ values[4] = TransactionIdGetDatum(relerrent->xid);
+ else
+ nulls[4] = true;
+
+ /* failure_source */
+ if (OidIsValid(relerrent->subrelid))
+ values[5] = CStringGetTextDatum("tablesync");
+ else
+ values[5] = CStringGetTextDatum("apply");
+
+ /* failure_count */
+ values[6] = Int64GetDatum(relerrent->failure_count);
+
+ /* last_failure */
+ values[7] = TimestampTzGetDatum(relerrent->last_failure);
+
+ /* failure_message */
+ values[8] = CStringGetTextDatum(relerrent->errmsg);
+
+ tuplestore_putvalues(tupstore, tupdesc, values, nulls);
+ }
+
+ /* clean up and return the tuplestore */
+ tuplestore_donestoring(tupstore);
+
+ return (Datum) 0;
+}
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index a3e1c59a82..dd36850016 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -1441,6 +1441,22 @@ getinternalerrposition(void)
return edata->internalpos;
}
+/*
+ * geterrmessage --- return the currently set error message
+ *
+ * This is only intended for use in error callback subroutines, since there
+ * is no other place outside elog.c where the concept is meaningful.
+ */
+const char *
+geterrmessage(void)
+{
+ ErrorData *edata = &errordata[errordata_stack_depth];
+
+ /* we don't bother incrementing recursion_depth */
+ CHECK_STACK_DEPTH();
+
+ return (const char *) edata->message;
+}
/*
* Functions to allow construction of error message strings separately from
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 8cd0252082..92297d60d1 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5321,6 +5321,14 @@
proargmodes => '{i,o,o,o,o,o,o,o,o,o,o}',
proargnames => '{slot_name,slot_name,spill_txns,spill_count,spill_bytes,stream_txns,stream_count,stream_bytes,total_txns,total_bytes,stats_reset}',
prosrc => 'pg_stat_get_replication_slot' },
+{ oid => '8523', descr => 'statistics: information about logical replication error',
+ proname => 'pg_stat_get_subscription_error', prorows => '10', proisstrict => 'f',
+ proretset => 't', provolatile => 's', proparallel => 'r',
+ prorettype => 'record', proargtypes => 'oid',
+ proallargtypes => '{oid,oid,oid,oid,text,xid,text,int8,timestamptz,text}',
+ proargmodes => '{i,o,o,o,o,o,o,o,o,o}',
+ proargnames => '{subid,datid,subid,relid,command,xid,failure_source,failure_count,last_failure,last_failure_message}',
+ prosrc => 'pg_stat_get_subscription_errors' },
{ oid => '6118', descr => 'statistics: information about subscription',
proname => 'pg_stat_get_subscription', prorows => '10', proisstrict => 'f',
proretset => 't', provolatile => 's', proparallel => 'r',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 9612c0a6c2..215ac3abd5 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -14,6 +14,7 @@
#include "datatype/timestamp.h"
#include "portability/instr_time.h"
#include "postmaster/pgarch.h" /* for MAX_XFN_CHARS */
+#include "replication/logicalproto.h"
#include "utils/backend_progress.h" /* for backward compatibility */
#include "utils/backend_status.h" /* for backward compatibility */
#include "utils/hsearch.h"
@@ -66,6 +67,8 @@ typedef enum StatMsgType
PGSTAT_MTYPE_RESETSINGLECOUNTER,
PGSTAT_MTYPE_RESETSLRUCOUNTER,
PGSTAT_MTYPE_RESETREPLSLOTCOUNTER,
+ PGSTAT_MTYPE_SUBSCRIPTIONERR,
+ PGSTAT_MTYPE_SUBSCRIPTIONPURGE,
PGSTAT_MTYPE_AUTOVAC_START,
PGSTAT_MTYPE_VACUUM,
PGSTAT_MTYPE_ANALYZE,
@@ -539,6 +542,47 @@ typedef struct PgStat_MsgReplSlot
PgStat_Counter m_total_bytes;
} PgStat_MsgReplSlot;
+/* ----------
+ * PgStat_MsgSubscriptionErr Sent by the apply worker or the table sync worker to
+ update the error happening during logical replication.
+ * ----------
+ */
+#define PGSTAT_SUBSCRIPTIONERR_MSGLEN 256
+typedef struct PgStat_MsgSubscriptionErr
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * m_subid and m_subrelid are used to determine the subscription and the
+ * reporter of this error. m_subrelid is InvalidOid if reported by the
+ * apply worker, otherwise by the table sync worker. In table sync worker
+ * case, m_subrelid must be the same as m_relid.
+ */
+ Oid m_subid;
+ Oid m_subrelid;
+
+ bool m_clear;
+ Oid m_databaseid;
+ Oid m_relid;
+ LogicalRepMsgType m_command;
+ TransactionId m_xid;
+ TimestampTz m_last_failure;
+ char m_errmsg[PGSTAT_SUBSCRIPTIONERR_MSGLEN];
+} PgStat_MsgSubscriptionErr;
+
+/* ----------
+ * PgStat_MsgSubscriptionPurge Sent by the autovacuum purge the subscriptions.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionPurge
+{
+ PgStat_MsgHdr m_hdr;
+ int m_nentries;
+ Oid m_subids[PGSTAT_NUM_SUBSCRIPTIONPURGE];
+} PgStat_MsgSubscriptionPurge;
/* ----------
* PgStat_MsgRecoveryConflict Sent by the backend upon recovery conflict
@@ -710,6 +754,8 @@ typedef union PgStat_Msg
PgStat_MsgChecksumFailure msg_checksumfailure;
PgStat_MsgReplSlot msg_replslot;
PgStat_MsgConn msg_conn;
+ PgStat_MsgSubscriptionErr msg_subscriptionerr;
+ PgStat_MsgSubscriptionPurge msg_subscriptionpurge;
} PgStat_Msg;
@@ -908,6 +954,28 @@ typedef struct PgStat_StatReplSlotEntry
TimestampTz stat_reset_timestamp;
} PgStat_StatReplSlotEntry;
+/*
+ * Subscription error statistics kept in the stats collector
+ */
+typedef struct PgStat_StatSubErrEntry
+{
+ Oid subid; /* hash table key */
+ HTAB *suberrors;
+} PgStat_StatSubErrEntry;
+
+typedef struct PgStat_StatSubRelErrEntry
+{
+ Oid subrelid; /* InvalidOid if the apply worker, otherwise the table
+ * sync worker. hash table key. */
+ Oid databaseid;
+ Oid relid; /* OID of relation related to the error. Must be the same
+ * as subrelid in the table sync case. */
+ LogicalRepMsgType command;
+ TransactionId xid;
+ PgStat_Counter failure_count;
+ TimestampTz last_failure;
+ char errmsg[PGSTAT_SUBSCRIPTIONERR_MSGLEN];
+} PgStat_StatSubRelErrEntry;
/*
* Working state needed to accumulate per-function-call timing statistics.
@@ -1011,6 +1079,10 @@ extern void pgstat_report_checksum_failure(void);
extern void pgstat_report_replslot(const PgStat_StatReplSlotEntry *repSlotStat);
extern void pgstat_report_replslot_create(const char *slotname);
extern void pgstat_report_replslot_drop(const char *slotname);
+extern void pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command,
+ TransactionId xid, const char *errmsg);
+extern void pgstat_clear_subscription_error(Oid subid, Oid subrelid);
extern void pgstat_initialize(void);
@@ -1106,6 +1178,7 @@ extern PgStat_GlobalStats *pgstat_fetch_global(void);
extern PgStat_WalStats *pgstat_fetch_stat_wal(void);
extern PgStat_SLRUStats *pgstat_fetch_slru(void);
extern PgStat_StatReplSlotEntry *pgstat_fetch_replslot(NameData slotname);
+extern PgStat_StatSubErrEntry *pgstat_fetch_subscription_error(Oid subid);
extern void pgstat_count_slru_page_zeroed(int slru_idx);
extern void pgstat_count_slru_page_hit(int slru_idx);
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index f53607e12e..155145a77d 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -222,6 +222,7 @@ extern int err_generic_string(int field, const char *str);
extern int geterrcode(void);
extern int geterrposition(void);
extern int getinternalerrposition(void);
+extern const char *geterrmessage(void);
/*----------
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index e5ab11275d..9cde7c2d09 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2094,6 +2094,18 @@ pg_stat_subscription| SELECT su.oid AS subid,
st.latest_end_time
FROM (pg_subscription su
LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid, pid, received_lsn, last_msg_send_time, last_msg_receipt_time, latest_end_lsn, latest_end_time) ON ((st.subid = su.oid)));
+pg_stat_subscription_errors| SELECT d.datname,
+ s.subname,
+ e.relid,
+ e.command,
+ e.xid,
+ e.failure_source,
+ e.failure_count,
+ e.last_failure,
+ e.last_failure_message
+ FROM pg_subscription s,
+ (LATERAL pg_stat_get_subscription_error(s.oid) e(datid, subid, relid, command, xid, failure_source, failure_count, last_failure, last_failure_message)
+ JOIN pg_database d ON ((e.datid = d.oid)));
pg_stat_sys_indexes| SELECT pg_stat_all_indexes.relid,
pg_stat_all_indexes.indexrelid,
pg_stat_all_indexes.schemaname,
--
2.24.3 (Apple Git-128)
On Mon, Jul 19, 2021 at 12:10 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Sat, Jul 17, 2021 at 12:02 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
1. How to clear apply worker errors. IIUC we've discussed that once
the apply worker skipped the transaction we leave the error entry
itself but clear its fields except for some fields such as failure
counts. But given that the stats messages could be lost, how can we
ensure to clear those error details? For table sync workers’ error, we
can have autovacuum workers periodically check entires of
pg_subscription_rel and clear the error entry if the table sync worker
completes table sync (i.g., checking if srsubstate = ‘r’). But there
is no such information for the apply workers and subscriptions. In
addition to sending the message clearing the error details just after
skipping the transaction, I thought that we can have apply workers
periodically send the message clearing the error details but it seems
not good.I think that the motivation behind the idea of leaving error entries
and clearing theirs some fields is that users can check if the error
is successfully resolved and the worker is working find. But we can
check it also in another way, for example, checking
pg_stat_subscription view. So is it worth considering leaving the
apply worker errors as they are?
I think so. Basically, we will send the clear message after skipping
the exact but I think it is fine if that message is lost. At worst, it
will be displayed as the last error details. If there is another error
it will be overwritten or probably we should have a function *_reset()
which allows the user to reset a particular subscription's error info.
--
With Regards,
Amit Kapila.
On July 19, 2021 2:40 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached the updated version patch that incorporated all comments
I got so far except for the clearing error details part I mentioned
above. After getting a consensus on those parts, I'll incorporate the
idea into the patches.
Hi Sawada-san,
I am interested in this feature.
After having a look at the patch, I have a few questions about it.
(Sorry in advance if I missed something)
1) In 0002 patch, it introduces a new view called pg_stat_subscription_errors.
Since it won't be cleaned automatically after we resolve the conflict, do we
need a reset function to clean the statistics in it ? Maybe something
similar to pg_stat_reset_replication_slot which clean the
pg_stat_replication_slots.
2) For 0003 patch, When I am faced with a conflict, I set skip_xid = xxx, and
then I resolve the conflict. If I reset skip_xid after resolving the
conflict, will the change(which cause the conflict before) be applied again ?
3) For 0003 patch, if user set skip_xid to a wrong xid which have not been
assigned, and then will the change be skipped when the xid is assigned in
the future even if it doesn't cause any conflicts ?
Besides, It might be better to add some description of patch in each patch's
commit message which will make it easier for new reviewers to follow.
Best regards,
Houzj
On Mon, Jul 19, 2021 at 5:47 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Jul 19, 2021 at 12:10 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Sat, Jul 17, 2021 at 12:02 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
1. How to clear apply worker errors. IIUC we've discussed that once
the apply worker skipped the transaction we leave the error entry
itself but clear its fields except for some fields such as failure
counts. But given that the stats messages could be lost, how can we
ensure to clear those error details? For table sync workers’ error, we
can have autovacuum workers periodically check entires of
pg_subscription_rel and clear the error entry if the table sync worker
completes table sync (i.g., checking if srsubstate = ‘r’). But there
is no such information for the apply workers and subscriptions. In
addition to sending the message clearing the error details just after
skipping the transaction, I thought that we can have apply workers
periodically send the message clearing the error details but it seems
not good.I think that the motivation behind the idea of leaving error entries
and clearing theirs some fields is that users can check if the error
is successfully resolved and the worker is working find. But we can
check it also in another way, for example, checking
pg_stat_subscription view. So is it worth considering leaving the
apply worker errors as they are?I think so. Basically, we will send the clear message after skipping
the exact but I think it is fine if that message is lost. At worst, it
will be displayed as the last error details. If there is another error
it will be overwritten or probably we should have a function *_reset()
which allows the user to reset a particular subscription's error info.
That makes sense. I'll incorporate this idea in the next version patch.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Mon, Jul 19, 2021 at 8:38 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
On July 19, 2021 2:40 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached the updated version patch that incorporated all comments
I got so far except for the clearing error details part I mentioned
above. After getting a consensus on those parts, I'll incorporate the
idea into the patches.Hi Sawada-san,
I am interested in this feature.
After having a look at the patch, I have a few questions about it.
Thank you for having a look at the patches!
1) In 0002 patch, it introduces a new view called pg_stat_subscription_errors.
Since it won't be cleaned automatically after we resolve the conflict, do we
need a reset function to clean the statistics in it ? Maybe something
similar to pg_stat_reset_replication_slot which clean the
pg_stat_replication_slots.
Agreed. As Amit also mentioned, providing a reset function to clean
the statistics seems a good idea. If the message clearing the stats
that is sent after skipping the transaction gets lost, the user is
able to reset those stats manually.
2) For 0003 patch, When I am faced with a conflict, I set skip_xid = xxx, and
then I resolve the conflict. If I reset skip_xid after resolving the
conflict, will the change(which cause the conflict before) be applied again ?
The apply worker checks skip_xid when it reads the subscription.
Therefore, if you reset skip_xid before the apply worker restarts and
skips the transaction, the change is applied. But if you reset
skip_xid after the apply worker skips transaction, the change is
already skipped and your resetting skip_xid has no effect.
3) For 0003 patch, if user set skip_xid to a wrong xid which have not been
assigned, and then will the change be skipped when the xid is assigned in
the future even if it doesn't cause any conflicts ?
Yes. Currently, setting a correct xid is the user's responsibility. I
think it would be better to disable it or emit WARNING/ERROR when the
user mistakenly set the wrong xid if we find out a convenient way to
detect that.
Besides, It might be better to add some description of patch in each patch's
commit message which will make it easier for new reviewers to follow.
I'll add commit messages in the next version patch.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, Jul 20, 2021 at 6:56 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Jul 19, 2021 at 8:38 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:3) For 0003 patch, if user set skip_xid to a wrong xid which have not been
assigned, and then will the change be skipped when the xid is assigned in
the future even if it doesn't cause any conflicts ?Yes. Currently, setting a correct xid is the user's responsibility. I
think it would be better to disable it or emit WARNING/ERROR when the
user mistakenly set the wrong xid if we find out a convenient way to
detect that.
I think in this regard we should clearly document how this can be
misused by users. I see that you have mentioned about skip_xid but
maybe we can add more on how it could lead to skipping a
non-conflicting XID and can lead to an inconsistent replica. As
discussed earlier as well, users can anyway do similar harm by using
pg_replication_slot_advance(). I think if possible we might want to
give some examples as well where it would be helpful for users to use
this functionality.
--
With Regards,
Amit Kapila.
On July 20, 2021 9:26 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Jul 19, 2021 at 8:38 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:On July 19, 2021 2:40 PM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:
I've attached the updated version patch that incorporated all
comments I got so far except for the clearing error details part I
mentioned above. After getting a consensus on those parts, I'll
incorporate the idea into the patches.3) For 0003 patch, if user set skip_xid to a wrong xid which have not been
assigned, and then will the change be skipped when the xid is assigned in
the future even if it doesn't cause any conflicts ?Yes. Currently, setting a correct xid is the user's responsibility. I think it would
be better to disable it or emit WARNING/ERROR when the user mistakenly set
the wrong xid if we find out a convenient way to detect that.
Thanks for the explanation. As Amit suggested, it seems we can document the
risk of misusing skip_xid. Besides, I found some minor things in the patch.
1) In 0002 patch
+ */
+static void
+pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len)
+{
+ if (subscriptionErrHash != NULL)
+ return;
+
+static void
+pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len)
+{
the second paramater "len" seems not used in the function
pgstat_recv_subscription_purge() and pgstat_recv_subscription_error().
2) in 0003 patch
* Helper function for apply_handle_commit and apply_handle_stream_commit.
*/
static void
-apply_handle_commit_internal(StringInfo s, LogicalRepCommitData *commit_data)
+apply_handle_commit_internal(LogicalRepCommitData *commit_data)
{
This looks like a separate change which remove unused paramater in existing
code, maybe we can get this committed first ?
Best regards,
Houzj
On Thu, Jul 22, 2021 at 8:53 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
On July 20, 2021 9:26 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Jul 19, 2021 at 8:38 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:On July 19, 2021 2:40 PM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:
I've attached the updated version patch that incorporated all
comments I got so far except for the clearing error details part I
mentioned above. After getting a consensus on those parts, I'll
incorporate the idea into the patches.3) For 0003 patch, if user set skip_xid to a wrong xid which have not been
assigned, and then will the change be skipped when the xid is assigned in
the future even if it doesn't cause any conflicts ?Yes. Currently, setting a correct xid is the user's responsibility. I think it would
be better to disable it or emit WARNING/ERROR when the user mistakenly set
the wrong xid if we find out a convenient way to detect that.Thanks for the explanation. As Amit suggested, it seems we can document the
risk of misusing skip_xid. Besides, I found some minor things in the patch.1) In 0002 patch
+ */ +static void +pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len) +{ + if (subscriptionErrHash != NULL) + return; ++static void +pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len) +{the second paramater "len" seems not used in the function
pgstat_recv_subscription_purge() and pgstat_recv_subscription_error().
'len' is not used at all in not only functions the patch added but
also other pgstat_recv_* functions. Can we remove all of them in a
separate patch? 'len' in pgstat_recv_* functions has never been used
since the stats collector code is introduced. It seems like that it
was mistakenly introduced in the first commit and other pgstat_recv_*
functions were added that followed it to define ‘len’ but didn’t also
use it at all.
2) in 0003 patch
* Helper function for apply_handle_commit and apply_handle_stream_commit. */ static void -apply_handle_commit_internal(StringInfo s, LogicalRepCommitData *commit_data) +apply_handle_commit_internal(LogicalRepCommitData *commit_data) {This looks like a separate change which remove unused paramater in existing
code, maybe we can get this committed first ?
Yeah, it seems to be introduced by commit 0926e96c493. I've attached
the patch for that.
Also, I've attached the updated version patches. This version patch
has pg_stat_reset_subscription_error() SQL function and sends a clear
message after skipping the transaction. 0004 patch includes the
skipping transaction feature and introducing RESET to ALTER
SUBSCRIPTION. It would be better to separate them.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
0001-Remove-unused-function-argument-in-apply_handle_comm.patchapplication/octet-stream; name=0001-Remove-unused-function-argument-in-apply_handle_comm.patchDownload
From 95e1b7934e93f8e1920aff526f0be56888c3aa20 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 26 Jul 2021 11:34:36 +0900
Subject: [PATCH 1/4] Remove unused function argument in
apply_handle_commit_internal()
---
src/backend/replication/logical/worker.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index b9a7a7ffbb..186be1a188 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -309,8 +309,7 @@ static void maybe_reread_subscription(void);
/* prototype needed because of stream_commit */
static void apply_dispatch(StringInfo s);
-static void apply_handle_commit_internal(StringInfo s,
- LogicalRepCommitData *commit_data);
+static void apply_handle_commit_internal(LogicalRepCommitData *commit_data);
static void apply_handle_insert_internal(ApplyExecutionData *edata,
ResultRelInfo *relinfo,
TupleTableSlot *remoteslot);
@@ -853,7 +852,7 @@ apply_handle_commit(StringInfo s)
LSN_FORMAT_ARGS(commit_data.commit_lsn),
LSN_FORMAT_ARGS(remote_final_lsn))));
- apply_handle_commit_internal(s, &commit_data);
+ apply_handle_commit_internal(&commit_data);
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(commit_data.end_lsn);
@@ -1390,7 +1389,7 @@ apply_handle_stream_commit(StringInfo s)
elog(DEBUG1, "replayed %d (all) changes from file \"%s\"",
nchanges, path);
- apply_handle_commit_internal(s, &commit_data);
+ apply_handle_commit_internal(&commit_data);
/* unlink the files with serialized changes and subxact info */
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
@@ -1405,7 +1404,7 @@ apply_handle_stream_commit(StringInfo s)
* Helper function for apply_handle_commit and apply_handle_stream_commit.
*/
static void
-apply_handle_commit_internal(StringInfo s, LogicalRepCommitData *commit_data)
+apply_handle_commit_internal(LogicalRepCommitData *commit_data)
{
if (IsTransactionState())
{
--
2.24.3 (Apple Git-128)
v3-0003-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchapplication/octet-stream; name=v3-0003-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchDownload
From 859bb995732dac1c7479b2a103c0c2da78967ffb Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 28 Jun 2021 13:18:58 +0900
Subject: [PATCH v3 3/3] Add skip_xid option to ALTER SUBSCRIPTION.
If incoming change violates any constraint, lotigcal replication stops
until it's resolved. This commit introduces another way to skip the
transaction in question.
The user can specify XID by ALTER SUBSCRIPTION ... SET (skip_xid =
XXX), updating pg_subscription.subskipxid field, telling the apply
worker to skip the transaction. The apply worker skips all data
modification changes within the specified transaction. After skipping
the transaciton the apply worker clears subskipxid. Also it clear the
error statistics of the subscription in pg_stat_subscription_errors
system view.
To reset skip_xid parameter (and other paremeters), this commits also
adds RESET command to ALTER SUBSCRIPTION command.
---
doc/src/sgml/logical-replication.sgml | 49 +++-
doc/src/sgml/ref/alter_subscription.sgml | 46 +++-
src/backend/catalog/pg_subscription.c | 10 +
src/backend/commands/subscriptioncmds.c | 146 +++++++++--
src/backend/parser/gram.y | 11 +-
src/backend/postmaster/pgstat.c | 39 ++-
src/backend/replication/logical/worker.c | 273 +++++++++++++++++----
src/include/catalog/pg_subscription.h | 4 +
src/include/nodes/parsenodes.h | 4 +-
src/include/pgstat.h | 4 +-
src/test/regress/expected/subscription.out | 24 ++
src/test/regress/sql/subscription.sql | 20 ++
src/test/subscription/t/023_skip_xact.pl | 185 ++++++++++++++
13 files changed, 731 insertions(+), 84 deletions(-)
create mode 100644 src/test/subscription/t/023_skip_xact.pl
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 88646bc859..992d8b4ac1 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -333,20 +333,63 @@
<para>
A conflict will produce an error and will stop the replication; it must be
resolved manually by the user. Details about the conflict can be found in
- the subscriber's server log.
+ <link linkend="monitoring-pg-stat-subscription-errors"/> as well as the
+ subscriber's server log.
</para>
<para>
The resolution can be done either by changing data on the subscriber so
that it does not conflict with the incoming change or by skipping the
- transaction that conflicts with the existing data. The transaction can be
- skipped by calling the <link linkend="pg-replication-origin-advance">
+ transaction that conflicts with the existing data. When a conflict produce
+ an error, it is shown in <structname>pg_stat_subscription_errors</structname>
+ view as follows:
+ </para>
+
+ <programlisting>
+postgres=# SELECT * FROM pg_stat_subscription_errors;
+-[ RECORD 1 ]--------+-----------------------------------------------------------
+datname | postgres
+subid | 16395
+subname | test_sub
+relid | 16385
+command | INSERT
+xid | 716
+failure_source | apply
+failure_count | 50
+last_failure | 2021-07-21 21:16:02.781779+00
+last_failure_message | duplicate key value violates unique constraint "test_pkey"
+stats_reset |
+</programlisting>
+
+ <para>
+ and it is also shown in subscriber's server log as follows:
+ </para>
+
+<screen>
+ERROR: duplicate key value violates unique constraint "test_pkey"
+DETAIL: Key (id)=(1) already exists.
+CONTEXT: during apply of "INSERT" for relation "public.test" in transaction with xid 716 committs 2021-07-15 21:54:58.802874+00
+</screen>
+
+ <para>
+ The transaction ID that contains the change violating the constraint can be
+ found from those outputs (transaction ID 740 in the above case). The transaction
+ can be skipped by setting <replaceable>skip_xid</replaceable> to the subscription
+ by <command>ALTER SUBSCRIPTION ... SET</command>. Alternatively, the transaction
+ can also be skipped by calling the <link linkend="pg-replication-origin-advance">
<function>pg_replication_origin_advance()</function></link> function with
a <parameter>node_name</parameter> corresponding to the subscription name,
and a position. The current position of origins can be seen in the
<link linkend="view-pg-replication-origin-status">
<structname>pg_replication_origin_status</structname></link> system view.
</para>
+
+ <para>
+ In either way, those should be used as a last resort. They skip the whole
+ transaction including changes that may not violate any constraint and easily
+ make subscriber inconsistent if a user specifies the wrong transaction ID or
+ the position of origin.
+ </para>
</sect1>
<sect1 id="logical-replication-restrictions">
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index a6f994450d..591f554fc7 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -29,6 +29,7 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> REFRESH PUB
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> ENABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> DISABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <replaceable>new_name</replaceable>
</synopsis>
@@ -192,15 +193,46 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
<varlistentry>
<term><literal>SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )</literal></term>
+ <term><literal>RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )</literal></term>
<listitem>
<para>
- This clause alters parameters originally set by
- <xref linkend="sql-createsubscription"/>. See there for more
- information. The parameters that can be altered
- are <literal>slot_name</literal>,
- <literal>synchronous_commit</literal>,
- <literal>binary</literal>, and
- <literal>streaming</literal>.
+ This clause sets or resets a subscription option. The parameters that can be
+ set are the parameters originally set by <xref linkend="sql-createsubscription"/>:
+ <literal>slot_name</literal>, <literal>synchronous_commit</literal>,
+ <literal>binary</literal>, <literal>streaming</literal>, and following
+ parameter:
+ </para>
+ <para>
+ <variablelist>
+ <varlistentry>
+ <term><literal>skip_xid</literal> (<type>xid</type>)</term>
+ <listitem>
+ <para>
+ If incoming data violates any constraints the logical replication
+ will stop until it is resolved. The resolution can be done either
+ by changing data on the subscriber so that it doesn't conflict with
+ incoming change or by skipping the whole transaction. This option
+ specifies transaction ID that logical replication worker skips to
+ apply. The logical replication worker skips all data modification
+ changes within the specified transaction. Therefore, since it skips
+ the whole transaction including the changes that may not violate the
+ constraint, it should only be used as a last resort. This option has
+ no effect for the transaction that is already prepared with enabling
+ <literal>two_phase</literal> on susbscriber. After the logical
+ replication successfully skips the transaction, the transaction ID
+ (stored in
+ <structname>pg_subscription</structname>.<structfield>subskipxid</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ <para>
+ The parameters that can be reset are: <literal>streaming</literal>,
+ <literal>binary</literal>, <literal>synchronous_commit</literal>,
+ and <literal>skip_xid</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index d76bdff36a..8ecc55150e 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -104,6 +104,16 @@ GetSubscription(Oid subid, bool missing_ok)
Assert(!isnull);
sub->publications = textarray_to_stringlist(DatumGetArrayTypeP(datum));
+ /* Get skip XID */
+ datum = SysCacheGetAttr(SUBSCRIPTIONOID,
+ tup,
+ Anum_pg_subscription_subskipxid,
+ &isnull);
+ if (!isnull)
+ sub->skipxid = DatumGetTransactionId(datum);
+ else
+ sub->skipxid = InvalidTransactionId;
+
ReleaseSysCache(tup);
return sub;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index da02d3bbfa..0cc965c056 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -60,6 +60,7 @@
#define SUBOPT_BINARY 0x00000080
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
+#define SUBOPT_SKIP_XID 0x00000400
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -81,6 +82,7 @@ typedef struct SubOpts
bool binary;
bool streaming;
bool twophase;
+ TransactionId skip_xid;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -99,7 +101,8 @@ static void ReportSlotConnectionError(HTAB *rstates, Oid subid, char *slotname,
*/
static void
parse_subscription_options(ParseState *pstate, List *stmt_options,
- bits32 supported_opts, SubOpts *opts)
+ bits32 supported_opts, SubOpts *opts,
+ bool is_reset)
{
ListCell *lc;
@@ -128,12 +131,23 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->streaming = false;
if (IsSet(supported_opts, SUBOPT_TWOPHASE_COMMIT))
opts->twophase = false;
+ if (IsSet(supported_opts, SUBOPT_SKIP_XID))
+ opts->skip_xid = InvalidTransactionId;
/* Parse options */
foreach(lc, stmt_options)
{
DefElem *defel = (DefElem *) lfirst(lc);
+ if (is_reset)
+ {
+ if (defel->arg != NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("RESET must not include values for parameters")));
+
+ }
+
if (IsSet(supported_opts, SUBOPT_CONNECT) &&
strcmp(defel->defname, "connect") == 0)
{
@@ -141,7 +155,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_CONNECT;
- opts->connect = defGetBoolean(defel);
+ if (!is_reset)
+ opts->connect = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_ENABLED) &&
strcmp(defel->defname, "enabled") == 0)
@@ -150,7 +165,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_ENABLED;
- opts->enabled = defGetBoolean(defel);
+ if (!is_reset)
+ opts->enabled = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_CREATE_SLOT) &&
strcmp(defel->defname, "create_slot") == 0)
@@ -159,7 +175,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_CREATE_SLOT;
- opts->create_slot = defGetBoolean(defel);
+ if (!is_reset)
+ opts->create_slot = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_SLOT_NAME) &&
strcmp(defel->defname, "slot_name") == 0)
@@ -168,7 +185,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_SLOT_NAME;
- opts->slot_name = defGetString(defel);
+ if (!is_reset)
+ opts->slot_name = defGetString(defel);
/* Setting slot_name = NONE is treated as no slot name. */
if (strcmp(opts->slot_name, "none") == 0)
@@ -183,7 +201,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_COPY_DATA;
- opts->copy_data = defGetBoolean(defel);
+ if (!is_reset)
+ opts->copy_data = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_SYNCHRONOUS_COMMIT) &&
strcmp(defel->defname, "synchronous_commit") == 0)
@@ -192,12 +211,18 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_SYNCHRONOUS_COMMIT;
- opts->synchronous_commit = defGetString(defel);
+ if (!is_reset)
+ {
+ opts->synchronous_commit = defGetString(defel);
- /* Test if the given value is valid for synchronous_commit GUC. */
- (void) set_config_option("synchronous_commit", opts->synchronous_commit,
- PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
- false, 0, false);
+ /*
+ * Test if the given value is valid for synchronous_commit
+ * GUC.
+ */
+ (void) set_config_option("synchronous_commit", opts->synchronous_commit,
+ PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
+ false, 0, false);
+ }
}
else if (IsSet(supported_opts, SUBOPT_REFRESH) &&
strcmp(defel->defname, "refresh") == 0)
@@ -206,7 +231,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_REFRESH;
- opts->refresh = defGetBoolean(defel);
+ if (!is_reset)
+ opts->refresh = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_BINARY) &&
strcmp(defel->defname, "binary") == 0)
@@ -215,7 +241,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_BINARY;
- opts->binary = defGetBoolean(defel);
+ if (!is_reset)
+ opts->binary = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_STREAMING) &&
strcmp(defel->defname, "streaming") == 0)
@@ -224,7 +251,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_STREAMING;
- opts->streaming = defGetBoolean(defel);
+ if (!is_reset)
+ opts->streaming = defGetBoolean(defel);
}
else if (strcmp(defel->defname, "two_phase") == 0)
{
@@ -245,7 +273,31 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_TWOPHASE_COMMIT;
- opts->twophase = defGetBoolean(defel);
+ if (!is_reset)
+ opts->twophase = defGetBoolean(defel);
+ }
+ else if (strcmp(defel->defname, "skip_xid") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_SKIP_XID))
+ errorConflictingDefElem(defel, pstate);
+
+ if (!is_reset)
+ {
+ char *xid_str = defGetString(defel);
+ TransactionId xid;
+
+ /* Parse the argument as TransactionId */
+ xid = DatumGetTransactionId(DirectFunctionCall1(xidin,
+ CStringGetDatum(xid_str)));
+
+ if (!TransactionIdIsNormal(xid))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid transaction id")));
+ opts->skip_xid = xid;
+ }
+
+ opts->specified_opts |= SUBOPT_SKIP_XID;
}
else
ereport(ERROR,
@@ -416,7 +468,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT);
- parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
+ parse_subscription_options(pstate, stmt->options, supported_opts, &opts,
+ false);
/*
* Since creating a replication slot is not transactional, rolling back
@@ -489,6 +542,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
CharGetDatum(opts.twophase ?
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -885,14 +939,14 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
switch (stmt->kind)
{
- case ALTER_SUBSCRIPTION_OPTIONS:
+ case ALTER_SUBSCRIPTION_SET_OPTIONS:
{
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
- SUBOPT_STREAMING);
+ SUBOPT_STREAMING | SUBOPT_SKIP_XID);
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
if (IsSet(opts.specified_opts, SUBOPT_SLOT_NAME))
{
@@ -944,14 +998,60 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
replaces[Anum_pg_subscription_substream - 1] = true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_SKIP_XID))
+ {
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(opts.skip_xid);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+ }
+
update_tuple = true;
break;
}
+ case ALTER_SUBSCRIPTION_RESET_OPTIONS:
+ {
+ supported_opts = (SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_STREAMING |
+ SUBOPT_BINARY | SUBOPT_SKIP_XID);
+
+ parse_subscription_options(pstate, stmt->options,
+ supported_opts, &opts, true);
+
+ if (IsSet(opts.specified_opts, SUBOPT_SYNCHRONOUS_COMMIT))
+ {
+ values[Anum_pg_subscription_subsynccommit - 1] =
+ CStringGetTextDatum("off");
+ replaces[Anum_pg_subscription_subsynccommit - 1] = true;
+ }
+
+ if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
+ {
+ values[Anum_pg_subscription_substream - 1] =
+ BoolGetDatum(false);
+ replaces[Anum_pg_subscription_substream - 1] = true;
+ }
+
+ if (IsSet(opts.specified_opts, SUBOPT_BINARY))
+ {
+ values[Anum_pg_subscription_subbinary - 1] =
+ BoolGetDatum(false);
+ replaces[Anum_pg_subscription_subbinary - 1] = true;
+ }
+
+ if (IsSet(opts.specified_opts, SUBOPT_SKIP_XID))
+ {
+ nulls[Anum_pg_subscription_subskipxid - 1] =
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+ }
+
+ update_tuple = true;
+ break;
+ }
case ALTER_SUBSCRIPTION_ENABLED:
{
parse_subscription_options(pstate, stmt->options,
- SUBOPT_ENABLED, &opts);
+ SUBOPT_ENABLED, &opts, false);
+
Assert(IsSet(opts.specified_opts, SUBOPT_ENABLED));
if (!sub->slotname && opts.enabled)
@@ -986,7 +1086,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
{
supported_opts = SUBOPT_COPY_DATA | SUBOPT_REFRESH;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
values[Anum_pg_subscription_subpublications - 1] =
publicationListToArray(stmt->publication);
@@ -1036,7 +1136,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts |= SUBOPT_COPY_DATA;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
publist = merge_publications(sub->publications, stmt->publication, isadd, stmt->subname);
values[Anum_pg_subscription_subpublications - 1] =
@@ -1084,7 +1184,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
errmsg("ALTER SUBSCRIPTION ... REFRESH is not allowed for disabled subscriptions")));
parse_subscription_options(pstate, stmt->options,
- SUBOPT_COPY_DATA, &opts);
+ SUBOPT_COPY_DATA, &opts, false);
/*
* The subscription option "two_phase" requires that
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 10da5c5c51..41a1d333f6 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9699,7 +9699,16 @@ AlterSubscriptionStmt:
{
AlterSubscriptionStmt *n =
makeNode(AlterSubscriptionStmt);
- n->kind = ALTER_SUBSCRIPTION_OPTIONS;
+ n->kind = ALTER_SUBSCRIPTION_SET_OPTIONS;
+ n->subname = $3;
+ n->options = $5;
+ $$ = (Node *)n;
+ }
+ | ALTER SUBSCRIPTION name RESET definition
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_RESET_OPTIONS;
n->subname = $3;
n->options = $5;
$$ = (Node *)n;
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index a8d2a0bc65..bec79056b9 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -1699,6 +1699,27 @@ pgstat_reset_subscription_error(Oid subid, Oid subrelid)
msg.m_subid = subid;
msg.m_subrelid = subrelid;
msg.m_reset = true;
+ msg.m_clear = true;
+
+ pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
+}
+
+/* ----------
+ * pgstat_clear_subscription_error() -
+ *
+ * Tell the collector about clear the error of subscription.
+ * ----------
+ */
+void
+pgstat_clear_subscription_error(Oid subid, Oid subrelid)
+{
+ PgStat_MsgSubscriptionErr msg;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_reset = false;
+ msg.m_clear = true;
pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
}
@@ -2046,6 +2067,7 @@ pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
msg.m_databaseid = MyDatabaseId;
msg.m_relid = relid;
msg.m_reset = false;
+ msg.m_clear = false;
msg.m_command = command;
msg.m_xid = xid;
msg.m_last_failure = GetCurrentTimestamp();
@@ -6085,16 +6107,27 @@ pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len)
true);
Assert(relerrent);
- if (msg->m_reset)
+ if (msg->m_reset || msg->m_clear)
{
- /* reset fields and set reset timestamp */
+ Assert(!(msg->m_reset && msg->m_clear));
+
+ /* reset fields */
relerrent->relid = InvalidOid;
relerrent->command = 0;
relerrent->xid = InvalidTransactionId;
relerrent->failure_count = 0;
relerrent->last_failure = 0;
relerrent->errmsg[0] = '\0';
- relerrent->stat_reset_timestamp = GetCurrentTimestamp();
+
+ /*
+ * If the reset is requested, reset more fields and set the reset
+ * timestamp.
+ */
+ if (msg->m_reset)
+ {
+ relerrent->failure_count = 0;
+ relerrent->stat_reset_timestamp = GetCurrentTimestamp();
+ }
}
else
{
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 4f9c4e9014..56d78ba905 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -136,6 +136,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -277,6 +278,21 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * skipping_changes is true if we're skipping all data modification changes of the
+ * specified transaction in MySubscription->skipxid copied to skipping_xid. Please
+ * note that we don’t skip receiving the changes particularly in streaming cases,
+ * since we decide whether or not to skip applying the changes when starting to apply.
+ * Once starting skipping changes, we copy XID to skipping_xid and then don't stop
+ * skipping until we skip the whole transaction even if the subscription is invalidated
+ * and MySubscription->skipxid gets changed or reset. When stopping the skipping
+ * behavior, we reset the skip XID (subskipxid) in the pg_subscription and associate
+ * origin status to the transaction that resets the skip XID so that we can start
+ * streaming from the next transaction.
+ */
+static bool skipping_changes = false;
+static TransactionId skipping_xid = InvalidTransactionId;
+
/*
* Hash table for storing the streaming xid information along with shared file
* set for streaming and subxact files.
@@ -360,6 +376,9 @@ static void set_apply_error_context_rel(LogicalRepRelMapEntry *rel);
static void reset_apply_error_context_rel(void);
static void reset_apply_error_context_info(void);
+static void maybe_start_skipping_changes(TransactionId xid);
+static void stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs);
+
/*
* Should this worker apply changes for given relation.
*
@@ -857,6 +876,11 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
+ /*
+ * Start skipping all changes of this transaction if specified
+ */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -881,7 +905,18 @@ apply_handle_commit(StringInfo s)
LSN_FORMAT_ARGS(commit_data.commit_lsn),
LSN_FORMAT_ARGS(remote_final_lsn))));
- apply_handle_commit_internal(&commit_data);
+ /*
+ * Stop the skipping transaction if enabled. Otherwise, commit the changes
+ * that are just applied.
+ */
+ if (skipping_changes)
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ apply_handle_commit_internal(&commit_data);
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(commit_data.end_lsn);
@@ -910,6 +945,9 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
+ /* Start skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -934,47 +972,57 @@ apply_handle_prepare(StringInfo s)
LSN_FORMAT_ARGS(remote_final_lsn))));
/*
- * Compute unique GID for two_phase transactions. We don't use GID of
- * prepared transaction sent by server as that can lead to deadlock when
- * we have multiple subscriptions from same node point to publications on
- * the same node. See comments atop worker.c
+ * Prepare transaction if we haven't skipped the changes of this
+ * transaction.
*/
- TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
- gid, sizeof(gid));
+ if (skipping_changes)
+ stop_skipping_changes(prepare_data.end_lsn, prepare_data.prepare_time);
+ else
+ {
+ /*
+ * Compute unique GID for two_phase transactions. We don't use GID of
+ * prepared transaction sent by server as that can lead to deadlock
+ * when we have multiple subscriptions from same node point to
+ * publications on the same node. See comments atop worker.c
+ */
+ TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
+ gid, sizeof(gid));
- /*
- * Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction. It is done this way because at
- * commit prepared time, we won't know whether we have skipped preparing a
- * transaction because of no change.
- *
- * XXX, We can optimize such that at commit prepared time, we first check
- * whether we have prepared the transaction or not but that doesn't seem
- * worthwhile because such cases shouldn't be common.
- */
- begin_replication_step();
+ /*
+ * Unlike commit, here, we always prepare the transaction even though
+ * no change has happened in this transaction. It is done this way
+ * because at commit prepared time, we won't know whether we have
+ * skipped preparing a transaction because of no change.
+ *
+ * XXX, We can optimize such that at commit prepared time, we first
+ * check whether we have prepared the transaction or not but that
+ * doesn't seem worthwhile because such cases shouldn't be common.
+ */
+ begin_replication_step();
- /*
- * BeginTransactionBlock is necessary to balance the EndTransactionBlock
- * called within the PrepareTransactionBlock below.
- */
- BeginTransactionBlock();
- CommitTransactionCommand(); /* Completes the preceding Begin command. */
+ /*
+ * BeginTransactionBlock is necessary to balance the
+ * EndTransactionBlock called within the PrepareTransactionBlock
+ * below.
+ */
+ BeginTransactionBlock();
+ CommitTransactionCommand(); /* Completes the preceding Begin command. */
- /*
- * Update origin state so we can restart streaming from correct position
- * in case of crash.
- */
- replorigin_session_origin_lsn = prepare_data.end_lsn;
- replorigin_session_origin_timestamp = prepare_data.prepare_time;
+ /*
+ * Update origin state so we can restart streaming from correct
+ * position in case of crash.
+ */
+ replorigin_session_origin_lsn = prepare_data.end_lsn;
+ replorigin_session_origin_timestamp = prepare_data.prepare_time;
- PrepareTransactionBlock(gid);
- end_replication_step();
- CommitTransactionCommand();
- pgstat_report_stat(false);
+ PrepareTransactionBlock(gid);
+ end_replication_step();
+ CommitTransactionCommand();
+ pgstat_report_stat(false);
- store_flush_position(prepare_data.end_lsn);
+ }
+ store_flush_position(prepare_data.end_lsn);
in_remote_transaction = false;
/* Process any tables that are being synchronized in parallel. */
@@ -1087,9 +1135,10 @@ apply_handle_origin(StringInfo s)
{
/*
* ORIGIN message can only come inside streaming transaction or inside
- * remote transaction and before any actual writes.
+ * remote transaction and before any actual writes unless we're skipping
+ * changes of the transaction.
*/
- if (!in_streamed_transaction &&
+ if (!in_streamed_transaction && !skipping_changes &&
(!in_remote_transaction ||
(IsTransactionState() && !am_tablesync_worker())))
ereport(ERROR,
@@ -1111,6 +1160,9 @@ apply_handle_stream_start(StringInfo s)
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg_internal("duplicate STREAM START message")));
+ /* extract XID of the top-level transaction */
+ stream_xid = logicalrep_read_stream_start(s, &first_segment);
+
/*
* Start a transaction on stream start, this transaction will be committed
* on the stream stop unless it is a tablesync worker in which case it
@@ -1123,9 +1175,6 @@ apply_handle_stream_start(StringInfo s)
/* notify handle methods we're processing a remote transaction */
in_streamed_transaction = true;
- /* extract XID of the top-level transaction */
- stream_xid = logicalrep_read_stream_start(s, &first_segment);
-
if (!TransactionIdIsValid(stream_xid))
ereport(ERROR,
(errcode(ERRCODE_PROTOCOL_VIOLATION),
@@ -1207,6 +1256,7 @@ apply_handle_stream_abort(StringInfo s)
errmsg_internal("STREAM ABORT message without STREAM STOP")));
logicalrep_read_stream_abort(s, &xid, &subxid);
+ maybe_start_skipping_changes(xid);
/*
* If the two XIDs are the same, it's in fact abort of toplevel xact, so
@@ -1299,6 +1349,10 @@ apply_handle_stream_abort(StringInfo s)
CommitTransactionCommand();
}
+ /* Stop the skipping transaction if enabled */
+ if (skipping_changes)
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
reset_apply_error_context_info();
}
@@ -1309,11 +1363,11 @@ static void
apply_handle_stream_commit(StringInfo s)
{
TransactionId xid;
+ LogicalRepCommitData commit_data;
StringInfoData s2;
int nchanges;
char path[MAXPGPATH];
char *buffer = NULL;
- LogicalRepCommitData commit_data;
StreamXidHash *ent;
MemoryContext oldcxt;
BufFile *fd;
@@ -1327,8 +1381,13 @@ apply_handle_stream_commit(StringInfo s)
apply_error_callback_arg.remote_xid = xid;
apply_error_callback_arg.committs = commit_data.committime;
+ remote_final_lsn = commit_data.commit_lsn;
+
elog(DEBUG1, "received commit for streamed transaction %u", xid);
+ /* Start skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(xid);
+
/* Make sure we have an open transaction */
begin_replication_step();
@@ -1360,13 +1419,12 @@ apply_handle_stream_commit(StringInfo s)
MemoryContextSwitchTo(oldcxt);
- remote_final_lsn = commit_data.commit_lsn;
-
/*
* Make sure the handle apply_dispatch methods are aware we're in a remote
* transaction.
*/
in_remote_transaction = true;
+
pgstat_report_activity(STATE_RUNNING, NULL);
end_replication_step();
@@ -1439,7 +1497,17 @@ apply_handle_stream_commit(StringInfo s)
elog(DEBUG1, "replayed %d (all) changes from file \"%s\"",
nchanges, path);
- apply_handle_commit_internal(&commit_data);
+ if (skipping_changes)
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ {
+ /* commit the streamed transaction */
+ apply_handle_commit_internal(&commit_data);
+ }
/* unlink the files with serialized changes and subxact info */
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
@@ -1448,7 +1516,6 @@ apply_handle_stream_commit(StringInfo s)
process_syncing_tables(commit_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
-
reset_apply_error_context_info();
}
@@ -2328,6 +2395,17 @@ apply_dispatch(StringInfo s)
LogicalRepMsgType action = pq_getmsgbyte(s);
ErrorContextCallback errcallback;
+ /*
+ * Skip all data-modification changes if we're skipping changes of this
+ * transaction.
+ */
+ if (skipping_changes &&
+ (action == LOGICAL_REP_MSG_INSERT ||
+ action == LOGICAL_REP_MSG_UPDATE ||
+ action == LOGICAL_REP_MSG_DELETE ||
+ action == LOGICAL_REP_MSG_TRUNCATE))
+ return;
+
/*
* Push apply error context callback. Other fields will be filled during
* applying the change.
@@ -3788,3 +3866,108 @@ reset_logicalrep_error_context_rel(void)
apply_error_callback_arg.relname = NULL;
}
}
+
+/*
+ * Start skipping changes of the transaction if the given XID matches the
+ * transaction ID specified by skip_xid option.
+ */
+static void
+maybe_start_skipping_changes(TransactionId xid)
+{
+ Assert(!skipping_changes);
+ Assert(!TransactionIdIsValid(skipping_xid));
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ if (!TransactionIdIsValid(MySubscription->skipxid) ||
+ MySubscription->skipxid != xid)
+ return;
+
+ skipping_changes = true;
+ skipping_xid = xid;
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction with xid %u",
+ skipping_xid));
+}
+
+/*
+ * Stop skipping changes and reset the skip XID.
+ *
+ * If origin_lsn and origin_committs are valid, we set origin state to the
+ * transaction commit that resets the skip XID so that we can start streaming
+ * from the transaction next to the one that we just skipped.
+ */
+static void
+stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs)
+{
+ Relation rel;
+ HeapTuple tup;
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ Assert(skipping_changes);
+ Assert(TransactionIdIsValid(skipping_xid));
+ Assert(in_remote_transaction);
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction with xid %u",
+ skipping_xid)));
+
+ /* Stop skipping changes */
+ skipping_changes = false;
+ skipping_xid = InvalidTransactionId;
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ if (!IsTransactionState())
+ StartTransactionCommand();
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ TransactionIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ /* Set subskipxid to null */
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ /* Update the system catalog to reset the skip XID */
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+
+ heap_freetuple(tup);
+ table_close(rel, RowExclusiveLock);
+
+ if (!XLogRecPtrIsInvalid(origin_lsn))
+ {
+ /*
+ * Update origin state so we can restart streaming from correct
+ * position in case of crash.
+ */
+ replorigin_session_origin_lsn = origin_lsn;
+ replorigin_session_origin_timestamp = origin_committs;
+ }
+
+ CommitTransactionCommand();
+ pgstat_report_stat(false);
+
+ /*
+ * Clear the error statistics of this subscription to let users know the
+ * subscription is no longer getting stuck by the conflict.
+ *
+ * The message for clearing the error statistics can be lost but that's
+ * okay. The user can know the logical replication is working fine in
+ * other ways, for example, checking pg_stat_subscription view. And the
+ * user is able to reset the single subscription error statistics by
+ * pg_reset_subscription_error SQL function.
+ */
+ pgstat_clear_subscription_error(MySubscription->oid, InvalidOid);
+}
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 21061493ea..beaa6e646d 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -67,6 +67,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
char subtwophasestate; /* Stream two-phase transactions */
+ TransactionId subskipxid BKI_FORCE_NULL; /* All changes associated with
+ * this XID are skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -103,6 +106,7 @@ typedef struct Subscription
* binary format */
bool stream; /* Allow streaming in-progress transactions. */
char twophasestate; /* Allow streaming two-phase transactions */
+ TransactionId skipxid; /* All changes of the XID are skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 947660a4b0..19d00d9eac 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3658,7 +3658,8 @@ typedef struct CreateSubscriptionStmt
typedef enum AlterSubscriptionType
{
- ALTER_SUBSCRIPTION_OPTIONS,
+ ALTER_SUBSCRIPTION_SET_OPTIONS,
+ ALTER_SUBSCRIPTION_RESET_OPTIONS,
ALTER_SUBSCRIPTION_CONNECTION,
ALTER_SUBSCRIPTION_SET_PUBLICATION,
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
@@ -3675,6 +3676,7 @@ typedef struct AlterSubscriptionStmt
char *conninfo; /* Connection string to publisher */
List *publication; /* One or more publication to subscribe to */
List *options; /* List of DefElem nodes */
+ TransactionId skip_xid; /* XID to skip */
} AlterSubscriptionStmt;
typedef struct DropSubscriptionStmt
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 1104886bef..4a1185a4f6 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -563,9 +563,10 @@ typedef struct PgStat_MsgSubscriptionErr
Oid m_subid;
Oid m_subrelid;
- /* The clear messages use below field */
+ /* The clear messages use below fields */
bool m_reset; /* clear all fields and set reset_stats
* timestamp */
+ bool m_clear; /* clear all fields except for total_failure */
/* The error report message uses below fields */
Oid m_databaseid;
@@ -1101,6 +1102,7 @@ extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type t
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
extern void pgstat_reset_subscription_error(Oid subid, Oid subrelid);
+extern void pgstat_clear_subscription_error(Oid subid, Oid subrelid);
extern void pgstat_report_autovac(Oid dboid);
extern void pgstat_report_vacuum(Oid tableoid, bool shared,
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 67f92b3878..4883e6267f 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -286,6 +286,30 @@ ERROR: unrecognized subscription parameter: "two_phase"
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
ERROR: cannot set streaming = true for two-phase enabled subscription
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = '4294967295');
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid, synchronous_commit, binary, streaming);
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 4294967296);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1.1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+ERROR: invalid transaction id
+-- fail - unsupported parameters
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ERROR: unrecognized subscription parameter: "connect"
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+ERROR: unrecognized subscription parameter: "enabled"
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+ERROR: RESET must not include values for parameters
\dRs+
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 88743ab33b..6846550e87 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -220,6 +220,26 @@ ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = '4294967295');
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid, synchronous_commit, binary, streaming);
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 4294967296);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1.1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+
+-- fail - unsupported parameters
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+
\dRs+
DROP SUBSCRIPTION regress_testsub;
diff --git a/src/test/subscription/t/023_skip_xact.pl b/src/test/subscription/t/023_skip_xact.pl
new file mode 100644
index 0000000000..7b29828cce
--- /dev/null
+++ b/src/test/subscription/t/023_skip_xact.pl
@@ -0,0 +1,185 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Test skipping logical replication transactions
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 8;
+
+sub test_subscription_error
+{
+ my ($node, $expected, $source, $relname, $msg) = @_;
+
+ # Wait for the error statistics to be updated.
+ $node->poll_query_until(
+ 'postgres', qq[
+SELECT count(1) > 0 FROM pg_stat_subscription_errors
+WHERE relid = '$relname'::regclass AND failure_source = '$source';
+]) or die "Timed out while waiting for statistics to be updated";
+
+ my $result = $node->safe_psql(
+ 'postgres',
+ qq[
+SELECT datname, subname, command, relid::regclass, failure_source, failure_count > 0
+FROM pg_stat_subscription_errors
+WHERE relid = '$relname'::regclass AND failure_source = '$source';
+]);
+ is($result, $expected, $msg);
+}
+
+# Create publisher node.
+my $node_publisher = get_new_node('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = get_new_node('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+
+# don't overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ 'wal_retrieve_retry_interval = 5s');
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we create
+# the same tables but with primary keys. Also, insert some data that will conflict
+# with the data replicated from publisher later.
+$node_publisher->safe_psql('postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int);
+CREATE TABLE test_tab2 (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+INSERT INTO test_tab1 VALUES (1);
+INSERT INTO test_tab2 VALUES (1);
+COMMIT;
+]);
+$node_subscriber->safe_psql('postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int primary key);
+CREATE TABLE test_tab2 (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab2 VALUES (1);
+INSERT INTO test_tab_streaming SELECT 10000, md5(10000::text);
+COMMIT;
+]);
+
+# Check if there is no subscription errors before starting logical replication.
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_errors");
+is($result, qq(0), 'check no subscription error');
+
+# Setup logical replication.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ q[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Start logical replication. The table sync for test_tab2 on tap_sub will enter
+# infinite error due to violating the unique constraint.
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (two_phase = on);");
+my $appname_streaming = 'tap_sub_streaming';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming' PUBLICATION tap_pub_streaming WITH (streaming = on);");
+
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Also wait for initial table sync for test_tab1 and test_tab_streaming to finish.
+$node_subscriber->poll_query_until('postgres',
+ q[
+SELECT count(1) = 2 FROM pg_subscription_rel
+WHERE srrelid in ('test_tab1'::regclass, 'test_tab_streaming'::regclass) AND srsubstate = 'r'
+]) or die "Timed out while waiting for subscriber to synchronize data";
+
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab1");
+is($result, q(1), 'check initial data was copied to subscriber');
+
+# Insert more data to test_tab1, raising an error on the subscriber due to violating
+# the unique constraint on test_tab1.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab1 VALUES (1)");
+
+# Insert enough rows to test_tab_streaming to exceed the 64kB limit, also raising an
+# error on the subscriber for the same reason.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);");
+
+# Check both two errors on tap_sub subscription are reported.
+test_subscription_error($node_subscriber, qq(postgres|tap_sub|INSERT|test_tab1|apply|t),
+ 'apply', 'test_tab1', 'error reporting by the apply worker');
+test_subscription_error($node_subscriber, qq(postgres|tap_sub||test_tab2|tablesync|t),
+ 'tablesync', 'test_tab2', 'error reporting by the table sync worker');
+test_subscription_error($node_subscriber, qq(postgres|tap_sub_streaming|INSERT|test_tab_streaming|apply|t),
+ 'apply', 'test_tab_streaming', 'error reporting by the apply worker');
+
+# Set XIDs of the transactions in question to the subscriptions to skip.
+my $skip_xid1 = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT xid FROM pg_stat_subscription_errors WHERE relid = 'test_tab1'::regclass");
+my $skip_xid2 = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT xid FROM pg_stat_subscription_errors WHERE relid = 'test_tab_streaming'::regclass");
+
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (skip_xid = $skip_xid1)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub_streaming SET (skip_xid = $skip_xid2)");
+
+# Restart the subscriber to restart logical replication without interval.
+$node_subscriber->restart;
+
+# Wait for the transaction in question is skipped.
+$node_subscriber->poll_query_until(
+ 'postgres',
+ q[
+SELECT count(1) = 2 FROM pg_subscription
+WHERE subname in ('tap_sub', 'tap_sub_streaming') AND subskipxid IS NULL
+]) or die "Timed out while waiting for the transaction to be skipped";
+
+# Insert data to test_tab1 that doesn't conflict.
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab1 VALUES (2)");
+
+# Also, insert data to test_tab_streaming that doesn't conflict.
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab_streaming VALUES (10001, md5(10001::text))");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Check the data is successfully replicated after skipping the transaction.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT * FROM test_tab1");
+is($result, q(1
+2), "subscription gets changes after skipped transaction");
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM test_tab_streaming");
+is($result, q(2), "subscription gets changes after skipped transaction");
+
+# Check if the view doesn't show any entries after dropping the subscription.
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+DROP SUBSCRIPTION tap_sub;
+DROP SUBSCRIPTION tap_sub_streaming;
+]);
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_errors");
+is($result, q(0), 'no error after dropping subscription');
--
2.24.3 (Apple Git-128)
v3-0001-Add-errcontext-to-errors-of-the-applying-logical-.patchapplication/octet-stream; name=v3-0001-Add-errcontext-to-errors-of-the-applying-logical-.patchDownload
From 98c5447e985221bd0e49583db7e73ea2154eff24 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 28 Jun 2021 13:21:58 +0900
Subject: [PATCH v3 1/3] Add errcontext to errors of the applying logical
replication changes.
This commit adds the error context to errors happening during applying
logical replication changes, showing the command, the relation
relation, transaction ID, and commit timestamp in the server log.
---
src/backend/commands/tablecmds.c | 7 +
src/backend/replication/logical/proto.c | 49 +++++
src/backend/replication/logical/worker.c | 220 ++++++++++++++++++++---
src/include/replication/logicalproto.h | 1 +
src/include/replication/logicalworker.h | 2 +
5 files changed, 257 insertions(+), 22 deletions(-)
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index a16e749506..a500abaf2f 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -78,6 +78,7 @@
#include "partitioning/partbounds.h"
#include "partitioning/partdesc.h"
#include "pgstat.h"
+#include "replication/logicalworker.h"
#include "rewrite/rewriteDefine.h"
#include "rewrite/rewriteHandler.h"
#include "rewrite/rewriteManip.h"
@@ -1897,6 +1898,9 @@ ExecuteTruncateGuts(List *explicit_rels,
continue;
}
+ /* Set logical replication error callback info if necessary */
+ set_logicalrep_error_context_rel(rel);
+
/*
* Build the lists of foreign tables belonging to each foreign server
* and pass each list to the foreign data wrapper's callback function,
@@ -2004,6 +2008,9 @@ ExecuteTruncateGuts(List *explicit_rels,
pgstat_count_truncate(rel);
}
+ /* Reset logical replication error callback info */
+ reset_logicalrep_error_context_rel();
+
/* Now go through the hash table, and truncate foreign tables */
if (ft_htab)
{
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index a245252529..54fff7df21 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -1109,3 +1109,52 @@ logicalrep_read_stream_abort(StringInfo in, TransactionId *xid,
*xid = pq_getmsgint(in, 4);
*subxid = pq_getmsgint(in, 4);
}
+
+/*
+ * get string representing LogicalRepMsgType.
+ */
+char *
+logicalrep_message_type(LogicalRepMsgType action)
+{
+ switch (action)
+ {
+ case LOGICAL_REP_MSG_BEGIN:
+ return "BEGIN";
+ case LOGICAL_REP_MSG_COMMIT:
+ return "COMMIT";
+ case LOGICAL_REP_MSG_INSERT:
+ return "INSERT";
+ case LOGICAL_REP_MSG_UPDATE:
+ return "UPDATE";
+ case LOGICAL_REP_MSG_DELETE:
+ return "DELETE";
+ case LOGICAL_REP_MSG_TRUNCATE:
+ return "TRUNCATE";
+ case LOGICAL_REP_MSG_RELATION:
+ return "RELATION";
+ case LOGICAL_REP_MSG_TYPE:
+ return "TYPE";
+ case LOGICAL_REP_MSG_ORIGIN:
+ return "ORIGIN";
+ case LOGICAL_REP_MSG_MESSAGE:
+ return "MESSAGE";
+ case LOGICAL_REP_MSG_STREAM_START:
+ return "STREAM START";
+ case LOGICAL_REP_MSG_STREAM_END:
+ return "STREAM END";
+ case LOGICAL_REP_MSG_STREAM_ABORT:
+ return "STREAM ABORT";
+ case LOGICAL_REP_MSG_STREAM_COMMIT:
+ return "STREAM COMMIT";
+ case LOGICAL_REP_MSG_BEGIN_PREPARE:
+ return "BEGIN PREPARE";
+ case LOGICAL_REP_MSG_PREPARE:
+ return "PREPARE";
+ case LOGICAL_REP_MSG_COMMIT_PREPARED:
+ return "COMMIT PREPARED";
+ case LOGICAL_REP_MSG_ROLLBACK_PREPARED:
+ return "ROLLBACK PREPARED";
+ }
+
+ elog(ERROR, "invalid logical replication message type \"%c\"", action);
+}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 186be1a188..d346377b20 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -221,6 +221,27 @@ typedef struct ApplyExecutionData
PartitionTupleRouting *proute; /* partition routing info */
} ApplyExecutionData;
+/* Struct for saving and restoring apply information */
+typedef struct ApplyErrCallbackArg
+{
+ LogicalRepMsgType command; /* 0 if invalid */
+
+ /* Local relation information */
+ char *nspname; /* used for error context */
+ char *relname; /* used for error context */
+
+ TransactionId remote_xid;
+ TimestampTz committs;
+} ApplyErrCallbackArg;
+static ApplyErrCallbackArg apply_error_callback_arg =
+{
+ .command = 0,
+ .relname = NULL,
+ .nspname = NULL,
+ .remote_xid = InvalidTransactionId,
+ .committs = 0,
+};
+
/*
* Stream xid hash entry. Whenever we see a new xid we create this entry in the
* xidhash and along with it create the streaming file and store the fileset handle.
@@ -332,6 +353,10 @@ static void apply_handle_tuple_routing(ApplyExecutionData *edata,
/* Compute GID for two_phase transactions */
static void TwoPhaseTransactionGid(Oid subid, TransactionId xid, char *gid, int szgid);
+static void apply_error_callback(void *arg);
+static void set_apply_error_context_rel(LogicalRepRelMapEntry *rel);
+static void reset_apply_error_context_rel(void);
+static void reset_apply_error_context_info(void);
/*
* Should this worker apply changes for given relation.
@@ -825,6 +850,8 @@ apply_handle_begin(StringInfo s)
LogicalRepBeginData begin_data;
logicalrep_read_begin(s, &begin_data);
+ apply_error_callback_arg.remote_xid = begin_data.xid;
+ apply_error_callback_arg.committs = begin_data.committime;
remote_final_lsn = begin_data.final_lsn;
@@ -858,6 +885,7 @@ apply_handle_commit(StringInfo s)
process_syncing_tables(commit_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -875,6 +903,8 @@ apply_handle_begin_prepare(StringInfo s)
errmsg_internal("tablesync worker received a BEGIN PREPARE message")));
logicalrep_read_begin_prepare(s, &begin_data);
+ apply_error_callback_arg.remote_xid = begin_data.xid;
+ apply_error_callback_arg.committs = begin_data.prepare_time;
remote_final_lsn = begin_data.prepare_lsn;
@@ -949,6 +979,7 @@ apply_handle_prepare(StringInfo s)
process_syncing_tables(prepare_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -961,6 +992,8 @@ apply_handle_commit_prepared(StringInfo s)
char gid[GIDSIZE];
logicalrep_read_commit_prepared(s, &prepare_data);
+ apply_error_callback_arg.remote_xid = prepare_data.xid;
+ apply_error_callback_arg.committs = prepare_data.commit_time;
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
@@ -988,6 +1021,7 @@ apply_handle_commit_prepared(StringInfo s)
process_syncing_tables(prepare_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1000,6 +1034,7 @@ apply_handle_rollback_prepared(StringInfo s)
char gid[GIDSIZE];
logicalrep_read_rollback_prepared(s, &rollback_data);
+ apply_error_callback_arg.remote_xid = rollback_data.xid;
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, rollback_data.xid,
@@ -1037,6 +1072,7 @@ apply_handle_rollback_prepared(StringInfo s)
process_syncing_tables(rollback_data.rollback_end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1093,6 +1129,8 @@ apply_handle_stream_start(StringInfo s)
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg_internal("invalid transaction ID in streamed replication transaction")));
+ apply_error_callback_arg.remote_xid = stream_xid;
+
/*
* Initialize the xidhash table if we haven't yet. This will be used for
* the entire duration of the apply worker so create it in permanent
@@ -1149,6 +1187,7 @@ apply_handle_stream_stop(StringInfo s)
MemoryContextReset(LogicalStreamingContext);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1172,7 +1211,10 @@ apply_handle_stream_abort(StringInfo s)
* just delete the files with serialized info.
*/
if (xid == subxid)
+ {
+ apply_error_callback_arg.remote_xid = xid;
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
+ }
else
{
/*
@@ -1197,6 +1239,7 @@ apply_handle_stream_abort(StringInfo s)
char path[MAXPGPATH];
StreamXidHash *ent;
+ apply_error_callback_arg.remote_xid = subxid;
subidx = -1;
begin_replication_step();
subxact_info_read(MyLogicalRepWorker->subid, xid);
@@ -1221,6 +1264,7 @@ apply_handle_stream_abort(StringInfo s)
cleanup_subxact_info();
end_replication_step();
CommitTransactionCommand();
+ reset_apply_error_context_info();
return;
}
@@ -1252,6 +1296,8 @@ apply_handle_stream_abort(StringInfo s)
end_replication_step();
CommitTransactionCommand();
}
+
+ reset_apply_error_context_info();
}
/*
@@ -1276,6 +1322,8 @@ apply_handle_stream_commit(StringInfo s)
errmsg_internal("STREAM COMMIT message without STREAM STOP")));
xid = logicalrep_read_stream_commit(s, &commit_data);
+ apply_error_callback_arg.remote_xid = xid;
+ apply_error_callback_arg.committs = commit_data.committime;
elog(DEBUG1, "received commit for streamed transaction %u", xid);
@@ -1398,6 +1446,8 @@ apply_handle_stream_commit(StringInfo s)
process_syncing_tables(commit_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+
+ reset_apply_error_context_info();
}
/*
@@ -1517,6 +1567,9 @@ apply_handle_insert(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ set_apply_error_context_rel(rel);
+
/* Initialize the executor state. */
edata = create_edata_for_relation(rel);
estate = edata->estate;
@@ -1540,6 +1593,9 @@ apply_handle_insert(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ reset_apply_error_context_rel();
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -1638,6 +1694,9 @@ apply_handle_update(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ set_apply_error_context_rel(rel);
+
/* Check if we can do the update. */
check_relation_updatable(rel);
@@ -1691,6 +1750,9 @@ apply_handle_update(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ reset_apply_error_context_rel();
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -1794,6 +1856,9 @@ apply_handle_delete(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ set_apply_error_context_rel(rel);
+
/* Check if we can do the delete. */
check_relation_updatable(rel);
@@ -1819,6 +1884,9 @@ apply_handle_delete(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ reset_apply_error_context_rel();
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -2223,6 +2291,9 @@ apply_handle_truncate(StringInfo s)
* Even if we used CASCADE on the upstream primary we explicitly default
* to replaying changes without further cascading. This might be later
* changeable with a user specified option.
+ *
+ * Both namespace and relation name for error callback will be set in
+ * ExecuteTruncateGuts().
*/
ExecuteTruncateGuts(rels,
relids,
@@ -2253,44 +2324,54 @@ static void
apply_dispatch(StringInfo s)
{
LogicalRepMsgType action = pq_getmsgbyte(s);
+ ErrorContextCallback errcallback;
+
+ /*
+ * Push apply error context callback. Other fields will be filled during
+ * applying the change.
+ */
+ apply_error_callback_arg.command = action;
+ errcallback.callback = apply_error_callback;
+ errcallback.previous = error_context_stack;
+ error_context_stack = &errcallback;
switch (action)
{
case LOGICAL_REP_MSG_BEGIN:
apply_handle_begin(s);
- return;
+ break;
case LOGICAL_REP_MSG_COMMIT:
apply_handle_commit(s);
- return;
+ break;
case LOGICAL_REP_MSG_INSERT:
apply_handle_insert(s);
- return;
+ break;
case LOGICAL_REP_MSG_UPDATE:
apply_handle_update(s);
- return;
+ break;
case LOGICAL_REP_MSG_DELETE:
apply_handle_delete(s);
- return;
+ break;
case LOGICAL_REP_MSG_TRUNCATE:
apply_handle_truncate(s);
- return;
+ break;
case LOGICAL_REP_MSG_RELATION:
apply_handle_relation(s);
- return;
+ break;
case LOGICAL_REP_MSG_TYPE:
apply_handle_type(s);
- return;
+ break;
case LOGICAL_REP_MSG_ORIGIN:
apply_handle_origin(s);
- return;
+ break;
case LOGICAL_REP_MSG_MESSAGE:
@@ -2299,45 +2380,48 @@ apply_dispatch(StringInfo s)
* Although, it could be used by other applications that use this
* output plugin.
*/
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_START:
apply_handle_stream_start(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_END:
apply_handle_stream_stop(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_ABORT:
apply_handle_stream_abort(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_COMMIT:
apply_handle_stream_commit(s);
- return;
+ break;
case LOGICAL_REP_MSG_BEGIN_PREPARE:
apply_handle_begin_prepare(s);
- return;
+ break;
case LOGICAL_REP_MSG_PREPARE:
apply_handle_prepare(s);
- return;
+ break;
case LOGICAL_REP_MSG_COMMIT_PREPARED:
apply_handle_commit_prepared(s);
- return;
+ break;
case LOGICAL_REP_MSG_ROLLBACK_PREPARED:
apply_handle_rollback_prepared(s);
- return;
+ break;
+
+ default:
+ ereport(ERROR,
+ (errcode(ERRCODE_PROTOCOL_VIOLATION),
+ errmsg("invalid logical replication message type \"%c\"", action)));
}
- ereport(ERROR,
- (errcode(ERRCODE_PROTOCOL_VIOLATION),
- errmsg_internal("invalid logical replication message type \"%c\"",
- action)));
+ /* Pop the error context stack */
+ error_context_stack = errcallback.previous;
}
/*
@@ -3570,3 +3654,95 @@ IsLogicalWorker(void)
{
return MyLogicalRepWorker != NULL;
}
+
+static void
+apply_error_callback(void *arg)
+{
+ StringInfoData buf;
+
+ if (apply_error_callback_arg.command == 0)
+ return;
+
+ initStringInfo(&buf);
+ appendStringInfo(&buf, _("during apply of \"%s\""),
+ logicalrep_message_type(apply_error_callback_arg.command));
+
+ if (apply_error_callback_arg.relname)
+ appendStringInfo(&buf, _(" for relation \"%s.%s\""),
+ apply_error_callback_arg.nspname,
+ apply_error_callback_arg.relname);
+
+ if (TransactionIdIsNormal(apply_error_callback_arg.remote_xid))
+ appendStringInfo(&buf, _(" in transaction with xid %u committs %s"),
+ apply_error_callback_arg.remote_xid,
+ apply_error_callback_arg.committs == 0
+ ? "(unset)"
+ : timestamptz_to_str(apply_error_callback_arg.committs));
+
+ errcontext("%s", buf.data);
+}
+
+/* Set relation information of apply error callback */
+static void
+set_apply_error_context_rel(LogicalRepRelMapEntry *rel)
+{
+ apply_error_callback_arg.nspname = rel->remoterel.nspname;
+ apply_error_callback_arg.relname = rel->remoterel.relname;
+}
+
+/* Reset relation information of apply error callback */
+static void
+reset_apply_error_context_rel(void)
+{
+ apply_error_callback_arg.nspname = NULL;
+ apply_error_callback_arg.relname = NULL;
+}
+
+/* Reset all information of apply error callback */
+static void
+reset_apply_error_context_info(void)
+{
+ apply_error_callback_arg.command = 0;
+ apply_error_callback_arg.remote_xid = InvalidTransactionId;
+ apply_error_callback_arg.committs = 0;
+ reset_apply_error_context_rel();
+}
+
+/*
+ * Set relation information of error callback.
+ *
+ * Both set_logicalrep_error_context_rel() and
+ * reset_logicalrep_error_context_rel() functions are intended to be
+ * used by functions outside of logical replication module where don't
+ * use LogicalRepRelMapEntry.
+ *
+ * The caller must call reset_logicalrep_error_context_rel() after use
+ * so we free the memory used for names.
+ */
+void
+set_logicalrep_error_context_rel(Relation rel)
+{
+ if (IsLogicalWorker())
+ {
+ apply_error_callback_arg.nspname =
+ get_namespace_name(RelationGetNamespace(rel));
+ apply_error_callback_arg.relname =
+ pstrdup(RelationGetRelationName(rel));
+ }
+}
+
+/* Reset relation information for error callback set */
+void
+reset_logicalrep_error_context_rel(void)
+{
+ if (IsLogicalWorker())
+ {
+ if (apply_error_callback_arg.nspname)
+ pfree(apply_error_callback_arg.nspname);
+ apply_error_callback_arg.nspname = NULL;
+
+ if (apply_error_callback_arg.relname)
+ pfree(apply_error_callback_arg.relname);
+ apply_error_callback_arg.relname = NULL;
+ }
+}
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 63de90d94a..c78a4409bc 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -242,5 +242,6 @@ extern void logicalrep_write_stream_abort(StringInfo out, TransactionId xid,
TransactionId subxid);
extern void logicalrep_read_stream_abort(StringInfo in, TransactionId *xid,
TransactionId *subxid);
+extern char *logicalrep_message_type(LogicalRepMsgType action);
#endif /* LOGICAL_PROTO_H */
diff --git a/src/include/replication/logicalworker.h b/src/include/replication/logicalworker.h
index 2ad61a001a..d3e8514ffd 100644
--- a/src/include/replication/logicalworker.h
+++ b/src/include/replication/logicalworker.h
@@ -15,5 +15,7 @@
extern void ApplyWorkerMain(Datum main_arg);
extern bool IsLogicalWorker(void);
+extern void set_logicalrep_error_context_rel(Relation rel);
+extern void reset_logicalrep_error_context_rel(void);
#endif /* LOGICALWORKER_H */
--
2.24.3 (Apple Git-128)
v3-0002-Add-pg_stat_logical_replication_error-statistics-.patchapplication/octet-stream; name=v3-0002-Add-pg_stat_logical_replication_error-statistics-.patchDownload
From 6a0003f1d248214cdcb6dbb712cf3a82c63af986 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 16 Jul 2021 23:10:22 +0900
Subject: [PATCH v3 2/3] Add pg_stat_logical_replication_error statistics view.
This commits adds new system view pg_stat_logical_replication_error,
showing errors happening during applying logical replication changes
as well as during performing initial table synchronization.
It also adds SQL function pg_stat_reset_subscription_errror() to
reset the single subscription error.
---
doc/src/sgml/monitoring.sgml | 151 +++++
src/backend/catalog/pg_subscription.c | 23 +-
src/backend/catalog/system_functions.sql | 2 +
src/backend/catalog/system_views.sql | 27 +
src/backend/commands/subscriptioncmds.c | 30 +-
src/backend/postmaster/pgstat.c | 634 ++++++++++++++++++++
src/backend/replication/logical/tablesync.c | 16 +-
src/backend/replication/logical/worker.c | 48 +-
src/backend/utils/adt/pgstatfuncs.c | 120 ++++
src/backend/utils/error/elog.c | 16 +
src/include/catalog/pg_proc.dat | 13 +
src/include/catalog/pg_subscription_rel.h | 3 +-
src/include/pgstat.h | 112 ++++
src/include/utils/elog.h | 1 +
src/test/regress/expected/rules.out | 22 +
15 files changed, 1187 insertions(+), 31 deletions(-)
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 74a58a916c..ca9eec5e22 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -346,6 +346,15 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
</entry>
</row>
+ <row>
+ <entry><structname>pg_stat_subscription_errors</structname><indexterm><primary>pg_stat_subscription_errors</primary></indexterm></entry>
+ <entry>One row per error happened on subscription, showing information about
+ the subscription errors.
+ See <link linkend="monitoring-pg-stat-subscription-errors">
+ <structname>pg_stat_subscription_errors</structname></link> for details.
+ </entry>
+ </row>
+
<row>
<entry><structname>pg_stat_ssl</structname><indexterm><primary>pg_stat_ssl</primary></indexterm></entry>
<entry>One row per connection (regular and replication), showing information about
@@ -3050,6 +3059,126 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
</sect2>
+ <sect2 id="monitoring-pg-stat-subscription-errors">
+ <title><structname>pg_stat_subscription_errors</structname></title>
+
+ <indexterm>
+ <primary>pg_stat_subscription_errors</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_stat_subscription_errors</structname> view will contain one
+ row per subscription error reported by workers applying logical replication
+ changes and workers handling the initial data copy of the subscribed tables.
+ </para>
+
+ <table id="pg-stat-subscription-errors" xreflabel="pg_stat_subscription-errors">
+ <title><structname>pg_stat_subscription_errors</structname> View</title>
+ <tgroup cols="1">
+ <thead>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ Column Type
+ </para>
+ <para>
+ Description
+ </para></entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>datname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the database in which the subscription is created.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the subscription
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>relid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker is processing when the
+ error happened.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>command</structfield> <type>text</type>
+ </para>
+ <para>
+ Name of command being applied when the error happened. This
+ field is always NULL if the error is reported by
+ <literal>tablesync</literal> worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>xid</structfield> <type>xid</type>
+ </para>
+ <para>
+ Transaction ID of publisher node being applied when the error
+ happened. This field is always NULL if the error is reported
+ by <literal>tablesync</literal> worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>failure_source</structfield> <type>text</type>
+ </para>
+ <para>
+ Type of worker reported the error: <literal>apply</literal> or
+ <literal>tablesync</literal>.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>failure_count</structfield> <type>uint8</type>
+ </para>
+ <para>
+ Number of times error happened on the worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_failure</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which the last error happened.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_failure_message</structfield> <type>text</type>
+ </para>
+ <para>
+ Error message which is reported last failure time.
+ </para></entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ </sect2>
+
<sect2 id="monitoring-pg-stat-ssl-view">
<title><structname>pg_stat_ssl</structname></title>
@@ -5172,6 +5301,28 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
can be granted EXECUTE to run the function.
</para></entry>
</row>
+
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_stat_reset_subscription_error</primary>
+ </indexterm>
+ <function>pg_stat_reset_subscription_error</function> ( <parameter>subid</parameter> <type>oid</type>, <parameter>relid</parameter> <type>oid</type> )
+ <returnvalue>void</returnvalue>
+ </para>
+ <para>
+ Resets statistics of a single subscription error. If
+ the argument <parameter>relid</parameter> is not <literal>NULL</literal>,
+ resets error statistics of the <literal>tablesync</literal> worker for
+ the relation with <parameter>relid</parameter>. Otherwise, resets the
+ error statistics of the <literal>apply</literal> worker running on the
+ subscription with <parameter>subid</parameter>.
+ </para>
+ <para>
+ This function is restricted to superusers by default, but other users
+ can be granted EXECUTE to run the function.
+ </para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index 25021e25a4..d76bdff36a 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -541,18 +541,27 @@ GetSubscriptionRelations(Oid subid)
/*
* Get all relations for subscription that are not in a ready state.
*
- * Returned list is palloc'ed in current memory context.
+ * Returned HTAB is created in current memory context.
*/
-List *
+HTAB *
GetSubscriptionNotReadyRelations(Oid subid)
{
- List *res = NIL;
+ HTAB *htab;
+ HASHCTL hash_ctl;
Relation rel;
HeapTuple tup;
int nkeys = 0;
ScanKeyData skey[2];
SysScanDesc scan;
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(SubscriptionRelState);
+ hash_ctl.hcxt = CurrentMemoryContext;
+ htab = hash_create("not ready relations in subscription",
+ 64,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+
rel = table_open(SubscriptionRelRelationId, AccessShareLock);
ScanKeyInit(&skey[nkeys++],
@@ -577,8 +586,8 @@ GetSubscriptionNotReadyRelations(Oid subid)
subrel = (Form_pg_subscription_rel) GETSTRUCT(tup);
- relstate = (SubscriptionRelState *) palloc(sizeof(SubscriptionRelState));
- relstate->relid = subrel->srrelid;
+ relstate = (SubscriptionRelState *) hash_search(htab, (void *) &subrel->srrelid,
+ HASH_ENTER, NULL);
relstate->state = subrel->srsubstate;
d = SysCacheGetAttr(SUBSCRIPTIONRELMAP, tup,
Anum_pg_subscription_rel_srsublsn, &isnull);
@@ -586,13 +595,11 @@ GetSubscriptionNotReadyRelations(Oid subid)
relstate->lsn = InvalidXLogRecPtr;
else
relstate->lsn = DatumGetLSN(d);
-
- res = lappend(res, relstate);
}
/* Cleanup */
systable_endscan(scan);
table_close(rel, AccessShareLock);
- return res;
+ return htab;
}
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index a416e94d37..c9aa6f04d3 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -639,6 +639,8 @@ REVOKE EXECUTE ON FUNCTION pg_stat_reset_single_function_counters(oid) FROM publ
REVOKE EXECUTE ON FUNCTION pg_stat_reset_replication_slot(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_stat_reset_subscription_error(oid, oid) FROM public;
+
REVOKE EXECUTE ON FUNCTION lo_import(text) FROM public;
REVOKE EXECUTE ON FUNCTION lo_import(text, oid) FROM public;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 55f6e3711d..cd07f2e02f 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1257,3 +1257,30 @@ REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
substream, subtwophasestate, subslotname, subsynccommit, subpublications)
ON pg_subscription TO public;
+
+CREATE VIEW pg_stat_subscription_errors AS
+ SELECT
+ d.datname,
+ sr.subid,
+ s.subname,
+ e.relid,
+ e.command,
+ e.xid,
+ e.failure_source,
+ e.failure_count,
+ e.last_failure,
+ e.last_failure_message,
+ e.stats_reset
+ FROM (SELECT
+ oid as subid,
+ NULL as relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT
+ srsubid as subid,
+ srrelid as relid
+ FROM pg_subscription_rel
+ WHERE srsubstate <> 'r') sr,
+ LATERAL pg_stat_get_subscription_error(sr.subid, sr.relid) e
+ JOIN pg_database d ON (e.datid = d.oid)
+ JOIN pg_subscription s ON (e.subid = s.oid);
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 22ae982328..da02d3bbfa 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -86,7 +86,7 @@ typedef struct SubOpts
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
static void check_duplicates_in_publist(List *publist, Datum *datums);
static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
-static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
+static void ReportSlotConnectionError(HTAB *rstates, Oid subid, char *slotname, char *err);
/*
@@ -1163,7 +1163,9 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
char *err = NULL;
WalReceiverConn *wrconn;
Form_pg_subscription form;
- List *rstates;
+ HTAB *rstates;
+ HASH_SEQ_STATUS hstat;
+ SubscriptionRelState *rstate;
/*
* Lock pg_subscription with AccessExclusiveLock to ensure that the
@@ -1286,9 +1288,9 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
* exclusive lock on the subscription.
*/
rstates = GetSubscriptionNotReadyRelations(subid);
- foreach(lc, rstates)
+ hash_seq_init(&hstat, rstates);
+ while ((rstate = (SubscriptionRelState *) hash_seq_search(&hstat)) != NULL)
{
- SubscriptionRelState *rstate = (SubscriptionRelState *) lfirst(lc);
Oid relid = rstate->relid;
/* Only cleanup resources of tablesync workers */
@@ -1321,8 +1323,9 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
* If there is no slot associated with the subscription, we can finish
* here.
*/
- if (!slotname && rstates == NIL)
+ if (!slotname && hash_get_num_entries(rstates) == 0)
{
+ hash_destroy(rstates);
table_close(rel, NoLock);
return;
}
@@ -1346,7 +1349,7 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
if (!slotname)
{
/* be tidy */
- list_free(rstates);
+ hash_destroy(rstates);
table_close(rel, NoLock);
return;
}
@@ -1358,9 +1361,9 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
PG_TRY();
{
- foreach(lc, rstates)
+ hash_seq_init(&hstat, rstates);
+ while ((rstate = (SubscriptionRelState *) hash_seq_search(&hstat)) != NULL)
{
- SubscriptionRelState *rstate = (SubscriptionRelState *) lfirst(lc);
Oid relid = rstate->relid;
/* Only cleanup resources of tablesync workers */
@@ -1389,7 +1392,7 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
}
}
- list_free(rstates);
+ hash_destroy(rstates);
/*
* If there is a slot associated with the subscription, then drop the
@@ -1641,13 +1644,14 @@ fetch_table_list(WalReceiverConn *wrconn, List *publications)
* them manually, if required.
*/
static void
-ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err)
+ReportSlotConnectionError(HTAB *rstates, Oid subid, char *slotname, char *err)
{
- ListCell *lc;
+ HASH_SEQ_STATUS hstat;
+ SubscriptionRelState *rstate;
- foreach(lc, rstates)
+ hash_seq_init(&hstat, rstates);
+ while ((rstate = (SubscriptionRelState *) hash_seq_search(&hstat)) != NULL)
{
- SubscriptionRelState *rstate = (SubscriptionRelState *) lfirst(lc);
Oid relid = rstate->relid;
/* Only cleanup resources of tablesync workers */
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 11702f2a80..a8d2a0bc65 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -41,6 +41,8 @@
#include "catalog/partition.h"
#include "catalog/pg_database.h"
#include "catalog/pg_proc.h"
+#include "catalog/pg_subscription.h"
+#include "catalog/pg_subscription_rel.h"
#include "common/ip.h"
#include "executor/instrument.h"
#include "libpq/libpq.h"
@@ -106,6 +108,7 @@
#define PGSTAT_TAB_HASH_SIZE 512
#define PGSTAT_FUNCTION_HASH_SIZE 512
#define PGSTAT_REPLSLOT_HASH_SIZE 32
+#define PGSTAT_SUBSCRIPTION_ERR_HASH_SIZE 32
/* ----------
@@ -279,6 +282,7 @@ static PgStat_GlobalStats globalStats;
static PgStat_WalStats walStats;
static PgStat_SLRUStats slruStats[SLRU_NUM_ELEMENTS];
static HTAB *replSlotStatHash = NULL;
+static HTAB *subscriptionErrHash = NULL;
/*
* List of OIDs of databases we need to write out. If an entry is InvalidOid,
@@ -320,6 +324,12 @@ static bool pgstat_db_requested(Oid databaseid);
static PgStat_StatReplSlotEntry *pgstat_get_replslot_entry(NameData name, bool create_it);
static void pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotstats, TimestampTz ts);
+static PgStat_StatSubErrEntry *pgstat_get_subscription_error_entry(Oid subid,
+ bool create);
+static PgStat_StatSubRelErrEntry *pgstat_get_subscription_rel_error_entry(Oid subid,
+ Oid subrelid,
+ bool create);
+
static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg);
static void pgstat_send_funcstats(void);
static void pgstat_send_slru(void);
@@ -358,6 +368,10 @@ static void pgstat_recv_checksum_failure(PgStat_MsgChecksumFailure *msg, int len
static void pgstat_recv_connstat(PgStat_MsgConn *msg, int len);
static void pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len);
static void pgstat_recv_tempfile(PgStat_MsgTempFile *msg, int len);
+static void pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len);
+static void pgstat_recv_subscription_error_purge(PgStat_MsgSubscriptionErrPurge *msg,
+ int len);
+static void pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len);
/* ------------------------------------------------------------
* Public functions called from postmaster follow
@@ -1134,6 +1148,133 @@ pgstat_vacuum_stat(void)
}
}
+ /*
+ * Search for all the dead subscriptions and error entries in stats
+ * hashtable and tell the stats collector to drop them.
+ */
+ if (subscriptionErrHash)
+ {
+ PgStat_MsgSubscriptionPurge submsg;
+ PgStat_StatSubErrEntry *suberrent;
+ HTAB *htab;
+
+ htab = pgstat_collect_oids(SubscriptionRelationId, Anum_pg_subscription_oid);
+
+ submsg.m_nentries = 0;
+
+ hash_seq_init(&hstat, subscriptionErrHash);
+ while ((suberrent = (PgStat_StatSubErrEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ PgStat_MsgSubscriptionErrPurge errmsg;
+ PgStat_StatSubRelErrEntry *relerrent;
+ HASH_SEQ_STATUS hstat_rel;
+ HTAB *rstates;
+
+ CHECK_FOR_INTERRUPTS();
+
+ if (hash_search(htab, (void *) &(suberrent->subid), HASH_FIND, NULL) == NULL)
+ {
+ /* This subscription is dead, add subid to the message */
+ submsg.m_subids[submsg.m_nentries++] = suberrent->subid;
+
+ /*
+ * If the message is full, send it out and reinitialize to
+ * empty
+ */
+ if (submsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONPURGE)
+ {
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + submsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&submsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(&submsg, len);
+ submsg.m_nentries = 0;
+ }
+
+ continue;
+ }
+
+ /*
+ * Nothing to do here if the subscription exists but has no error
+ * entries.
+ */
+ if (suberrent->suberrors == NULL)
+ continue;
+
+ /*
+ * The subscription has error entries. We search errors of the
+ * table sync workers who are already in sync state. Those errors
+ * should be removed.
+ *
+ * Note that the lifetime of error entries of the apply worker and
+ * the table sync worker are different. The former lives until
+ * the subscription is dropped whereas the latter lives the table
+ * synchronization is completed.
+ */
+ rstates = GetSubscriptionNotReadyRelations(suberrent->subid);
+
+ errmsg.m_nentries = 0;
+ errmsg.m_subid = suberrent->subid;
+
+ hash_seq_init(&hstat_rel, suberrent->suberrors);
+ while ((relerrent = (PgStat_StatSubRelErrEntry *) hash_seq_search(&hstat_rel)) != NULL)
+ {
+ CHECK_FOR_INTERRUPTS();
+
+ /* Skip the apply worker's error */
+ if (!OidIsValid(relerrent->subrelid))
+ continue;
+
+ /*
+ * Add the relid to the message if the table synchronization
+ * for this relation already complete or the table is no
+ * longer subscribed.
+ */
+ if (hash_search(rstates, (void *) &(relerrent->subrelid), HASH_FIND, NULL) == NULL)
+ errmsg.m_relids[errmsg.m_nentries++] = relerrent->subrelid;
+
+ /*
+ * If the message is full, send it out and reinitialize to
+ * empty
+ */
+ if (errmsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONERRPURGE)
+ {
+ len = offsetof(PgStat_MsgSubscriptionErrPurge, m_relids[0])
+ + errmsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&errmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE);
+ pgstat_send(&errmsg, len);
+ errmsg.m_nentries = 0;
+ }
+ }
+
+ /* Send the rest of dead error entries */
+ if (errmsg.m_nentries > 0)
+ {
+ len = offsetof(PgStat_MsgSubscriptionErrPurge, m_relids[0])
+ + errmsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&errmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE);
+ pgstat_send(&errmsg, len);
+ errmsg.m_nentries = 0;
+ }
+
+ hash_destroy(rstates);
+ }
+
+ /* Send the rest of dead subscriptions */
+ if (submsg.m_nentries > 0)
+ {
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + submsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&submsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(&submsg, len);
+ }
+
+ hash_destroy(htab);
+ }
+
/*
* Lookup our own database entry; if not found, nothing more to do.
*/
@@ -1543,6 +1684,25 @@ pgstat_reset_replslot_counter(const char *name)
pgstat_send(&msg, sizeof(msg));
}
+/* ----------
+ * pgstat_reset_subscription_error() -
+ *
+ * Tell the collector about reset the error of subscription.
+ * ----------
+ */
+void
+pgstat_reset_subscription_error(Oid subid, Oid subrelid)
+{
+ PgStat_MsgSubscriptionErr msg;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_reset = true;
+
+ pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
+}
+
/* ----------
* pgstat_report_autovac() -
*
@@ -1863,6 +2023,37 @@ pgstat_report_replslot_drop(const char *slotname)
pgstat_send(&msg, sizeof(PgStat_MsgReplSlot));
}
+/* ----------
+ * pgstat_report_subscription_error() -
+ *
+ * Tell the collector about error of subscription.
+ * ----------
+ */
+void
+pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command, TransactionId xid,
+ const char *errmsg)
+{
+ PgStat_MsgSubscriptionErr msg;
+ int len;
+
+ len = offsetof(PgStat_MsgSubscriptionErr, m_errmsg[0]) + strlen(errmsg);
+ Assert(len < PGSTAT_MAX_MSG_SIZE);
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_databaseid = MyDatabaseId;
+ msg.m_relid = relid;
+ msg.m_reset = false;
+ msg.m_command = command;
+ msg.m_xid = xid;
+ msg.m_last_failure = GetCurrentTimestamp();
+ strlcpy(msg.m_errmsg, errmsg, PGSTAT_SUBSCRIPTIONERR_MSGLEN);
+
+ pgstat_send(&msg, len);
+}
+
/* ----------
* pgstat_ping() -
*
@@ -2895,6 +3086,38 @@ pgstat_fetch_replslot(NameData slotname)
return pgstat_get_replslot_entry(slotname, false);
}
+/*
+ * ---------
+ * pgstat_fetch_subscription_error() -
+ *
+ * Support function for the SQL-callable pgstat* functions. Returns
+ * a pointer to the subscription errors struct.
+ * ---------
+ */
+PgStat_StatSubErrEntry *
+pgstat_fetch_subscription_error(Oid subid)
+{
+ backend_read_statsfile();
+
+ return pgstat_get_subscription_error_entry(subid, false);
+}
+
+/*
+ * ---------
+ * pgstat_fetch_subscription_rel_error() -
+ *
+ * Support function for the SQL-callable pgstat* functions. Returns
+ * a pointer to the subscription error struct.
+ * ---------
+ */
+PgStat_StatSubRelErrEntry *
+pgstat_fetch_subscription_rel_error(Oid subid, Oid relid)
+{
+ backend_read_statsfile();
+
+ return pgstat_get_subscription_rel_error_entry(subid, relid, false);
+}
+
/*
* Shut down a single backend's statistics reporting at process exit.
*
@@ -3424,6 +3647,19 @@ PgstatCollectorMain(int argc, char *argv[])
pgstat_recv_connstat(&msg.msg_conn, len);
break;
+ case PGSTAT_MTYPE_SUBSCRIPTIONERR:
+ pgstat_recv_subscription_error(&msg.msg_subscriptionerr, len);
+ break;
+
+ case PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE:
+ pgstat_recv_subscription_error_purge(&msg.msg_subscriptionerrpurge,
+ len);
+ break;
+
+ case PGSTAT_MTYPE_SUBSCRIPTIONPURGE:
+ pgstat_recv_subscription_purge(&msg.msg_subscriptionpurge, len);
+ break;
+
default:
break;
}
@@ -3725,6 +3961,50 @@ pgstat_write_statsfiles(bool permanent, bool allDbs)
}
}
+ /*
+ * Write subscription error structs
+ */
+ if (subscriptionErrHash)
+ {
+ PgStat_StatSubErrEntry *suberrent;
+
+ hash_seq_init(&hstat, subscriptionErrHash);
+ while ((suberrent = (PgStat_StatSubErrEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ PgStat_StatSubRelErrEntry *relerrent;
+ HASH_SEQ_STATUS relhstat;
+ long nrels = hash_get_num_entries(suberrent->suberrors);
+
+ /* Skip this subscription if not have any errors */
+ if (suberrent->suberrors == NULL)
+ continue;
+
+ fputc('S', fpout);
+ rc = fwrite(suberrent, sizeof(PgStat_StatSubErrEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+
+ /* The number of errors follows */
+ rc = fwrite(&nrels, sizeof(long), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+
+ /* Then, the error entries follow */
+ hash_seq_init(&relhstat, suberrent->suberrors);
+ while ((relerrent = (PgStat_StatSubRelErrEntry *) hash_seq_search(&relhstat)) != NULL)
+ {
+ /*
+ * XXX we write the whole PgStat_StatSubRelErrEntry entry that
+ * contains the fixed-legnth error message string which is
+ * PGSTAT_SUBSCRIPTIONERR_MSGLEN in length, making the stats
+ * file bloat. We don't expect we have many error entries but
+ * if the expectation became false we should write the string
+ * followed by its length instead.
+ */
+ rc = fwrite(relerrent, sizeof(PgStat_StatSubRelErrEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+ }
+ }
+ }
+
/*
* No more output to be done. Close the temp file and replace the old
* pgstat.stat with it. The ferror() check replaces testing for error
@@ -4184,6 +4464,99 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
break;
}
+ /*
+ * 'S' A PgStat_StatSubErrEntry struct followed by the number
+ * of errors and PgStat_StatSubRelErrEntry structs, describing
+ * subscription errors.
+ */
+ case 'S':
+ {
+ PgStat_StatSubErrEntry suberrbuf;
+ PgStat_StatSubErrEntry *suberrent;
+ long nerrors;
+
+ /* Read the subscription entry */
+ if (fread(&suberrbuf, 1, sizeof(PgStat_StatSubErrEntry), fpin) !=
+ sizeof(PgStat_StatSubErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /* Create hash table if we don't have it already. */
+ if (subscriptionErrHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubErrEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ subscriptionErrHash = hash_create("Subscription error hash",
+ PGSTAT_SUBSCRIPTION_ERR_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ }
+
+ /* Enter the subscription error entry */
+ suberrent =
+ (PgStat_StatSubErrEntry *) hash_search(subscriptionErrHash,
+ (void *) &(suberrbuf.subid),
+ HASH_ENTER, NULL);
+ suberrent->suberrors = NULL;
+
+ /* Read the number of errors in the subscription */
+ if (fread(&nerrors, 1, sizeof(long), fpin) != sizeof(long))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ for (int i = 0; i < nerrors; i++)
+ {
+ PgStat_StatSubRelErrEntry *subrelent;
+ PgStat_StatSubRelErrEntry subrelbuf;
+
+ if (fread(&subrelbuf, 1, sizeof(PgStat_StatSubRelErrEntry), fpin) !=
+ sizeof(PgStat_StatSubRelErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ if (suberrent->suberrors == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubRelErrEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ suberrent->suberrors = hash_create("Subscription relation error hash",
+ PGSTAT_SUBSCRIPTION_ERR_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ }
+
+ /*
+ * Enter the error information to the subscription
+ * hash
+ */
+ subrelent =
+ (PgStat_StatSubRelErrEntry *) hash_search(suberrent->suberrors,
+ (void *) &(subrelbuf.subrelid),
+ HASH_ENTER, NULL);
+
+ memcpy(subrelent, &subrelbuf, sizeof(PgStat_StatSubRelErrEntry));
+ }
+
+ break;
+ }
+
case 'E':
goto done;
@@ -4526,6 +4899,50 @@ pgstat_read_db_statsfile_timestamp(Oid databaseid, bool permanent,
}
break;
+ /*
+ * 'S' A PgStat_StatSubErrEntry struct followed by the number
+ * of errors and PgStat_StatSubRelErrEntry structs describing
+ * a subscription errors.
+ */
+ case 'S':
+ {
+ PgStat_StatSubErrEntry mySubErrs;
+ PgStat_StatSubRelErrEntry subrelbuf;
+ long nrels;
+
+ if (fread(&mySubErrs, 1, sizeof(PgStat_StatSubErrEntry), fpin)
+ != sizeof(PgStat_StatSubErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ FreeFile(fpin);
+ return false;
+ }
+
+ if (fread(&nrels, 1, sizeof(long), fpin) != sizeof(long))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ for (int i = 0; i < nrels; i++)
+ {
+ if (fread(&subrelbuf, 1, sizeof(PgStat_StatSubRelErrEntry), fpin) !=
+ sizeof(PgStat_StatSubRelErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+ }
+ }
+
+ break;
+
case 'E':
goto done;
@@ -4716,6 +5133,7 @@ pgstat_clear_snapshot(void)
pgStatLocalContext = NULL;
pgStatDBHash = NULL;
replSlotStatHash = NULL;
+ subscriptionErrHash = NULL;
/*
* Historically the backend_status.c facilities lived in this file, and
@@ -5650,6 +6068,117 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
}
}
+/* ----------
+ * pgstat_recv_subscription_error() -
+ *
+ * Process a SUBSCRIPTIONERR message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len)
+{
+ PgStat_StatSubRelErrEntry *relerrent;
+
+ /* Get subscription error */
+ relerrent = pgstat_get_subscription_rel_error_entry(msg->m_subid,
+ msg->m_subrelid,
+ true);
+ Assert(relerrent);
+
+ if (msg->m_reset)
+ {
+ /* reset fields and set reset timestamp */
+ relerrent->relid = InvalidOid;
+ relerrent->command = 0;
+ relerrent->xid = InvalidTransactionId;
+ relerrent->failure_count = 0;
+ relerrent->last_failure = 0;
+ relerrent->errmsg[0] = '\0';
+ relerrent->stat_reset_timestamp = GetCurrentTimestamp();
+ }
+ else
+ {
+ /* update the error entry */
+ relerrent->databaseid = msg->m_databaseid;
+ relerrent->relid = msg->m_relid;
+ relerrent->command = msg->m_command;
+ relerrent->xid = msg->m_xid;
+ relerrent->failure_count++;
+ relerrent->last_failure = msg->m_last_failure;
+ strlcpy(relerrent->errmsg, msg->m_errmsg, PGSTAT_SUBSCRIPTIONERR_MSGLEN);
+ }
+}
+
+/* ----------
+ * pgstat_recv_subscription_purge() -
+ *
+ * Process a SUBSCRIPTIONPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len)
+{
+ if (subscriptionErrHash == NULL)
+ return;
+
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ PgStat_StatSubErrEntry *suberrent;
+
+ suberrent = hash_search(subscriptionErrHash, (void *) &(msg->m_subids[i]),
+ HASH_FIND, NULL);
+
+ /*
+ * Nothing to do if the subscription entry is not found. This could
+ * happen when the subscription is dropped and the message for
+ * dropping subscription entry arrived before the message for
+ * reporting the error.
+ */
+ if (suberrent == NULL)
+ continue;
+
+ /* Cleanup the hash table for errors */
+ if (suberrent->suberrors != NULL)
+ hash_destroy(suberrent->suberrors);
+
+ (void) hash_search(subscriptionErrHash, (void *) &(msg->m_subids[i]),
+ HASH_REMOVE, NULL);
+ }
+}
+
+/* ----------
+ * pgstat_recv_subscription_error_purge() -
+ *
+ * Process a SUBSCRIPTIONERRPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_error_purge(PgStat_MsgSubscriptionErrPurge *msg, int len)
+{
+ if (subscriptionErrHash == NULL)
+ return;
+
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ PgStat_StatSubErrEntry *suberrent;
+
+ suberrent = hash_search(subscriptionErrHash, (void *) &(msg->m_subid),
+ HASH_FIND, NULL);
+
+ /*
+ * Nothing to do if the subscription entry is not found. This could
+ * happen when the subscription with msg->m_subid is removed and the
+ * corresponding entry is also removed before receiving the error
+ * purge message.
+ */
+ if (suberrent == NULL)
+ continue;
+
+ (void) hash_search(suberrent->suberrors, (void *) &(msg->m_relids[i]),
+ HASH_REMOVE, NULL);
+ }
+}
+
/* ----------
* pgstat_write_statsfile_needed() -
*
@@ -5747,6 +6276,111 @@ pgstat_get_replslot_entry(NameData name, bool create)
return slotent;
}
+/* ----------
+ * pgstat_get_subscription_error_entry
+ *
+ * Return the entry of subscription errors entry with the subscription OID.
+ * Return NULL if not found and the caller didn't request to create it.
+ *
+ * create tells whether to create the new subscription entry if it is not
+ * found.
+ * ----------
+ */
+static PgStat_StatSubErrEntry *
+pgstat_get_subscription_error_entry(Oid subid, bool create)
+{
+ PgStat_StatSubErrEntry *suberrent;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);
+ bool found;
+
+ if (subscriptionErrHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ if (!create)
+ return NULL;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubErrEntry);
+ subscriptionErrHash = hash_create("Subscription error hash",
+ PGSTAT_SUBSCRIPTION_ERR_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+ }
+
+ suberrent = (PgStat_StatSubErrEntry *) hash_search(subscriptionErrHash,
+ (void *) &subid,
+ action, &found);
+
+ /* initialize fields */
+ if (create && !found)
+ suberrent->suberrors = NULL;
+
+ return suberrent;
+}
+
+/* ----------
+ * pgstat_get_subscription_rel_error_entry
+ *
+ * Return the entry of subscription relation error entry with the subscription
+ * OID and relcation OID. Return NULL if not found and the caller didn't
+ * request to create it.
+ *
+ * create tells whether to create the new subscription relation entry if it is
+ * not found.
+ * ----------
+ */
+static PgStat_StatSubRelErrEntry *
+pgstat_get_subscription_rel_error_entry(Oid subid, Oid subrelid, bool create)
+{
+ PgStat_StatSubErrEntry *suberrent;
+ PgStat_StatSubRelErrEntry *relerrent;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);
+ bool found;
+
+ suberrent = pgstat_get_subscription_error_entry(subid, create);
+
+ if (suberrent == NULL)
+ {
+ Assert(!create);
+ return NULL;
+ }
+
+ if (suberrent->suberrors == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ if (!create)
+ return NULL;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubRelErrEntry);
+ suberrent->suberrors = hash_create("Subscription relation error hash",
+ PGSTAT_SUBSCRIPTION_ERR_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+ }
+
+ relerrent = (PgStat_StatSubRelErrEntry *) hash_search(suberrent->suberrors,
+ (void *) &subrelid,
+ action, &found);
+
+ /* initialize fields */
+ if (create && !found)
+ {
+ relerrent->databaseid = InvalidOid;
+ relerrent->relid = InvalidOid;
+ relerrent->command = 0;
+ relerrent->xid = InvalidTransactionId;
+ relerrent->failure_count = 0;
+ relerrent->last_failure = 0;
+ relerrent->errmsg[0] = '\0';
+ relerrent->stat_reset_timestamp = 0;
+ }
+
+ return relerrent;
+}
+
/* ----------
* pgstat_reset_replslot
*
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a43c..8765396432 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -1175,8 +1175,8 @@ FetchTableStates(bool *started_tx)
if (!table_states_valid)
{
MemoryContext oldctx;
- List *rstates;
- ListCell *lc;
+ HTAB *rstates;
+ HASH_SEQ_STATUS hstat;
SubscriptionRelState *rstate;
/* Clean the old lists. */
@@ -1194,14 +1194,18 @@ FetchTableStates(bool *started_tx)
/* Allocate the tracking info in a permanent memory context. */
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- foreach(lc, rstates)
+ hash_seq_init(&hstat, rstates);
+ while ((rstate = (SubscriptionRelState *) hash_seq_search(&hstat)) != NULL)
{
- rstate = palloc(sizeof(SubscriptionRelState));
- memcpy(rstate, lfirst(lc), sizeof(SubscriptionRelState));
- table_states_not_ready = lappend(table_states_not_ready, rstate);
+ SubscriptionRelState *r = palloc(sizeof(SubscriptionRelState));
+
+ memcpy(r, rstate, sizeof(SubscriptionRelState));
+ table_states_not_ready = lappend(table_states_not_ready, r);
}
MemoryContextSwitchTo(oldctx);
+ hash_destroy(rstates);
+
/*
* Does the subscription have tables?
*
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index d346377b20..4f9c4e9014 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -227,6 +227,7 @@ typedef struct ApplyErrCallbackArg
LogicalRepMsgType command; /* 0 if invalid */
/* Local relation information */
+ Oid relid; /* used for error reporting */
char *nspname; /* used for error context */
char *relname; /* used for error context */
@@ -236,6 +237,7 @@ typedef struct ApplyErrCallbackArg
static ApplyErrCallbackArg apply_error_callback_arg =
{
.command = 0,
+ .relid = InvalidOid,
.relname = NULL,
.nspname = NULL,
.remote_xid = InvalidTransactionId,
@@ -3522,8 +3524,26 @@ ApplyWorkerMain(Datum main_arg)
{
char *syncslotname;
- /* This is table synchronization worker, call initial sync. */
- syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ PG_TRY();
+ {
+ /* This is table synchronization worker, call initial sync. */
+ syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ }
+ PG_CATCH();
+ {
+ elog(NOTICE, "errmsg \"%s\"",
+ geterrmessage());
+
+ /* report the table sync error */
+ pgstat_report_subscription_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ MyLogicalRepWorker->relid,
+ 0,
+ InvalidTransactionId,
+ geterrmessage());
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
/* allocate slot name in long-lived context */
myslotname = MemoryContextStrdup(ApplyContext, syncslotname);
@@ -3641,7 +3661,24 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- LogicalRepApplyLoop(origin_startpos);
+ PG_TRY();
+ {
+ LogicalRepApplyLoop(origin_startpos);
+ }
+ PG_CATCH();
+ {
+ /* report the apply error */
+ if (apply_error_callback_arg.command != 0)
+ pgstat_report_subscription_error(MySubscription->oid,
+ InvalidOid,
+ apply_error_callback_arg.relid,
+ apply_error_callback_arg.command,
+ apply_error_callback_arg.remote_xid,
+ geterrmessage());
+
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
proc_exit(0);
}
@@ -3686,6 +3723,7 @@ apply_error_callback(void *arg)
static void
set_apply_error_context_rel(LogicalRepRelMapEntry *rel)
{
+ apply_error_callback_arg.relid = rel->localreloid;
apply_error_callback_arg.nspname = rel->remoterel.nspname;
apply_error_callback_arg.relname = rel->remoterel.relname;
}
@@ -3694,6 +3732,7 @@ set_apply_error_context_rel(LogicalRepRelMapEntry *rel)
static void
reset_apply_error_context_rel(void)
{
+ apply_error_callback_arg.relid = InvalidOid;
apply_error_callback_arg.nspname = NULL;
apply_error_callback_arg.relname = NULL;
}
@@ -3724,6 +3763,7 @@ set_logicalrep_error_context_rel(Relation rel)
{
if (IsLogicalWorker())
{
+ apply_error_callback_arg.relid = RelationGetRelid(rel);
apply_error_callback_arg.nspname =
get_namespace_name(RelationGetNamespace(rel));
apply_error_callback_arg.relname =
@@ -3737,6 +3777,8 @@ reset_logicalrep_error_context_rel(void)
{
if (IsLogicalWorker())
{
+ apply_error_callback_arg.relid = InvalidOid;
+
if (apply_error_callback_arg.nspname)
pfree(apply_error_callback_arg.nspname);
apply_error_callback_arg.nspname = NULL;
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index f0e09eae4d..f1348a415e 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -24,6 +24,8 @@
#include "pgstat.h"
#include "postmaster/bgworker_internals.h"
#include "postmaster/postmaster.h"
+#include "replication/logicalproto.h"
+#include "replication/logicalworker.h"
#include "replication/slot.h"
#include "storage/proc.h"
#include "storage/procarray.h"
@@ -2240,6 +2242,23 @@ pg_stat_reset_replication_slot(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
+/* Reset a subscription error stats */
+Datum
+pg_stat_reset_subscription_error(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid;
+
+ if (PG_ARGISNULL(1))
+ relid = InvalidOid; /* reset apply worker error stats */
+ else
+ relid = PG_GETARG_OID(1); /* reset table sync worker error stats */
+
+ pgstat_reset_subscription_error(subid, relid);
+
+ PG_RETURN_VOID();
+}
+
Datum
pg_stat_get_archiver(PG_FUNCTION_ARGS)
{
@@ -2380,3 +2399,104 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
/* Returns the record as Datum */
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
+
+/*
+ * Get the subscription error for the given subscription (and relation).
+ */
+Datum
+pg_stat_get_subscription_error(PG_FUNCTION_ARGS)
+{
+#define PG_STAT_GET_SUBSCRIPTION_ERROR_COLS 10
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid;
+ TupleDesc tupdesc;
+ Datum values[PG_STAT_GET_SUBSCRIPTION_ERROR_COLS];
+ bool nulls[PG_STAT_GET_SUBSCRIPTION_ERROR_COLS];
+ PgStat_StatSubRelErrEntry *relerrent;
+
+ /* Initialise values and NULL flags arrays */
+ MemSet(values, 0, sizeof(values));
+ MemSet(nulls, 0, sizeof(nulls));
+
+ /* Initialise attributes information in the tuple descriptor */
+ tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_SUBSCRIPTION_ERROR_COLS);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "datid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 3, "relid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 4, "command",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 5, "xid",
+ XIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 6, "failure_source",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 7, "failure_count",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 8, "last_failure",
+ TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 9, "last_failure_message",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 10, "stats_reset",
+ TIMESTAMPTZOID, -1, 0);
+ BlessTupleDesc(tupdesc);
+
+ if (PG_ARGISNULL(1))
+ relid = InvalidOid;
+ else
+ relid = PG_GETARG_OID(1);
+
+ /* Get subscription errors */
+ relerrent = pgstat_fetch_subscription_rel_error(subid, relid);
+
+ /* Return NULL if the subscription doesn't have any errors */
+ if (relerrent == NULL)
+ PG_RETURN_NULL();
+
+ values[0] = ObjectIdGetDatum(relerrent->databaseid);
+ values[1] = ObjectIdGetDatum(subid);
+
+ if (OidIsValid(relerrent->relid))
+ values[2] = ObjectIdGetDatum(relerrent->relid);
+ else
+ nulls[2] = true;
+
+ if (relerrent->command == 0)
+ nulls[3] = true;
+ else
+ {
+ /* should be apply worker */
+ Assert(!OidIsValid(relerrent->subrelid));
+
+ values[3] = CStringGetTextDatum(logicalrep_message_type(relerrent->command));
+ }
+
+ if (TransactionIdIsValid(relerrent->xid))
+ values[4] = TransactionIdGetDatum(relerrent->xid);
+ else
+ nulls[4] = true;
+
+ if (OidIsValid(relerrent->subrelid))
+ values[5] = CStringGetTextDatum("tablesync");
+ else
+ values[5] = CStringGetTextDatum("apply");
+
+
+ values[6] = Int64GetDatum(relerrent->failure_count);
+
+ if (relerrent->last_failure == 0)
+ nulls[7] = true;
+ else
+ values[7] = TimestampTzGetDatum(relerrent->last_failure);
+
+ values[8] = CStringGetTextDatum(relerrent->errmsg);
+
+ if (relerrent->stat_reset_timestamp == 0)
+ nulls[9] = true;
+ else
+ values[9] = TimestampTzGetDatum(relerrent->stat_reset_timestamp);
+
+ /* Returns the record as Datum */
+ PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
+}
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index a3e1c59a82..dd36850016 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -1441,6 +1441,22 @@ getinternalerrposition(void)
return edata->internalpos;
}
+/*
+ * geterrmessage --- return the currently set error message
+ *
+ * This is only intended for use in error callback subroutines, since there
+ * is no other place outside elog.c where the concept is meaningful.
+ */
+const char *
+geterrmessage(void)
+{
+ ErrorData *edata = &errordata[errordata_stack_depth];
+
+ /* we don't bother incrementing recursion_depth */
+ CHECK_STACK_DEPTH();
+
+ return (const char *) edata->message;
+}
/*
* Functions to allow construction of error message strings separately from
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 8cd0252082..044ff52227 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5321,6 +5321,14 @@
proargmodes => '{i,o,o,o,o,o,o,o,o,o,o}',
proargnames => '{slot_name,slot_name,spill_txns,spill_count,spill_bytes,stream_txns,stream_count,stream_bytes,total_txns,total_bytes,stats_reset}',
prosrc => 'pg_stat_get_replication_slot' },
+{ oid => '8523', descr => 'statistics: information about subscription error',
+ proname => 'pg_stat_get_subscription_error', prorows => '1', proisstrict => 'f',
+ proretset => 't', provolatile => 's', proparallel => 'r',
+ prorettype => 'record', proargtypes => 'oid oid',
+ proallargtypes => '{oid,oid,oid,oid,oid,text,xid,text,int8,timestamptz,text,timestamptz}',
+ proargmodes => '{i,i,o,o,o,o,o,o,o,o,o,o}',
+ proargnames => '{subid,relid,datid,subid,relid,command,xid,failure_source,failure_count,last_failure,last_failure_message,stats_reset}',
+ prosrc => 'pg_stat_get_subscription_error' },
{ oid => '6118', descr => 'statistics: information about subscription',
proname => 'pg_stat_get_subscription', prorows => '10', proisstrict => 'f',
proretset => 't', provolatile => 's', proparallel => 'r',
@@ -5708,6 +5716,11 @@
proname => 'pg_stat_reset_replication_slot', proisstrict => 'f',
provolatile => 'v', prorettype => 'void', proargtypes => 'text',
prosrc => 'pg_stat_reset_replication_slot' },
+{ oid => '8524',
+ descr => 'statistics: reset collected statistics for a single subscription error',
+ proname => 'pg_stat_reset_subscription_error', proisstrict => 'f',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid oid',
+ prosrc => 'pg_stat_reset_subscription_error' },
{ oid => '3163', descr => 'current trigger depth',
proname => 'pg_trigger_depth', provolatile => 's', proparallel => 'r',
diff --git a/src/include/catalog/pg_subscription_rel.h b/src/include/catalog/pg_subscription_rel.h
index 632381b4e3..50053cdafc 100644
--- a/src/include/catalog/pg_subscription_rel.h
+++ b/src/include/catalog/pg_subscription_rel.h
@@ -22,6 +22,7 @@
#include "catalog/genbki.h"
#include "catalog/pg_subscription_rel_d.h"
#include "nodes/pg_list.h"
+#include "utils/hsearch.h"
/* ----------------
* pg_subscription_rel definition. cpp turns this into
@@ -89,6 +90,6 @@ extern void RemoveSubscriptionRel(Oid subid, Oid relid);
extern bool HasSubscriptionRelations(Oid subid);
extern List *GetSubscriptionRelations(Oid subid);
-extern List *GetSubscriptionNotReadyRelations(Oid subid);
+extern HTAB *GetSubscriptionNotReadyRelations(Oid subid);
#endif /* PG_SUBSCRIPTION_REL_H */
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 9612c0a6c2..1104886bef 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -14,6 +14,7 @@
#include "datatype/timestamp.h"
#include "portability/instr_time.h"
#include "postmaster/pgarch.h" /* for MAX_XFN_CHARS */
+#include "replication/logicalproto.h"
#include "utils/backend_progress.h" /* for backward compatibility */
#include "utils/backend_status.h" /* for backward compatibility */
#include "utils/hsearch.h"
@@ -66,6 +67,9 @@ typedef enum StatMsgType
PGSTAT_MTYPE_RESETSINGLECOUNTER,
PGSTAT_MTYPE_RESETSLRUCOUNTER,
PGSTAT_MTYPE_RESETREPLSLOTCOUNTER,
+ PGSTAT_MTYPE_SUBSCRIPTIONERR,
+ PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE,
+ PGSTAT_MTYPE_SUBSCRIPTIONPURGE,
PGSTAT_MTYPE_AUTOVAC_START,
PGSTAT_MTYPE_VACUUM,
PGSTAT_MTYPE_ANALYZE,
@@ -539,6 +543,68 @@ typedef struct PgStat_MsgReplSlot
PgStat_Counter m_total_bytes;
} PgStat_MsgReplSlot;
+/* ----------
+ * PgStat_MsgSubscriptionErr Sent by the apply worker or the table sync worker to
+ * update/reset/clear the error happening during logical
+ * replication.
+ * ----------
+ */
+#define PGSTAT_SUBSCRIPTIONERR_MSGLEN 256
+typedef struct PgStat_MsgSubscriptionErr
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * m_subid and m_subrelid are used to determine the subscription and the
+ * reporter of this error. m_subrelid is InvalidOid if reported by the
+ * apply worker, otherwise by the table sync worker. In table sync worker
+ * case, m_subrelid must be the same as m_relid.
+ */
+ Oid m_subid;
+ Oid m_subrelid;
+
+ /* The clear messages use below field */
+ bool m_reset; /* clear all fields and set reset_stats
+ * timestamp */
+
+ /* The error report message uses below fields */
+ Oid m_databaseid;
+ Oid m_relid;
+ LogicalRepMsgType m_command;
+ TransactionId m_xid;
+ TimestampTz m_last_failure;
+ char m_errmsg[PGSTAT_SUBSCRIPTIONERR_MSGLEN];
+} PgStat_MsgSubscriptionErr;
+
+/* ----------
+ * PgStat_MsgSubscriptionErrPurge Sent by the autovacuum purge the subscription
+ * errors.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONERRPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(Oid) - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionErrPurge
+{
+ PgStat_MsgHdr m_hdr;
+ Oid m_subid;
+ int m_nentries;
+ Oid m_relids[PGSTAT_NUM_SUBSCRIPTIONERRPURGE];
+} PgStat_MsgSubscriptionErrPurge;
+
+/* ----------
+ * PgStat_MsgSubscriptionPurge Sent by the autovacuum purge the subscriptions.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionPurge
+{
+ PgStat_MsgHdr m_hdr;
+ int m_nentries;
+ Oid m_subids[PGSTAT_NUM_SUBSCRIPTIONPURGE];
+} PgStat_MsgSubscriptionPurge;
/* ----------
* PgStat_MsgRecoveryConflict Sent by the backend upon recovery conflict
@@ -710,6 +776,9 @@ typedef union PgStat_Msg
PgStat_MsgChecksumFailure msg_checksumfailure;
PgStat_MsgReplSlot msg_replslot;
PgStat_MsgConn msg_conn;
+ PgStat_MsgSubscriptionErr msg_subscriptionerr;
+ PgStat_MsgSubscriptionErrPurge msg_subscriptionerrpurge;
+ PgStat_MsgSubscriptionPurge msg_subscriptionpurge;
} PgStat_Msg;
@@ -908,6 +977,42 @@ typedef struct PgStat_StatReplSlotEntry
TimestampTz stat_reset_timestamp;
} PgStat_StatReplSlotEntry;
+/*
+ * Subscription error statistics kept in the stats collector.
+ *
+ * PgStat_StatSubErrEntry holds all errors associated with the subscription,
+ * reported by the apply worker and the table sync workers. This entry is
+ * created when the first error message with the subscription is reported
+ * and is dropped along with its errors when the subscription is dropped.
+ *
+ * PgStat_StatSubRelErrEntry represents a error happened during logical
+ * replication, reported by the apply worker (subrelid is InvalidOid) or by the
+ * table sync worker (subrelid is a valid OID). The error reported by the apply
+ * worker is dropped when the subscription is dropped whereas the error reported
+ * by the table sync worker is dropped when the table synchronization process
+ * completed.
+ */
+typedef struct PgStat_StatSubErrEntry
+{
+ Oid subid; /* hash table key */
+ HTAB *suberrors;
+} PgStat_StatSubErrEntry;
+
+typedef struct PgStat_StatSubRelErrEntry
+{
+ Oid subrelid; /* InvalidOid if the apply worker, otherwise
+ * the table sync worker. hash table key. */
+ Oid databaseid;
+ Oid relid; /* OID of relation related to the error. Must
+ * be the same as subrelid in the table sync
+ * case. */
+ LogicalRepMsgType command;
+ TransactionId xid;
+ PgStat_Counter failure_count;
+ TimestampTz last_failure;
+ char errmsg[PGSTAT_SUBSCRIPTIONERR_MSGLEN];
+ TimestampTz stat_reset_timestamp;
+} PgStat_StatSubRelErrEntry;
/*
* Working state needed to accumulate per-function-call timing statistics.
@@ -995,6 +1100,7 @@ extern void pgstat_reset_shared_counters(const char *);
extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type type);
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
+extern void pgstat_reset_subscription_error(Oid subid, Oid subrelid);
extern void pgstat_report_autovac(Oid dboid);
extern void pgstat_report_vacuum(Oid tableoid, bool shared,
@@ -1011,6 +1117,9 @@ extern void pgstat_report_checksum_failure(void);
extern void pgstat_report_replslot(const PgStat_StatReplSlotEntry *repSlotStat);
extern void pgstat_report_replslot_create(const char *slotname);
extern void pgstat_report_replslot_drop(const char *slotname);
+extern void pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command,
+ TransactionId xid, const char *errmsg);
extern void pgstat_initialize(void);
@@ -1106,6 +1215,9 @@ extern PgStat_GlobalStats *pgstat_fetch_global(void);
extern PgStat_WalStats *pgstat_fetch_stat_wal(void);
extern PgStat_SLRUStats *pgstat_fetch_slru(void);
extern PgStat_StatReplSlotEntry *pgstat_fetch_replslot(NameData slotname);
+extern PgStat_StatSubErrEntry *pgstat_fetch_subscription_error(Oid subid);
+extern PgStat_StatSubRelErrEntry *pgstat_fetch_subscription_rel_error(Oid subid,
+ Oid relid);
extern void pgstat_count_slru_page_zeroed(int slru_idx);
extern void pgstat_count_slru_page_hit(int slru_idx);
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index f53607e12e..155145a77d 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -222,6 +222,7 @@ extern int err_generic_string(int field, const char *str);
extern int geterrcode(void);
extern int geterrposition(void);
extern int getinternalerrposition(void);
+extern const char *geterrmessage(void);
/*----------
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index e5ab11275d..ffad9790ae 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2094,6 +2094,28 @@ pg_stat_subscription| SELECT su.oid AS subid,
st.latest_end_time
FROM (pg_subscription su
LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid, pid, received_lsn, last_msg_send_time, last_msg_receipt_time, latest_end_lsn, latest_end_time) ON ((st.subid = su.oid)));
+pg_stat_subscription_errors| SELECT d.datname,
+ sr.subid,
+ s.subname,
+ e.relid,
+ e.command,
+ e.xid,
+ e.failure_source,
+ e.failure_count,
+ e.last_failure,
+ e.last_failure_message,
+ e.stats_reset
+ FROM ( SELECT pg_subscription.oid AS subid,
+ NULL::oid AS relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT pg_subscription_rel.srsubid AS subid,
+ pg_subscription_rel.srrelid AS relid
+ FROM pg_subscription_rel
+ WHERE (pg_subscription_rel.srsubstate <> 'r'::"char")) sr,
+ ((LATERAL pg_stat_get_subscription_error(sr.subid, sr.relid) e(datid, subid, relid, command, xid, failure_source, failure_count, last_failure, last_failure_message, stats_reset)
+ JOIN pg_database d ON ((e.datid = d.oid)))
+ JOIN pg_subscription s ON ((e.subid = s.oid)));
pg_stat_sys_indexes| SELECT pg_stat_all_indexes.relid,
pg_stat_all_indexes.indexrelid,
pg_stat_all_indexes.schemaname,
--
2.24.3 (Apple Git-128)
On Mon, Jul 26, 2021 at 11:58 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Jul 22, 2021 at 8:53 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:On July 20, 2021 9:26 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Jul 19, 2021 at 8:38 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:On July 19, 2021 2:40 PM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:
I've attached the updated version patch that incorporated all
comments I got so far except for the clearing error details part I
mentioned above. After getting a consensus on those parts, I'll
incorporate the idea into the patches.3) For 0003 patch, if user set skip_xid to a wrong xid which have not been
assigned, and then will the change be skipped when the xid is assigned in
the future even if it doesn't cause any conflicts ?Yes. Currently, setting a correct xid is the user's responsibility. I think it would
be better to disable it or emit WARNING/ERROR when the user mistakenly set
the wrong xid if we find out a convenient way to detect that.Thanks for the explanation. As Amit suggested, it seems we can document the
risk of misusing skip_xid. Besides, I found some minor things in the patch.1) In 0002 patch
+ */ +static void +pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len) +{ + if (subscriptionErrHash != NULL) + return; ++static void +pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len) +{the second paramater "len" seems not used in the function
pgstat_recv_subscription_purge() and pgstat_recv_subscription_error().'len' is not used at all in not only functions the patch added but
also other pgstat_recv_* functions. Can we remove all of them in a
separate patch? 'len' in pgstat_recv_* functions has never been used
since the stats collector code is introduced. It seems like that it
was mistakenly introduced in the first commit and other pgstat_recv_*
functions were added that followed it to define ‘len’ but didn’t also
use it at all.2) in 0003 patch
* Helper function for apply_handle_commit and apply_handle_stream_commit. */ static void -apply_handle_commit_internal(StringInfo s, LogicalRepCommitData *commit_data) +apply_handle_commit_internal(LogicalRepCommitData *commit_data) {This looks like a separate change which remove unused paramater in existing
code, maybe we can get this committed first ?Yeah, it seems to be introduced by commit 0926e96c493. I've attached
the patch for that.Also, I've attached the updated version patches. This version patch
has pg_stat_reset_subscription_error() SQL function and sends a clear
message after skipping the transaction. 0004 patch includes the
skipping transaction feature and introducing RESET to ALTER
SUBSCRIPTION. It would be better to separate them.
I've attached the new version patches that fix cfbot failure.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v4-0003-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchapplication/x-patch; name=v4-0003-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchDownload
From 202e813150db4b13ed3b4002a82b235622b7968b Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 28 Jun 2021 13:18:58 +0900
Subject: [PATCH v4 3/3] Add skip_xid option to ALTER SUBSCRIPTION.
If incoming change violates any constraint, lotigcal replication stops
until it's resolved. This commit introduces another way to skip the
transaction in question.
The user can specify XID by ALTER SUBSCRIPTION ... SET (skip_xid =
XXX), updating pg_subscription.subskipxid field, telling the apply
worker to skip the transaction. The apply worker skips all data
modification changes within the specified transaction. After skipping
the transaciton the apply worker clears subskipxid. Also it clear the
error statistics of the subscription in pg_stat_subscription_errors
system view.
To reset skip_xid parameter (and other paremeters), this commits also
adds RESET command to ALTER SUBSCRIPTION command.
---
doc/src/sgml/logical-replication.sgml | 49 +++-
doc/src/sgml/ref/alter_subscription.sgml | 46 +++-
src/backend/catalog/pg_subscription.c | 10 +
src/backend/commands/subscriptioncmds.c | 146 +++++++++--
src/backend/parser/gram.y | 11 +-
src/backend/postmaster/pgstat.c | 41 +++-
src/backend/replication/logical/worker.c | 273 +++++++++++++++++----
src/include/catalog/pg_subscription.h | 4 +
src/include/nodes/parsenodes.h | 4 +-
src/include/pgstat.h | 4 +-
src/test/regress/expected/subscription.out | 22 ++
src/test/regress/sql/subscription.sql | 19 ++
src/test/subscription/t/023_skip_xact.pl | 185 ++++++++++++++
13 files changed, 729 insertions(+), 85 deletions(-)
create mode 100644 src/test/subscription/t/023_skip_xact.pl
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 88646bc859..992d8b4ac1 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -333,20 +333,63 @@
<para>
A conflict will produce an error and will stop the replication; it must be
resolved manually by the user. Details about the conflict can be found in
- the subscriber's server log.
+ <link linkend="monitoring-pg-stat-subscription-errors"/> as well as the
+ subscriber's server log.
</para>
<para>
The resolution can be done either by changing data on the subscriber so
that it does not conflict with the incoming change or by skipping the
- transaction that conflicts with the existing data. The transaction can be
- skipped by calling the <link linkend="pg-replication-origin-advance">
+ transaction that conflicts with the existing data. When a conflict produce
+ an error, it is shown in <structname>pg_stat_subscription_errors</structname>
+ view as follows:
+ </para>
+
+ <programlisting>
+postgres=# SELECT * FROM pg_stat_subscription_errors;
+-[ RECORD 1 ]--------+-----------------------------------------------------------
+datname | postgres
+subid | 16395
+subname | test_sub
+relid | 16385
+command | INSERT
+xid | 716
+failure_source | apply
+failure_count | 50
+last_failure | 2021-07-21 21:16:02.781779+00
+last_failure_message | duplicate key value violates unique constraint "test_pkey"
+stats_reset |
+</programlisting>
+
+ <para>
+ and it is also shown in subscriber's server log as follows:
+ </para>
+
+<screen>
+ERROR: duplicate key value violates unique constraint "test_pkey"
+DETAIL: Key (id)=(1) already exists.
+CONTEXT: during apply of "INSERT" for relation "public.test" in transaction with xid 716 committs 2021-07-15 21:54:58.802874+00
+</screen>
+
+ <para>
+ The transaction ID that contains the change violating the constraint can be
+ found from those outputs (transaction ID 740 in the above case). The transaction
+ can be skipped by setting <replaceable>skip_xid</replaceable> to the subscription
+ by <command>ALTER SUBSCRIPTION ... SET</command>. Alternatively, the transaction
+ can also be skipped by calling the <link linkend="pg-replication-origin-advance">
<function>pg_replication_origin_advance()</function></link> function with
a <parameter>node_name</parameter> corresponding to the subscription name,
and a position. The current position of origins can be seen in the
<link linkend="view-pg-replication-origin-status">
<structname>pg_replication_origin_status</structname></link> system view.
</para>
+
+ <para>
+ In either way, those should be used as a last resort. They skip the whole
+ transaction including changes that may not violate any constraint and easily
+ make subscriber inconsistent if a user specifies the wrong transaction ID or
+ the position of origin.
+ </para>
</sect1>
<sect1 id="logical-replication-restrictions">
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index a6f994450d..591f554fc7 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -29,6 +29,7 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> REFRESH PUB
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> ENABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> DISABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <replaceable>new_name</replaceable>
</synopsis>
@@ -192,15 +193,46 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
<varlistentry>
<term><literal>SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )</literal></term>
+ <term><literal>RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )</literal></term>
<listitem>
<para>
- This clause alters parameters originally set by
- <xref linkend="sql-createsubscription"/>. See there for more
- information. The parameters that can be altered
- are <literal>slot_name</literal>,
- <literal>synchronous_commit</literal>,
- <literal>binary</literal>, and
- <literal>streaming</literal>.
+ This clause sets or resets a subscription option. The parameters that can be
+ set are the parameters originally set by <xref linkend="sql-createsubscription"/>:
+ <literal>slot_name</literal>, <literal>synchronous_commit</literal>,
+ <literal>binary</literal>, <literal>streaming</literal>, and following
+ parameter:
+ </para>
+ <para>
+ <variablelist>
+ <varlistentry>
+ <term><literal>skip_xid</literal> (<type>xid</type>)</term>
+ <listitem>
+ <para>
+ If incoming data violates any constraints the logical replication
+ will stop until it is resolved. The resolution can be done either
+ by changing data on the subscriber so that it doesn't conflict with
+ incoming change or by skipping the whole transaction. This option
+ specifies transaction ID that logical replication worker skips to
+ apply. The logical replication worker skips all data modification
+ changes within the specified transaction. Therefore, since it skips
+ the whole transaction including the changes that may not violate the
+ constraint, it should only be used as a last resort. This option has
+ no effect for the transaction that is already prepared with enabling
+ <literal>two_phase</literal> on susbscriber. After the logical
+ replication successfully skips the transaction, the transaction ID
+ (stored in
+ <structname>pg_subscription</structname>.<structfield>subskipxid</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ <para>
+ The parameters that can be reset are: <literal>streaming</literal>,
+ <literal>binary</literal>, <literal>synchronous_commit</literal>,
+ and <literal>skip_xid</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index d76bdff36a..8ecc55150e 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -104,6 +104,16 @@ GetSubscription(Oid subid, bool missing_ok)
Assert(!isnull);
sub->publications = textarray_to_stringlist(DatumGetArrayTypeP(datum));
+ /* Get skip XID */
+ datum = SysCacheGetAttr(SUBSCRIPTIONOID,
+ tup,
+ Anum_pg_subscription_subskipxid,
+ &isnull);
+ if (!isnull)
+ sub->skipxid = DatumGetTransactionId(datum);
+ else
+ sub->skipxid = InvalidTransactionId;
+
ReleaseSysCache(tup);
return sub;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index da02d3bbfa..0cc965c056 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -60,6 +60,7 @@
#define SUBOPT_BINARY 0x00000080
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
+#define SUBOPT_SKIP_XID 0x00000400
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -81,6 +82,7 @@ typedef struct SubOpts
bool binary;
bool streaming;
bool twophase;
+ TransactionId skip_xid;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -99,7 +101,8 @@ static void ReportSlotConnectionError(HTAB *rstates, Oid subid, char *slotname,
*/
static void
parse_subscription_options(ParseState *pstate, List *stmt_options,
- bits32 supported_opts, SubOpts *opts)
+ bits32 supported_opts, SubOpts *opts,
+ bool is_reset)
{
ListCell *lc;
@@ -128,12 +131,23 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->streaming = false;
if (IsSet(supported_opts, SUBOPT_TWOPHASE_COMMIT))
opts->twophase = false;
+ if (IsSet(supported_opts, SUBOPT_SKIP_XID))
+ opts->skip_xid = InvalidTransactionId;
/* Parse options */
foreach(lc, stmt_options)
{
DefElem *defel = (DefElem *) lfirst(lc);
+ if (is_reset)
+ {
+ if (defel->arg != NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("RESET must not include values for parameters")));
+
+ }
+
if (IsSet(supported_opts, SUBOPT_CONNECT) &&
strcmp(defel->defname, "connect") == 0)
{
@@ -141,7 +155,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_CONNECT;
- opts->connect = defGetBoolean(defel);
+ if (!is_reset)
+ opts->connect = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_ENABLED) &&
strcmp(defel->defname, "enabled") == 0)
@@ -150,7 +165,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_ENABLED;
- opts->enabled = defGetBoolean(defel);
+ if (!is_reset)
+ opts->enabled = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_CREATE_SLOT) &&
strcmp(defel->defname, "create_slot") == 0)
@@ -159,7 +175,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_CREATE_SLOT;
- opts->create_slot = defGetBoolean(defel);
+ if (!is_reset)
+ opts->create_slot = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_SLOT_NAME) &&
strcmp(defel->defname, "slot_name") == 0)
@@ -168,7 +185,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_SLOT_NAME;
- opts->slot_name = defGetString(defel);
+ if (!is_reset)
+ opts->slot_name = defGetString(defel);
/* Setting slot_name = NONE is treated as no slot name. */
if (strcmp(opts->slot_name, "none") == 0)
@@ -183,7 +201,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_COPY_DATA;
- opts->copy_data = defGetBoolean(defel);
+ if (!is_reset)
+ opts->copy_data = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_SYNCHRONOUS_COMMIT) &&
strcmp(defel->defname, "synchronous_commit") == 0)
@@ -192,12 +211,18 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_SYNCHRONOUS_COMMIT;
- opts->synchronous_commit = defGetString(defel);
+ if (!is_reset)
+ {
+ opts->synchronous_commit = defGetString(defel);
- /* Test if the given value is valid for synchronous_commit GUC. */
- (void) set_config_option("synchronous_commit", opts->synchronous_commit,
- PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
- false, 0, false);
+ /*
+ * Test if the given value is valid for synchronous_commit
+ * GUC.
+ */
+ (void) set_config_option("synchronous_commit", opts->synchronous_commit,
+ PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
+ false, 0, false);
+ }
}
else if (IsSet(supported_opts, SUBOPT_REFRESH) &&
strcmp(defel->defname, "refresh") == 0)
@@ -206,7 +231,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_REFRESH;
- opts->refresh = defGetBoolean(defel);
+ if (!is_reset)
+ opts->refresh = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_BINARY) &&
strcmp(defel->defname, "binary") == 0)
@@ -215,7 +241,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_BINARY;
- opts->binary = defGetBoolean(defel);
+ if (!is_reset)
+ opts->binary = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_STREAMING) &&
strcmp(defel->defname, "streaming") == 0)
@@ -224,7 +251,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_STREAMING;
- opts->streaming = defGetBoolean(defel);
+ if (!is_reset)
+ opts->streaming = defGetBoolean(defel);
}
else if (strcmp(defel->defname, "two_phase") == 0)
{
@@ -245,7 +273,31 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_TWOPHASE_COMMIT;
- opts->twophase = defGetBoolean(defel);
+ if (!is_reset)
+ opts->twophase = defGetBoolean(defel);
+ }
+ else if (strcmp(defel->defname, "skip_xid") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_SKIP_XID))
+ errorConflictingDefElem(defel, pstate);
+
+ if (!is_reset)
+ {
+ char *xid_str = defGetString(defel);
+ TransactionId xid;
+
+ /* Parse the argument as TransactionId */
+ xid = DatumGetTransactionId(DirectFunctionCall1(xidin,
+ CStringGetDatum(xid_str)));
+
+ if (!TransactionIdIsNormal(xid))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid transaction id")));
+ opts->skip_xid = xid;
+ }
+
+ opts->specified_opts |= SUBOPT_SKIP_XID;
}
else
ereport(ERROR,
@@ -416,7 +468,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT);
- parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
+ parse_subscription_options(pstate, stmt->options, supported_opts, &opts,
+ false);
/*
* Since creating a replication slot is not transactional, rolling back
@@ -489,6 +542,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
CharGetDatum(opts.twophase ?
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -885,14 +939,14 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
switch (stmt->kind)
{
- case ALTER_SUBSCRIPTION_OPTIONS:
+ case ALTER_SUBSCRIPTION_SET_OPTIONS:
{
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
- SUBOPT_STREAMING);
+ SUBOPT_STREAMING | SUBOPT_SKIP_XID);
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
if (IsSet(opts.specified_opts, SUBOPT_SLOT_NAME))
{
@@ -944,14 +998,60 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
replaces[Anum_pg_subscription_substream - 1] = true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_SKIP_XID))
+ {
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(opts.skip_xid);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+ }
+
update_tuple = true;
break;
}
+ case ALTER_SUBSCRIPTION_RESET_OPTIONS:
+ {
+ supported_opts = (SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_STREAMING |
+ SUBOPT_BINARY | SUBOPT_SKIP_XID);
+
+ parse_subscription_options(pstate, stmt->options,
+ supported_opts, &opts, true);
+
+ if (IsSet(opts.specified_opts, SUBOPT_SYNCHRONOUS_COMMIT))
+ {
+ values[Anum_pg_subscription_subsynccommit - 1] =
+ CStringGetTextDatum("off");
+ replaces[Anum_pg_subscription_subsynccommit - 1] = true;
+ }
+
+ if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
+ {
+ values[Anum_pg_subscription_substream - 1] =
+ BoolGetDatum(false);
+ replaces[Anum_pg_subscription_substream - 1] = true;
+ }
+
+ if (IsSet(opts.specified_opts, SUBOPT_BINARY))
+ {
+ values[Anum_pg_subscription_subbinary - 1] =
+ BoolGetDatum(false);
+ replaces[Anum_pg_subscription_subbinary - 1] = true;
+ }
+
+ if (IsSet(opts.specified_opts, SUBOPT_SKIP_XID))
+ {
+ nulls[Anum_pg_subscription_subskipxid - 1] =
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+ }
+
+ update_tuple = true;
+ break;
+ }
case ALTER_SUBSCRIPTION_ENABLED:
{
parse_subscription_options(pstate, stmt->options,
- SUBOPT_ENABLED, &opts);
+ SUBOPT_ENABLED, &opts, false);
+
Assert(IsSet(opts.specified_opts, SUBOPT_ENABLED));
if (!sub->slotname && opts.enabled)
@@ -986,7 +1086,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
{
supported_opts = SUBOPT_COPY_DATA | SUBOPT_REFRESH;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
values[Anum_pg_subscription_subpublications - 1] =
publicationListToArray(stmt->publication);
@@ -1036,7 +1136,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts |= SUBOPT_COPY_DATA;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
publist = merge_publications(sub->publications, stmt->publication, isadd, stmt->subname);
values[Anum_pg_subscription_subpublications - 1] =
@@ -1084,7 +1184,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
errmsg("ALTER SUBSCRIPTION ... REFRESH is not allowed for disabled subscriptions")));
parse_subscription_options(pstate, stmt->options,
- SUBOPT_COPY_DATA, &opts);
+ SUBOPT_COPY_DATA, &opts, false);
/*
* The subscription option "two_phase" requires that
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 39a2849eba..bcf85e8980 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9707,7 +9707,16 @@ AlterSubscriptionStmt:
{
AlterSubscriptionStmt *n =
makeNode(AlterSubscriptionStmt);
- n->kind = ALTER_SUBSCRIPTION_OPTIONS;
+ n->kind = ALTER_SUBSCRIPTION_SET_OPTIONS;
+ n->subname = $3;
+ n->options = $5;
+ $$ = (Node *)n;
+ }
+ | ALTER SUBSCRIPTION name RESET definition
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_RESET_OPTIONS;
n->subname = $3;
n->options = $5;
$$ = (Node *)n;
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 4a35e6640f..d3f0a2ea2f 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -1699,6 +1699,27 @@ pgstat_reset_subscription_error(Oid subid, Oid subrelid)
msg.m_subid = subid;
msg.m_subrelid = subrelid;
msg.m_reset = true;
+ msg.m_clear = true;
+
+ pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
+}
+
+/* ----------
+ * pgstat_clear_subscription_error() -
+ *
+ * Tell the collector about clear the error of subscription.
+ * ----------
+ */
+void
+pgstat_clear_subscription_error(Oid subid, Oid subrelid)
+{
+ PgStat_MsgSubscriptionErr msg;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_reset = false;
+ msg.m_clear = true;
pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
}
@@ -2046,6 +2067,7 @@ pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
msg.m_databaseid = MyDatabaseId;
msg.m_relid = relid;
msg.m_reset = false;
+ msg.m_clear = false;
msg.m_command = command;
msg.m_xid = xid;
msg.m_last_failure = GetCurrentTimestamp();
@@ -6078,26 +6100,37 @@ static void
pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len)
{
PgStat_StatSubRelErrEntry *relerrent;
- bool create = !msg->m_reset;
+ bool create = !(msg->m_reset || msg->m_clear);
/* Get subscription error */
relerrent = pgstat_get_subscription_rel_error_entry(msg->m_subid,
msg->m_subrelid,
create);
- if (msg->m_reset)
+ if (msg->m_reset || msg->m_clear)
{
+ Assert(!(msg->m_reset && msg->m_clear));
+
if (relerrent)
return;
- /* reset fields and set reset timestamp */
+ /* reset fields */
relerrent->relid = InvalidOid;
relerrent->command = 0;
relerrent->xid = InvalidTransactionId;
relerrent->failure_count = 0;
relerrent->last_failure = 0;
relerrent->errmsg[0] = '\0';
- relerrent->stat_reset_timestamp = GetCurrentTimestamp();
+
+ /*
+ * If the reset is requested, reset more fields and set the reset
+ * timestamp.
+ */
+ if (msg->m_reset)
+ {
+ relerrent->failure_count = 0;
+ relerrent->stat_reset_timestamp = GetCurrentTimestamp();
+ }
}
else
{
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 4f9c4e9014..56d78ba905 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -136,6 +136,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -277,6 +278,21 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * skipping_changes is true if we're skipping all data modification changes of the
+ * specified transaction in MySubscription->skipxid copied to skipping_xid. Please
+ * note that we don’t skip receiving the changes particularly in streaming cases,
+ * since we decide whether or not to skip applying the changes when starting to apply.
+ * Once starting skipping changes, we copy XID to skipping_xid and then don't stop
+ * skipping until we skip the whole transaction even if the subscription is invalidated
+ * and MySubscription->skipxid gets changed or reset. When stopping the skipping
+ * behavior, we reset the skip XID (subskipxid) in the pg_subscription and associate
+ * origin status to the transaction that resets the skip XID so that we can start
+ * streaming from the next transaction.
+ */
+static bool skipping_changes = false;
+static TransactionId skipping_xid = InvalidTransactionId;
+
/*
* Hash table for storing the streaming xid information along with shared file
* set for streaming and subxact files.
@@ -360,6 +376,9 @@ static void set_apply_error_context_rel(LogicalRepRelMapEntry *rel);
static void reset_apply_error_context_rel(void);
static void reset_apply_error_context_info(void);
+static void maybe_start_skipping_changes(TransactionId xid);
+static void stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs);
+
/*
* Should this worker apply changes for given relation.
*
@@ -857,6 +876,11 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
+ /*
+ * Start skipping all changes of this transaction if specified
+ */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -881,7 +905,18 @@ apply_handle_commit(StringInfo s)
LSN_FORMAT_ARGS(commit_data.commit_lsn),
LSN_FORMAT_ARGS(remote_final_lsn))));
- apply_handle_commit_internal(&commit_data);
+ /*
+ * Stop the skipping transaction if enabled. Otherwise, commit the changes
+ * that are just applied.
+ */
+ if (skipping_changes)
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ apply_handle_commit_internal(&commit_data);
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(commit_data.end_lsn);
@@ -910,6 +945,9 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
+ /* Start skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -934,47 +972,57 @@ apply_handle_prepare(StringInfo s)
LSN_FORMAT_ARGS(remote_final_lsn))));
/*
- * Compute unique GID for two_phase transactions. We don't use GID of
- * prepared transaction sent by server as that can lead to deadlock when
- * we have multiple subscriptions from same node point to publications on
- * the same node. See comments atop worker.c
+ * Prepare transaction if we haven't skipped the changes of this
+ * transaction.
*/
- TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
- gid, sizeof(gid));
+ if (skipping_changes)
+ stop_skipping_changes(prepare_data.end_lsn, prepare_data.prepare_time);
+ else
+ {
+ /*
+ * Compute unique GID for two_phase transactions. We don't use GID of
+ * prepared transaction sent by server as that can lead to deadlock
+ * when we have multiple subscriptions from same node point to
+ * publications on the same node. See comments atop worker.c
+ */
+ TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
+ gid, sizeof(gid));
- /*
- * Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction. It is done this way because at
- * commit prepared time, we won't know whether we have skipped preparing a
- * transaction because of no change.
- *
- * XXX, We can optimize such that at commit prepared time, we first check
- * whether we have prepared the transaction or not but that doesn't seem
- * worthwhile because such cases shouldn't be common.
- */
- begin_replication_step();
+ /*
+ * Unlike commit, here, we always prepare the transaction even though
+ * no change has happened in this transaction. It is done this way
+ * because at commit prepared time, we won't know whether we have
+ * skipped preparing a transaction because of no change.
+ *
+ * XXX, We can optimize such that at commit prepared time, we first
+ * check whether we have prepared the transaction or not but that
+ * doesn't seem worthwhile because such cases shouldn't be common.
+ */
+ begin_replication_step();
- /*
- * BeginTransactionBlock is necessary to balance the EndTransactionBlock
- * called within the PrepareTransactionBlock below.
- */
- BeginTransactionBlock();
- CommitTransactionCommand(); /* Completes the preceding Begin command. */
+ /*
+ * BeginTransactionBlock is necessary to balance the
+ * EndTransactionBlock called within the PrepareTransactionBlock
+ * below.
+ */
+ BeginTransactionBlock();
+ CommitTransactionCommand(); /* Completes the preceding Begin command. */
- /*
- * Update origin state so we can restart streaming from correct position
- * in case of crash.
- */
- replorigin_session_origin_lsn = prepare_data.end_lsn;
- replorigin_session_origin_timestamp = prepare_data.prepare_time;
+ /*
+ * Update origin state so we can restart streaming from correct
+ * position in case of crash.
+ */
+ replorigin_session_origin_lsn = prepare_data.end_lsn;
+ replorigin_session_origin_timestamp = prepare_data.prepare_time;
- PrepareTransactionBlock(gid);
- end_replication_step();
- CommitTransactionCommand();
- pgstat_report_stat(false);
+ PrepareTransactionBlock(gid);
+ end_replication_step();
+ CommitTransactionCommand();
+ pgstat_report_stat(false);
- store_flush_position(prepare_data.end_lsn);
+ }
+ store_flush_position(prepare_data.end_lsn);
in_remote_transaction = false;
/* Process any tables that are being synchronized in parallel. */
@@ -1087,9 +1135,10 @@ apply_handle_origin(StringInfo s)
{
/*
* ORIGIN message can only come inside streaming transaction or inside
- * remote transaction and before any actual writes.
+ * remote transaction and before any actual writes unless we're skipping
+ * changes of the transaction.
*/
- if (!in_streamed_transaction &&
+ if (!in_streamed_transaction && !skipping_changes &&
(!in_remote_transaction ||
(IsTransactionState() && !am_tablesync_worker())))
ereport(ERROR,
@@ -1111,6 +1160,9 @@ apply_handle_stream_start(StringInfo s)
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg_internal("duplicate STREAM START message")));
+ /* extract XID of the top-level transaction */
+ stream_xid = logicalrep_read_stream_start(s, &first_segment);
+
/*
* Start a transaction on stream start, this transaction will be committed
* on the stream stop unless it is a tablesync worker in which case it
@@ -1123,9 +1175,6 @@ apply_handle_stream_start(StringInfo s)
/* notify handle methods we're processing a remote transaction */
in_streamed_transaction = true;
- /* extract XID of the top-level transaction */
- stream_xid = logicalrep_read_stream_start(s, &first_segment);
-
if (!TransactionIdIsValid(stream_xid))
ereport(ERROR,
(errcode(ERRCODE_PROTOCOL_VIOLATION),
@@ -1207,6 +1256,7 @@ apply_handle_stream_abort(StringInfo s)
errmsg_internal("STREAM ABORT message without STREAM STOP")));
logicalrep_read_stream_abort(s, &xid, &subxid);
+ maybe_start_skipping_changes(xid);
/*
* If the two XIDs are the same, it's in fact abort of toplevel xact, so
@@ -1299,6 +1349,10 @@ apply_handle_stream_abort(StringInfo s)
CommitTransactionCommand();
}
+ /* Stop the skipping transaction if enabled */
+ if (skipping_changes)
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
reset_apply_error_context_info();
}
@@ -1309,11 +1363,11 @@ static void
apply_handle_stream_commit(StringInfo s)
{
TransactionId xid;
+ LogicalRepCommitData commit_data;
StringInfoData s2;
int nchanges;
char path[MAXPGPATH];
char *buffer = NULL;
- LogicalRepCommitData commit_data;
StreamXidHash *ent;
MemoryContext oldcxt;
BufFile *fd;
@@ -1327,8 +1381,13 @@ apply_handle_stream_commit(StringInfo s)
apply_error_callback_arg.remote_xid = xid;
apply_error_callback_arg.committs = commit_data.committime;
+ remote_final_lsn = commit_data.commit_lsn;
+
elog(DEBUG1, "received commit for streamed transaction %u", xid);
+ /* Start skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(xid);
+
/* Make sure we have an open transaction */
begin_replication_step();
@@ -1360,13 +1419,12 @@ apply_handle_stream_commit(StringInfo s)
MemoryContextSwitchTo(oldcxt);
- remote_final_lsn = commit_data.commit_lsn;
-
/*
* Make sure the handle apply_dispatch methods are aware we're in a remote
* transaction.
*/
in_remote_transaction = true;
+
pgstat_report_activity(STATE_RUNNING, NULL);
end_replication_step();
@@ -1439,7 +1497,17 @@ apply_handle_stream_commit(StringInfo s)
elog(DEBUG1, "replayed %d (all) changes from file \"%s\"",
nchanges, path);
- apply_handle_commit_internal(&commit_data);
+ if (skipping_changes)
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ {
+ /* commit the streamed transaction */
+ apply_handle_commit_internal(&commit_data);
+ }
/* unlink the files with serialized changes and subxact info */
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
@@ -1448,7 +1516,6 @@ apply_handle_stream_commit(StringInfo s)
process_syncing_tables(commit_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
-
reset_apply_error_context_info();
}
@@ -2328,6 +2395,17 @@ apply_dispatch(StringInfo s)
LogicalRepMsgType action = pq_getmsgbyte(s);
ErrorContextCallback errcallback;
+ /*
+ * Skip all data-modification changes if we're skipping changes of this
+ * transaction.
+ */
+ if (skipping_changes &&
+ (action == LOGICAL_REP_MSG_INSERT ||
+ action == LOGICAL_REP_MSG_UPDATE ||
+ action == LOGICAL_REP_MSG_DELETE ||
+ action == LOGICAL_REP_MSG_TRUNCATE))
+ return;
+
/*
* Push apply error context callback. Other fields will be filled during
* applying the change.
@@ -3788,3 +3866,108 @@ reset_logicalrep_error_context_rel(void)
apply_error_callback_arg.relname = NULL;
}
}
+
+/*
+ * Start skipping changes of the transaction if the given XID matches the
+ * transaction ID specified by skip_xid option.
+ */
+static void
+maybe_start_skipping_changes(TransactionId xid)
+{
+ Assert(!skipping_changes);
+ Assert(!TransactionIdIsValid(skipping_xid));
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ if (!TransactionIdIsValid(MySubscription->skipxid) ||
+ MySubscription->skipxid != xid)
+ return;
+
+ skipping_changes = true;
+ skipping_xid = xid;
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction with xid %u",
+ skipping_xid));
+}
+
+/*
+ * Stop skipping changes and reset the skip XID.
+ *
+ * If origin_lsn and origin_committs are valid, we set origin state to the
+ * transaction commit that resets the skip XID so that we can start streaming
+ * from the transaction next to the one that we just skipped.
+ */
+static void
+stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs)
+{
+ Relation rel;
+ HeapTuple tup;
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ Assert(skipping_changes);
+ Assert(TransactionIdIsValid(skipping_xid));
+ Assert(in_remote_transaction);
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction with xid %u",
+ skipping_xid)));
+
+ /* Stop skipping changes */
+ skipping_changes = false;
+ skipping_xid = InvalidTransactionId;
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ if (!IsTransactionState())
+ StartTransactionCommand();
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ TransactionIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ /* Set subskipxid to null */
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ /* Update the system catalog to reset the skip XID */
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+
+ heap_freetuple(tup);
+ table_close(rel, RowExclusiveLock);
+
+ if (!XLogRecPtrIsInvalid(origin_lsn))
+ {
+ /*
+ * Update origin state so we can restart streaming from correct
+ * position in case of crash.
+ */
+ replorigin_session_origin_lsn = origin_lsn;
+ replorigin_session_origin_timestamp = origin_committs;
+ }
+
+ CommitTransactionCommand();
+ pgstat_report_stat(false);
+
+ /*
+ * Clear the error statistics of this subscription to let users know the
+ * subscription is no longer getting stuck by the conflict.
+ *
+ * The message for clearing the error statistics can be lost but that's
+ * okay. The user can know the logical replication is working fine in
+ * other ways, for example, checking pg_stat_subscription view. And the
+ * user is able to reset the single subscription error statistics by
+ * pg_reset_subscription_error SQL function.
+ */
+ pgstat_clear_subscription_error(MySubscription->oid, InvalidOid);
+}
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 21061493ea..beaa6e646d 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -67,6 +67,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
char subtwophasestate; /* Stream two-phase transactions */
+ TransactionId subskipxid BKI_FORCE_NULL; /* All changes associated with
+ * this XID are skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -103,6 +106,7 @@ typedef struct Subscription
* binary format */
bool stream; /* Allow streaming in-progress transactions. */
char twophasestate; /* Allow streaming two-phase transactions */
+ TransactionId skipxid; /* All changes of the XID are skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index e28248af32..af5c16abfa 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3659,7 +3659,8 @@ typedef struct CreateSubscriptionStmt
typedef enum AlterSubscriptionType
{
- ALTER_SUBSCRIPTION_OPTIONS,
+ ALTER_SUBSCRIPTION_SET_OPTIONS,
+ ALTER_SUBSCRIPTION_RESET_OPTIONS,
ALTER_SUBSCRIPTION_CONNECTION,
ALTER_SUBSCRIPTION_SET_PUBLICATION,
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
@@ -3676,6 +3677,7 @@ typedef struct AlterSubscriptionStmt
char *conninfo; /* Connection string to publisher */
List *publication; /* One or more publication to subscribe to */
List *options; /* List of DefElem nodes */
+ TransactionId skip_xid; /* XID to skip */
} AlterSubscriptionStmt;
typedef struct DropSubscriptionStmt
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 1104886bef..4a1185a4f6 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -563,9 +563,10 @@ typedef struct PgStat_MsgSubscriptionErr
Oid m_subid;
Oid m_subrelid;
- /* The clear messages use below field */
+ /* The clear messages use below fields */
bool m_reset; /* clear all fields and set reset_stats
* timestamp */
+ bool m_clear; /* clear all fields except for total_failure */
/* The error report message uses below fields */
Oid m_databaseid;
@@ -1101,6 +1102,7 @@ extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type t
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
extern void pgstat_reset_subscription_error(Oid subid, Oid subrelid);
+extern void pgstat_clear_subscription_error(Oid subid, Oid subrelid);
extern void pgstat_report_autovac(Oid dboid);
extern void pgstat_report_vacuum(Oid tableoid, bool shared,
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 67f92b3878..e2ec685f78 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -286,6 +286,28 @@ ERROR: unrecognized subscription parameter: "two_phase"
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
ERROR: cannot set streaming = true for two-phase enabled subscription
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = '4294967295');
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid, synchronous_commit, binary, streaming);
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1.1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+ERROR: invalid transaction id
+-- fail - unsupported parameters
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ERROR: unrecognized subscription parameter: "connect"
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+ERROR: unrecognized subscription parameter: "enabled"
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+ERROR: RESET must not include values for parameters
\dRs+
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 88743ab33b..2412b28422 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -220,6 +220,25 @@ ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = '4294967295');
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid, synchronous_commit, binary, streaming);
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1.1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+
+-- fail - unsupported parameters
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+
\dRs+
DROP SUBSCRIPTION regress_testsub;
diff --git a/src/test/subscription/t/023_skip_xact.pl b/src/test/subscription/t/023_skip_xact.pl
new file mode 100644
index 0000000000..7b29828cce
--- /dev/null
+++ b/src/test/subscription/t/023_skip_xact.pl
@@ -0,0 +1,185 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Test skipping logical replication transactions
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 8;
+
+sub test_subscription_error
+{
+ my ($node, $expected, $source, $relname, $msg) = @_;
+
+ # Wait for the error statistics to be updated.
+ $node->poll_query_until(
+ 'postgres', qq[
+SELECT count(1) > 0 FROM pg_stat_subscription_errors
+WHERE relid = '$relname'::regclass AND failure_source = '$source';
+]) or die "Timed out while waiting for statistics to be updated";
+
+ my $result = $node->safe_psql(
+ 'postgres',
+ qq[
+SELECT datname, subname, command, relid::regclass, failure_source, failure_count > 0
+FROM pg_stat_subscription_errors
+WHERE relid = '$relname'::regclass AND failure_source = '$source';
+]);
+ is($result, $expected, $msg);
+}
+
+# Create publisher node.
+my $node_publisher = get_new_node('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = get_new_node('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+
+# don't overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ 'wal_retrieve_retry_interval = 5s');
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we create
+# the same tables but with primary keys. Also, insert some data that will conflict
+# with the data replicated from publisher later.
+$node_publisher->safe_psql('postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int);
+CREATE TABLE test_tab2 (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+INSERT INTO test_tab1 VALUES (1);
+INSERT INTO test_tab2 VALUES (1);
+COMMIT;
+]);
+$node_subscriber->safe_psql('postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int primary key);
+CREATE TABLE test_tab2 (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab2 VALUES (1);
+INSERT INTO test_tab_streaming SELECT 10000, md5(10000::text);
+COMMIT;
+]);
+
+# Check if there is no subscription errors before starting logical replication.
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_errors");
+is($result, qq(0), 'check no subscription error');
+
+# Setup logical replication.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ q[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Start logical replication. The table sync for test_tab2 on tap_sub will enter
+# infinite error due to violating the unique constraint.
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (two_phase = on);");
+my $appname_streaming = 'tap_sub_streaming';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming' PUBLICATION tap_pub_streaming WITH (streaming = on);");
+
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Also wait for initial table sync for test_tab1 and test_tab_streaming to finish.
+$node_subscriber->poll_query_until('postgres',
+ q[
+SELECT count(1) = 2 FROM pg_subscription_rel
+WHERE srrelid in ('test_tab1'::regclass, 'test_tab_streaming'::regclass) AND srsubstate = 'r'
+]) or die "Timed out while waiting for subscriber to synchronize data";
+
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab1");
+is($result, q(1), 'check initial data was copied to subscriber');
+
+# Insert more data to test_tab1, raising an error on the subscriber due to violating
+# the unique constraint on test_tab1.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab1 VALUES (1)");
+
+# Insert enough rows to test_tab_streaming to exceed the 64kB limit, also raising an
+# error on the subscriber for the same reason.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);");
+
+# Check both two errors on tap_sub subscription are reported.
+test_subscription_error($node_subscriber, qq(postgres|tap_sub|INSERT|test_tab1|apply|t),
+ 'apply', 'test_tab1', 'error reporting by the apply worker');
+test_subscription_error($node_subscriber, qq(postgres|tap_sub||test_tab2|tablesync|t),
+ 'tablesync', 'test_tab2', 'error reporting by the table sync worker');
+test_subscription_error($node_subscriber, qq(postgres|tap_sub_streaming|INSERT|test_tab_streaming|apply|t),
+ 'apply', 'test_tab_streaming', 'error reporting by the apply worker');
+
+# Set XIDs of the transactions in question to the subscriptions to skip.
+my $skip_xid1 = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT xid FROM pg_stat_subscription_errors WHERE relid = 'test_tab1'::regclass");
+my $skip_xid2 = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT xid FROM pg_stat_subscription_errors WHERE relid = 'test_tab_streaming'::regclass");
+
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (skip_xid = $skip_xid1)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub_streaming SET (skip_xid = $skip_xid2)");
+
+# Restart the subscriber to restart logical replication without interval.
+$node_subscriber->restart;
+
+# Wait for the transaction in question is skipped.
+$node_subscriber->poll_query_until(
+ 'postgres',
+ q[
+SELECT count(1) = 2 FROM pg_subscription
+WHERE subname in ('tap_sub', 'tap_sub_streaming') AND subskipxid IS NULL
+]) or die "Timed out while waiting for the transaction to be skipped";
+
+# Insert data to test_tab1 that doesn't conflict.
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab1 VALUES (2)");
+
+# Also, insert data to test_tab_streaming that doesn't conflict.
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab_streaming VALUES (10001, md5(10001::text))");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Check the data is successfully replicated after skipping the transaction.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT * FROM test_tab1");
+is($result, q(1
+2), "subscription gets changes after skipped transaction");
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM test_tab_streaming");
+is($result, q(2), "subscription gets changes after skipped transaction");
+
+# Check if the view doesn't show any entries after dropping the subscription.
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+DROP SUBSCRIPTION tap_sub;
+DROP SUBSCRIPTION tap_sub_streaming;
+]);
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_errors");
+is($result, q(0), 'no error after dropping subscription');
--
2.24.3 (Apple Git-128)
v4-0001-Add-errcontext-to-errors-of-the-applying-logical-.patchapplication/x-patch; name=v4-0001-Add-errcontext-to-errors-of-the-applying-logical-.patchDownload
From 3f9f615a5c8e60725d61463a18f00b6568ab08f0 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 28 Jun 2021 13:21:58 +0900
Subject: [PATCH v4 1/3] Add errcontext to errors of the applying logical
replication changes.
This commit adds the error context to errors happening during applying
logical replication changes, showing the command, the relation
relation, transaction ID, and commit timestamp in the server log.
---
src/backend/commands/tablecmds.c | 7 +
src/backend/replication/logical/proto.c | 49 +++++
src/backend/replication/logical/worker.c | 220 ++++++++++++++++++++---
src/include/replication/logicalproto.h | 1 +
src/include/replication/logicalworker.h | 2 +
5 files changed, 257 insertions(+), 22 deletions(-)
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index fcd778c62a..911bef8312 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -78,6 +78,7 @@
#include "partitioning/partbounds.h"
#include "partitioning/partdesc.h"
#include "pgstat.h"
+#include "replication/logicalworker.h"
#include "rewrite/rewriteDefine.h"
#include "rewrite/rewriteHandler.h"
#include "rewrite/rewriteManip.h"
@@ -1899,6 +1900,9 @@ ExecuteTruncateGuts(List *explicit_rels,
continue;
}
+ /* Set logical replication error callback info if necessary */
+ set_logicalrep_error_context_rel(rel);
+
/*
* Build the lists of foreign tables belonging to each foreign server
* and pass each list to the foreign data wrapper's callback function,
@@ -2006,6 +2010,9 @@ ExecuteTruncateGuts(List *explicit_rels,
pgstat_count_truncate(rel);
}
+ /* Reset logical replication error callback info */
+ reset_logicalrep_error_context_rel();
+
/* Now go through the hash table, and truncate foreign tables */
if (ft_htab)
{
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index a245252529..54fff7df21 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -1109,3 +1109,52 @@ logicalrep_read_stream_abort(StringInfo in, TransactionId *xid,
*xid = pq_getmsgint(in, 4);
*subxid = pq_getmsgint(in, 4);
}
+
+/*
+ * get string representing LogicalRepMsgType.
+ */
+char *
+logicalrep_message_type(LogicalRepMsgType action)
+{
+ switch (action)
+ {
+ case LOGICAL_REP_MSG_BEGIN:
+ return "BEGIN";
+ case LOGICAL_REP_MSG_COMMIT:
+ return "COMMIT";
+ case LOGICAL_REP_MSG_INSERT:
+ return "INSERT";
+ case LOGICAL_REP_MSG_UPDATE:
+ return "UPDATE";
+ case LOGICAL_REP_MSG_DELETE:
+ return "DELETE";
+ case LOGICAL_REP_MSG_TRUNCATE:
+ return "TRUNCATE";
+ case LOGICAL_REP_MSG_RELATION:
+ return "RELATION";
+ case LOGICAL_REP_MSG_TYPE:
+ return "TYPE";
+ case LOGICAL_REP_MSG_ORIGIN:
+ return "ORIGIN";
+ case LOGICAL_REP_MSG_MESSAGE:
+ return "MESSAGE";
+ case LOGICAL_REP_MSG_STREAM_START:
+ return "STREAM START";
+ case LOGICAL_REP_MSG_STREAM_END:
+ return "STREAM END";
+ case LOGICAL_REP_MSG_STREAM_ABORT:
+ return "STREAM ABORT";
+ case LOGICAL_REP_MSG_STREAM_COMMIT:
+ return "STREAM COMMIT";
+ case LOGICAL_REP_MSG_BEGIN_PREPARE:
+ return "BEGIN PREPARE";
+ case LOGICAL_REP_MSG_PREPARE:
+ return "PREPARE";
+ case LOGICAL_REP_MSG_COMMIT_PREPARED:
+ return "COMMIT PREPARED";
+ case LOGICAL_REP_MSG_ROLLBACK_PREPARED:
+ return "ROLLBACK PREPARED";
+ }
+
+ elog(ERROR, "invalid logical replication message type \"%c\"", action);
+}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 186be1a188..d346377b20 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -221,6 +221,27 @@ typedef struct ApplyExecutionData
PartitionTupleRouting *proute; /* partition routing info */
} ApplyExecutionData;
+/* Struct for saving and restoring apply information */
+typedef struct ApplyErrCallbackArg
+{
+ LogicalRepMsgType command; /* 0 if invalid */
+
+ /* Local relation information */
+ char *nspname; /* used for error context */
+ char *relname; /* used for error context */
+
+ TransactionId remote_xid;
+ TimestampTz committs;
+} ApplyErrCallbackArg;
+static ApplyErrCallbackArg apply_error_callback_arg =
+{
+ .command = 0,
+ .relname = NULL,
+ .nspname = NULL,
+ .remote_xid = InvalidTransactionId,
+ .committs = 0,
+};
+
/*
* Stream xid hash entry. Whenever we see a new xid we create this entry in the
* xidhash and along with it create the streaming file and store the fileset handle.
@@ -332,6 +353,10 @@ static void apply_handle_tuple_routing(ApplyExecutionData *edata,
/* Compute GID for two_phase transactions */
static void TwoPhaseTransactionGid(Oid subid, TransactionId xid, char *gid, int szgid);
+static void apply_error_callback(void *arg);
+static void set_apply_error_context_rel(LogicalRepRelMapEntry *rel);
+static void reset_apply_error_context_rel(void);
+static void reset_apply_error_context_info(void);
/*
* Should this worker apply changes for given relation.
@@ -825,6 +850,8 @@ apply_handle_begin(StringInfo s)
LogicalRepBeginData begin_data;
logicalrep_read_begin(s, &begin_data);
+ apply_error_callback_arg.remote_xid = begin_data.xid;
+ apply_error_callback_arg.committs = begin_data.committime;
remote_final_lsn = begin_data.final_lsn;
@@ -858,6 +885,7 @@ apply_handle_commit(StringInfo s)
process_syncing_tables(commit_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -875,6 +903,8 @@ apply_handle_begin_prepare(StringInfo s)
errmsg_internal("tablesync worker received a BEGIN PREPARE message")));
logicalrep_read_begin_prepare(s, &begin_data);
+ apply_error_callback_arg.remote_xid = begin_data.xid;
+ apply_error_callback_arg.committs = begin_data.prepare_time;
remote_final_lsn = begin_data.prepare_lsn;
@@ -949,6 +979,7 @@ apply_handle_prepare(StringInfo s)
process_syncing_tables(prepare_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -961,6 +992,8 @@ apply_handle_commit_prepared(StringInfo s)
char gid[GIDSIZE];
logicalrep_read_commit_prepared(s, &prepare_data);
+ apply_error_callback_arg.remote_xid = prepare_data.xid;
+ apply_error_callback_arg.committs = prepare_data.commit_time;
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
@@ -988,6 +1021,7 @@ apply_handle_commit_prepared(StringInfo s)
process_syncing_tables(prepare_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1000,6 +1034,7 @@ apply_handle_rollback_prepared(StringInfo s)
char gid[GIDSIZE];
logicalrep_read_rollback_prepared(s, &rollback_data);
+ apply_error_callback_arg.remote_xid = rollback_data.xid;
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, rollback_data.xid,
@@ -1037,6 +1072,7 @@ apply_handle_rollback_prepared(StringInfo s)
process_syncing_tables(rollback_data.rollback_end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1093,6 +1129,8 @@ apply_handle_stream_start(StringInfo s)
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg_internal("invalid transaction ID in streamed replication transaction")));
+ apply_error_callback_arg.remote_xid = stream_xid;
+
/*
* Initialize the xidhash table if we haven't yet. This will be used for
* the entire duration of the apply worker so create it in permanent
@@ -1149,6 +1187,7 @@ apply_handle_stream_stop(StringInfo s)
MemoryContextReset(LogicalStreamingContext);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1172,7 +1211,10 @@ apply_handle_stream_abort(StringInfo s)
* just delete the files with serialized info.
*/
if (xid == subxid)
+ {
+ apply_error_callback_arg.remote_xid = xid;
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
+ }
else
{
/*
@@ -1197,6 +1239,7 @@ apply_handle_stream_abort(StringInfo s)
char path[MAXPGPATH];
StreamXidHash *ent;
+ apply_error_callback_arg.remote_xid = subxid;
subidx = -1;
begin_replication_step();
subxact_info_read(MyLogicalRepWorker->subid, xid);
@@ -1221,6 +1264,7 @@ apply_handle_stream_abort(StringInfo s)
cleanup_subxact_info();
end_replication_step();
CommitTransactionCommand();
+ reset_apply_error_context_info();
return;
}
@@ -1252,6 +1296,8 @@ apply_handle_stream_abort(StringInfo s)
end_replication_step();
CommitTransactionCommand();
}
+
+ reset_apply_error_context_info();
}
/*
@@ -1276,6 +1322,8 @@ apply_handle_stream_commit(StringInfo s)
errmsg_internal("STREAM COMMIT message without STREAM STOP")));
xid = logicalrep_read_stream_commit(s, &commit_data);
+ apply_error_callback_arg.remote_xid = xid;
+ apply_error_callback_arg.committs = commit_data.committime;
elog(DEBUG1, "received commit for streamed transaction %u", xid);
@@ -1398,6 +1446,8 @@ apply_handle_stream_commit(StringInfo s)
process_syncing_tables(commit_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+
+ reset_apply_error_context_info();
}
/*
@@ -1517,6 +1567,9 @@ apply_handle_insert(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ set_apply_error_context_rel(rel);
+
/* Initialize the executor state. */
edata = create_edata_for_relation(rel);
estate = edata->estate;
@@ -1540,6 +1593,9 @@ apply_handle_insert(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ reset_apply_error_context_rel();
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -1638,6 +1694,9 @@ apply_handle_update(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ set_apply_error_context_rel(rel);
+
/* Check if we can do the update. */
check_relation_updatable(rel);
@@ -1691,6 +1750,9 @@ apply_handle_update(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ reset_apply_error_context_rel();
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -1794,6 +1856,9 @@ apply_handle_delete(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ set_apply_error_context_rel(rel);
+
/* Check if we can do the delete. */
check_relation_updatable(rel);
@@ -1819,6 +1884,9 @@ apply_handle_delete(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ reset_apply_error_context_rel();
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -2223,6 +2291,9 @@ apply_handle_truncate(StringInfo s)
* Even if we used CASCADE on the upstream primary we explicitly default
* to replaying changes without further cascading. This might be later
* changeable with a user specified option.
+ *
+ * Both namespace and relation name for error callback will be set in
+ * ExecuteTruncateGuts().
*/
ExecuteTruncateGuts(rels,
relids,
@@ -2253,44 +2324,54 @@ static void
apply_dispatch(StringInfo s)
{
LogicalRepMsgType action = pq_getmsgbyte(s);
+ ErrorContextCallback errcallback;
+
+ /*
+ * Push apply error context callback. Other fields will be filled during
+ * applying the change.
+ */
+ apply_error_callback_arg.command = action;
+ errcallback.callback = apply_error_callback;
+ errcallback.previous = error_context_stack;
+ error_context_stack = &errcallback;
switch (action)
{
case LOGICAL_REP_MSG_BEGIN:
apply_handle_begin(s);
- return;
+ break;
case LOGICAL_REP_MSG_COMMIT:
apply_handle_commit(s);
- return;
+ break;
case LOGICAL_REP_MSG_INSERT:
apply_handle_insert(s);
- return;
+ break;
case LOGICAL_REP_MSG_UPDATE:
apply_handle_update(s);
- return;
+ break;
case LOGICAL_REP_MSG_DELETE:
apply_handle_delete(s);
- return;
+ break;
case LOGICAL_REP_MSG_TRUNCATE:
apply_handle_truncate(s);
- return;
+ break;
case LOGICAL_REP_MSG_RELATION:
apply_handle_relation(s);
- return;
+ break;
case LOGICAL_REP_MSG_TYPE:
apply_handle_type(s);
- return;
+ break;
case LOGICAL_REP_MSG_ORIGIN:
apply_handle_origin(s);
- return;
+ break;
case LOGICAL_REP_MSG_MESSAGE:
@@ -2299,45 +2380,48 @@ apply_dispatch(StringInfo s)
* Although, it could be used by other applications that use this
* output plugin.
*/
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_START:
apply_handle_stream_start(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_END:
apply_handle_stream_stop(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_ABORT:
apply_handle_stream_abort(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_COMMIT:
apply_handle_stream_commit(s);
- return;
+ break;
case LOGICAL_REP_MSG_BEGIN_PREPARE:
apply_handle_begin_prepare(s);
- return;
+ break;
case LOGICAL_REP_MSG_PREPARE:
apply_handle_prepare(s);
- return;
+ break;
case LOGICAL_REP_MSG_COMMIT_PREPARED:
apply_handle_commit_prepared(s);
- return;
+ break;
case LOGICAL_REP_MSG_ROLLBACK_PREPARED:
apply_handle_rollback_prepared(s);
- return;
+ break;
+
+ default:
+ ereport(ERROR,
+ (errcode(ERRCODE_PROTOCOL_VIOLATION),
+ errmsg("invalid logical replication message type \"%c\"", action)));
}
- ereport(ERROR,
- (errcode(ERRCODE_PROTOCOL_VIOLATION),
- errmsg_internal("invalid logical replication message type \"%c\"",
- action)));
+ /* Pop the error context stack */
+ error_context_stack = errcallback.previous;
}
/*
@@ -3570,3 +3654,95 @@ IsLogicalWorker(void)
{
return MyLogicalRepWorker != NULL;
}
+
+static void
+apply_error_callback(void *arg)
+{
+ StringInfoData buf;
+
+ if (apply_error_callback_arg.command == 0)
+ return;
+
+ initStringInfo(&buf);
+ appendStringInfo(&buf, _("during apply of \"%s\""),
+ logicalrep_message_type(apply_error_callback_arg.command));
+
+ if (apply_error_callback_arg.relname)
+ appendStringInfo(&buf, _(" for relation \"%s.%s\""),
+ apply_error_callback_arg.nspname,
+ apply_error_callback_arg.relname);
+
+ if (TransactionIdIsNormal(apply_error_callback_arg.remote_xid))
+ appendStringInfo(&buf, _(" in transaction with xid %u committs %s"),
+ apply_error_callback_arg.remote_xid,
+ apply_error_callback_arg.committs == 0
+ ? "(unset)"
+ : timestamptz_to_str(apply_error_callback_arg.committs));
+
+ errcontext("%s", buf.data);
+}
+
+/* Set relation information of apply error callback */
+static void
+set_apply_error_context_rel(LogicalRepRelMapEntry *rel)
+{
+ apply_error_callback_arg.nspname = rel->remoterel.nspname;
+ apply_error_callback_arg.relname = rel->remoterel.relname;
+}
+
+/* Reset relation information of apply error callback */
+static void
+reset_apply_error_context_rel(void)
+{
+ apply_error_callback_arg.nspname = NULL;
+ apply_error_callback_arg.relname = NULL;
+}
+
+/* Reset all information of apply error callback */
+static void
+reset_apply_error_context_info(void)
+{
+ apply_error_callback_arg.command = 0;
+ apply_error_callback_arg.remote_xid = InvalidTransactionId;
+ apply_error_callback_arg.committs = 0;
+ reset_apply_error_context_rel();
+}
+
+/*
+ * Set relation information of error callback.
+ *
+ * Both set_logicalrep_error_context_rel() and
+ * reset_logicalrep_error_context_rel() functions are intended to be
+ * used by functions outside of logical replication module where don't
+ * use LogicalRepRelMapEntry.
+ *
+ * The caller must call reset_logicalrep_error_context_rel() after use
+ * so we free the memory used for names.
+ */
+void
+set_logicalrep_error_context_rel(Relation rel)
+{
+ if (IsLogicalWorker())
+ {
+ apply_error_callback_arg.nspname =
+ get_namespace_name(RelationGetNamespace(rel));
+ apply_error_callback_arg.relname =
+ pstrdup(RelationGetRelationName(rel));
+ }
+}
+
+/* Reset relation information for error callback set */
+void
+reset_logicalrep_error_context_rel(void)
+{
+ if (IsLogicalWorker())
+ {
+ if (apply_error_callback_arg.nspname)
+ pfree(apply_error_callback_arg.nspname);
+ apply_error_callback_arg.nspname = NULL;
+
+ if (apply_error_callback_arg.relname)
+ pfree(apply_error_callback_arg.relname);
+ apply_error_callback_arg.relname = NULL;
+ }
+}
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 63de90d94a..c78a4409bc 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -242,5 +242,6 @@ extern void logicalrep_write_stream_abort(StringInfo out, TransactionId xid,
TransactionId subxid);
extern void logicalrep_read_stream_abort(StringInfo in, TransactionId *xid,
TransactionId *subxid);
+extern char *logicalrep_message_type(LogicalRepMsgType action);
#endif /* LOGICAL_PROTO_H */
diff --git a/src/include/replication/logicalworker.h b/src/include/replication/logicalworker.h
index 2ad61a001a..d3e8514ffd 100644
--- a/src/include/replication/logicalworker.h
+++ b/src/include/replication/logicalworker.h
@@ -15,5 +15,7 @@
extern void ApplyWorkerMain(Datum main_arg);
extern bool IsLogicalWorker(void);
+extern void set_logicalrep_error_context_rel(Relation rel);
+extern void reset_logicalrep_error_context_rel(void);
#endif /* LOGICALWORKER_H */
--
2.24.3 (Apple Git-128)
v4-0002-Add-pg_stat_logical_replication_error-statistics-.patchapplication/x-patch; name=v4-0002-Add-pg_stat_logical_replication_error-statistics-.patchDownload
From 84752e871e0c46bbec1b02cdf87dfe092f67dfc2 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 16 Jul 2021 23:10:22 +0900
Subject: [PATCH v4 2/3] Add pg_stat_logical_replication_error statistics view.
This commits adds new system view pg_stat_logical_replication_error,
showing errors happening during applying logical replication changes
as well as during performing initial table synchronization.
It also adds SQL function pg_stat_reset_subscription_errror() to
reset the single subscription error.
---
doc/src/sgml/monitoring.sgml | 151 +++++
src/backend/catalog/pg_subscription.c | 23 +-
src/backend/catalog/system_functions.sql | 2 +
src/backend/catalog/system_views.sql | 27 +
src/backend/commands/subscriptioncmds.c | 30 +-
src/backend/postmaster/pgstat.c | 639 ++++++++++++++++++++
src/backend/replication/logical/tablesync.c | 16 +-
src/backend/replication/logical/worker.c | 48 +-
src/backend/utils/adt/pgstatfuncs.c | 120 ++++
src/backend/utils/error/elog.c | 16 +
src/include/catalog/pg_proc.dat | 13 +
src/include/catalog/pg_subscription_rel.h | 3 +-
src/include/pgstat.h | 112 ++++
src/include/utils/elog.h | 1 +
src/test/regress/expected/rules.out | 22 +
15 files changed, 1192 insertions(+), 31 deletions(-)
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 74a58a916c..ca9eec5e22 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -346,6 +346,15 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
</entry>
</row>
+ <row>
+ <entry><structname>pg_stat_subscription_errors</structname><indexterm><primary>pg_stat_subscription_errors</primary></indexterm></entry>
+ <entry>One row per error happened on subscription, showing information about
+ the subscription errors.
+ See <link linkend="monitoring-pg-stat-subscription-errors">
+ <structname>pg_stat_subscription_errors</structname></link> for details.
+ </entry>
+ </row>
+
<row>
<entry><structname>pg_stat_ssl</structname><indexterm><primary>pg_stat_ssl</primary></indexterm></entry>
<entry>One row per connection (regular and replication), showing information about
@@ -3050,6 +3059,126 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
</sect2>
+ <sect2 id="monitoring-pg-stat-subscription-errors">
+ <title><structname>pg_stat_subscription_errors</structname></title>
+
+ <indexterm>
+ <primary>pg_stat_subscription_errors</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_stat_subscription_errors</structname> view will contain one
+ row per subscription error reported by workers applying logical replication
+ changes and workers handling the initial data copy of the subscribed tables.
+ </para>
+
+ <table id="pg-stat-subscription-errors" xreflabel="pg_stat_subscription-errors">
+ <title><structname>pg_stat_subscription_errors</structname> View</title>
+ <tgroup cols="1">
+ <thead>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ Column Type
+ </para>
+ <para>
+ Description
+ </para></entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>datname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the database in which the subscription is created.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the subscription
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>relid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker is processing when the
+ error happened.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>command</structfield> <type>text</type>
+ </para>
+ <para>
+ Name of command being applied when the error happened. This
+ field is always NULL if the error is reported by
+ <literal>tablesync</literal> worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>xid</structfield> <type>xid</type>
+ </para>
+ <para>
+ Transaction ID of publisher node being applied when the error
+ happened. This field is always NULL if the error is reported
+ by <literal>tablesync</literal> worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>failure_source</structfield> <type>text</type>
+ </para>
+ <para>
+ Type of worker reported the error: <literal>apply</literal> or
+ <literal>tablesync</literal>.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>failure_count</structfield> <type>uint8</type>
+ </para>
+ <para>
+ Number of times error happened on the worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_failure</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which the last error happened.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_failure_message</structfield> <type>text</type>
+ </para>
+ <para>
+ Error message which is reported last failure time.
+ </para></entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ </sect2>
+
<sect2 id="monitoring-pg-stat-ssl-view">
<title><structname>pg_stat_ssl</structname></title>
@@ -5172,6 +5301,28 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
can be granted EXECUTE to run the function.
</para></entry>
</row>
+
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_stat_reset_subscription_error</primary>
+ </indexterm>
+ <function>pg_stat_reset_subscription_error</function> ( <parameter>subid</parameter> <type>oid</type>, <parameter>relid</parameter> <type>oid</type> )
+ <returnvalue>void</returnvalue>
+ </para>
+ <para>
+ Resets statistics of a single subscription error. If
+ the argument <parameter>relid</parameter> is not <literal>NULL</literal>,
+ resets error statistics of the <literal>tablesync</literal> worker for
+ the relation with <parameter>relid</parameter>. Otherwise, resets the
+ error statistics of the <literal>apply</literal> worker running on the
+ subscription with <parameter>subid</parameter>.
+ </para>
+ <para>
+ This function is restricted to superusers by default, but other users
+ can be granted EXECUTE to run the function.
+ </para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index 25021e25a4..d76bdff36a 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -541,18 +541,27 @@ GetSubscriptionRelations(Oid subid)
/*
* Get all relations for subscription that are not in a ready state.
*
- * Returned list is palloc'ed in current memory context.
+ * Returned HTAB is created in current memory context.
*/
-List *
+HTAB *
GetSubscriptionNotReadyRelations(Oid subid)
{
- List *res = NIL;
+ HTAB *htab;
+ HASHCTL hash_ctl;
Relation rel;
HeapTuple tup;
int nkeys = 0;
ScanKeyData skey[2];
SysScanDesc scan;
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(SubscriptionRelState);
+ hash_ctl.hcxt = CurrentMemoryContext;
+ htab = hash_create("not ready relations in subscription",
+ 64,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+
rel = table_open(SubscriptionRelRelationId, AccessShareLock);
ScanKeyInit(&skey[nkeys++],
@@ -577,8 +586,8 @@ GetSubscriptionNotReadyRelations(Oid subid)
subrel = (Form_pg_subscription_rel) GETSTRUCT(tup);
- relstate = (SubscriptionRelState *) palloc(sizeof(SubscriptionRelState));
- relstate->relid = subrel->srrelid;
+ relstate = (SubscriptionRelState *) hash_search(htab, (void *) &subrel->srrelid,
+ HASH_ENTER, NULL);
relstate->state = subrel->srsubstate;
d = SysCacheGetAttr(SUBSCRIPTIONRELMAP, tup,
Anum_pg_subscription_rel_srsublsn, &isnull);
@@ -586,13 +595,11 @@ GetSubscriptionNotReadyRelations(Oid subid)
relstate->lsn = InvalidXLogRecPtr;
else
relstate->lsn = DatumGetLSN(d);
-
- res = lappend(res, relstate);
}
/* Cleanup */
systable_endscan(scan);
table_close(rel, AccessShareLock);
- return res;
+ return htab;
}
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index a416e94d37..c9aa6f04d3 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -639,6 +639,8 @@ REVOKE EXECUTE ON FUNCTION pg_stat_reset_single_function_counters(oid) FROM publ
REVOKE EXECUTE ON FUNCTION pg_stat_reset_replication_slot(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_stat_reset_subscription_error(oid, oid) FROM public;
+
REVOKE EXECUTE ON FUNCTION lo_import(text) FROM public;
REVOKE EXECUTE ON FUNCTION lo_import(text, oid) FROM public;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 55f6e3711d..cd07f2e02f 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1257,3 +1257,30 @@ REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
substream, subtwophasestate, subslotname, subsynccommit, subpublications)
ON pg_subscription TO public;
+
+CREATE VIEW pg_stat_subscription_errors AS
+ SELECT
+ d.datname,
+ sr.subid,
+ s.subname,
+ e.relid,
+ e.command,
+ e.xid,
+ e.failure_source,
+ e.failure_count,
+ e.last_failure,
+ e.last_failure_message,
+ e.stats_reset
+ FROM (SELECT
+ oid as subid,
+ NULL as relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT
+ srsubid as subid,
+ srrelid as relid
+ FROM pg_subscription_rel
+ WHERE srsubstate <> 'r') sr,
+ LATERAL pg_stat_get_subscription_error(sr.subid, sr.relid) e
+ JOIN pg_database d ON (e.datid = d.oid)
+ JOIN pg_subscription s ON (e.subid = s.oid);
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 22ae982328..da02d3bbfa 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -86,7 +86,7 @@ typedef struct SubOpts
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
static void check_duplicates_in_publist(List *publist, Datum *datums);
static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
-static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
+static void ReportSlotConnectionError(HTAB *rstates, Oid subid, char *slotname, char *err);
/*
@@ -1163,7 +1163,9 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
char *err = NULL;
WalReceiverConn *wrconn;
Form_pg_subscription form;
- List *rstates;
+ HTAB *rstates;
+ HASH_SEQ_STATUS hstat;
+ SubscriptionRelState *rstate;
/*
* Lock pg_subscription with AccessExclusiveLock to ensure that the
@@ -1286,9 +1288,9 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
* exclusive lock on the subscription.
*/
rstates = GetSubscriptionNotReadyRelations(subid);
- foreach(lc, rstates)
+ hash_seq_init(&hstat, rstates);
+ while ((rstate = (SubscriptionRelState *) hash_seq_search(&hstat)) != NULL)
{
- SubscriptionRelState *rstate = (SubscriptionRelState *) lfirst(lc);
Oid relid = rstate->relid;
/* Only cleanup resources of tablesync workers */
@@ -1321,8 +1323,9 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
* If there is no slot associated with the subscription, we can finish
* here.
*/
- if (!slotname && rstates == NIL)
+ if (!slotname && hash_get_num_entries(rstates) == 0)
{
+ hash_destroy(rstates);
table_close(rel, NoLock);
return;
}
@@ -1346,7 +1349,7 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
if (!slotname)
{
/* be tidy */
- list_free(rstates);
+ hash_destroy(rstates);
table_close(rel, NoLock);
return;
}
@@ -1358,9 +1361,9 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
PG_TRY();
{
- foreach(lc, rstates)
+ hash_seq_init(&hstat, rstates);
+ while ((rstate = (SubscriptionRelState *) hash_seq_search(&hstat)) != NULL)
{
- SubscriptionRelState *rstate = (SubscriptionRelState *) lfirst(lc);
Oid relid = rstate->relid;
/* Only cleanup resources of tablesync workers */
@@ -1389,7 +1392,7 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
}
}
- list_free(rstates);
+ hash_destroy(rstates);
/*
* If there is a slot associated with the subscription, then drop the
@@ -1641,13 +1644,14 @@ fetch_table_list(WalReceiverConn *wrconn, List *publications)
* them manually, if required.
*/
static void
-ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err)
+ReportSlotConnectionError(HTAB *rstates, Oid subid, char *slotname, char *err)
{
- ListCell *lc;
+ HASH_SEQ_STATUS hstat;
+ SubscriptionRelState *rstate;
- foreach(lc, rstates)
+ hash_seq_init(&hstat, rstates);
+ while ((rstate = (SubscriptionRelState *) hash_seq_search(&hstat)) != NULL)
{
- SubscriptionRelState *rstate = (SubscriptionRelState *) lfirst(lc);
Oid relid = rstate->relid;
/* Only cleanup resources of tablesync workers */
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 11702f2a80..4a35e6640f 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -41,6 +41,8 @@
#include "catalog/partition.h"
#include "catalog/pg_database.h"
#include "catalog/pg_proc.h"
+#include "catalog/pg_subscription.h"
+#include "catalog/pg_subscription_rel.h"
#include "common/ip.h"
#include "executor/instrument.h"
#include "libpq/libpq.h"
@@ -106,6 +108,7 @@
#define PGSTAT_TAB_HASH_SIZE 512
#define PGSTAT_FUNCTION_HASH_SIZE 512
#define PGSTAT_REPLSLOT_HASH_SIZE 32
+#define PGSTAT_SUBSCRIPTION_ERR_HASH_SIZE 32
/* ----------
@@ -279,6 +282,7 @@ static PgStat_GlobalStats globalStats;
static PgStat_WalStats walStats;
static PgStat_SLRUStats slruStats[SLRU_NUM_ELEMENTS];
static HTAB *replSlotStatHash = NULL;
+static HTAB *subscriptionErrHash = NULL;
/*
* List of OIDs of databases we need to write out. If an entry is InvalidOid,
@@ -320,6 +324,12 @@ static bool pgstat_db_requested(Oid databaseid);
static PgStat_StatReplSlotEntry *pgstat_get_replslot_entry(NameData name, bool create_it);
static void pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotstats, TimestampTz ts);
+static PgStat_StatSubErrEntry *pgstat_get_subscription_error_entry(Oid subid,
+ bool create);
+static PgStat_StatSubRelErrEntry *pgstat_get_subscription_rel_error_entry(Oid subid,
+ Oid subrelid,
+ bool create);
+
static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg);
static void pgstat_send_funcstats(void);
static void pgstat_send_slru(void);
@@ -358,6 +368,10 @@ static void pgstat_recv_checksum_failure(PgStat_MsgChecksumFailure *msg, int len
static void pgstat_recv_connstat(PgStat_MsgConn *msg, int len);
static void pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len);
static void pgstat_recv_tempfile(PgStat_MsgTempFile *msg, int len);
+static void pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len);
+static void pgstat_recv_subscription_error_purge(PgStat_MsgSubscriptionErrPurge *msg,
+ int len);
+static void pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len);
/* ------------------------------------------------------------
* Public functions called from postmaster follow
@@ -1134,6 +1148,133 @@ pgstat_vacuum_stat(void)
}
}
+ /*
+ * Search for all the dead subscriptions and error entries in stats
+ * hashtable and tell the stats collector to drop them.
+ */
+ if (subscriptionErrHash)
+ {
+ PgStat_MsgSubscriptionPurge submsg;
+ PgStat_StatSubErrEntry *suberrent;
+ HTAB *htab;
+
+ htab = pgstat_collect_oids(SubscriptionRelationId, Anum_pg_subscription_oid);
+
+ submsg.m_nentries = 0;
+
+ hash_seq_init(&hstat, subscriptionErrHash);
+ while ((suberrent = (PgStat_StatSubErrEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ PgStat_MsgSubscriptionErrPurge errmsg;
+ PgStat_StatSubRelErrEntry *relerrent;
+ HASH_SEQ_STATUS hstat_rel;
+ HTAB *rstates;
+
+ CHECK_FOR_INTERRUPTS();
+
+ if (hash_search(htab, (void *) &(suberrent->subid), HASH_FIND, NULL) == NULL)
+ {
+ /* This subscription is dead, add subid to the message */
+ submsg.m_subids[submsg.m_nentries++] = suberrent->subid;
+
+ /*
+ * If the message is full, send it out and reinitialize to
+ * empty
+ */
+ if (submsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONPURGE)
+ {
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + submsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&submsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(&submsg, len);
+ submsg.m_nentries = 0;
+ }
+
+ continue;
+ }
+
+ /*
+ * Nothing to do here if the subscription exists but has no error
+ * entries.
+ */
+ if (suberrent->suberrors == NULL)
+ continue;
+
+ /*
+ * The subscription has error entries. We search errors of the
+ * table sync workers who are already in sync state. Those errors
+ * should be removed.
+ *
+ * Note that the lifetime of error entries of the apply worker and
+ * the table sync worker are different. The former lives until
+ * the subscription is dropped whereas the latter lives the table
+ * synchronization is completed.
+ */
+ rstates = GetSubscriptionNotReadyRelations(suberrent->subid);
+
+ errmsg.m_nentries = 0;
+ errmsg.m_subid = suberrent->subid;
+
+ hash_seq_init(&hstat_rel, suberrent->suberrors);
+ while ((relerrent = (PgStat_StatSubRelErrEntry *) hash_seq_search(&hstat_rel)) != NULL)
+ {
+ CHECK_FOR_INTERRUPTS();
+
+ /* Skip the apply worker's error */
+ if (!OidIsValid(relerrent->subrelid))
+ continue;
+
+ /*
+ * Add the relid to the message if the table synchronization
+ * for this relation already complete or the table is no
+ * longer subscribed.
+ */
+ if (hash_search(rstates, (void *) &(relerrent->subrelid), HASH_FIND, NULL) == NULL)
+ errmsg.m_relids[errmsg.m_nentries++] = relerrent->subrelid;
+
+ /*
+ * If the message is full, send it out and reinitialize to
+ * empty
+ */
+ if (errmsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONERRPURGE)
+ {
+ len = offsetof(PgStat_MsgSubscriptionErrPurge, m_relids[0])
+ + errmsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&errmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE);
+ pgstat_send(&errmsg, len);
+ errmsg.m_nentries = 0;
+ }
+ }
+
+ /* Send the rest of dead error entries */
+ if (errmsg.m_nentries > 0)
+ {
+ len = offsetof(PgStat_MsgSubscriptionErrPurge, m_relids[0])
+ + errmsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&errmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE);
+ pgstat_send(&errmsg, len);
+ errmsg.m_nentries = 0;
+ }
+
+ hash_destroy(rstates);
+ }
+
+ /* Send the rest of dead subscriptions */
+ if (submsg.m_nentries > 0)
+ {
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + submsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&submsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(&submsg, len);
+ }
+
+ hash_destroy(htab);
+ }
+
/*
* Lookup our own database entry; if not found, nothing more to do.
*/
@@ -1543,6 +1684,25 @@ pgstat_reset_replslot_counter(const char *name)
pgstat_send(&msg, sizeof(msg));
}
+/* ----------
+ * pgstat_reset_subscription_error() -
+ *
+ * Tell the collector about reset the error of subscription.
+ * ----------
+ */
+void
+pgstat_reset_subscription_error(Oid subid, Oid subrelid)
+{
+ PgStat_MsgSubscriptionErr msg;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_reset = true;
+
+ pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
+}
+
/* ----------
* pgstat_report_autovac() -
*
@@ -1863,6 +2023,37 @@ pgstat_report_replslot_drop(const char *slotname)
pgstat_send(&msg, sizeof(PgStat_MsgReplSlot));
}
+/* ----------
+ * pgstat_report_subscription_error() -
+ *
+ * Tell the collector about error of subscription.
+ * ----------
+ */
+void
+pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command, TransactionId xid,
+ const char *errmsg)
+{
+ PgStat_MsgSubscriptionErr msg;
+ int len;
+
+ len = offsetof(PgStat_MsgSubscriptionErr, m_errmsg[0]) + strlen(errmsg);
+ Assert(len < PGSTAT_MAX_MSG_SIZE);
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_databaseid = MyDatabaseId;
+ msg.m_relid = relid;
+ msg.m_reset = false;
+ msg.m_command = command;
+ msg.m_xid = xid;
+ msg.m_last_failure = GetCurrentTimestamp();
+ strlcpy(msg.m_errmsg, errmsg, PGSTAT_SUBSCRIPTIONERR_MSGLEN);
+
+ pgstat_send(&msg, len);
+}
+
/* ----------
* pgstat_ping() -
*
@@ -2895,6 +3086,38 @@ pgstat_fetch_replslot(NameData slotname)
return pgstat_get_replslot_entry(slotname, false);
}
+/*
+ * ---------
+ * pgstat_fetch_subscription_error() -
+ *
+ * Support function for the SQL-callable pgstat* functions. Returns
+ * a pointer to the subscription errors struct.
+ * ---------
+ */
+PgStat_StatSubErrEntry *
+pgstat_fetch_subscription_error(Oid subid)
+{
+ backend_read_statsfile();
+
+ return pgstat_get_subscription_error_entry(subid, false);
+}
+
+/*
+ * ---------
+ * pgstat_fetch_subscription_rel_error() -
+ *
+ * Support function for the SQL-callable pgstat* functions. Returns
+ * a pointer to the subscription error struct.
+ * ---------
+ */
+PgStat_StatSubRelErrEntry *
+pgstat_fetch_subscription_rel_error(Oid subid, Oid relid)
+{
+ backend_read_statsfile();
+
+ return pgstat_get_subscription_rel_error_entry(subid, relid, false);
+}
+
/*
* Shut down a single backend's statistics reporting at process exit.
*
@@ -3424,6 +3647,19 @@ PgstatCollectorMain(int argc, char *argv[])
pgstat_recv_connstat(&msg.msg_conn, len);
break;
+ case PGSTAT_MTYPE_SUBSCRIPTIONERR:
+ pgstat_recv_subscription_error(&msg.msg_subscriptionerr, len);
+ break;
+
+ case PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE:
+ pgstat_recv_subscription_error_purge(&msg.msg_subscriptionerrpurge,
+ len);
+ break;
+
+ case PGSTAT_MTYPE_SUBSCRIPTIONPURGE:
+ pgstat_recv_subscription_purge(&msg.msg_subscriptionpurge, len);
+ break;
+
default:
break;
}
@@ -3725,6 +3961,50 @@ pgstat_write_statsfiles(bool permanent, bool allDbs)
}
}
+ /*
+ * Write subscription error structs
+ */
+ if (subscriptionErrHash)
+ {
+ PgStat_StatSubErrEntry *suberrent;
+
+ hash_seq_init(&hstat, subscriptionErrHash);
+ while ((suberrent = (PgStat_StatSubErrEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ PgStat_StatSubRelErrEntry *relerrent;
+ HASH_SEQ_STATUS relhstat;
+ long nrels = hash_get_num_entries(suberrent->suberrors);
+
+ /* Skip this subscription if not have any errors */
+ if (suberrent->suberrors == NULL)
+ continue;
+
+ fputc('S', fpout);
+ rc = fwrite(suberrent, sizeof(PgStat_StatSubErrEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+
+ /* The number of errors follows */
+ rc = fwrite(&nrels, sizeof(long), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+
+ /* Then, the error entries follow */
+ hash_seq_init(&relhstat, suberrent->suberrors);
+ while ((relerrent = (PgStat_StatSubRelErrEntry *) hash_seq_search(&relhstat)) != NULL)
+ {
+ /*
+ * XXX we write the whole PgStat_StatSubRelErrEntry entry that
+ * contains the fixed-legnth error message string which is
+ * PGSTAT_SUBSCRIPTIONERR_MSGLEN in length, making the stats
+ * file bloat. We don't expect we have many error entries but
+ * if the expectation became false we should write the string
+ * and its length instead.
+ */
+ rc = fwrite(relerrent, sizeof(PgStat_StatSubRelErrEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+ }
+ }
+ }
+
/*
* No more output to be done. Close the temp file and replace the old
* pgstat.stat with it. The ferror() check replaces testing for error
@@ -4184,6 +4464,99 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
break;
}
+ /*
+ * 'S' A PgStat_StatSubErrEntry struct followed by the number
+ * of errors and PgStat_StatSubRelErrEntry structs, describing
+ * subscription errors.
+ */
+ case 'S':
+ {
+ PgStat_StatSubErrEntry suberrbuf;
+ PgStat_StatSubErrEntry *suberrent;
+ long nerrors;
+
+ /* Read the subscription entry */
+ if (fread(&suberrbuf, 1, sizeof(PgStat_StatSubErrEntry), fpin) !=
+ sizeof(PgStat_StatSubErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /* Create hash table if we don't have it already. */
+ if (subscriptionErrHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubErrEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ subscriptionErrHash = hash_create("Subscription error hash",
+ PGSTAT_SUBSCRIPTION_ERR_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ }
+
+ /* Enter the subscription error entry */
+ suberrent =
+ (PgStat_StatSubErrEntry *) hash_search(subscriptionErrHash,
+ (void *) &(suberrbuf.subid),
+ HASH_ENTER, NULL);
+ suberrent->suberrors = NULL;
+
+ /* Read the number of errors in the subscription */
+ if (fread(&nerrors, 1, sizeof(long), fpin) != sizeof(long))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ for (int i = 0; i < nerrors; i++)
+ {
+ PgStat_StatSubRelErrEntry *subrelent;
+ PgStat_StatSubRelErrEntry subrelbuf;
+
+ if (fread(&subrelbuf, 1, sizeof(PgStat_StatSubRelErrEntry), fpin) !=
+ sizeof(PgStat_StatSubRelErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ if (suberrent->suberrors == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubRelErrEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ suberrent->suberrors = hash_create("Subscription relation error hash",
+ PGSTAT_SUBSCRIPTION_ERR_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ }
+
+ /*
+ * Enter the error information to the subscription
+ * hash
+ */
+ subrelent =
+ (PgStat_StatSubRelErrEntry *) hash_search(suberrent->suberrors,
+ (void *) &(subrelbuf.subrelid),
+ HASH_ENTER, NULL);
+
+ memcpy(subrelent, &subrelbuf, sizeof(PgStat_StatSubRelErrEntry));
+ }
+
+ break;
+ }
+
case 'E':
goto done;
@@ -4526,6 +4899,50 @@ pgstat_read_db_statsfile_timestamp(Oid databaseid, bool permanent,
}
break;
+ /*
+ * 'S' A PgStat_StatSubErrEntry struct followed by the number
+ * of errors and PgStat_StatSubRelErrEntry structs describing
+ * a subscription errors.
+ */
+ case 'S':
+ {
+ PgStat_StatSubErrEntry mySubErrs;
+ PgStat_StatSubRelErrEntry subrelbuf;
+ long nrels;
+
+ if (fread(&mySubErrs, 1, sizeof(PgStat_StatSubErrEntry), fpin)
+ != sizeof(PgStat_StatSubErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ FreeFile(fpin);
+ return false;
+ }
+
+ if (fread(&nrels, 1, sizeof(long), fpin) != sizeof(long))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ for (int i = 0; i < nrels; i++)
+ {
+ if (fread(&subrelbuf, 1, sizeof(PgStat_StatSubRelErrEntry), fpin) !=
+ sizeof(PgStat_StatSubRelErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+ }
+ }
+
+ break;
+
case 'E':
goto done;
@@ -4716,6 +5133,7 @@ pgstat_clear_snapshot(void)
pgStatLocalContext = NULL;
pgStatDBHash = NULL;
replSlotStatHash = NULL;
+ subscriptionErrHash = NULL;
/*
* Historically the backend_status.c facilities lived in this file, and
@@ -5650,6 +6068,122 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
}
}
+/* ----------
+ * pgstat_recv_subscription_error() -
+ *
+ * Process a SUBSCRIPTIONERR message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len)
+{
+ PgStat_StatSubRelErrEntry *relerrent;
+ bool create = !msg->m_reset;
+
+ /* Get subscription error */
+ relerrent = pgstat_get_subscription_rel_error_entry(msg->m_subid,
+ msg->m_subrelid,
+ create);
+
+ if (msg->m_reset)
+ {
+ if (relerrent)
+ return;
+
+ /* reset fields and set reset timestamp */
+ relerrent->relid = InvalidOid;
+ relerrent->command = 0;
+ relerrent->xid = InvalidTransactionId;
+ relerrent->failure_count = 0;
+ relerrent->last_failure = 0;
+ relerrent->errmsg[0] = '\0';
+ relerrent->stat_reset_timestamp = GetCurrentTimestamp();
+ }
+ else
+ {
+ Assert(relerrent);
+
+ /* update the error entry */
+ relerrent->databaseid = msg->m_databaseid;
+ relerrent->relid = msg->m_relid;
+ relerrent->command = msg->m_command;
+ relerrent->xid = msg->m_xid;
+ relerrent->failure_count++;
+ relerrent->last_failure = msg->m_last_failure;
+ strlcpy(relerrent->errmsg, msg->m_errmsg, PGSTAT_SUBSCRIPTIONERR_MSGLEN);
+ }
+}
+
+/* ----------
+ * pgstat_recv_subscription_purge() -
+ *
+ * Process a SUBSCRIPTIONPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len)
+{
+ if (subscriptionErrHash == NULL)
+ return;
+
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ PgStat_StatSubErrEntry *suberrent;
+
+ suberrent = hash_search(subscriptionErrHash, (void *) &(msg->m_subids[i]),
+ HASH_FIND, NULL);
+
+ /*
+ * Nothing to do if the subscription entry is not found. This could
+ * happen when the subscription is dropped and the message for
+ * dropping subscription entry arrived before the message for
+ * reporting the error.
+ */
+ if (suberrent == NULL)
+ continue;
+
+ /* Cleanup the hash table for errors */
+ if (suberrent->suberrors != NULL)
+ hash_destroy(suberrent->suberrors);
+
+ (void) hash_search(subscriptionErrHash, (void *) &(msg->m_subids[i]),
+ HASH_REMOVE, NULL);
+ }
+}
+
+/* ----------
+ * pgstat_recv_subscription_error_purge() -
+ *
+ * Process a SUBSCRIPTIONERRPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_error_purge(PgStat_MsgSubscriptionErrPurge *msg, int len)
+{
+ if (subscriptionErrHash == NULL)
+ return;
+
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ PgStat_StatSubErrEntry *suberrent;
+
+ suberrent = hash_search(subscriptionErrHash, (void *) &(msg->m_subid),
+ HASH_FIND, NULL);
+
+ /*
+ * Nothing to do if the subscription entry is not found. This could
+ * happen when the subscription with msg->m_subid is removed and the
+ * corresponding entry is also removed before receiving the error
+ * purge message.
+ */
+ if (suberrent == NULL)
+ continue;
+
+ (void) hash_search(suberrent->suberrors, (void *) &(msg->m_relids[i]),
+ HASH_REMOVE, NULL);
+ }
+}
+
/* ----------
* pgstat_write_statsfile_needed() -
*
@@ -5747,6 +6281,111 @@ pgstat_get_replslot_entry(NameData name, bool create)
return slotent;
}
+/* ----------
+ * pgstat_get_subscription_error_entry
+ *
+ * Return the entry of subscription errors entry with the subscription OID.
+ * Return NULL if not found and the caller didn't request to create it.
+ *
+ * create tells whether to create the new subscription entry if it is not
+ * found.
+ * ----------
+ */
+static PgStat_StatSubErrEntry *
+pgstat_get_subscription_error_entry(Oid subid, bool create)
+{
+ PgStat_StatSubErrEntry *suberrent;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);
+ bool found;
+
+ if (subscriptionErrHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ if (!create)
+ return NULL;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubErrEntry);
+ subscriptionErrHash = hash_create("Subscription error hash",
+ PGSTAT_SUBSCRIPTION_ERR_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+ }
+
+ suberrent = (PgStat_StatSubErrEntry *) hash_search(subscriptionErrHash,
+ (void *) &subid,
+ action, &found);
+
+ /* initialize fields */
+ if (create && !found)
+ suberrent->suberrors = NULL;
+
+ return suberrent;
+}
+
+/* ----------
+ * pgstat_get_subscription_rel_error_entry
+ *
+ * Return the entry of subscription relation error entry with the subscription
+ * OID and relcation OID. Return NULL if not found and the caller didn't
+ * request to create it.
+ *
+ * create tells whether to create the new subscription relation entry if it is
+ * not found.
+ * ----------
+ */
+static PgStat_StatSubRelErrEntry *
+pgstat_get_subscription_rel_error_entry(Oid subid, Oid subrelid, bool create)
+{
+ PgStat_StatSubErrEntry *suberrent;
+ PgStat_StatSubRelErrEntry *relerrent;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);
+ bool found;
+
+ suberrent = pgstat_get_subscription_error_entry(subid, create);
+
+ if (suberrent == NULL)
+ {
+ Assert(!create);
+ return NULL;
+ }
+
+ if (suberrent->suberrors == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ if (!create)
+ return NULL;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubRelErrEntry);
+ suberrent->suberrors = hash_create("Subscription relation error hash",
+ PGSTAT_SUBSCRIPTION_ERR_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+ }
+
+ relerrent = (PgStat_StatSubRelErrEntry *) hash_search(suberrent->suberrors,
+ (void *) &subrelid,
+ action, &found);
+
+ /* initialize fields */
+ if (create && !found)
+ {
+ relerrent->databaseid = InvalidOid;
+ relerrent->relid = InvalidOid;
+ relerrent->command = 0;
+ relerrent->xid = InvalidTransactionId;
+ relerrent->failure_count = 0;
+ relerrent->last_failure = 0;
+ relerrent->errmsg[0] = '\0';
+ relerrent->stat_reset_timestamp = 0;
+ }
+
+ return relerrent;
+}
+
/* ----------
* pgstat_reset_replslot
*
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a43c..8765396432 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -1175,8 +1175,8 @@ FetchTableStates(bool *started_tx)
if (!table_states_valid)
{
MemoryContext oldctx;
- List *rstates;
- ListCell *lc;
+ HTAB *rstates;
+ HASH_SEQ_STATUS hstat;
SubscriptionRelState *rstate;
/* Clean the old lists. */
@@ -1194,14 +1194,18 @@ FetchTableStates(bool *started_tx)
/* Allocate the tracking info in a permanent memory context. */
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- foreach(lc, rstates)
+ hash_seq_init(&hstat, rstates);
+ while ((rstate = (SubscriptionRelState *) hash_seq_search(&hstat)) != NULL)
{
- rstate = palloc(sizeof(SubscriptionRelState));
- memcpy(rstate, lfirst(lc), sizeof(SubscriptionRelState));
- table_states_not_ready = lappend(table_states_not_ready, rstate);
+ SubscriptionRelState *r = palloc(sizeof(SubscriptionRelState));
+
+ memcpy(r, rstate, sizeof(SubscriptionRelState));
+ table_states_not_ready = lappend(table_states_not_ready, r);
}
MemoryContextSwitchTo(oldctx);
+ hash_destroy(rstates);
+
/*
* Does the subscription have tables?
*
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index d346377b20..4f9c4e9014 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -227,6 +227,7 @@ typedef struct ApplyErrCallbackArg
LogicalRepMsgType command; /* 0 if invalid */
/* Local relation information */
+ Oid relid; /* used for error reporting */
char *nspname; /* used for error context */
char *relname; /* used for error context */
@@ -236,6 +237,7 @@ typedef struct ApplyErrCallbackArg
static ApplyErrCallbackArg apply_error_callback_arg =
{
.command = 0,
+ .relid = InvalidOid,
.relname = NULL,
.nspname = NULL,
.remote_xid = InvalidTransactionId,
@@ -3522,8 +3524,26 @@ ApplyWorkerMain(Datum main_arg)
{
char *syncslotname;
- /* This is table synchronization worker, call initial sync. */
- syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ PG_TRY();
+ {
+ /* This is table synchronization worker, call initial sync. */
+ syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ }
+ PG_CATCH();
+ {
+ elog(NOTICE, "errmsg \"%s\"",
+ geterrmessage());
+
+ /* report the table sync error */
+ pgstat_report_subscription_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ MyLogicalRepWorker->relid,
+ 0,
+ InvalidTransactionId,
+ geterrmessage());
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
/* allocate slot name in long-lived context */
myslotname = MemoryContextStrdup(ApplyContext, syncslotname);
@@ -3641,7 +3661,24 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- LogicalRepApplyLoop(origin_startpos);
+ PG_TRY();
+ {
+ LogicalRepApplyLoop(origin_startpos);
+ }
+ PG_CATCH();
+ {
+ /* report the apply error */
+ if (apply_error_callback_arg.command != 0)
+ pgstat_report_subscription_error(MySubscription->oid,
+ InvalidOid,
+ apply_error_callback_arg.relid,
+ apply_error_callback_arg.command,
+ apply_error_callback_arg.remote_xid,
+ geterrmessage());
+
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
proc_exit(0);
}
@@ -3686,6 +3723,7 @@ apply_error_callback(void *arg)
static void
set_apply_error_context_rel(LogicalRepRelMapEntry *rel)
{
+ apply_error_callback_arg.relid = rel->localreloid;
apply_error_callback_arg.nspname = rel->remoterel.nspname;
apply_error_callback_arg.relname = rel->remoterel.relname;
}
@@ -3694,6 +3732,7 @@ set_apply_error_context_rel(LogicalRepRelMapEntry *rel)
static void
reset_apply_error_context_rel(void)
{
+ apply_error_callback_arg.relid = InvalidOid;
apply_error_callback_arg.nspname = NULL;
apply_error_callback_arg.relname = NULL;
}
@@ -3724,6 +3763,7 @@ set_logicalrep_error_context_rel(Relation rel)
{
if (IsLogicalWorker())
{
+ apply_error_callback_arg.relid = RelationGetRelid(rel);
apply_error_callback_arg.nspname =
get_namespace_name(RelationGetNamespace(rel));
apply_error_callback_arg.relname =
@@ -3737,6 +3777,8 @@ reset_logicalrep_error_context_rel(void)
{
if (IsLogicalWorker())
{
+ apply_error_callback_arg.relid = InvalidOid;
+
if (apply_error_callback_arg.nspname)
pfree(apply_error_callback_arg.nspname);
apply_error_callback_arg.nspname = NULL;
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index f0e09eae4d..f1348a415e 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -24,6 +24,8 @@
#include "pgstat.h"
#include "postmaster/bgworker_internals.h"
#include "postmaster/postmaster.h"
+#include "replication/logicalproto.h"
+#include "replication/logicalworker.h"
#include "replication/slot.h"
#include "storage/proc.h"
#include "storage/procarray.h"
@@ -2240,6 +2242,23 @@ pg_stat_reset_replication_slot(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
+/* Reset a subscription error stats */
+Datum
+pg_stat_reset_subscription_error(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid;
+
+ if (PG_ARGISNULL(1))
+ relid = InvalidOid; /* reset apply worker error stats */
+ else
+ relid = PG_GETARG_OID(1); /* reset table sync worker error stats */
+
+ pgstat_reset_subscription_error(subid, relid);
+
+ PG_RETURN_VOID();
+}
+
Datum
pg_stat_get_archiver(PG_FUNCTION_ARGS)
{
@@ -2380,3 +2399,104 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
/* Returns the record as Datum */
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
+
+/*
+ * Get the subscription error for the given subscription (and relation).
+ */
+Datum
+pg_stat_get_subscription_error(PG_FUNCTION_ARGS)
+{
+#define PG_STAT_GET_SUBSCRIPTION_ERROR_COLS 10
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid;
+ TupleDesc tupdesc;
+ Datum values[PG_STAT_GET_SUBSCRIPTION_ERROR_COLS];
+ bool nulls[PG_STAT_GET_SUBSCRIPTION_ERROR_COLS];
+ PgStat_StatSubRelErrEntry *relerrent;
+
+ /* Initialise values and NULL flags arrays */
+ MemSet(values, 0, sizeof(values));
+ MemSet(nulls, 0, sizeof(nulls));
+
+ /* Initialise attributes information in the tuple descriptor */
+ tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_SUBSCRIPTION_ERROR_COLS);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "datid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 3, "relid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 4, "command",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 5, "xid",
+ XIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 6, "failure_source",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 7, "failure_count",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 8, "last_failure",
+ TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 9, "last_failure_message",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 10, "stats_reset",
+ TIMESTAMPTZOID, -1, 0);
+ BlessTupleDesc(tupdesc);
+
+ if (PG_ARGISNULL(1))
+ relid = InvalidOid;
+ else
+ relid = PG_GETARG_OID(1);
+
+ /* Get subscription errors */
+ relerrent = pgstat_fetch_subscription_rel_error(subid, relid);
+
+ /* Return NULL if the subscription doesn't have any errors */
+ if (relerrent == NULL)
+ PG_RETURN_NULL();
+
+ values[0] = ObjectIdGetDatum(relerrent->databaseid);
+ values[1] = ObjectIdGetDatum(subid);
+
+ if (OidIsValid(relerrent->relid))
+ values[2] = ObjectIdGetDatum(relerrent->relid);
+ else
+ nulls[2] = true;
+
+ if (relerrent->command == 0)
+ nulls[3] = true;
+ else
+ {
+ /* should be apply worker */
+ Assert(!OidIsValid(relerrent->subrelid));
+
+ values[3] = CStringGetTextDatum(logicalrep_message_type(relerrent->command));
+ }
+
+ if (TransactionIdIsValid(relerrent->xid))
+ values[4] = TransactionIdGetDatum(relerrent->xid);
+ else
+ nulls[4] = true;
+
+ if (OidIsValid(relerrent->subrelid))
+ values[5] = CStringGetTextDatum("tablesync");
+ else
+ values[5] = CStringGetTextDatum("apply");
+
+
+ values[6] = Int64GetDatum(relerrent->failure_count);
+
+ if (relerrent->last_failure == 0)
+ nulls[7] = true;
+ else
+ values[7] = TimestampTzGetDatum(relerrent->last_failure);
+
+ values[8] = CStringGetTextDatum(relerrent->errmsg);
+
+ if (relerrent->stat_reset_timestamp == 0)
+ nulls[9] = true;
+ else
+ values[9] = TimestampTzGetDatum(relerrent->stat_reset_timestamp);
+
+ /* Returns the record as Datum */
+ PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
+}
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index a3e1c59a82..dd36850016 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -1441,6 +1441,22 @@ getinternalerrposition(void)
return edata->internalpos;
}
+/*
+ * geterrmessage --- return the currently set error message
+ *
+ * This is only intended for use in error callback subroutines, since there
+ * is no other place outside elog.c where the concept is meaningful.
+ */
+const char *
+geterrmessage(void)
+{
+ ErrorData *edata = &errordata[errordata_stack_depth];
+
+ /* we don't bother incrementing recursion_depth */
+ CHECK_STACK_DEPTH();
+
+ return (const char *) edata->message;
+}
/*
* Functions to allow construction of error message strings separately from
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 8cd0252082..044ff52227 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5321,6 +5321,14 @@
proargmodes => '{i,o,o,o,o,o,o,o,o,o,o}',
proargnames => '{slot_name,slot_name,spill_txns,spill_count,spill_bytes,stream_txns,stream_count,stream_bytes,total_txns,total_bytes,stats_reset}',
prosrc => 'pg_stat_get_replication_slot' },
+{ oid => '8523', descr => 'statistics: information about subscription error',
+ proname => 'pg_stat_get_subscription_error', prorows => '1', proisstrict => 'f',
+ proretset => 't', provolatile => 's', proparallel => 'r',
+ prorettype => 'record', proargtypes => 'oid oid',
+ proallargtypes => '{oid,oid,oid,oid,oid,text,xid,text,int8,timestamptz,text,timestamptz}',
+ proargmodes => '{i,i,o,o,o,o,o,o,o,o,o,o}',
+ proargnames => '{subid,relid,datid,subid,relid,command,xid,failure_source,failure_count,last_failure,last_failure_message,stats_reset}',
+ prosrc => 'pg_stat_get_subscription_error' },
{ oid => '6118', descr => 'statistics: information about subscription',
proname => 'pg_stat_get_subscription', prorows => '10', proisstrict => 'f',
proretset => 't', provolatile => 's', proparallel => 'r',
@@ -5708,6 +5716,11 @@
proname => 'pg_stat_reset_replication_slot', proisstrict => 'f',
provolatile => 'v', prorettype => 'void', proargtypes => 'text',
prosrc => 'pg_stat_reset_replication_slot' },
+{ oid => '8524',
+ descr => 'statistics: reset collected statistics for a single subscription error',
+ proname => 'pg_stat_reset_subscription_error', proisstrict => 'f',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid oid',
+ prosrc => 'pg_stat_reset_subscription_error' },
{ oid => '3163', descr => 'current trigger depth',
proname => 'pg_trigger_depth', provolatile => 's', proparallel => 'r',
diff --git a/src/include/catalog/pg_subscription_rel.h b/src/include/catalog/pg_subscription_rel.h
index 632381b4e3..50053cdafc 100644
--- a/src/include/catalog/pg_subscription_rel.h
+++ b/src/include/catalog/pg_subscription_rel.h
@@ -22,6 +22,7 @@
#include "catalog/genbki.h"
#include "catalog/pg_subscription_rel_d.h"
#include "nodes/pg_list.h"
+#include "utils/hsearch.h"
/* ----------------
* pg_subscription_rel definition. cpp turns this into
@@ -89,6 +90,6 @@ extern void RemoveSubscriptionRel(Oid subid, Oid relid);
extern bool HasSubscriptionRelations(Oid subid);
extern List *GetSubscriptionRelations(Oid subid);
-extern List *GetSubscriptionNotReadyRelations(Oid subid);
+extern HTAB *GetSubscriptionNotReadyRelations(Oid subid);
#endif /* PG_SUBSCRIPTION_REL_H */
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 9612c0a6c2..1104886bef 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -14,6 +14,7 @@
#include "datatype/timestamp.h"
#include "portability/instr_time.h"
#include "postmaster/pgarch.h" /* for MAX_XFN_CHARS */
+#include "replication/logicalproto.h"
#include "utils/backend_progress.h" /* for backward compatibility */
#include "utils/backend_status.h" /* for backward compatibility */
#include "utils/hsearch.h"
@@ -66,6 +67,9 @@ typedef enum StatMsgType
PGSTAT_MTYPE_RESETSINGLECOUNTER,
PGSTAT_MTYPE_RESETSLRUCOUNTER,
PGSTAT_MTYPE_RESETREPLSLOTCOUNTER,
+ PGSTAT_MTYPE_SUBSCRIPTIONERR,
+ PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE,
+ PGSTAT_MTYPE_SUBSCRIPTIONPURGE,
PGSTAT_MTYPE_AUTOVAC_START,
PGSTAT_MTYPE_VACUUM,
PGSTAT_MTYPE_ANALYZE,
@@ -539,6 +543,68 @@ typedef struct PgStat_MsgReplSlot
PgStat_Counter m_total_bytes;
} PgStat_MsgReplSlot;
+/* ----------
+ * PgStat_MsgSubscriptionErr Sent by the apply worker or the table sync worker to
+ * update/reset/clear the error happening during logical
+ * replication.
+ * ----------
+ */
+#define PGSTAT_SUBSCRIPTIONERR_MSGLEN 256
+typedef struct PgStat_MsgSubscriptionErr
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * m_subid and m_subrelid are used to determine the subscription and the
+ * reporter of this error. m_subrelid is InvalidOid if reported by the
+ * apply worker, otherwise by the table sync worker. In table sync worker
+ * case, m_subrelid must be the same as m_relid.
+ */
+ Oid m_subid;
+ Oid m_subrelid;
+
+ /* The clear messages use below field */
+ bool m_reset; /* clear all fields and set reset_stats
+ * timestamp */
+
+ /* The error report message uses below fields */
+ Oid m_databaseid;
+ Oid m_relid;
+ LogicalRepMsgType m_command;
+ TransactionId m_xid;
+ TimestampTz m_last_failure;
+ char m_errmsg[PGSTAT_SUBSCRIPTIONERR_MSGLEN];
+} PgStat_MsgSubscriptionErr;
+
+/* ----------
+ * PgStat_MsgSubscriptionErrPurge Sent by the autovacuum purge the subscription
+ * errors.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONERRPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(Oid) - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionErrPurge
+{
+ PgStat_MsgHdr m_hdr;
+ Oid m_subid;
+ int m_nentries;
+ Oid m_relids[PGSTAT_NUM_SUBSCRIPTIONERRPURGE];
+} PgStat_MsgSubscriptionErrPurge;
+
+/* ----------
+ * PgStat_MsgSubscriptionPurge Sent by the autovacuum purge the subscriptions.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionPurge
+{
+ PgStat_MsgHdr m_hdr;
+ int m_nentries;
+ Oid m_subids[PGSTAT_NUM_SUBSCRIPTIONPURGE];
+} PgStat_MsgSubscriptionPurge;
/* ----------
* PgStat_MsgRecoveryConflict Sent by the backend upon recovery conflict
@@ -710,6 +776,9 @@ typedef union PgStat_Msg
PgStat_MsgChecksumFailure msg_checksumfailure;
PgStat_MsgReplSlot msg_replslot;
PgStat_MsgConn msg_conn;
+ PgStat_MsgSubscriptionErr msg_subscriptionerr;
+ PgStat_MsgSubscriptionErrPurge msg_subscriptionerrpurge;
+ PgStat_MsgSubscriptionPurge msg_subscriptionpurge;
} PgStat_Msg;
@@ -908,6 +977,42 @@ typedef struct PgStat_StatReplSlotEntry
TimestampTz stat_reset_timestamp;
} PgStat_StatReplSlotEntry;
+/*
+ * Subscription error statistics kept in the stats collector.
+ *
+ * PgStat_StatSubErrEntry holds all errors associated with the subscription,
+ * reported by the apply worker and the table sync workers. This entry is
+ * created when the first error message with the subscription is reported
+ * and is dropped along with its errors when the subscription is dropped.
+ *
+ * PgStat_StatSubRelErrEntry represents a error happened during logical
+ * replication, reported by the apply worker (subrelid is InvalidOid) or by the
+ * table sync worker (subrelid is a valid OID). The error reported by the apply
+ * worker is dropped when the subscription is dropped whereas the error reported
+ * by the table sync worker is dropped when the table synchronization process
+ * completed.
+ */
+typedef struct PgStat_StatSubErrEntry
+{
+ Oid subid; /* hash table key */
+ HTAB *suberrors;
+} PgStat_StatSubErrEntry;
+
+typedef struct PgStat_StatSubRelErrEntry
+{
+ Oid subrelid; /* InvalidOid if the apply worker, otherwise
+ * the table sync worker. hash table key. */
+ Oid databaseid;
+ Oid relid; /* OID of relation related to the error. Must
+ * be the same as subrelid in the table sync
+ * case. */
+ LogicalRepMsgType command;
+ TransactionId xid;
+ PgStat_Counter failure_count;
+ TimestampTz last_failure;
+ char errmsg[PGSTAT_SUBSCRIPTIONERR_MSGLEN];
+ TimestampTz stat_reset_timestamp;
+} PgStat_StatSubRelErrEntry;
/*
* Working state needed to accumulate per-function-call timing statistics.
@@ -995,6 +1100,7 @@ extern void pgstat_reset_shared_counters(const char *);
extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type type);
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
+extern void pgstat_reset_subscription_error(Oid subid, Oid subrelid);
extern void pgstat_report_autovac(Oid dboid);
extern void pgstat_report_vacuum(Oid tableoid, bool shared,
@@ -1011,6 +1117,9 @@ extern void pgstat_report_checksum_failure(void);
extern void pgstat_report_replslot(const PgStat_StatReplSlotEntry *repSlotStat);
extern void pgstat_report_replslot_create(const char *slotname);
extern void pgstat_report_replslot_drop(const char *slotname);
+extern void pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command,
+ TransactionId xid, const char *errmsg);
extern void pgstat_initialize(void);
@@ -1106,6 +1215,9 @@ extern PgStat_GlobalStats *pgstat_fetch_global(void);
extern PgStat_WalStats *pgstat_fetch_stat_wal(void);
extern PgStat_SLRUStats *pgstat_fetch_slru(void);
extern PgStat_StatReplSlotEntry *pgstat_fetch_replslot(NameData slotname);
+extern PgStat_StatSubErrEntry *pgstat_fetch_subscription_error(Oid subid);
+extern PgStat_StatSubRelErrEntry *pgstat_fetch_subscription_rel_error(Oid subid,
+ Oid relid);
extern void pgstat_count_slru_page_zeroed(int slru_idx);
extern void pgstat_count_slru_page_hit(int slru_idx);
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index f53607e12e..155145a77d 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -222,6 +222,7 @@ extern int err_generic_string(int field, const char *str);
extern int geterrcode(void);
extern int geterrposition(void);
extern int getinternalerrposition(void);
+extern const char *geterrmessage(void);
/*----------
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index e5ab11275d..ffad9790ae 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2094,6 +2094,28 @@ pg_stat_subscription| SELECT su.oid AS subid,
st.latest_end_time
FROM (pg_subscription su
LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid, pid, received_lsn, last_msg_send_time, last_msg_receipt_time, latest_end_lsn, latest_end_time) ON ((st.subid = su.oid)));
+pg_stat_subscription_errors| SELECT d.datname,
+ sr.subid,
+ s.subname,
+ e.relid,
+ e.command,
+ e.xid,
+ e.failure_source,
+ e.failure_count,
+ e.last_failure,
+ e.last_failure_message,
+ e.stats_reset
+ FROM ( SELECT pg_subscription.oid AS subid,
+ NULL::oid AS relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT pg_subscription_rel.srsubid AS subid,
+ pg_subscription_rel.srrelid AS relid
+ FROM pg_subscription_rel
+ WHERE (pg_subscription_rel.srsubstate <> 'r'::"char")) sr,
+ ((LATERAL pg_stat_get_subscription_error(sr.subid, sr.relid) e(datid, subid, relid, command, xid, failure_source, failure_count, last_failure, last_failure_message, stats_reset)
+ JOIN pg_database d ON ((e.datid = d.oid)))
+ JOIN pg_subscription s ON ((e.subid = s.oid)));
pg_stat_sys_indexes| SELECT pg_stat_all_indexes.relid,
pg_stat_all_indexes.indexrelid,
pg_stat_all_indexes.schemaname,
--
2.24.3 (Apple Git-128)
0001-Remove-unused-function-argument-in-apply_handle_comm.patchapplication/x-patch; name=0001-Remove-unused-function-argument-in-apply_handle_comm.patchDownload
From 780a4f0cf0e2ed8a84710c0a564ec3d8eaeec3ee Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 26 Jul 2021 11:34:36 +0900
Subject: [PATCH 1/4] Remove unused function argument in
apply_handle_commit_internal()
---
src/backend/replication/logical/worker.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index b9a7a7ffbb..186be1a188 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -309,8 +309,7 @@ static void maybe_reread_subscription(void);
/* prototype needed because of stream_commit */
static void apply_dispatch(StringInfo s);
-static void apply_handle_commit_internal(StringInfo s,
- LogicalRepCommitData *commit_data);
+static void apply_handle_commit_internal(LogicalRepCommitData *commit_data);
static void apply_handle_insert_internal(ApplyExecutionData *edata,
ResultRelInfo *relinfo,
TupleTableSlot *remoteslot);
@@ -853,7 +852,7 @@ apply_handle_commit(StringInfo s)
LSN_FORMAT_ARGS(commit_data.commit_lsn),
LSN_FORMAT_ARGS(remote_final_lsn))));
- apply_handle_commit_internal(s, &commit_data);
+ apply_handle_commit_internal(&commit_data);
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(commit_data.end_lsn);
@@ -1390,7 +1389,7 @@ apply_handle_stream_commit(StringInfo s)
elog(DEBUG1, "replayed %d (all) changes from file \"%s\"",
nchanges, path);
- apply_handle_commit_internal(s, &commit_data);
+ apply_handle_commit_internal(&commit_data);
/* unlink the files with serialized changes and subxact info */
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
@@ -1405,7 +1404,7 @@ apply_handle_stream_commit(StringInfo s)
* Helper function for apply_handle_commit and apply_handle_stream_commit.
*/
static void
-apply_handle_commit_internal(StringInfo s, LogicalRepCommitData *commit_data)
+apply_handle_commit_internal(LogicalRepCommitData *commit_data)
{
if (IsTransactionState())
{
--
2.24.3 (Apple Git-128)
On Thu, Jul 29, 2021 at 2:04 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Jul 26, 2021 at 11:58 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Jul 22, 2021 at 8:53 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:On July 20, 2021 9:26 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Jul 19, 2021 at 8:38 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:On July 19, 2021 2:40 PM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:
I've attached the updated version patch that incorporated all
comments I got so far except for the clearing error details part I
mentioned above. After getting a consensus on those parts, I'll
incorporate the idea into the patches.3) For 0003 patch, if user set skip_xid to a wrong xid which have not been
assigned, and then will the change be skipped when the xid is assigned in
the future even if it doesn't cause any conflicts ?Yes. Currently, setting a correct xid is the user's responsibility. I think it would
be better to disable it or emit WARNING/ERROR when the user mistakenly set
the wrong xid if we find out a convenient way to detect that.Thanks for the explanation. As Amit suggested, it seems we can document the
risk of misusing skip_xid. Besides, I found some minor things in the patch.1) In 0002 patch
+ */ +static void +pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len) +{ + if (subscriptionErrHash != NULL) + return; ++static void +pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len) +{the second paramater "len" seems not used in the function
pgstat_recv_subscription_purge() and pgstat_recv_subscription_error().'len' is not used at all in not only functions the patch added but
also other pgstat_recv_* functions. Can we remove all of them in a
separate patch? 'len' in pgstat_recv_* functions has never been used
since the stats collector code is introduced. It seems like that it
was mistakenly introduced in the first commit and other pgstat_recv_*
functions were added that followed it to define ‘len’ but didn’t also
use it at all.2) in 0003 patch
* Helper function for apply_handle_commit and apply_handle_stream_commit. */ static void -apply_handle_commit_internal(StringInfo s, LogicalRepCommitData *commit_data) +apply_handle_commit_internal(LogicalRepCommitData *commit_data) {This looks like a separate change which remove unused paramater in existing
code, maybe we can get this committed first ?Yeah, it seems to be introduced by commit 0926e96c493. I've attached
the patch for that.Also, I've attached the updated version patches. This version patch
has pg_stat_reset_subscription_error() SQL function and sends a clear
message after skipping the transaction. 0004 patch includes the
skipping transaction feature and introducing RESET to ALTER
SUBSCRIPTION. It would be better to separate them.I've attached the new version patches that fix cfbot failure.
Sorry I've attached wrong ones. Reattached the correct version patches.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v4-0003-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchapplication/octet-stream; name=v4-0003-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchDownload
From 79f59f1756ae3f783bb3fa8da4041631013e269a Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 28 Jun 2021 13:18:58 +0900
Subject: [PATCH v4 3/3] Add skip_xid option to ALTER SUBSCRIPTION.
If incoming change violates any constraint, lotigcal replication stops
until it's resolved. This commit introduces another way to skip the
transaction in question.
The user can specify XID by ALTER SUBSCRIPTION ... SET (skip_xid =
XXX), updating pg_subscription.subskipxid field, telling the apply
worker to skip the transaction. The apply worker skips all data
modification changes within the specified transaction. After skipping
the transaciton the apply worker clears subskipxid. Also it clear the
error statistics of the subscription in pg_stat_subscription_errors
system view.
To reset skip_xid parameter (and other paremeters), this commits also
adds RESET command to ALTER SUBSCRIPTION command.
---
doc/src/sgml/logical-replication.sgml | 49 +++-
doc/src/sgml/ref/alter_subscription.sgml | 46 +++-
src/backend/catalog/pg_subscription.c | 10 +
src/backend/commands/subscriptioncmds.c | 146 +++++++++--
src/backend/parser/gram.y | 11 +-
src/backend/postmaster/pgstat.c | 41 +++-
src/backend/replication/logical/worker.c | 271 +++++++++++++++++----
src/include/catalog/pg_subscription.h | 4 +
src/include/nodes/parsenodes.h | 4 +-
src/include/pgstat.h | 4 +-
src/test/regress/expected/subscription.out | 22 ++
src/test/regress/sql/subscription.sql | 19 ++
src/test/subscription/t/023_skip_xact.pl | 185 ++++++++++++++
13 files changed, 727 insertions(+), 85 deletions(-)
create mode 100644 src/test/subscription/t/023_skip_xact.pl
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 88646bc859..992d8b4ac1 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -333,20 +333,63 @@
<para>
A conflict will produce an error and will stop the replication; it must be
resolved manually by the user. Details about the conflict can be found in
- the subscriber's server log.
+ <link linkend="monitoring-pg-stat-subscription-errors"/> as well as the
+ subscriber's server log.
</para>
<para>
The resolution can be done either by changing data on the subscriber so
that it does not conflict with the incoming change or by skipping the
- transaction that conflicts with the existing data. The transaction can be
- skipped by calling the <link linkend="pg-replication-origin-advance">
+ transaction that conflicts with the existing data. When a conflict produce
+ an error, it is shown in <structname>pg_stat_subscription_errors</structname>
+ view as follows:
+ </para>
+
+ <programlisting>
+postgres=# SELECT * FROM pg_stat_subscription_errors;
+-[ RECORD 1 ]--------+-----------------------------------------------------------
+datname | postgres
+subid | 16395
+subname | test_sub
+relid | 16385
+command | INSERT
+xid | 716
+failure_source | apply
+failure_count | 50
+last_failure | 2021-07-21 21:16:02.781779+00
+last_failure_message | duplicate key value violates unique constraint "test_pkey"
+stats_reset |
+</programlisting>
+
+ <para>
+ and it is also shown in subscriber's server log as follows:
+ </para>
+
+<screen>
+ERROR: duplicate key value violates unique constraint "test_pkey"
+DETAIL: Key (id)=(1) already exists.
+CONTEXT: during apply of "INSERT" for relation "public.test" in transaction with xid 716 committs 2021-07-15 21:54:58.802874+00
+</screen>
+
+ <para>
+ The transaction ID that contains the change violating the constraint can be
+ found from those outputs (transaction ID 740 in the above case). The transaction
+ can be skipped by setting <replaceable>skip_xid</replaceable> to the subscription
+ by <command>ALTER SUBSCRIPTION ... SET</command>. Alternatively, the transaction
+ can also be skipped by calling the <link linkend="pg-replication-origin-advance">
<function>pg_replication_origin_advance()</function></link> function with
a <parameter>node_name</parameter> corresponding to the subscription name,
and a position. The current position of origins can be seen in the
<link linkend="view-pg-replication-origin-status">
<structname>pg_replication_origin_status</structname></link> system view.
</para>
+
+ <para>
+ In either way, those should be used as a last resort. They skip the whole
+ transaction including changes that may not violate any constraint and easily
+ make subscriber inconsistent if a user specifies the wrong transaction ID or
+ the position of origin.
+ </para>
</sect1>
<sect1 id="logical-replication-restrictions">
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index a6f994450d..591f554fc7 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -29,6 +29,7 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> REFRESH PUB
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> ENABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> DISABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <replaceable>new_name</replaceable>
</synopsis>
@@ -192,15 +193,46 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
<varlistentry>
<term><literal>SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )</literal></term>
+ <term><literal>RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )</literal></term>
<listitem>
<para>
- This clause alters parameters originally set by
- <xref linkend="sql-createsubscription"/>. See there for more
- information. The parameters that can be altered
- are <literal>slot_name</literal>,
- <literal>synchronous_commit</literal>,
- <literal>binary</literal>, and
- <literal>streaming</literal>.
+ This clause sets or resets a subscription option. The parameters that can be
+ set are the parameters originally set by <xref linkend="sql-createsubscription"/>:
+ <literal>slot_name</literal>, <literal>synchronous_commit</literal>,
+ <literal>binary</literal>, <literal>streaming</literal>, and following
+ parameter:
+ </para>
+ <para>
+ <variablelist>
+ <varlistentry>
+ <term><literal>skip_xid</literal> (<type>xid</type>)</term>
+ <listitem>
+ <para>
+ If incoming data violates any constraints the logical replication
+ will stop until it is resolved. The resolution can be done either
+ by changing data on the subscriber so that it doesn't conflict with
+ incoming change or by skipping the whole transaction. This option
+ specifies transaction ID that logical replication worker skips to
+ apply. The logical replication worker skips all data modification
+ changes within the specified transaction. Therefore, since it skips
+ the whole transaction including the changes that may not violate the
+ constraint, it should only be used as a last resort. This option has
+ no effect for the transaction that is already prepared with enabling
+ <literal>two_phase</literal> on susbscriber. After the logical
+ replication successfully skips the transaction, the transaction ID
+ (stored in
+ <structname>pg_subscription</structname>.<structfield>subskipxid</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ <para>
+ The parameters that can be reset are: <literal>streaming</literal>,
+ <literal>binary</literal>, <literal>synchronous_commit</literal>,
+ and <literal>skip_xid</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index d76bdff36a..8ecc55150e 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -104,6 +104,16 @@ GetSubscription(Oid subid, bool missing_ok)
Assert(!isnull);
sub->publications = textarray_to_stringlist(DatumGetArrayTypeP(datum));
+ /* Get skip XID */
+ datum = SysCacheGetAttr(SUBSCRIPTIONOID,
+ tup,
+ Anum_pg_subscription_subskipxid,
+ &isnull);
+ if (!isnull)
+ sub->skipxid = DatumGetTransactionId(datum);
+ else
+ sub->skipxid = InvalidTransactionId;
+
ReleaseSysCache(tup);
return sub;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index da02d3bbfa..0cc965c056 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -60,6 +60,7 @@
#define SUBOPT_BINARY 0x00000080
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
+#define SUBOPT_SKIP_XID 0x00000400
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -81,6 +82,7 @@ typedef struct SubOpts
bool binary;
bool streaming;
bool twophase;
+ TransactionId skip_xid;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -99,7 +101,8 @@ static void ReportSlotConnectionError(HTAB *rstates, Oid subid, char *slotname,
*/
static void
parse_subscription_options(ParseState *pstate, List *stmt_options,
- bits32 supported_opts, SubOpts *opts)
+ bits32 supported_opts, SubOpts *opts,
+ bool is_reset)
{
ListCell *lc;
@@ -128,12 +131,23 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->streaming = false;
if (IsSet(supported_opts, SUBOPT_TWOPHASE_COMMIT))
opts->twophase = false;
+ if (IsSet(supported_opts, SUBOPT_SKIP_XID))
+ opts->skip_xid = InvalidTransactionId;
/* Parse options */
foreach(lc, stmt_options)
{
DefElem *defel = (DefElem *) lfirst(lc);
+ if (is_reset)
+ {
+ if (defel->arg != NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("RESET must not include values for parameters")));
+
+ }
+
if (IsSet(supported_opts, SUBOPT_CONNECT) &&
strcmp(defel->defname, "connect") == 0)
{
@@ -141,7 +155,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_CONNECT;
- opts->connect = defGetBoolean(defel);
+ if (!is_reset)
+ opts->connect = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_ENABLED) &&
strcmp(defel->defname, "enabled") == 0)
@@ -150,7 +165,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_ENABLED;
- opts->enabled = defGetBoolean(defel);
+ if (!is_reset)
+ opts->enabled = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_CREATE_SLOT) &&
strcmp(defel->defname, "create_slot") == 0)
@@ -159,7 +175,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_CREATE_SLOT;
- opts->create_slot = defGetBoolean(defel);
+ if (!is_reset)
+ opts->create_slot = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_SLOT_NAME) &&
strcmp(defel->defname, "slot_name") == 0)
@@ -168,7 +185,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_SLOT_NAME;
- opts->slot_name = defGetString(defel);
+ if (!is_reset)
+ opts->slot_name = defGetString(defel);
/* Setting slot_name = NONE is treated as no slot name. */
if (strcmp(opts->slot_name, "none") == 0)
@@ -183,7 +201,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_COPY_DATA;
- opts->copy_data = defGetBoolean(defel);
+ if (!is_reset)
+ opts->copy_data = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_SYNCHRONOUS_COMMIT) &&
strcmp(defel->defname, "synchronous_commit") == 0)
@@ -192,12 +211,18 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_SYNCHRONOUS_COMMIT;
- opts->synchronous_commit = defGetString(defel);
+ if (!is_reset)
+ {
+ opts->synchronous_commit = defGetString(defel);
- /* Test if the given value is valid for synchronous_commit GUC. */
- (void) set_config_option("synchronous_commit", opts->synchronous_commit,
- PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
- false, 0, false);
+ /*
+ * Test if the given value is valid for synchronous_commit
+ * GUC.
+ */
+ (void) set_config_option("synchronous_commit", opts->synchronous_commit,
+ PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
+ false, 0, false);
+ }
}
else if (IsSet(supported_opts, SUBOPT_REFRESH) &&
strcmp(defel->defname, "refresh") == 0)
@@ -206,7 +231,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_REFRESH;
- opts->refresh = defGetBoolean(defel);
+ if (!is_reset)
+ opts->refresh = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_BINARY) &&
strcmp(defel->defname, "binary") == 0)
@@ -215,7 +241,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_BINARY;
- opts->binary = defGetBoolean(defel);
+ if (!is_reset)
+ opts->binary = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_STREAMING) &&
strcmp(defel->defname, "streaming") == 0)
@@ -224,7 +251,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_STREAMING;
- opts->streaming = defGetBoolean(defel);
+ if (!is_reset)
+ opts->streaming = defGetBoolean(defel);
}
else if (strcmp(defel->defname, "two_phase") == 0)
{
@@ -245,7 +273,31 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_TWOPHASE_COMMIT;
- opts->twophase = defGetBoolean(defel);
+ if (!is_reset)
+ opts->twophase = defGetBoolean(defel);
+ }
+ else if (strcmp(defel->defname, "skip_xid") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_SKIP_XID))
+ errorConflictingDefElem(defel, pstate);
+
+ if (!is_reset)
+ {
+ char *xid_str = defGetString(defel);
+ TransactionId xid;
+
+ /* Parse the argument as TransactionId */
+ xid = DatumGetTransactionId(DirectFunctionCall1(xidin,
+ CStringGetDatum(xid_str)));
+
+ if (!TransactionIdIsNormal(xid))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid transaction id")));
+ opts->skip_xid = xid;
+ }
+
+ opts->specified_opts |= SUBOPT_SKIP_XID;
}
else
ereport(ERROR,
@@ -416,7 +468,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT);
- parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
+ parse_subscription_options(pstate, stmt->options, supported_opts, &opts,
+ false);
/*
* Since creating a replication slot is not transactional, rolling back
@@ -489,6 +542,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
CharGetDatum(opts.twophase ?
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -885,14 +939,14 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
switch (stmt->kind)
{
- case ALTER_SUBSCRIPTION_OPTIONS:
+ case ALTER_SUBSCRIPTION_SET_OPTIONS:
{
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
- SUBOPT_STREAMING);
+ SUBOPT_STREAMING | SUBOPT_SKIP_XID);
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
if (IsSet(opts.specified_opts, SUBOPT_SLOT_NAME))
{
@@ -944,14 +998,60 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
replaces[Anum_pg_subscription_substream - 1] = true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_SKIP_XID))
+ {
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(opts.skip_xid);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+ }
+
update_tuple = true;
break;
}
+ case ALTER_SUBSCRIPTION_RESET_OPTIONS:
+ {
+ supported_opts = (SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_STREAMING |
+ SUBOPT_BINARY | SUBOPT_SKIP_XID);
+
+ parse_subscription_options(pstate, stmt->options,
+ supported_opts, &opts, true);
+
+ if (IsSet(opts.specified_opts, SUBOPT_SYNCHRONOUS_COMMIT))
+ {
+ values[Anum_pg_subscription_subsynccommit - 1] =
+ CStringGetTextDatum("off");
+ replaces[Anum_pg_subscription_subsynccommit - 1] = true;
+ }
+
+ if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
+ {
+ values[Anum_pg_subscription_substream - 1] =
+ BoolGetDatum(false);
+ replaces[Anum_pg_subscription_substream - 1] = true;
+ }
+
+ if (IsSet(opts.specified_opts, SUBOPT_BINARY))
+ {
+ values[Anum_pg_subscription_subbinary - 1] =
+ BoolGetDatum(false);
+ replaces[Anum_pg_subscription_subbinary - 1] = true;
+ }
+
+ if (IsSet(opts.specified_opts, SUBOPT_SKIP_XID))
+ {
+ nulls[Anum_pg_subscription_subskipxid - 1] =
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+ }
+
+ update_tuple = true;
+ break;
+ }
case ALTER_SUBSCRIPTION_ENABLED:
{
parse_subscription_options(pstate, stmt->options,
- SUBOPT_ENABLED, &opts);
+ SUBOPT_ENABLED, &opts, false);
+
Assert(IsSet(opts.specified_opts, SUBOPT_ENABLED));
if (!sub->slotname && opts.enabled)
@@ -986,7 +1086,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
{
supported_opts = SUBOPT_COPY_DATA | SUBOPT_REFRESH;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
values[Anum_pg_subscription_subpublications - 1] =
publicationListToArray(stmt->publication);
@@ -1036,7 +1136,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts |= SUBOPT_COPY_DATA;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
publist = merge_publications(sub->publications, stmt->publication, isadd, stmt->subname);
values[Anum_pg_subscription_subpublications - 1] =
@@ -1084,7 +1184,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
errmsg("ALTER SUBSCRIPTION ... REFRESH is not allowed for disabled subscriptions")));
parse_subscription_options(pstate, stmt->options,
- SUBOPT_COPY_DATA, &opts);
+ SUBOPT_COPY_DATA, &opts, false);
/*
* The subscription option "two_phase" requires that
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 39a2849eba..bcf85e8980 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9707,7 +9707,16 @@ AlterSubscriptionStmt:
{
AlterSubscriptionStmt *n =
makeNode(AlterSubscriptionStmt);
- n->kind = ALTER_SUBSCRIPTION_OPTIONS;
+ n->kind = ALTER_SUBSCRIPTION_SET_OPTIONS;
+ n->subname = $3;
+ n->options = $5;
+ $$ = (Node *)n;
+ }
+ | ALTER SUBSCRIPTION name RESET definition
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_RESET_OPTIONS;
n->subname = $3;
n->options = $5;
$$ = (Node *)n;
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 4a35e6640f..d3f0a2ea2f 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -1699,6 +1699,27 @@ pgstat_reset_subscription_error(Oid subid, Oid subrelid)
msg.m_subid = subid;
msg.m_subrelid = subrelid;
msg.m_reset = true;
+ msg.m_clear = true;
+
+ pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
+}
+
+/* ----------
+ * pgstat_clear_subscription_error() -
+ *
+ * Tell the collector about clear the error of subscription.
+ * ----------
+ */
+void
+pgstat_clear_subscription_error(Oid subid, Oid subrelid)
+{
+ PgStat_MsgSubscriptionErr msg;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_reset = false;
+ msg.m_clear = true;
pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
}
@@ -2046,6 +2067,7 @@ pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
msg.m_databaseid = MyDatabaseId;
msg.m_relid = relid;
msg.m_reset = false;
+ msg.m_clear = false;
msg.m_command = command;
msg.m_xid = xid;
msg.m_last_failure = GetCurrentTimestamp();
@@ -6078,26 +6100,37 @@ static void
pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len)
{
PgStat_StatSubRelErrEntry *relerrent;
- bool create = !msg->m_reset;
+ bool create = !(msg->m_reset || msg->m_clear);
/* Get subscription error */
relerrent = pgstat_get_subscription_rel_error_entry(msg->m_subid,
msg->m_subrelid,
create);
- if (msg->m_reset)
+ if (msg->m_reset || msg->m_clear)
{
+ Assert(!(msg->m_reset && msg->m_clear));
+
if (relerrent)
return;
- /* reset fields and set reset timestamp */
+ /* reset fields */
relerrent->relid = InvalidOid;
relerrent->command = 0;
relerrent->xid = InvalidTransactionId;
relerrent->failure_count = 0;
relerrent->last_failure = 0;
relerrent->errmsg[0] = '\0';
- relerrent->stat_reset_timestamp = GetCurrentTimestamp();
+
+ /*
+ * If the reset is requested, reset more fields and set the reset
+ * timestamp.
+ */
+ if (msg->m_reset)
+ {
+ relerrent->failure_count = 0;
+ relerrent->stat_reset_timestamp = GetCurrentTimestamp();
+ }
}
else
{
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 4f9c4e9014..e979c4e98f 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -136,6 +136,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -277,6 +278,21 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * skipping_xid is a valid XID if we're skipping all data modification changes of the
+ * specified transaction in MySubscription->skipxid. Please note that we don’t skip
+ * receiving the changes particularly in streaming cases, since we decide whether or
+ * not to skip applying the changes when starting to apply. Once starting skipping
+ * changes, we copy the XID to skipping_xid and then don't stop skipping until we skip
+ * the whole transaction even if the subscription is invalidated and
+ * MySubscription->skipxid gets changed or reset. When stopping the skipping behavior,
+ * we reset the skip XID (subskipxid) in the pg_subscription and associate origin status
+ * to the transaction that resets the skip XID so that we can start streaming from the
+ * next transaction.
+ */
+static TransactionId skipping_xid = InvalidTransactionId;
+#define skipping_changes() (TransactionIdIsValid(skipping_xid))
+
/*
* Hash table for storing the streaming xid information along with shared file
* set for streaming and subxact files.
@@ -360,6 +376,9 @@ static void set_apply_error_context_rel(LogicalRepRelMapEntry *rel);
static void reset_apply_error_context_rel(void);
static void reset_apply_error_context_info(void);
+static void maybe_start_skipping_changes(TransactionId xid);
+static void stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs);
+
/*
* Should this worker apply changes for given relation.
*
@@ -857,6 +876,11 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
+ /*
+ * Start skipping all changes of this transaction if specified
+ */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -881,7 +905,18 @@ apply_handle_commit(StringInfo s)
LSN_FORMAT_ARGS(commit_data.commit_lsn),
LSN_FORMAT_ARGS(remote_final_lsn))));
- apply_handle_commit_internal(&commit_data);
+ /*
+ * Stop the skipping transaction if enabled. Otherwise, commit the changes
+ * that are just applied.
+ */
+ if (skipping_changes())
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ apply_handle_commit_internal(&commit_data);
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(commit_data.end_lsn);
@@ -910,6 +945,9 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
+ /* Start skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -934,47 +972,57 @@ apply_handle_prepare(StringInfo s)
LSN_FORMAT_ARGS(remote_final_lsn))));
/*
- * Compute unique GID for two_phase transactions. We don't use GID of
- * prepared transaction sent by server as that can lead to deadlock when
- * we have multiple subscriptions from same node point to publications on
- * the same node. See comments atop worker.c
+ * Prepare transaction if we haven't skipped the changes of this
+ * transaction.
*/
- TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
- gid, sizeof(gid));
+ if (skipping_changes())
+ stop_skipping_changes(prepare_data.end_lsn, prepare_data.prepare_time);
+ else
+ {
+ /*
+ * Compute unique GID for two_phase transactions. We don't use GID of
+ * prepared transaction sent by server as that can lead to deadlock
+ * when we have multiple subscriptions from same node point to
+ * publications on the same node. See comments atop worker.c
+ */
+ TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
+ gid, sizeof(gid));
- /*
- * Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction. It is done this way because at
- * commit prepared time, we won't know whether we have skipped preparing a
- * transaction because of no change.
- *
- * XXX, We can optimize such that at commit prepared time, we first check
- * whether we have prepared the transaction or not but that doesn't seem
- * worthwhile because such cases shouldn't be common.
- */
- begin_replication_step();
+ /*
+ * Unlike commit, here, we always prepare the transaction even though
+ * no change has happened in this transaction. It is done this way
+ * because at commit prepared time, we won't know whether we have
+ * skipped preparing a transaction because of no change.
+ *
+ * XXX, We can optimize such that at commit prepared time, we first
+ * check whether we have prepared the transaction or not but that
+ * doesn't seem worthwhile because such cases shouldn't be common.
+ */
+ begin_replication_step();
- /*
- * BeginTransactionBlock is necessary to balance the EndTransactionBlock
- * called within the PrepareTransactionBlock below.
- */
- BeginTransactionBlock();
- CommitTransactionCommand(); /* Completes the preceding Begin command. */
+ /*
+ * BeginTransactionBlock is necessary to balance the
+ * EndTransactionBlock called within the PrepareTransactionBlock
+ * below.
+ */
+ BeginTransactionBlock();
+ CommitTransactionCommand(); /* Completes the preceding Begin command. */
- /*
- * Update origin state so we can restart streaming from correct position
- * in case of crash.
- */
- replorigin_session_origin_lsn = prepare_data.end_lsn;
- replorigin_session_origin_timestamp = prepare_data.prepare_time;
+ /*
+ * Update origin state so we can restart streaming from correct
+ * position in case of crash.
+ */
+ replorigin_session_origin_lsn = prepare_data.end_lsn;
+ replorigin_session_origin_timestamp = prepare_data.prepare_time;
- PrepareTransactionBlock(gid);
- end_replication_step();
- CommitTransactionCommand();
- pgstat_report_stat(false);
+ PrepareTransactionBlock(gid);
+ end_replication_step();
+ CommitTransactionCommand();
+ pgstat_report_stat(false);
- store_flush_position(prepare_data.end_lsn);
+ }
+ store_flush_position(prepare_data.end_lsn);
in_remote_transaction = false;
/* Process any tables that are being synchronized in parallel. */
@@ -1087,9 +1135,10 @@ apply_handle_origin(StringInfo s)
{
/*
* ORIGIN message can only come inside streaming transaction or inside
- * remote transaction and before any actual writes.
+ * remote transaction and before any actual writes unless we're skipping
+ * changes of the transaction.
*/
- if (!in_streamed_transaction &&
+ if (!in_streamed_transaction && !skipping_changes() &&
(!in_remote_transaction ||
(IsTransactionState() && !am_tablesync_worker())))
ereport(ERROR,
@@ -1111,6 +1160,9 @@ apply_handle_stream_start(StringInfo s)
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg_internal("duplicate STREAM START message")));
+ /* extract XID of the top-level transaction */
+ stream_xid = logicalrep_read_stream_start(s, &first_segment);
+
/*
* Start a transaction on stream start, this transaction will be committed
* on the stream stop unless it is a tablesync worker in which case it
@@ -1123,9 +1175,6 @@ apply_handle_stream_start(StringInfo s)
/* notify handle methods we're processing a remote transaction */
in_streamed_transaction = true;
- /* extract XID of the top-level transaction */
- stream_xid = logicalrep_read_stream_start(s, &first_segment);
-
if (!TransactionIdIsValid(stream_xid))
ereport(ERROR,
(errcode(ERRCODE_PROTOCOL_VIOLATION),
@@ -1207,6 +1256,7 @@ apply_handle_stream_abort(StringInfo s)
errmsg_internal("STREAM ABORT message without STREAM STOP")));
logicalrep_read_stream_abort(s, &xid, &subxid);
+ maybe_start_skipping_changes(xid);
/*
* If the two XIDs are the same, it's in fact abort of toplevel xact, so
@@ -1299,6 +1349,10 @@ apply_handle_stream_abort(StringInfo s)
CommitTransactionCommand();
}
+ /* Stop the skipping transaction if enabled */
+ if (skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
reset_apply_error_context_info();
}
@@ -1309,11 +1363,11 @@ static void
apply_handle_stream_commit(StringInfo s)
{
TransactionId xid;
+ LogicalRepCommitData commit_data;
StringInfoData s2;
int nchanges;
char path[MAXPGPATH];
char *buffer = NULL;
- LogicalRepCommitData commit_data;
StreamXidHash *ent;
MemoryContext oldcxt;
BufFile *fd;
@@ -1327,8 +1381,13 @@ apply_handle_stream_commit(StringInfo s)
apply_error_callback_arg.remote_xid = xid;
apply_error_callback_arg.committs = commit_data.committime;
+ remote_final_lsn = commit_data.commit_lsn;
+
elog(DEBUG1, "received commit for streamed transaction %u", xid);
+ /* Start skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(xid);
+
/* Make sure we have an open transaction */
begin_replication_step();
@@ -1360,13 +1419,12 @@ apply_handle_stream_commit(StringInfo s)
MemoryContextSwitchTo(oldcxt);
- remote_final_lsn = commit_data.commit_lsn;
-
/*
* Make sure the handle apply_dispatch methods are aware we're in a remote
* transaction.
*/
in_remote_transaction = true;
+
pgstat_report_activity(STATE_RUNNING, NULL);
end_replication_step();
@@ -1439,7 +1497,17 @@ apply_handle_stream_commit(StringInfo s)
elog(DEBUG1, "replayed %d (all) changes from file \"%s\"",
nchanges, path);
- apply_handle_commit_internal(&commit_data);
+ if (skipping_changes())
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ {
+ /* commit the streamed transaction */
+ apply_handle_commit_internal(&commit_data);
+ }
/* unlink the files with serialized changes and subxact info */
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
@@ -1448,7 +1516,6 @@ apply_handle_stream_commit(StringInfo s)
process_syncing_tables(commit_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
-
reset_apply_error_context_info();
}
@@ -2328,6 +2395,17 @@ apply_dispatch(StringInfo s)
LogicalRepMsgType action = pq_getmsgbyte(s);
ErrorContextCallback errcallback;
+ /*
+ * Skip all data-modification changes if we're skipping changes of this
+ * transaction.
+ */
+ if (skipping_changes() &&
+ (action == LOGICAL_REP_MSG_INSERT ||
+ action == LOGICAL_REP_MSG_UPDATE ||
+ action == LOGICAL_REP_MSG_DELETE ||
+ action == LOGICAL_REP_MSG_TRUNCATE))
+ return;
+
/*
* Push apply error context callback. Other fields will be filled during
* applying the change.
@@ -3788,3 +3866,106 @@ reset_logicalrep_error_context_rel(void)
apply_error_callback_arg.relname = NULL;
}
}
+
+/*
+ * Start skipping changes of the transaction if the given XID matches the
+ * transaction ID specified by skip_xid option.
+ */
+static void
+maybe_start_skipping_changes(TransactionId xid)
+{
+ Assert(!skipping_changes());
+ Assert(!TransactionIdIsValid(skipping_xid));
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ if (!TransactionIdIsValid(MySubscription->skipxid) ||
+ MySubscription->skipxid != xid)
+ return;
+
+ skipping_xid = xid;
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction with xid %u",
+ skipping_xid));
+}
+
+/*
+ * Stop skipping changes and reset the skip XID.
+ *
+ * If origin_lsn and origin_committs are valid, we set origin state to the
+ * transaction commit that resets the skip XID so that we can start streaming
+ * from the transaction next to the one that we just skipped.
+ */
+static void
+stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs)
+{
+ Relation rel;
+ HeapTuple tup;
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ Assert(skipping_changes());
+ Assert(TransactionIdIsValid(skipping_xid));
+ Assert(in_remote_transaction);
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction with xid %u",
+ skipping_xid)));
+
+ /* Stop skipping changes */
+ skipping_xid = InvalidTransactionId;
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ if (!IsTransactionState())
+ StartTransactionCommand();
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ TransactionIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ /* Set subskipxid to null */
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ /* Update the system catalog to reset the skip XID */
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+
+ heap_freetuple(tup);
+ table_close(rel, RowExclusiveLock);
+
+ if (!XLogRecPtrIsInvalid(origin_lsn))
+ {
+ /*
+ * Update origin state so we can restart streaming from correct
+ * position in case of crash.
+ */
+ replorigin_session_origin_lsn = origin_lsn;
+ replorigin_session_origin_timestamp = origin_committs;
+ }
+
+ CommitTransactionCommand();
+ pgstat_report_stat(false);
+
+ /*
+ * Clear the error statistics of this subscription to let users know the
+ * subscription is no longer getting stuck by the conflict.
+ *
+ * The message for clearing the error statistics can be lost but that's
+ * okay. The user can know the logical replication is working fine in
+ * other ways, for example, checking pg_stat_subscription view. And the
+ * user is able to reset the single subscription error statistics by
+ * pg_reset_subscription_error SQL function.
+ */
+ pgstat_clear_subscription_error(MySubscription->oid, InvalidOid);
+}
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 21061493ea..beaa6e646d 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -67,6 +67,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
char subtwophasestate; /* Stream two-phase transactions */
+ TransactionId subskipxid BKI_FORCE_NULL; /* All changes associated with
+ * this XID are skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -103,6 +106,7 @@ typedef struct Subscription
* binary format */
bool stream; /* Allow streaming in-progress transactions. */
char twophasestate; /* Allow streaming two-phase transactions */
+ TransactionId skipxid; /* All changes of the XID are skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index e28248af32..af5c16abfa 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3659,7 +3659,8 @@ typedef struct CreateSubscriptionStmt
typedef enum AlterSubscriptionType
{
- ALTER_SUBSCRIPTION_OPTIONS,
+ ALTER_SUBSCRIPTION_SET_OPTIONS,
+ ALTER_SUBSCRIPTION_RESET_OPTIONS,
ALTER_SUBSCRIPTION_CONNECTION,
ALTER_SUBSCRIPTION_SET_PUBLICATION,
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
@@ -3676,6 +3677,7 @@ typedef struct AlterSubscriptionStmt
char *conninfo; /* Connection string to publisher */
List *publication; /* One or more publication to subscribe to */
List *options; /* List of DefElem nodes */
+ TransactionId skip_xid; /* XID to skip */
} AlterSubscriptionStmt;
typedef struct DropSubscriptionStmt
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 1104886bef..4a1185a4f6 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -563,9 +563,10 @@ typedef struct PgStat_MsgSubscriptionErr
Oid m_subid;
Oid m_subrelid;
- /* The clear messages use below field */
+ /* The clear messages use below fields */
bool m_reset; /* clear all fields and set reset_stats
* timestamp */
+ bool m_clear; /* clear all fields except for total_failure */
/* The error report message uses below fields */
Oid m_databaseid;
@@ -1101,6 +1102,7 @@ extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type t
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
extern void pgstat_reset_subscription_error(Oid subid, Oid subrelid);
+extern void pgstat_clear_subscription_error(Oid subid, Oid subrelid);
extern void pgstat_report_autovac(Oid dboid);
extern void pgstat_report_vacuum(Oid tableoid, bool shared,
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 67f92b3878..e2ec685f78 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -286,6 +286,28 @@ ERROR: unrecognized subscription parameter: "two_phase"
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
ERROR: cannot set streaming = true for two-phase enabled subscription
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = '4294967295');
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid, synchronous_commit, binary, streaming);
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1.1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+ERROR: invalid transaction id
+-- fail - unsupported parameters
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ERROR: unrecognized subscription parameter: "connect"
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+ERROR: unrecognized subscription parameter: "enabled"
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+ERROR: RESET must not include values for parameters
\dRs+
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 88743ab33b..2412b28422 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -220,6 +220,25 @@ ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = '4294967295');
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid, synchronous_commit, binary, streaming);
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1.1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+
+-- fail - unsupported parameters
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+
\dRs+
DROP SUBSCRIPTION regress_testsub;
diff --git a/src/test/subscription/t/023_skip_xact.pl b/src/test/subscription/t/023_skip_xact.pl
new file mode 100644
index 0000000000..7b29828cce
--- /dev/null
+++ b/src/test/subscription/t/023_skip_xact.pl
@@ -0,0 +1,185 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Test skipping logical replication transactions
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 8;
+
+sub test_subscription_error
+{
+ my ($node, $expected, $source, $relname, $msg) = @_;
+
+ # Wait for the error statistics to be updated.
+ $node->poll_query_until(
+ 'postgres', qq[
+SELECT count(1) > 0 FROM pg_stat_subscription_errors
+WHERE relid = '$relname'::regclass AND failure_source = '$source';
+]) or die "Timed out while waiting for statistics to be updated";
+
+ my $result = $node->safe_psql(
+ 'postgres',
+ qq[
+SELECT datname, subname, command, relid::regclass, failure_source, failure_count > 0
+FROM pg_stat_subscription_errors
+WHERE relid = '$relname'::regclass AND failure_source = '$source';
+]);
+ is($result, $expected, $msg);
+}
+
+# Create publisher node.
+my $node_publisher = get_new_node('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = get_new_node('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+
+# don't overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ 'wal_retrieve_retry_interval = 5s');
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we create
+# the same tables but with primary keys. Also, insert some data that will conflict
+# with the data replicated from publisher later.
+$node_publisher->safe_psql('postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int);
+CREATE TABLE test_tab2 (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+INSERT INTO test_tab1 VALUES (1);
+INSERT INTO test_tab2 VALUES (1);
+COMMIT;
+]);
+$node_subscriber->safe_psql('postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int primary key);
+CREATE TABLE test_tab2 (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab2 VALUES (1);
+INSERT INTO test_tab_streaming SELECT 10000, md5(10000::text);
+COMMIT;
+]);
+
+# Check if there is no subscription errors before starting logical replication.
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_errors");
+is($result, qq(0), 'check no subscription error');
+
+# Setup logical replication.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ q[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Start logical replication. The table sync for test_tab2 on tap_sub will enter
+# infinite error due to violating the unique constraint.
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (two_phase = on);");
+my $appname_streaming = 'tap_sub_streaming';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming' PUBLICATION tap_pub_streaming WITH (streaming = on);");
+
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Also wait for initial table sync for test_tab1 and test_tab_streaming to finish.
+$node_subscriber->poll_query_until('postgres',
+ q[
+SELECT count(1) = 2 FROM pg_subscription_rel
+WHERE srrelid in ('test_tab1'::regclass, 'test_tab_streaming'::regclass) AND srsubstate = 'r'
+]) or die "Timed out while waiting for subscriber to synchronize data";
+
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab1");
+is($result, q(1), 'check initial data was copied to subscriber');
+
+# Insert more data to test_tab1, raising an error on the subscriber due to violating
+# the unique constraint on test_tab1.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab1 VALUES (1)");
+
+# Insert enough rows to test_tab_streaming to exceed the 64kB limit, also raising an
+# error on the subscriber for the same reason.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);");
+
+# Check both two errors on tap_sub subscription are reported.
+test_subscription_error($node_subscriber, qq(postgres|tap_sub|INSERT|test_tab1|apply|t),
+ 'apply', 'test_tab1', 'error reporting by the apply worker');
+test_subscription_error($node_subscriber, qq(postgres|tap_sub||test_tab2|tablesync|t),
+ 'tablesync', 'test_tab2', 'error reporting by the table sync worker');
+test_subscription_error($node_subscriber, qq(postgres|tap_sub_streaming|INSERT|test_tab_streaming|apply|t),
+ 'apply', 'test_tab_streaming', 'error reporting by the apply worker');
+
+# Set XIDs of the transactions in question to the subscriptions to skip.
+my $skip_xid1 = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT xid FROM pg_stat_subscription_errors WHERE relid = 'test_tab1'::regclass");
+my $skip_xid2 = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT xid FROM pg_stat_subscription_errors WHERE relid = 'test_tab_streaming'::regclass");
+
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (skip_xid = $skip_xid1)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub_streaming SET (skip_xid = $skip_xid2)");
+
+# Restart the subscriber to restart logical replication without interval.
+$node_subscriber->restart;
+
+# Wait for the transaction in question is skipped.
+$node_subscriber->poll_query_until(
+ 'postgres',
+ q[
+SELECT count(1) = 2 FROM pg_subscription
+WHERE subname in ('tap_sub', 'tap_sub_streaming') AND subskipxid IS NULL
+]) or die "Timed out while waiting for the transaction to be skipped";
+
+# Insert data to test_tab1 that doesn't conflict.
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab1 VALUES (2)");
+
+# Also, insert data to test_tab_streaming that doesn't conflict.
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab_streaming VALUES (10001, md5(10001::text))");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Check the data is successfully replicated after skipping the transaction.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT * FROM test_tab1");
+is($result, q(1
+2), "subscription gets changes after skipped transaction");
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM test_tab_streaming");
+is($result, q(2), "subscription gets changes after skipped transaction");
+
+# Check if the view doesn't show any entries after dropping the subscription.
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+DROP SUBSCRIPTION tap_sub;
+DROP SUBSCRIPTION tap_sub_streaming;
+]);
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_errors");
+is($result, q(0), 'no error after dropping subscription');
--
2.24.3 (Apple Git-128)
v4-0002-Add-pg_stat_logical_replication_error-statistics-.patchapplication/octet-stream; name=v4-0002-Add-pg_stat_logical_replication_error-statistics-.patchDownload
From 84752e871e0c46bbec1b02cdf87dfe092f67dfc2 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 16 Jul 2021 23:10:22 +0900
Subject: [PATCH v4 2/3] Add pg_stat_logical_replication_error statistics view.
This commits adds new system view pg_stat_logical_replication_error,
showing errors happening during applying logical replication changes
as well as during performing initial table synchronization.
It also adds SQL function pg_stat_reset_subscription_errror() to
reset the single subscription error.
---
doc/src/sgml/monitoring.sgml | 151 +++++
src/backend/catalog/pg_subscription.c | 23 +-
src/backend/catalog/system_functions.sql | 2 +
src/backend/catalog/system_views.sql | 27 +
src/backend/commands/subscriptioncmds.c | 30 +-
src/backend/postmaster/pgstat.c | 639 ++++++++++++++++++++
src/backend/replication/logical/tablesync.c | 16 +-
src/backend/replication/logical/worker.c | 48 +-
src/backend/utils/adt/pgstatfuncs.c | 120 ++++
src/backend/utils/error/elog.c | 16 +
src/include/catalog/pg_proc.dat | 13 +
src/include/catalog/pg_subscription_rel.h | 3 +-
src/include/pgstat.h | 112 ++++
src/include/utils/elog.h | 1 +
src/test/regress/expected/rules.out | 22 +
15 files changed, 1192 insertions(+), 31 deletions(-)
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 74a58a916c..ca9eec5e22 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -346,6 +346,15 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
</entry>
</row>
+ <row>
+ <entry><structname>pg_stat_subscription_errors</structname><indexterm><primary>pg_stat_subscription_errors</primary></indexterm></entry>
+ <entry>One row per error happened on subscription, showing information about
+ the subscription errors.
+ See <link linkend="monitoring-pg-stat-subscription-errors">
+ <structname>pg_stat_subscription_errors</structname></link> for details.
+ </entry>
+ </row>
+
<row>
<entry><structname>pg_stat_ssl</structname><indexterm><primary>pg_stat_ssl</primary></indexterm></entry>
<entry>One row per connection (regular and replication), showing information about
@@ -3050,6 +3059,126 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
</sect2>
+ <sect2 id="monitoring-pg-stat-subscription-errors">
+ <title><structname>pg_stat_subscription_errors</structname></title>
+
+ <indexterm>
+ <primary>pg_stat_subscription_errors</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_stat_subscription_errors</structname> view will contain one
+ row per subscription error reported by workers applying logical replication
+ changes and workers handling the initial data copy of the subscribed tables.
+ </para>
+
+ <table id="pg-stat-subscription-errors" xreflabel="pg_stat_subscription-errors">
+ <title><structname>pg_stat_subscription_errors</structname> View</title>
+ <tgroup cols="1">
+ <thead>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ Column Type
+ </para>
+ <para>
+ Description
+ </para></entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>datname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the database in which the subscription is created.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the subscription
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>relid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker is processing when the
+ error happened.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>command</structfield> <type>text</type>
+ </para>
+ <para>
+ Name of command being applied when the error happened. This
+ field is always NULL if the error is reported by
+ <literal>tablesync</literal> worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>xid</structfield> <type>xid</type>
+ </para>
+ <para>
+ Transaction ID of publisher node being applied when the error
+ happened. This field is always NULL if the error is reported
+ by <literal>tablesync</literal> worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>failure_source</structfield> <type>text</type>
+ </para>
+ <para>
+ Type of worker reported the error: <literal>apply</literal> or
+ <literal>tablesync</literal>.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>failure_count</structfield> <type>uint8</type>
+ </para>
+ <para>
+ Number of times error happened on the worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_failure</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which the last error happened.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_failure_message</structfield> <type>text</type>
+ </para>
+ <para>
+ Error message which is reported last failure time.
+ </para></entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ </sect2>
+
<sect2 id="monitoring-pg-stat-ssl-view">
<title><structname>pg_stat_ssl</structname></title>
@@ -5172,6 +5301,28 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
can be granted EXECUTE to run the function.
</para></entry>
</row>
+
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_stat_reset_subscription_error</primary>
+ </indexterm>
+ <function>pg_stat_reset_subscription_error</function> ( <parameter>subid</parameter> <type>oid</type>, <parameter>relid</parameter> <type>oid</type> )
+ <returnvalue>void</returnvalue>
+ </para>
+ <para>
+ Resets statistics of a single subscription error. If
+ the argument <parameter>relid</parameter> is not <literal>NULL</literal>,
+ resets error statistics of the <literal>tablesync</literal> worker for
+ the relation with <parameter>relid</parameter>. Otherwise, resets the
+ error statistics of the <literal>apply</literal> worker running on the
+ subscription with <parameter>subid</parameter>.
+ </para>
+ <para>
+ This function is restricted to superusers by default, but other users
+ can be granted EXECUTE to run the function.
+ </para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index 25021e25a4..d76bdff36a 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -541,18 +541,27 @@ GetSubscriptionRelations(Oid subid)
/*
* Get all relations for subscription that are not in a ready state.
*
- * Returned list is palloc'ed in current memory context.
+ * Returned HTAB is created in current memory context.
*/
-List *
+HTAB *
GetSubscriptionNotReadyRelations(Oid subid)
{
- List *res = NIL;
+ HTAB *htab;
+ HASHCTL hash_ctl;
Relation rel;
HeapTuple tup;
int nkeys = 0;
ScanKeyData skey[2];
SysScanDesc scan;
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(SubscriptionRelState);
+ hash_ctl.hcxt = CurrentMemoryContext;
+ htab = hash_create("not ready relations in subscription",
+ 64,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+
rel = table_open(SubscriptionRelRelationId, AccessShareLock);
ScanKeyInit(&skey[nkeys++],
@@ -577,8 +586,8 @@ GetSubscriptionNotReadyRelations(Oid subid)
subrel = (Form_pg_subscription_rel) GETSTRUCT(tup);
- relstate = (SubscriptionRelState *) palloc(sizeof(SubscriptionRelState));
- relstate->relid = subrel->srrelid;
+ relstate = (SubscriptionRelState *) hash_search(htab, (void *) &subrel->srrelid,
+ HASH_ENTER, NULL);
relstate->state = subrel->srsubstate;
d = SysCacheGetAttr(SUBSCRIPTIONRELMAP, tup,
Anum_pg_subscription_rel_srsublsn, &isnull);
@@ -586,13 +595,11 @@ GetSubscriptionNotReadyRelations(Oid subid)
relstate->lsn = InvalidXLogRecPtr;
else
relstate->lsn = DatumGetLSN(d);
-
- res = lappend(res, relstate);
}
/* Cleanup */
systable_endscan(scan);
table_close(rel, AccessShareLock);
- return res;
+ return htab;
}
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index a416e94d37..c9aa6f04d3 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -639,6 +639,8 @@ REVOKE EXECUTE ON FUNCTION pg_stat_reset_single_function_counters(oid) FROM publ
REVOKE EXECUTE ON FUNCTION pg_stat_reset_replication_slot(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_stat_reset_subscription_error(oid, oid) FROM public;
+
REVOKE EXECUTE ON FUNCTION lo_import(text) FROM public;
REVOKE EXECUTE ON FUNCTION lo_import(text, oid) FROM public;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 55f6e3711d..cd07f2e02f 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1257,3 +1257,30 @@ REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
substream, subtwophasestate, subslotname, subsynccommit, subpublications)
ON pg_subscription TO public;
+
+CREATE VIEW pg_stat_subscription_errors AS
+ SELECT
+ d.datname,
+ sr.subid,
+ s.subname,
+ e.relid,
+ e.command,
+ e.xid,
+ e.failure_source,
+ e.failure_count,
+ e.last_failure,
+ e.last_failure_message,
+ e.stats_reset
+ FROM (SELECT
+ oid as subid,
+ NULL as relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT
+ srsubid as subid,
+ srrelid as relid
+ FROM pg_subscription_rel
+ WHERE srsubstate <> 'r') sr,
+ LATERAL pg_stat_get_subscription_error(sr.subid, sr.relid) e
+ JOIN pg_database d ON (e.datid = d.oid)
+ JOIN pg_subscription s ON (e.subid = s.oid);
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 22ae982328..da02d3bbfa 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -86,7 +86,7 @@ typedef struct SubOpts
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
static void check_duplicates_in_publist(List *publist, Datum *datums);
static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
-static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
+static void ReportSlotConnectionError(HTAB *rstates, Oid subid, char *slotname, char *err);
/*
@@ -1163,7 +1163,9 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
char *err = NULL;
WalReceiverConn *wrconn;
Form_pg_subscription form;
- List *rstates;
+ HTAB *rstates;
+ HASH_SEQ_STATUS hstat;
+ SubscriptionRelState *rstate;
/*
* Lock pg_subscription with AccessExclusiveLock to ensure that the
@@ -1286,9 +1288,9 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
* exclusive lock on the subscription.
*/
rstates = GetSubscriptionNotReadyRelations(subid);
- foreach(lc, rstates)
+ hash_seq_init(&hstat, rstates);
+ while ((rstate = (SubscriptionRelState *) hash_seq_search(&hstat)) != NULL)
{
- SubscriptionRelState *rstate = (SubscriptionRelState *) lfirst(lc);
Oid relid = rstate->relid;
/* Only cleanup resources of tablesync workers */
@@ -1321,8 +1323,9 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
* If there is no slot associated with the subscription, we can finish
* here.
*/
- if (!slotname && rstates == NIL)
+ if (!slotname && hash_get_num_entries(rstates) == 0)
{
+ hash_destroy(rstates);
table_close(rel, NoLock);
return;
}
@@ -1346,7 +1349,7 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
if (!slotname)
{
/* be tidy */
- list_free(rstates);
+ hash_destroy(rstates);
table_close(rel, NoLock);
return;
}
@@ -1358,9 +1361,9 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
PG_TRY();
{
- foreach(lc, rstates)
+ hash_seq_init(&hstat, rstates);
+ while ((rstate = (SubscriptionRelState *) hash_seq_search(&hstat)) != NULL)
{
- SubscriptionRelState *rstate = (SubscriptionRelState *) lfirst(lc);
Oid relid = rstate->relid;
/* Only cleanup resources of tablesync workers */
@@ -1389,7 +1392,7 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
}
}
- list_free(rstates);
+ hash_destroy(rstates);
/*
* If there is a slot associated with the subscription, then drop the
@@ -1641,13 +1644,14 @@ fetch_table_list(WalReceiverConn *wrconn, List *publications)
* them manually, if required.
*/
static void
-ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err)
+ReportSlotConnectionError(HTAB *rstates, Oid subid, char *slotname, char *err)
{
- ListCell *lc;
+ HASH_SEQ_STATUS hstat;
+ SubscriptionRelState *rstate;
- foreach(lc, rstates)
+ hash_seq_init(&hstat, rstates);
+ while ((rstate = (SubscriptionRelState *) hash_seq_search(&hstat)) != NULL)
{
- SubscriptionRelState *rstate = (SubscriptionRelState *) lfirst(lc);
Oid relid = rstate->relid;
/* Only cleanup resources of tablesync workers */
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 11702f2a80..4a35e6640f 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -41,6 +41,8 @@
#include "catalog/partition.h"
#include "catalog/pg_database.h"
#include "catalog/pg_proc.h"
+#include "catalog/pg_subscription.h"
+#include "catalog/pg_subscription_rel.h"
#include "common/ip.h"
#include "executor/instrument.h"
#include "libpq/libpq.h"
@@ -106,6 +108,7 @@
#define PGSTAT_TAB_HASH_SIZE 512
#define PGSTAT_FUNCTION_HASH_SIZE 512
#define PGSTAT_REPLSLOT_HASH_SIZE 32
+#define PGSTAT_SUBSCRIPTION_ERR_HASH_SIZE 32
/* ----------
@@ -279,6 +282,7 @@ static PgStat_GlobalStats globalStats;
static PgStat_WalStats walStats;
static PgStat_SLRUStats slruStats[SLRU_NUM_ELEMENTS];
static HTAB *replSlotStatHash = NULL;
+static HTAB *subscriptionErrHash = NULL;
/*
* List of OIDs of databases we need to write out. If an entry is InvalidOid,
@@ -320,6 +324,12 @@ static bool pgstat_db_requested(Oid databaseid);
static PgStat_StatReplSlotEntry *pgstat_get_replslot_entry(NameData name, bool create_it);
static void pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotstats, TimestampTz ts);
+static PgStat_StatSubErrEntry *pgstat_get_subscription_error_entry(Oid subid,
+ bool create);
+static PgStat_StatSubRelErrEntry *pgstat_get_subscription_rel_error_entry(Oid subid,
+ Oid subrelid,
+ bool create);
+
static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg);
static void pgstat_send_funcstats(void);
static void pgstat_send_slru(void);
@@ -358,6 +368,10 @@ static void pgstat_recv_checksum_failure(PgStat_MsgChecksumFailure *msg, int len
static void pgstat_recv_connstat(PgStat_MsgConn *msg, int len);
static void pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len);
static void pgstat_recv_tempfile(PgStat_MsgTempFile *msg, int len);
+static void pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len);
+static void pgstat_recv_subscription_error_purge(PgStat_MsgSubscriptionErrPurge *msg,
+ int len);
+static void pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len);
/* ------------------------------------------------------------
* Public functions called from postmaster follow
@@ -1134,6 +1148,133 @@ pgstat_vacuum_stat(void)
}
}
+ /*
+ * Search for all the dead subscriptions and error entries in stats
+ * hashtable and tell the stats collector to drop them.
+ */
+ if (subscriptionErrHash)
+ {
+ PgStat_MsgSubscriptionPurge submsg;
+ PgStat_StatSubErrEntry *suberrent;
+ HTAB *htab;
+
+ htab = pgstat_collect_oids(SubscriptionRelationId, Anum_pg_subscription_oid);
+
+ submsg.m_nentries = 0;
+
+ hash_seq_init(&hstat, subscriptionErrHash);
+ while ((suberrent = (PgStat_StatSubErrEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ PgStat_MsgSubscriptionErrPurge errmsg;
+ PgStat_StatSubRelErrEntry *relerrent;
+ HASH_SEQ_STATUS hstat_rel;
+ HTAB *rstates;
+
+ CHECK_FOR_INTERRUPTS();
+
+ if (hash_search(htab, (void *) &(suberrent->subid), HASH_FIND, NULL) == NULL)
+ {
+ /* This subscription is dead, add subid to the message */
+ submsg.m_subids[submsg.m_nentries++] = suberrent->subid;
+
+ /*
+ * If the message is full, send it out and reinitialize to
+ * empty
+ */
+ if (submsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONPURGE)
+ {
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + submsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&submsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(&submsg, len);
+ submsg.m_nentries = 0;
+ }
+
+ continue;
+ }
+
+ /*
+ * Nothing to do here if the subscription exists but has no error
+ * entries.
+ */
+ if (suberrent->suberrors == NULL)
+ continue;
+
+ /*
+ * The subscription has error entries. We search errors of the
+ * table sync workers who are already in sync state. Those errors
+ * should be removed.
+ *
+ * Note that the lifetime of error entries of the apply worker and
+ * the table sync worker are different. The former lives until
+ * the subscription is dropped whereas the latter lives the table
+ * synchronization is completed.
+ */
+ rstates = GetSubscriptionNotReadyRelations(suberrent->subid);
+
+ errmsg.m_nentries = 0;
+ errmsg.m_subid = suberrent->subid;
+
+ hash_seq_init(&hstat_rel, suberrent->suberrors);
+ while ((relerrent = (PgStat_StatSubRelErrEntry *) hash_seq_search(&hstat_rel)) != NULL)
+ {
+ CHECK_FOR_INTERRUPTS();
+
+ /* Skip the apply worker's error */
+ if (!OidIsValid(relerrent->subrelid))
+ continue;
+
+ /*
+ * Add the relid to the message if the table synchronization
+ * for this relation already complete or the table is no
+ * longer subscribed.
+ */
+ if (hash_search(rstates, (void *) &(relerrent->subrelid), HASH_FIND, NULL) == NULL)
+ errmsg.m_relids[errmsg.m_nentries++] = relerrent->subrelid;
+
+ /*
+ * If the message is full, send it out and reinitialize to
+ * empty
+ */
+ if (errmsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONERRPURGE)
+ {
+ len = offsetof(PgStat_MsgSubscriptionErrPurge, m_relids[0])
+ + errmsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&errmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE);
+ pgstat_send(&errmsg, len);
+ errmsg.m_nentries = 0;
+ }
+ }
+
+ /* Send the rest of dead error entries */
+ if (errmsg.m_nentries > 0)
+ {
+ len = offsetof(PgStat_MsgSubscriptionErrPurge, m_relids[0])
+ + errmsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&errmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE);
+ pgstat_send(&errmsg, len);
+ errmsg.m_nentries = 0;
+ }
+
+ hash_destroy(rstates);
+ }
+
+ /* Send the rest of dead subscriptions */
+ if (submsg.m_nentries > 0)
+ {
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + submsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&submsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(&submsg, len);
+ }
+
+ hash_destroy(htab);
+ }
+
/*
* Lookup our own database entry; if not found, nothing more to do.
*/
@@ -1543,6 +1684,25 @@ pgstat_reset_replslot_counter(const char *name)
pgstat_send(&msg, sizeof(msg));
}
+/* ----------
+ * pgstat_reset_subscription_error() -
+ *
+ * Tell the collector about reset the error of subscription.
+ * ----------
+ */
+void
+pgstat_reset_subscription_error(Oid subid, Oid subrelid)
+{
+ PgStat_MsgSubscriptionErr msg;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_reset = true;
+
+ pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
+}
+
/* ----------
* pgstat_report_autovac() -
*
@@ -1863,6 +2023,37 @@ pgstat_report_replslot_drop(const char *slotname)
pgstat_send(&msg, sizeof(PgStat_MsgReplSlot));
}
+/* ----------
+ * pgstat_report_subscription_error() -
+ *
+ * Tell the collector about error of subscription.
+ * ----------
+ */
+void
+pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command, TransactionId xid,
+ const char *errmsg)
+{
+ PgStat_MsgSubscriptionErr msg;
+ int len;
+
+ len = offsetof(PgStat_MsgSubscriptionErr, m_errmsg[0]) + strlen(errmsg);
+ Assert(len < PGSTAT_MAX_MSG_SIZE);
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_databaseid = MyDatabaseId;
+ msg.m_relid = relid;
+ msg.m_reset = false;
+ msg.m_command = command;
+ msg.m_xid = xid;
+ msg.m_last_failure = GetCurrentTimestamp();
+ strlcpy(msg.m_errmsg, errmsg, PGSTAT_SUBSCRIPTIONERR_MSGLEN);
+
+ pgstat_send(&msg, len);
+}
+
/* ----------
* pgstat_ping() -
*
@@ -2895,6 +3086,38 @@ pgstat_fetch_replslot(NameData slotname)
return pgstat_get_replslot_entry(slotname, false);
}
+/*
+ * ---------
+ * pgstat_fetch_subscription_error() -
+ *
+ * Support function for the SQL-callable pgstat* functions. Returns
+ * a pointer to the subscription errors struct.
+ * ---------
+ */
+PgStat_StatSubErrEntry *
+pgstat_fetch_subscription_error(Oid subid)
+{
+ backend_read_statsfile();
+
+ return pgstat_get_subscription_error_entry(subid, false);
+}
+
+/*
+ * ---------
+ * pgstat_fetch_subscription_rel_error() -
+ *
+ * Support function for the SQL-callable pgstat* functions. Returns
+ * a pointer to the subscription error struct.
+ * ---------
+ */
+PgStat_StatSubRelErrEntry *
+pgstat_fetch_subscription_rel_error(Oid subid, Oid relid)
+{
+ backend_read_statsfile();
+
+ return pgstat_get_subscription_rel_error_entry(subid, relid, false);
+}
+
/*
* Shut down a single backend's statistics reporting at process exit.
*
@@ -3424,6 +3647,19 @@ PgstatCollectorMain(int argc, char *argv[])
pgstat_recv_connstat(&msg.msg_conn, len);
break;
+ case PGSTAT_MTYPE_SUBSCRIPTIONERR:
+ pgstat_recv_subscription_error(&msg.msg_subscriptionerr, len);
+ break;
+
+ case PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE:
+ pgstat_recv_subscription_error_purge(&msg.msg_subscriptionerrpurge,
+ len);
+ break;
+
+ case PGSTAT_MTYPE_SUBSCRIPTIONPURGE:
+ pgstat_recv_subscription_purge(&msg.msg_subscriptionpurge, len);
+ break;
+
default:
break;
}
@@ -3725,6 +3961,50 @@ pgstat_write_statsfiles(bool permanent, bool allDbs)
}
}
+ /*
+ * Write subscription error structs
+ */
+ if (subscriptionErrHash)
+ {
+ PgStat_StatSubErrEntry *suberrent;
+
+ hash_seq_init(&hstat, subscriptionErrHash);
+ while ((suberrent = (PgStat_StatSubErrEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ PgStat_StatSubRelErrEntry *relerrent;
+ HASH_SEQ_STATUS relhstat;
+ long nrels = hash_get_num_entries(suberrent->suberrors);
+
+ /* Skip this subscription if not have any errors */
+ if (suberrent->suberrors == NULL)
+ continue;
+
+ fputc('S', fpout);
+ rc = fwrite(suberrent, sizeof(PgStat_StatSubErrEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+
+ /* The number of errors follows */
+ rc = fwrite(&nrels, sizeof(long), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+
+ /* Then, the error entries follow */
+ hash_seq_init(&relhstat, suberrent->suberrors);
+ while ((relerrent = (PgStat_StatSubRelErrEntry *) hash_seq_search(&relhstat)) != NULL)
+ {
+ /*
+ * XXX we write the whole PgStat_StatSubRelErrEntry entry that
+ * contains the fixed-legnth error message string which is
+ * PGSTAT_SUBSCRIPTIONERR_MSGLEN in length, making the stats
+ * file bloat. We don't expect we have many error entries but
+ * if the expectation became false we should write the string
+ * and its length instead.
+ */
+ rc = fwrite(relerrent, sizeof(PgStat_StatSubRelErrEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+ }
+ }
+ }
+
/*
* No more output to be done. Close the temp file and replace the old
* pgstat.stat with it. The ferror() check replaces testing for error
@@ -4184,6 +4464,99 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
break;
}
+ /*
+ * 'S' A PgStat_StatSubErrEntry struct followed by the number
+ * of errors and PgStat_StatSubRelErrEntry structs, describing
+ * subscription errors.
+ */
+ case 'S':
+ {
+ PgStat_StatSubErrEntry suberrbuf;
+ PgStat_StatSubErrEntry *suberrent;
+ long nerrors;
+
+ /* Read the subscription entry */
+ if (fread(&suberrbuf, 1, sizeof(PgStat_StatSubErrEntry), fpin) !=
+ sizeof(PgStat_StatSubErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /* Create hash table if we don't have it already. */
+ if (subscriptionErrHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubErrEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ subscriptionErrHash = hash_create("Subscription error hash",
+ PGSTAT_SUBSCRIPTION_ERR_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ }
+
+ /* Enter the subscription error entry */
+ suberrent =
+ (PgStat_StatSubErrEntry *) hash_search(subscriptionErrHash,
+ (void *) &(suberrbuf.subid),
+ HASH_ENTER, NULL);
+ suberrent->suberrors = NULL;
+
+ /* Read the number of errors in the subscription */
+ if (fread(&nerrors, 1, sizeof(long), fpin) != sizeof(long))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ for (int i = 0; i < nerrors; i++)
+ {
+ PgStat_StatSubRelErrEntry *subrelent;
+ PgStat_StatSubRelErrEntry subrelbuf;
+
+ if (fread(&subrelbuf, 1, sizeof(PgStat_StatSubRelErrEntry), fpin) !=
+ sizeof(PgStat_StatSubRelErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ if (suberrent->suberrors == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubRelErrEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ suberrent->suberrors = hash_create("Subscription relation error hash",
+ PGSTAT_SUBSCRIPTION_ERR_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ }
+
+ /*
+ * Enter the error information to the subscription
+ * hash
+ */
+ subrelent =
+ (PgStat_StatSubRelErrEntry *) hash_search(suberrent->suberrors,
+ (void *) &(subrelbuf.subrelid),
+ HASH_ENTER, NULL);
+
+ memcpy(subrelent, &subrelbuf, sizeof(PgStat_StatSubRelErrEntry));
+ }
+
+ break;
+ }
+
case 'E':
goto done;
@@ -4526,6 +4899,50 @@ pgstat_read_db_statsfile_timestamp(Oid databaseid, bool permanent,
}
break;
+ /*
+ * 'S' A PgStat_StatSubErrEntry struct followed by the number
+ * of errors and PgStat_StatSubRelErrEntry structs describing
+ * a subscription errors.
+ */
+ case 'S':
+ {
+ PgStat_StatSubErrEntry mySubErrs;
+ PgStat_StatSubRelErrEntry subrelbuf;
+ long nrels;
+
+ if (fread(&mySubErrs, 1, sizeof(PgStat_StatSubErrEntry), fpin)
+ != sizeof(PgStat_StatSubErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ FreeFile(fpin);
+ return false;
+ }
+
+ if (fread(&nrels, 1, sizeof(long), fpin) != sizeof(long))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ for (int i = 0; i < nrels; i++)
+ {
+ if (fread(&subrelbuf, 1, sizeof(PgStat_StatSubRelErrEntry), fpin) !=
+ sizeof(PgStat_StatSubRelErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+ }
+ }
+
+ break;
+
case 'E':
goto done;
@@ -4716,6 +5133,7 @@ pgstat_clear_snapshot(void)
pgStatLocalContext = NULL;
pgStatDBHash = NULL;
replSlotStatHash = NULL;
+ subscriptionErrHash = NULL;
/*
* Historically the backend_status.c facilities lived in this file, and
@@ -5650,6 +6068,122 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
}
}
+/* ----------
+ * pgstat_recv_subscription_error() -
+ *
+ * Process a SUBSCRIPTIONERR message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len)
+{
+ PgStat_StatSubRelErrEntry *relerrent;
+ bool create = !msg->m_reset;
+
+ /* Get subscription error */
+ relerrent = pgstat_get_subscription_rel_error_entry(msg->m_subid,
+ msg->m_subrelid,
+ create);
+
+ if (msg->m_reset)
+ {
+ if (relerrent)
+ return;
+
+ /* reset fields and set reset timestamp */
+ relerrent->relid = InvalidOid;
+ relerrent->command = 0;
+ relerrent->xid = InvalidTransactionId;
+ relerrent->failure_count = 0;
+ relerrent->last_failure = 0;
+ relerrent->errmsg[0] = '\0';
+ relerrent->stat_reset_timestamp = GetCurrentTimestamp();
+ }
+ else
+ {
+ Assert(relerrent);
+
+ /* update the error entry */
+ relerrent->databaseid = msg->m_databaseid;
+ relerrent->relid = msg->m_relid;
+ relerrent->command = msg->m_command;
+ relerrent->xid = msg->m_xid;
+ relerrent->failure_count++;
+ relerrent->last_failure = msg->m_last_failure;
+ strlcpy(relerrent->errmsg, msg->m_errmsg, PGSTAT_SUBSCRIPTIONERR_MSGLEN);
+ }
+}
+
+/* ----------
+ * pgstat_recv_subscription_purge() -
+ *
+ * Process a SUBSCRIPTIONPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len)
+{
+ if (subscriptionErrHash == NULL)
+ return;
+
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ PgStat_StatSubErrEntry *suberrent;
+
+ suberrent = hash_search(subscriptionErrHash, (void *) &(msg->m_subids[i]),
+ HASH_FIND, NULL);
+
+ /*
+ * Nothing to do if the subscription entry is not found. This could
+ * happen when the subscription is dropped and the message for
+ * dropping subscription entry arrived before the message for
+ * reporting the error.
+ */
+ if (suberrent == NULL)
+ continue;
+
+ /* Cleanup the hash table for errors */
+ if (suberrent->suberrors != NULL)
+ hash_destroy(suberrent->suberrors);
+
+ (void) hash_search(subscriptionErrHash, (void *) &(msg->m_subids[i]),
+ HASH_REMOVE, NULL);
+ }
+}
+
+/* ----------
+ * pgstat_recv_subscription_error_purge() -
+ *
+ * Process a SUBSCRIPTIONERRPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_error_purge(PgStat_MsgSubscriptionErrPurge *msg, int len)
+{
+ if (subscriptionErrHash == NULL)
+ return;
+
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ PgStat_StatSubErrEntry *suberrent;
+
+ suberrent = hash_search(subscriptionErrHash, (void *) &(msg->m_subid),
+ HASH_FIND, NULL);
+
+ /*
+ * Nothing to do if the subscription entry is not found. This could
+ * happen when the subscription with msg->m_subid is removed and the
+ * corresponding entry is also removed before receiving the error
+ * purge message.
+ */
+ if (suberrent == NULL)
+ continue;
+
+ (void) hash_search(suberrent->suberrors, (void *) &(msg->m_relids[i]),
+ HASH_REMOVE, NULL);
+ }
+}
+
/* ----------
* pgstat_write_statsfile_needed() -
*
@@ -5747,6 +6281,111 @@ pgstat_get_replslot_entry(NameData name, bool create)
return slotent;
}
+/* ----------
+ * pgstat_get_subscription_error_entry
+ *
+ * Return the entry of subscription errors entry with the subscription OID.
+ * Return NULL if not found and the caller didn't request to create it.
+ *
+ * create tells whether to create the new subscription entry if it is not
+ * found.
+ * ----------
+ */
+static PgStat_StatSubErrEntry *
+pgstat_get_subscription_error_entry(Oid subid, bool create)
+{
+ PgStat_StatSubErrEntry *suberrent;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);
+ bool found;
+
+ if (subscriptionErrHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ if (!create)
+ return NULL;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubErrEntry);
+ subscriptionErrHash = hash_create("Subscription error hash",
+ PGSTAT_SUBSCRIPTION_ERR_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+ }
+
+ suberrent = (PgStat_StatSubErrEntry *) hash_search(subscriptionErrHash,
+ (void *) &subid,
+ action, &found);
+
+ /* initialize fields */
+ if (create && !found)
+ suberrent->suberrors = NULL;
+
+ return suberrent;
+}
+
+/* ----------
+ * pgstat_get_subscription_rel_error_entry
+ *
+ * Return the entry of subscription relation error entry with the subscription
+ * OID and relcation OID. Return NULL if not found and the caller didn't
+ * request to create it.
+ *
+ * create tells whether to create the new subscription relation entry if it is
+ * not found.
+ * ----------
+ */
+static PgStat_StatSubRelErrEntry *
+pgstat_get_subscription_rel_error_entry(Oid subid, Oid subrelid, bool create)
+{
+ PgStat_StatSubErrEntry *suberrent;
+ PgStat_StatSubRelErrEntry *relerrent;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);
+ bool found;
+
+ suberrent = pgstat_get_subscription_error_entry(subid, create);
+
+ if (suberrent == NULL)
+ {
+ Assert(!create);
+ return NULL;
+ }
+
+ if (suberrent->suberrors == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ if (!create)
+ return NULL;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubRelErrEntry);
+ suberrent->suberrors = hash_create("Subscription relation error hash",
+ PGSTAT_SUBSCRIPTION_ERR_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+ }
+
+ relerrent = (PgStat_StatSubRelErrEntry *) hash_search(suberrent->suberrors,
+ (void *) &subrelid,
+ action, &found);
+
+ /* initialize fields */
+ if (create && !found)
+ {
+ relerrent->databaseid = InvalidOid;
+ relerrent->relid = InvalidOid;
+ relerrent->command = 0;
+ relerrent->xid = InvalidTransactionId;
+ relerrent->failure_count = 0;
+ relerrent->last_failure = 0;
+ relerrent->errmsg[0] = '\0';
+ relerrent->stat_reset_timestamp = 0;
+ }
+
+ return relerrent;
+}
+
/* ----------
* pgstat_reset_replslot
*
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a43c..8765396432 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -1175,8 +1175,8 @@ FetchTableStates(bool *started_tx)
if (!table_states_valid)
{
MemoryContext oldctx;
- List *rstates;
- ListCell *lc;
+ HTAB *rstates;
+ HASH_SEQ_STATUS hstat;
SubscriptionRelState *rstate;
/* Clean the old lists. */
@@ -1194,14 +1194,18 @@ FetchTableStates(bool *started_tx)
/* Allocate the tracking info in a permanent memory context. */
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- foreach(lc, rstates)
+ hash_seq_init(&hstat, rstates);
+ while ((rstate = (SubscriptionRelState *) hash_seq_search(&hstat)) != NULL)
{
- rstate = palloc(sizeof(SubscriptionRelState));
- memcpy(rstate, lfirst(lc), sizeof(SubscriptionRelState));
- table_states_not_ready = lappend(table_states_not_ready, rstate);
+ SubscriptionRelState *r = palloc(sizeof(SubscriptionRelState));
+
+ memcpy(r, rstate, sizeof(SubscriptionRelState));
+ table_states_not_ready = lappend(table_states_not_ready, r);
}
MemoryContextSwitchTo(oldctx);
+ hash_destroy(rstates);
+
/*
* Does the subscription have tables?
*
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index d346377b20..4f9c4e9014 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -227,6 +227,7 @@ typedef struct ApplyErrCallbackArg
LogicalRepMsgType command; /* 0 if invalid */
/* Local relation information */
+ Oid relid; /* used for error reporting */
char *nspname; /* used for error context */
char *relname; /* used for error context */
@@ -236,6 +237,7 @@ typedef struct ApplyErrCallbackArg
static ApplyErrCallbackArg apply_error_callback_arg =
{
.command = 0,
+ .relid = InvalidOid,
.relname = NULL,
.nspname = NULL,
.remote_xid = InvalidTransactionId,
@@ -3522,8 +3524,26 @@ ApplyWorkerMain(Datum main_arg)
{
char *syncslotname;
- /* This is table synchronization worker, call initial sync. */
- syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ PG_TRY();
+ {
+ /* This is table synchronization worker, call initial sync. */
+ syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ }
+ PG_CATCH();
+ {
+ elog(NOTICE, "errmsg \"%s\"",
+ geterrmessage());
+
+ /* report the table sync error */
+ pgstat_report_subscription_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ MyLogicalRepWorker->relid,
+ 0,
+ InvalidTransactionId,
+ geterrmessage());
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
/* allocate slot name in long-lived context */
myslotname = MemoryContextStrdup(ApplyContext, syncslotname);
@@ -3641,7 +3661,24 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- LogicalRepApplyLoop(origin_startpos);
+ PG_TRY();
+ {
+ LogicalRepApplyLoop(origin_startpos);
+ }
+ PG_CATCH();
+ {
+ /* report the apply error */
+ if (apply_error_callback_arg.command != 0)
+ pgstat_report_subscription_error(MySubscription->oid,
+ InvalidOid,
+ apply_error_callback_arg.relid,
+ apply_error_callback_arg.command,
+ apply_error_callback_arg.remote_xid,
+ geterrmessage());
+
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
proc_exit(0);
}
@@ -3686,6 +3723,7 @@ apply_error_callback(void *arg)
static void
set_apply_error_context_rel(LogicalRepRelMapEntry *rel)
{
+ apply_error_callback_arg.relid = rel->localreloid;
apply_error_callback_arg.nspname = rel->remoterel.nspname;
apply_error_callback_arg.relname = rel->remoterel.relname;
}
@@ -3694,6 +3732,7 @@ set_apply_error_context_rel(LogicalRepRelMapEntry *rel)
static void
reset_apply_error_context_rel(void)
{
+ apply_error_callback_arg.relid = InvalidOid;
apply_error_callback_arg.nspname = NULL;
apply_error_callback_arg.relname = NULL;
}
@@ -3724,6 +3763,7 @@ set_logicalrep_error_context_rel(Relation rel)
{
if (IsLogicalWorker())
{
+ apply_error_callback_arg.relid = RelationGetRelid(rel);
apply_error_callback_arg.nspname =
get_namespace_name(RelationGetNamespace(rel));
apply_error_callback_arg.relname =
@@ -3737,6 +3777,8 @@ reset_logicalrep_error_context_rel(void)
{
if (IsLogicalWorker())
{
+ apply_error_callback_arg.relid = InvalidOid;
+
if (apply_error_callback_arg.nspname)
pfree(apply_error_callback_arg.nspname);
apply_error_callback_arg.nspname = NULL;
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index f0e09eae4d..f1348a415e 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -24,6 +24,8 @@
#include "pgstat.h"
#include "postmaster/bgworker_internals.h"
#include "postmaster/postmaster.h"
+#include "replication/logicalproto.h"
+#include "replication/logicalworker.h"
#include "replication/slot.h"
#include "storage/proc.h"
#include "storage/procarray.h"
@@ -2240,6 +2242,23 @@ pg_stat_reset_replication_slot(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
+/* Reset a subscription error stats */
+Datum
+pg_stat_reset_subscription_error(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid;
+
+ if (PG_ARGISNULL(1))
+ relid = InvalidOid; /* reset apply worker error stats */
+ else
+ relid = PG_GETARG_OID(1); /* reset table sync worker error stats */
+
+ pgstat_reset_subscription_error(subid, relid);
+
+ PG_RETURN_VOID();
+}
+
Datum
pg_stat_get_archiver(PG_FUNCTION_ARGS)
{
@@ -2380,3 +2399,104 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
/* Returns the record as Datum */
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
+
+/*
+ * Get the subscription error for the given subscription (and relation).
+ */
+Datum
+pg_stat_get_subscription_error(PG_FUNCTION_ARGS)
+{
+#define PG_STAT_GET_SUBSCRIPTION_ERROR_COLS 10
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid;
+ TupleDesc tupdesc;
+ Datum values[PG_STAT_GET_SUBSCRIPTION_ERROR_COLS];
+ bool nulls[PG_STAT_GET_SUBSCRIPTION_ERROR_COLS];
+ PgStat_StatSubRelErrEntry *relerrent;
+
+ /* Initialise values and NULL flags arrays */
+ MemSet(values, 0, sizeof(values));
+ MemSet(nulls, 0, sizeof(nulls));
+
+ /* Initialise attributes information in the tuple descriptor */
+ tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_SUBSCRIPTION_ERROR_COLS);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "datid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 3, "relid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 4, "command",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 5, "xid",
+ XIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 6, "failure_source",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 7, "failure_count",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 8, "last_failure",
+ TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 9, "last_failure_message",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 10, "stats_reset",
+ TIMESTAMPTZOID, -1, 0);
+ BlessTupleDesc(tupdesc);
+
+ if (PG_ARGISNULL(1))
+ relid = InvalidOid;
+ else
+ relid = PG_GETARG_OID(1);
+
+ /* Get subscription errors */
+ relerrent = pgstat_fetch_subscription_rel_error(subid, relid);
+
+ /* Return NULL if the subscription doesn't have any errors */
+ if (relerrent == NULL)
+ PG_RETURN_NULL();
+
+ values[0] = ObjectIdGetDatum(relerrent->databaseid);
+ values[1] = ObjectIdGetDatum(subid);
+
+ if (OidIsValid(relerrent->relid))
+ values[2] = ObjectIdGetDatum(relerrent->relid);
+ else
+ nulls[2] = true;
+
+ if (relerrent->command == 0)
+ nulls[3] = true;
+ else
+ {
+ /* should be apply worker */
+ Assert(!OidIsValid(relerrent->subrelid));
+
+ values[3] = CStringGetTextDatum(logicalrep_message_type(relerrent->command));
+ }
+
+ if (TransactionIdIsValid(relerrent->xid))
+ values[4] = TransactionIdGetDatum(relerrent->xid);
+ else
+ nulls[4] = true;
+
+ if (OidIsValid(relerrent->subrelid))
+ values[5] = CStringGetTextDatum("tablesync");
+ else
+ values[5] = CStringGetTextDatum("apply");
+
+
+ values[6] = Int64GetDatum(relerrent->failure_count);
+
+ if (relerrent->last_failure == 0)
+ nulls[7] = true;
+ else
+ values[7] = TimestampTzGetDatum(relerrent->last_failure);
+
+ values[8] = CStringGetTextDatum(relerrent->errmsg);
+
+ if (relerrent->stat_reset_timestamp == 0)
+ nulls[9] = true;
+ else
+ values[9] = TimestampTzGetDatum(relerrent->stat_reset_timestamp);
+
+ /* Returns the record as Datum */
+ PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
+}
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index a3e1c59a82..dd36850016 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -1441,6 +1441,22 @@ getinternalerrposition(void)
return edata->internalpos;
}
+/*
+ * geterrmessage --- return the currently set error message
+ *
+ * This is only intended for use in error callback subroutines, since there
+ * is no other place outside elog.c where the concept is meaningful.
+ */
+const char *
+geterrmessage(void)
+{
+ ErrorData *edata = &errordata[errordata_stack_depth];
+
+ /* we don't bother incrementing recursion_depth */
+ CHECK_STACK_DEPTH();
+
+ return (const char *) edata->message;
+}
/*
* Functions to allow construction of error message strings separately from
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 8cd0252082..044ff52227 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5321,6 +5321,14 @@
proargmodes => '{i,o,o,o,o,o,o,o,o,o,o}',
proargnames => '{slot_name,slot_name,spill_txns,spill_count,spill_bytes,stream_txns,stream_count,stream_bytes,total_txns,total_bytes,stats_reset}',
prosrc => 'pg_stat_get_replication_slot' },
+{ oid => '8523', descr => 'statistics: information about subscription error',
+ proname => 'pg_stat_get_subscription_error', prorows => '1', proisstrict => 'f',
+ proretset => 't', provolatile => 's', proparallel => 'r',
+ prorettype => 'record', proargtypes => 'oid oid',
+ proallargtypes => '{oid,oid,oid,oid,oid,text,xid,text,int8,timestamptz,text,timestamptz}',
+ proargmodes => '{i,i,o,o,o,o,o,o,o,o,o,o}',
+ proargnames => '{subid,relid,datid,subid,relid,command,xid,failure_source,failure_count,last_failure,last_failure_message,stats_reset}',
+ prosrc => 'pg_stat_get_subscription_error' },
{ oid => '6118', descr => 'statistics: information about subscription',
proname => 'pg_stat_get_subscription', prorows => '10', proisstrict => 'f',
proretset => 't', provolatile => 's', proparallel => 'r',
@@ -5708,6 +5716,11 @@
proname => 'pg_stat_reset_replication_slot', proisstrict => 'f',
provolatile => 'v', prorettype => 'void', proargtypes => 'text',
prosrc => 'pg_stat_reset_replication_slot' },
+{ oid => '8524',
+ descr => 'statistics: reset collected statistics for a single subscription error',
+ proname => 'pg_stat_reset_subscription_error', proisstrict => 'f',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid oid',
+ prosrc => 'pg_stat_reset_subscription_error' },
{ oid => '3163', descr => 'current trigger depth',
proname => 'pg_trigger_depth', provolatile => 's', proparallel => 'r',
diff --git a/src/include/catalog/pg_subscription_rel.h b/src/include/catalog/pg_subscription_rel.h
index 632381b4e3..50053cdafc 100644
--- a/src/include/catalog/pg_subscription_rel.h
+++ b/src/include/catalog/pg_subscription_rel.h
@@ -22,6 +22,7 @@
#include "catalog/genbki.h"
#include "catalog/pg_subscription_rel_d.h"
#include "nodes/pg_list.h"
+#include "utils/hsearch.h"
/* ----------------
* pg_subscription_rel definition. cpp turns this into
@@ -89,6 +90,6 @@ extern void RemoveSubscriptionRel(Oid subid, Oid relid);
extern bool HasSubscriptionRelations(Oid subid);
extern List *GetSubscriptionRelations(Oid subid);
-extern List *GetSubscriptionNotReadyRelations(Oid subid);
+extern HTAB *GetSubscriptionNotReadyRelations(Oid subid);
#endif /* PG_SUBSCRIPTION_REL_H */
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 9612c0a6c2..1104886bef 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -14,6 +14,7 @@
#include "datatype/timestamp.h"
#include "portability/instr_time.h"
#include "postmaster/pgarch.h" /* for MAX_XFN_CHARS */
+#include "replication/logicalproto.h"
#include "utils/backend_progress.h" /* for backward compatibility */
#include "utils/backend_status.h" /* for backward compatibility */
#include "utils/hsearch.h"
@@ -66,6 +67,9 @@ typedef enum StatMsgType
PGSTAT_MTYPE_RESETSINGLECOUNTER,
PGSTAT_MTYPE_RESETSLRUCOUNTER,
PGSTAT_MTYPE_RESETREPLSLOTCOUNTER,
+ PGSTAT_MTYPE_SUBSCRIPTIONERR,
+ PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE,
+ PGSTAT_MTYPE_SUBSCRIPTIONPURGE,
PGSTAT_MTYPE_AUTOVAC_START,
PGSTAT_MTYPE_VACUUM,
PGSTAT_MTYPE_ANALYZE,
@@ -539,6 +543,68 @@ typedef struct PgStat_MsgReplSlot
PgStat_Counter m_total_bytes;
} PgStat_MsgReplSlot;
+/* ----------
+ * PgStat_MsgSubscriptionErr Sent by the apply worker or the table sync worker to
+ * update/reset/clear the error happening during logical
+ * replication.
+ * ----------
+ */
+#define PGSTAT_SUBSCRIPTIONERR_MSGLEN 256
+typedef struct PgStat_MsgSubscriptionErr
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * m_subid and m_subrelid are used to determine the subscription and the
+ * reporter of this error. m_subrelid is InvalidOid if reported by the
+ * apply worker, otherwise by the table sync worker. In table sync worker
+ * case, m_subrelid must be the same as m_relid.
+ */
+ Oid m_subid;
+ Oid m_subrelid;
+
+ /* The clear messages use below field */
+ bool m_reset; /* clear all fields and set reset_stats
+ * timestamp */
+
+ /* The error report message uses below fields */
+ Oid m_databaseid;
+ Oid m_relid;
+ LogicalRepMsgType m_command;
+ TransactionId m_xid;
+ TimestampTz m_last_failure;
+ char m_errmsg[PGSTAT_SUBSCRIPTIONERR_MSGLEN];
+} PgStat_MsgSubscriptionErr;
+
+/* ----------
+ * PgStat_MsgSubscriptionErrPurge Sent by the autovacuum purge the subscription
+ * errors.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONERRPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(Oid) - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionErrPurge
+{
+ PgStat_MsgHdr m_hdr;
+ Oid m_subid;
+ int m_nentries;
+ Oid m_relids[PGSTAT_NUM_SUBSCRIPTIONERRPURGE];
+} PgStat_MsgSubscriptionErrPurge;
+
+/* ----------
+ * PgStat_MsgSubscriptionPurge Sent by the autovacuum purge the subscriptions.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionPurge
+{
+ PgStat_MsgHdr m_hdr;
+ int m_nentries;
+ Oid m_subids[PGSTAT_NUM_SUBSCRIPTIONPURGE];
+} PgStat_MsgSubscriptionPurge;
/* ----------
* PgStat_MsgRecoveryConflict Sent by the backend upon recovery conflict
@@ -710,6 +776,9 @@ typedef union PgStat_Msg
PgStat_MsgChecksumFailure msg_checksumfailure;
PgStat_MsgReplSlot msg_replslot;
PgStat_MsgConn msg_conn;
+ PgStat_MsgSubscriptionErr msg_subscriptionerr;
+ PgStat_MsgSubscriptionErrPurge msg_subscriptionerrpurge;
+ PgStat_MsgSubscriptionPurge msg_subscriptionpurge;
} PgStat_Msg;
@@ -908,6 +977,42 @@ typedef struct PgStat_StatReplSlotEntry
TimestampTz stat_reset_timestamp;
} PgStat_StatReplSlotEntry;
+/*
+ * Subscription error statistics kept in the stats collector.
+ *
+ * PgStat_StatSubErrEntry holds all errors associated with the subscription,
+ * reported by the apply worker and the table sync workers. This entry is
+ * created when the first error message with the subscription is reported
+ * and is dropped along with its errors when the subscription is dropped.
+ *
+ * PgStat_StatSubRelErrEntry represents a error happened during logical
+ * replication, reported by the apply worker (subrelid is InvalidOid) or by the
+ * table sync worker (subrelid is a valid OID). The error reported by the apply
+ * worker is dropped when the subscription is dropped whereas the error reported
+ * by the table sync worker is dropped when the table synchronization process
+ * completed.
+ */
+typedef struct PgStat_StatSubErrEntry
+{
+ Oid subid; /* hash table key */
+ HTAB *suberrors;
+} PgStat_StatSubErrEntry;
+
+typedef struct PgStat_StatSubRelErrEntry
+{
+ Oid subrelid; /* InvalidOid if the apply worker, otherwise
+ * the table sync worker. hash table key. */
+ Oid databaseid;
+ Oid relid; /* OID of relation related to the error. Must
+ * be the same as subrelid in the table sync
+ * case. */
+ LogicalRepMsgType command;
+ TransactionId xid;
+ PgStat_Counter failure_count;
+ TimestampTz last_failure;
+ char errmsg[PGSTAT_SUBSCRIPTIONERR_MSGLEN];
+ TimestampTz stat_reset_timestamp;
+} PgStat_StatSubRelErrEntry;
/*
* Working state needed to accumulate per-function-call timing statistics.
@@ -995,6 +1100,7 @@ extern void pgstat_reset_shared_counters(const char *);
extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type type);
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
+extern void pgstat_reset_subscription_error(Oid subid, Oid subrelid);
extern void pgstat_report_autovac(Oid dboid);
extern void pgstat_report_vacuum(Oid tableoid, bool shared,
@@ -1011,6 +1117,9 @@ extern void pgstat_report_checksum_failure(void);
extern void pgstat_report_replslot(const PgStat_StatReplSlotEntry *repSlotStat);
extern void pgstat_report_replslot_create(const char *slotname);
extern void pgstat_report_replslot_drop(const char *slotname);
+extern void pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command,
+ TransactionId xid, const char *errmsg);
extern void pgstat_initialize(void);
@@ -1106,6 +1215,9 @@ extern PgStat_GlobalStats *pgstat_fetch_global(void);
extern PgStat_WalStats *pgstat_fetch_stat_wal(void);
extern PgStat_SLRUStats *pgstat_fetch_slru(void);
extern PgStat_StatReplSlotEntry *pgstat_fetch_replslot(NameData slotname);
+extern PgStat_StatSubErrEntry *pgstat_fetch_subscription_error(Oid subid);
+extern PgStat_StatSubRelErrEntry *pgstat_fetch_subscription_rel_error(Oid subid,
+ Oid relid);
extern void pgstat_count_slru_page_zeroed(int slru_idx);
extern void pgstat_count_slru_page_hit(int slru_idx);
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index f53607e12e..155145a77d 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -222,6 +222,7 @@ extern int err_generic_string(int field, const char *str);
extern int geterrcode(void);
extern int geterrposition(void);
extern int getinternalerrposition(void);
+extern const char *geterrmessage(void);
/*----------
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index e5ab11275d..ffad9790ae 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2094,6 +2094,28 @@ pg_stat_subscription| SELECT su.oid AS subid,
st.latest_end_time
FROM (pg_subscription su
LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid, pid, received_lsn, last_msg_send_time, last_msg_receipt_time, latest_end_lsn, latest_end_time) ON ((st.subid = su.oid)));
+pg_stat_subscription_errors| SELECT d.datname,
+ sr.subid,
+ s.subname,
+ e.relid,
+ e.command,
+ e.xid,
+ e.failure_source,
+ e.failure_count,
+ e.last_failure,
+ e.last_failure_message,
+ e.stats_reset
+ FROM ( SELECT pg_subscription.oid AS subid,
+ NULL::oid AS relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT pg_subscription_rel.srsubid AS subid,
+ pg_subscription_rel.srrelid AS relid
+ FROM pg_subscription_rel
+ WHERE (pg_subscription_rel.srsubstate <> 'r'::"char")) sr,
+ ((LATERAL pg_stat_get_subscription_error(sr.subid, sr.relid) e(datid, subid, relid, command, xid, failure_source, failure_count, last_failure, last_failure_message, stats_reset)
+ JOIN pg_database d ON ((e.datid = d.oid)))
+ JOIN pg_subscription s ON ((e.subid = s.oid)));
pg_stat_sys_indexes| SELECT pg_stat_all_indexes.relid,
pg_stat_all_indexes.indexrelid,
pg_stat_all_indexes.schemaname,
--
2.24.3 (Apple Git-128)
0001-Remove-unused-function-argument-in-apply_handle_comm.patchapplication/octet-stream; name=0001-Remove-unused-function-argument-in-apply_handle_comm.patchDownload
From 780a4f0cf0e2ed8a84710c0a564ec3d8eaeec3ee Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 26 Jul 2021 11:34:36 +0900
Subject: [PATCH 1/4] Remove unused function argument in
apply_handle_commit_internal()
---
src/backend/replication/logical/worker.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index b9a7a7ffbb..186be1a188 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -309,8 +309,7 @@ static void maybe_reread_subscription(void);
/* prototype needed because of stream_commit */
static void apply_dispatch(StringInfo s);
-static void apply_handle_commit_internal(StringInfo s,
- LogicalRepCommitData *commit_data);
+static void apply_handle_commit_internal(LogicalRepCommitData *commit_data);
static void apply_handle_insert_internal(ApplyExecutionData *edata,
ResultRelInfo *relinfo,
TupleTableSlot *remoteslot);
@@ -853,7 +852,7 @@ apply_handle_commit(StringInfo s)
LSN_FORMAT_ARGS(commit_data.commit_lsn),
LSN_FORMAT_ARGS(remote_final_lsn))));
- apply_handle_commit_internal(s, &commit_data);
+ apply_handle_commit_internal(&commit_data);
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(commit_data.end_lsn);
@@ -1390,7 +1389,7 @@ apply_handle_stream_commit(StringInfo s)
elog(DEBUG1, "replayed %d (all) changes from file \"%s\"",
nchanges, path);
- apply_handle_commit_internal(s, &commit_data);
+ apply_handle_commit_internal(&commit_data);
/* unlink the files with serialized changes and subxact info */
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
@@ -1405,7 +1404,7 @@ apply_handle_stream_commit(StringInfo s)
* Helper function for apply_handle_commit and apply_handle_stream_commit.
*/
static void
-apply_handle_commit_internal(StringInfo s, LogicalRepCommitData *commit_data)
+apply_handle_commit_internal(LogicalRepCommitData *commit_data)
{
if (IsTransactionState())
{
--
2.24.3 (Apple Git-128)
v4-0001-Add-errcontext-to-errors-of-the-applying-logical-.patchapplication/octet-stream; name=v4-0001-Add-errcontext-to-errors-of-the-applying-logical-.patchDownload
From 3f9f615a5c8e60725d61463a18f00b6568ab08f0 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 28 Jun 2021 13:21:58 +0900
Subject: [PATCH v4 1/3] Add errcontext to errors of the applying logical
replication changes.
This commit adds the error context to errors happening during applying
logical replication changes, showing the command, the relation
relation, transaction ID, and commit timestamp in the server log.
---
src/backend/commands/tablecmds.c | 7 +
src/backend/replication/logical/proto.c | 49 +++++
src/backend/replication/logical/worker.c | 220 ++++++++++++++++++++---
src/include/replication/logicalproto.h | 1 +
src/include/replication/logicalworker.h | 2 +
5 files changed, 257 insertions(+), 22 deletions(-)
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index fcd778c62a..911bef8312 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -78,6 +78,7 @@
#include "partitioning/partbounds.h"
#include "partitioning/partdesc.h"
#include "pgstat.h"
+#include "replication/logicalworker.h"
#include "rewrite/rewriteDefine.h"
#include "rewrite/rewriteHandler.h"
#include "rewrite/rewriteManip.h"
@@ -1899,6 +1900,9 @@ ExecuteTruncateGuts(List *explicit_rels,
continue;
}
+ /* Set logical replication error callback info if necessary */
+ set_logicalrep_error_context_rel(rel);
+
/*
* Build the lists of foreign tables belonging to each foreign server
* and pass each list to the foreign data wrapper's callback function,
@@ -2006,6 +2010,9 @@ ExecuteTruncateGuts(List *explicit_rels,
pgstat_count_truncate(rel);
}
+ /* Reset logical replication error callback info */
+ reset_logicalrep_error_context_rel();
+
/* Now go through the hash table, and truncate foreign tables */
if (ft_htab)
{
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index a245252529..54fff7df21 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -1109,3 +1109,52 @@ logicalrep_read_stream_abort(StringInfo in, TransactionId *xid,
*xid = pq_getmsgint(in, 4);
*subxid = pq_getmsgint(in, 4);
}
+
+/*
+ * get string representing LogicalRepMsgType.
+ */
+char *
+logicalrep_message_type(LogicalRepMsgType action)
+{
+ switch (action)
+ {
+ case LOGICAL_REP_MSG_BEGIN:
+ return "BEGIN";
+ case LOGICAL_REP_MSG_COMMIT:
+ return "COMMIT";
+ case LOGICAL_REP_MSG_INSERT:
+ return "INSERT";
+ case LOGICAL_REP_MSG_UPDATE:
+ return "UPDATE";
+ case LOGICAL_REP_MSG_DELETE:
+ return "DELETE";
+ case LOGICAL_REP_MSG_TRUNCATE:
+ return "TRUNCATE";
+ case LOGICAL_REP_MSG_RELATION:
+ return "RELATION";
+ case LOGICAL_REP_MSG_TYPE:
+ return "TYPE";
+ case LOGICAL_REP_MSG_ORIGIN:
+ return "ORIGIN";
+ case LOGICAL_REP_MSG_MESSAGE:
+ return "MESSAGE";
+ case LOGICAL_REP_MSG_STREAM_START:
+ return "STREAM START";
+ case LOGICAL_REP_MSG_STREAM_END:
+ return "STREAM END";
+ case LOGICAL_REP_MSG_STREAM_ABORT:
+ return "STREAM ABORT";
+ case LOGICAL_REP_MSG_STREAM_COMMIT:
+ return "STREAM COMMIT";
+ case LOGICAL_REP_MSG_BEGIN_PREPARE:
+ return "BEGIN PREPARE";
+ case LOGICAL_REP_MSG_PREPARE:
+ return "PREPARE";
+ case LOGICAL_REP_MSG_COMMIT_PREPARED:
+ return "COMMIT PREPARED";
+ case LOGICAL_REP_MSG_ROLLBACK_PREPARED:
+ return "ROLLBACK PREPARED";
+ }
+
+ elog(ERROR, "invalid logical replication message type \"%c\"", action);
+}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 186be1a188..d346377b20 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -221,6 +221,27 @@ typedef struct ApplyExecutionData
PartitionTupleRouting *proute; /* partition routing info */
} ApplyExecutionData;
+/* Struct for saving and restoring apply information */
+typedef struct ApplyErrCallbackArg
+{
+ LogicalRepMsgType command; /* 0 if invalid */
+
+ /* Local relation information */
+ char *nspname; /* used for error context */
+ char *relname; /* used for error context */
+
+ TransactionId remote_xid;
+ TimestampTz committs;
+} ApplyErrCallbackArg;
+static ApplyErrCallbackArg apply_error_callback_arg =
+{
+ .command = 0,
+ .relname = NULL,
+ .nspname = NULL,
+ .remote_xid = InvalidTransactionId,
+ .committs = 0,
+};
+
/*
* Stream xid hash entry. Whenever we see a new xid we create this entry in the
* xidhash and along with it create the streaming file and store the fileset handle.
@@ -332,6 +353,10 @@ static void apply_handle_tuple_routing(ApplyExecutionData *edata,
/* Compute GID for two_phase transactions */
static void TwoPhaseTransactionGid(Oid subid, TransactionId xid, char *gid, int szgid);
+static void apply_error_callback(void *arg);
+static void set_apply_error_context_rel(LogicalRepRelMapEntry *rel);
+static void reset_apply_error_context_rel(void);
+static void reset_apply_error_context_info(void);
/*
* Should this worker apply changes for given relation.
@@ -825,6 +850,8 @@ apply_handle_begin(StringInfo s)
LogicalRepBeginData begin_data;
logicalrep_read_begin(s, &begin_data);
+ apply_error_callback_arg.remote_xid = begin_data.xid;
+ apply_error_callback_arg.committs = begin_data.committime;
remote_final_lsn = begin_data.final_lsn;
@@ -858,6 +885,7 @@ apply_handle_commit(StringInfo s)
process_syncing_tables(commit_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -875,6 +903,8 @@ apply_handle_begin_prepare(StringInfo s)
errmsg_internal("tablesync worker received a BEGIN PREPARE message")));
logicalrep_read_begin_prepare(s, &begin_data);
+ apply_error_callback_arg.remote_xid = begin_data.xid;
+ apply_error_callback_arg.committs = begin_data.prepare_time;
remote_final_lsn = begin_data.prepare_lsn;
@@ -949,6 +979,7 @@ apply_handle_prepare(StringInfo s)
process_syncing_tables(prepare_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -961,6 +992,8 @@ apply_handle_commit_prepared(StringInfo s)
char gid[GIDSIZE];
logicalrep_read_commit_prepared(s, &prepare_data);
+ apply_error_callback_arg.remote_xid = prepare_data.xid;
+ apply_error_callback_arg.committs = prepare_data.commit_time;
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
@@ -988,6 +1021,7 @@ apply_handle_commit_prepared(StringInfo s)
process_syncing_tables(prepare_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1000,6 +1034,7 @@ apply_handle_rollback_prepared(StringInfo s)
char gid[GIDSIZE];
logicalrep_read_rollback_prepared(s, &rollback_data);
+ apply_error_callback_arg.remote_xid = rollback_data.xid;
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, rollback_data.xid,
@@ -1037,6 +1072,7 @@ apply_handle_rollback_prepared(StringInfo s)
process_syncing_tables(rollback_data.rollback_end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1093,6 +1129,8 @@ apply_handle_stream_start(StringInfo s)
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg_internal("invalid transaction ID in streamed replication transaction")));
+ apply_error_callback_arg.remote_xid = stream_xid;
+
/*
* Initialize the xidhash table if we haven't yet. This will be used for
* the entire duration of the apply worker so create it in permanent
@@ -1149,6 +1187,7 @@ apply_handle_stream_stop(StringInfo s)
MemoryContextReset(LogicalStreamingContext);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1172,7 +1211,10 @@ apply_handle_stream_abort(StringInfo s)
* just delete the files with serialized info.
*/
if (xid == subxid)
+ {
+ apply_error_callback_arg.remote_xid = xid;
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
+ }
else
{
/*
@@ -1197,6 +1239,7 @@ apply_handle_stream_abort(StringInfo s)
char path[MAXPGPATH];
StreamXidHash *ent;
+ apply_error_callback_arg.remote_xid = subxid;
subidx = -1;
begin_replication_step();
subxact_info_read(MyLogicalRepWorker->subid, xid);
@@ -1221,6 +1264,7 @@ apply_handle_stream_abort(StringInfo s)
cleanup_subxact_info();
end_replication_step();
CommitTransactionCommand();
+ reset_apply_error_context_info();
return;
}
@@ -1252,6 +1296,8 @@ apply_handle_stream_abort(StringInfo s)
end_replication_step();
CommitTransactionCommand();
}
+
+ reset_apply_error_context_info();
}
/*
@@ -1276,6 +1322,8 @@ apply_handle_stream_commit(StringInfo s)
errmsg_internal("STREAM COMMIT message without STREAM STOP")));
xid = logicalrep_read_stream_commit(s, &commit_data);
+ apply_error_callback_arg.remote_xid = xid;
+ apply_error_callback_arg.committs = commit_data.committime;
elog(DEBUG1, "received commit for streamed transaction %u", xid);
@@ -1398,6 +1446,8 @@ apply_handle_stream_commit(StringInfo s)
process_syncing_tables(commit_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+
+ reset_apply_error_context_info();
}
/*
@@ -1517,6 +1567,9 @@ apply_handle_insert(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ set_apply_error_context_rel(rel);
+
/* Initialize the executor state. */
edata = create_edata_for_relation(rel);
estate = edata->estate;
@@ -1540,6 +1593,9 @@ apply_handle_insert(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ reset_apply_error_context_rel();
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -1638,6 +1694,9 @@ apply_handle_update(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ set_apply_error_context_rel(rel);
+
/* Check if we can do the update. */
check_relation_updatable(rel);
@@ -1691,6 +1750,9 @@ apply_handle_update(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ reset_apply_error_context_rel();
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -1794,6 +1856,9 @@ apply_handle_delete(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ set_apply_error_context_rel(rel);
+
/* Check if we can do the delete. */
check_relation_updatable(rel);
@@ -1819,6 +1884,9 @@ apply_handle_delete(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ reset_apply_error_context_rel();
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -2223,6 +2291,9 @@ apply_handle_truncate(StringInfo s)
* Even if we used CASCADE on the upstream primary we explicitly default
* to replaying changes without further cascading. This might be later
* changeable with a user specified option.
+ *
+ * Both namespace and relation name for error callback will be set in
+ * ExecuteTruncateGuts().
*/
ExecuteTruncateGuts(rels,
relids,
@@ -2253,44 +2324,54 @@ static void
apply_dispatch(StringInfo s)
{
LogicalRepMsgType action = pq_getmsgbyte(s);
+ ErrorContextCallback errcallback;
+
+ /*
+ * Push apply error context callback. Other fields will be filled during
+ * applying the change.
+ */
+ apply_error_callback_arg.command = action;
+ errcallback.callback = apply_error_callback;
+ errcallback.previous = error_context_stack;
+ error_context_stack = &errcallback;
switch (action)
{
case LOGICAL_REP_MSG_BEGIN:
apply_handle_begin(s);
- return;
+ break;
case LOGICAL_REP_MSG_COMMIT:
apply_handle_commit(s);
- return;
+ break;
case LOGICAL_REP_MSG_INSERT:
apply_handle_insert(s);
- return;
+ break;
case LOGICAL_REP_MSG_UPDATE:
apply_handle_update(s);
- return;
+ break;
case LOGICAL_REP_MSG_DELETE:
apply_handle_delete(s);
- return;
+ break;
case LOGICAL_REP_MSG_TRUNCATE:
apply_handle_truncate(s);
- return;
+ break;
case LOGICAL_REP_MSG_RELATION:
apply_handle_relation(s);
- return;
+ break;
case LOGICAL_REP_MSG_TYPE:
apply_handle_type(s);
- return;
+ break;
case LOGICAL_REP_MSG_ORIGIN:
apply_handle_origin(s);
- return;
+ break;
case LOGICAL_REP_MSG_MESSAGE:
@@ -2299,45 +2380,48 @@ apply_dispatch(StringInfo s)
* Although, it could be used by other applications that use this
* output plugin.
*/
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_START:
apply_handle_stream_start(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_END:
apply_handle_stream_stop(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_ABORT:
apply_handle_stream_abort(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_COMMIT:
apply_handle_stream_commit(s);
- return;
+ break;
case LOGICAL_REP_MSG_BEGIN_PREPARE:
apply_handle_begin_prepare(s);
- return;
+ break;
case LOGICAL_REP_MSG_PREPARE:
apply_handle_prepare(s);
- return;
+ break;
case LOGICAL_REP_MSG_COMMIT_PREPARED:
apply_handle_commit_prepared(s);
- return;
+ break;
case LOGICAL_REP_MSG_ROLLBACK_PREPARED:
apply_handle_rollback_prepared(s);
- return;
+ break;
+
+ default:
+ ereport(ERROR,
+ (errcode(ERRCODE_PROTOCOL_VIOLATION),
+ errmsg("invalid logical replication message type \"%c\"", action)));
}
- ereport(ERROR,
- (errcode(ERRCODE_PROTOCOL_VIOLATION),
- errmsg_internal("invalid logical replication message type \"%c\"",
- action)));
+ /* Pop the error context stack */
+ error_context_stack = errcallback.previous;
}
/*
@@ -3570,3 +3654,95 @@ IsLogicalWorker(void)
{
return MyLogicalRepWorker != NULL;
}
+
+static void
+apply_error_callback(void *arg)
+{
+ StringInfoData buf;
+
+ if (apply_error_callback_arg.command == 0)
+ return;
+
+ initStringInfo(&buf);
+ appendStringInfo(&buf, _("during apply of \"%s\""),
+ logicalrep_message_type(apply_error_callback_arg.command));
+
+ if (apply_error_callback_arg.relname)
+ appendStringInfo(&buf, _(" for relation \"%s.%s\""),
+ apply_error_callback_arg.nspname,
+ apply_error_callback_arg.relname);
+
+ if (TransactionIdIsNormal(apply_error_callback_arg.remote_xid))
+ appendStringInfo(&buf, _(" in transaction with xid %u committs %s"),
+ apply_error_callback_arg.remote_xid,
+ apply_error_callback_arg.committs == 0
+ ? "(unset)"
+ : timestamptz_to_str(apply_error_callback_arg.committs));
+
+ errcontext("%s", buf.data);
+}
+
+/* Set relation information of apply error callback */
+static void
+set_apply_error_context_rel(LogicalRepRelMapEntry *rel)
+{
+ apply_error_callback_arg.nspname = rel->remoterel.nspname;
+ apply_error_callback_arg.relname = rel->remoterel.relname;
+}
+
+/* Reset relation information of apply error callback */
+static void
+reset_apply_error_context_rel(void)
+{
+ apply_error_callback_arg.nspname = NULL;
+ apply_error_callback_arg.relname = NULL;
+}
+
+/* Reset all information of apply error callback */
+static void
+reset_apply_error_context_info(void)
+{
+ apply_error_callback_arg.command = 0;
+ apply_error_callback_arg.remote_xid = InvalidTransactionId;
+ apply_error_callback_arg.committs = 0;
+ reset_apply_error_context_rel();
+}
+
+/*
+ * Set relation information of error callback.
+ *
+ * Both set_logicalrep_error_context_rel() and
+ * reset_logicalrep_error_context_rel() functions are intended to be
+ * used by functions outside of logical replication module where don't
+ * use LogicalRepRelMapEntry.
+ *
+ * The caller must call reset_logicalrep_error_context_rel() after use
+ * so we free the memory used for names.
+ */
+void
+set_logicalrep_error_context_rel(Relation rel)
+{
+ if (IsLogicalWorker())
+ {
+ apply_error_callback_arg.nspname =
+ get_namespace_name(RelationGetNamespace(rel));
+ apply_error_callback_arg.relname =
+ pstrdup(RelationGetRelationName(rel));
+ }
+}
+
+/* Reset relation information for error callback set */
+void
+reset_logicalrep_error_context_rel(void)
+{
+ if (IsLogicalWorker())
+ {
+ if (apply_error_callback_arg.nspname)
+ pfree(apply_error_callback_arg.nspname);
+ apply_error_callback_arg.nspname = NULL;
+
+ if (apply_error_callback_arg.relname)
+ pfree(apply_error_callback_arg.relname);
+ apply_error_callback_arg.relname = NULL;
+ }
+}
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 63de90d94a..c78a4409bc 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -242,5 +242,6 @@ extern void logicalrep_write_stream_abort(StringInfo out, TransactionId xid,
TransactionId subxid);
extern void logicalrep_read_stream_abort(StringInfo in, TransactionId *xid,
TransactionId *subxid);
+extern char *logicalrep_message_type(LogicalRepMsgType action);
#endif /* LOGICAL_PROTO_H */
diff --git a/src/include/replication/logicalworker.h b/src/include/replication/logicalworker.h
index 2ad61a001a..d3e8514ffd 100644
--- a/src/include/replication/logicalworker.h
+++ b/src/include/replication/logicalworker.h
@@ -15,5 +15,7 @@
extern void ApplyWorkerMain(Datum main_arg);
extern bool IsLogicalWorker(void);
+extern void set_logicalrep_error_context_rel(Relation rel);
+extern void reset_logicalrep_error_context_rel(void);
#endif /* LOGICALWORKER_H */
--
2.24.3 (Apple Git-128)
On Thu, Jul 29, 2021 at 11:18 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Jul 29, 2021 at 2:04 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Yeah, it seems to be introduced by commit 0926e96c493. I've attached
the patch for that.Also, I've attached the updated version patches. This version patch
has pg_stat_reset_subscription_error() SQL function and sends a clear
message after skipping the transaction. 0004 patch includes the
skipping transaction feature and introducing RESET to ALTER
SUBSCRIPTION. It would be better to separate them.
+1, to separate out the reset part.
I've attached the new version patches that fix cfbot failure.
Sorry I've attached wrong ones. Reattached the correct version patches.
Pushed the 0001* patch that removes the unused parameter.
Few comments on v4-0001-Add-errcontext-to-errors-of-the-applying-logical-
===========================================================
1.
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -78,6 +78,7 @@
#include "partitioning/partbounds.h"
#include "partitioning/partdesc.h"
#include "pgstat.h"
+#include "replication/logicalworker.h"
#include "rewrite/rewriteDefine.h"
#include "rewrite/rewriteHandler.h"
#include "rewrite/rewriteManip.h"
@@ -1899,6 +1900,9 @@ ExecuteTruncateGuts(List *explicit_rels,
continue;
}
+ /* Set logical replication error callback info if necessary */
+ set_logicalrep_error_context_rel(rel);
+
/*
* Build the lists of foreign tables belonging to each foreign server
* and pass each list to the foreign data wrapper's callback function,
@@ -2006,6 +2010,9 @@ ExecuteTruncateGuts(List *explicit_rels,
pgstat_count_truncate(rel);
}
+ /* Reset logical replication error callback info */
+ reset_logicalrep_error_context_rel();
+
Setting up logical rep error context in a generic function looks a bit
odd to me. Do we really need to set up error context here? I
understand we can't do this in caller but anyway I think we are not
sending this to logical replication view as well, so not sure we need
to do it here.
2.
+/* Struct for saving and restoring apply information */
+typedef struct ApplyErrCallbackArg
+{
+ LogicalRepMsgType command; /* 0 if invalid */
+
+ /* Local relation information */
+ char *nspname; /* used for error context */
+ char *relname; /* used for error context */
+
+ TransactionId remote_xid;
+ TimestampTz committs;
+} ApplyErrCallbackArg;
+static ApplyErrCallbackArg apply_error_callback_arg =
+{
+ .command = 0,
+ .relname = NULL,
+ .nspname = NULL,
+ .remote_xid = InvalidTransactionId,
+ .committs = 0,
+};
+
Better to have a space between the above two declarations.
3. commit message:
This commit adds the error context to errors happening during applying
logical replication changes, showing the command, the relation
relation, transaction ID, and commit timestamp in the server log.
'relation' is mentioned twice.
The patch is not getting applied probably due to yesterday's commit in
this area.
--
With Regards,
Amit Kapila.
On July 29, 2021 1:48 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Sorry I've attached wrong ones. Reattached the correct version patches.
Hi,
I had some comments on the new version patches.
1)
- relstate = (SubscriptionRelState *) palloc(sizeof(SubscriptionRelState));
- relstate->relid = subrel->srrelid;
+ relstate = (SubscriptionRelState *) hash_search(htab, (void *) &subrel->srrelid,
+ HASH_ENTER, NULL);
I found the new version patch changes the List type 'relstate' to hash table type
'relstate'. Will this bring significant performance improvements ?
2)
+ * PgStat_StatSubRelErrEntry represents a error happened during logical
a error => an error
3)
+CREATE VIEW pg_stat_subscription_errors AS
+ SELECT
+ d.datname,
+ sr.subid,
+ s.subname,
It seems the 'subid' column is not mentioned in the document of the
pg_stat_subscription_errors view.
4)
+
+ if (fread(&nrels, 1, sizeof(long), fpin) != sizeof(long))
+ {
...
+ for (int i = 0; i < nrels; i++)
the type of i(int) seems different of the type or 'nrels'(long), it might be
better to use the same type.
Best regards,
houzj
On Fri, Jul 30, 2021 at 12:52 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Jul 29, 2021 at 11:18 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Jul 29, 2021 at 2:04 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Yeah, it seems to be introduced by commit 0926e96c493. I've attached
the patch for that.Also, I've attached the updated version patches. This version patch
has pg_stat_reset_subscription_error() SQL function and sends a clear
message after skipping the transaction. 0004 patch includes the
skipping transaction feature and introducing RESET to ALTER
SUBSCRIPTION. It would be better to separate them.+1, to separate out the reset part.
Okay, I'll do that.
I've attached the new version patches that fix cfbot failure.
Sorry I've attached wrong ones. Reattached the correct version patches.
Pushed the 0001* patch that removes the unused parameter.
Thanks!
Few comments on v4-0001-Add-errcontext-to-errors-of-the-applying-logical-
===========================================================
Thank you for the comments!
1. --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -78,6 +78,7 @@ #include "partitioning/partbounds.h" #include "partitioning/partdesc.h" #include "pgstat.h" +#include "replication/logicalworker.h" #include "rewrite/rewriteDefine.h" #include "rewrite/rewriteHandler.h" #include "rewrite/rewriteManip.h" @@ -1899,6 +1900,9 @@ ExecuteTruncateGuts(List *explicit_rels, continue; }+ /* Set logical replication error callback info if necessary */ + set_logicalrep_error_context_rel(rel); + /* * Build the lists of foreign tables belonging to each foreign server * and pass each list to the foreign data wrapper's callback function, @@ -2006,6 +2010,9 @@ ExecuteTruncateGuts(List *explicit_rels, pgstat_count_truncate(rel); }+ /* Reset logical replication error callback info */ + reset_logicalrep_error_context_rel(); +Setting up logical rep error context in a generic function looks a bit
odd to me. Do we really need to set up error context here? I
understand we can't do this in caller but anyway I think we are not
sending this to logical replication view as well, so not sure we need
to do it here.
Yeah, I'm not convinced of this part yet. I wanted to show relid also
in truncate cases but I came up with only this idea.
If an error happens during truncating the table (in
ExecuteTruncateGuts()), relid set by
set_logicalrep_error_context_rel() is actually sent to the view. If we
don’t have it, the view always shows relid as NULL in truncate cases.
On the other hand, it doesn’t cover all cases. For example, it doesn’t
cover an error that the target table doesn’t exist on the subscriber,
which happens when opening the target table. Anyway, in most cases,
even if relid is NULL, the error message in the view helps users to
know which relation the error happened on. What do you think?
2. +/* Struct for saving and restoring apply information */ +typedef struct ApplyErrCallbackArg +{ + LogicalRepMsgType command; /* 0 if invalid */ + + /* Local relation information */ + char *nspname; /* used for error context */ + char *relname; /* used for error context */ + + TransactionId remote_xid; + TimestampTz committs; +} ApplyErrCallbackArg; +static ApplyErrCallbackArg apply_error_callback_arg = +{ + .command = 0, + .relname = NULL, + .nspname = NULL, + .remote_xid = InvalidTransactionId, + .committs = 0, +}; +Better to have a space between the above two declarations.
Will fix.
3. commit message:
This commit adds the error context to errors happening during applying
logical replication changes, showing the command, the relation
relation, transaction ID, and commit timestamp in the server log.'relation' is mentioned twice.
Will fix.
The patch is not getting applied probably due to yesterday's commit in
this area.
Okay. I'll rebase the patches to the current HEAD.
I'm incorporating all comments from you and Houzj, and will submit the
new patch soon.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Fri, Jul 30, 2021 at 3:47 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
On July 29, 2021 1:48 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Sorry I've attached wrong ones. Reattached the correct version patches.
Hi,
I had some comments on the new version patches.
Thank you for the comments!
1)
- relstate = (SubscriptionRelState *) palloc(sizeof(SubscriptionRelState)); - relstate->relid = subrel->srrelid; + relstate = (SubscriptionRelState *) hash_search(htab, (void *) &subrel->srrelid, + HASH_ENTER, NULL);I found the new version patch changes the List type 'relstate' to hash table type
'relstate'. Will this bring significant performance improvements ?
For pgstat_vacuum_stat() purposes, I think it's better to use a hash
table to avoid O(N) lookup. But it might not be good to change the
type of the return value of GetSubscriptionNotReadyRelations() since
this returned value is used by other functions to iterate over
elements. The list iteration is faster than the hash table’s one. It
would be better to change it so that pgstat_vacuum_stat() constructs a
hash table for its own purpose.
2)
+ * PgStat_StatSubRelErrEntry represents a error happened during logicala error => an error
Will fix.
3) +CREATE VIEW pg_stat_subscription_errors AS + SELECT + d.datname, + sr.subid, + s.subname,It seems the 'subid' column is not mentioned in the document of the
pg_stat_subscription_errors view.
Will fix.
4) + + if (fread(&nrels, 1, sizeof(long), fpin) != sizeof(long)) + { ... + for (int i = 0; i < nrels; i++)the type of i(int) seems different of the type or 'nrels'(long), it might be
better to use the same type.
Will fix.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Mon, Aug 2, 2021 at 7:45 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Jul 30, 2021 at 12:52 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Jul 29, 2021 at 11:18 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Setting up logical rep error context in a generic function looks a bit
odd to me. Do we really need to set up error context here? I
understand we can't do this in caller but anyway I think we are not
sending this to logical replication view as well, so not sure we need
to do it here.Yeah, I'm not convinced of this part yet. I wanted to show relid also
in truncate cases but I came up with only this idea.If an error happens during truncating the table (in
ExecuteTruncateGuts()), relid set by
set_logicalrep_error_context_rel() is actually sent to the view. If we
don’t have it, the view always shows relid as NULL in truncate cases.
On the other hand, it doesn’t cover all cases. For example, it doesn’t
cover an error that the target table doesn’t exist on the subscriber,
which happens when opening the target table. Anyway, in most cases,
even if relid is NULL, the error message in the view helps users to
know which relation the error happened on. What do you think?
Yeah, I also think at this stage error message is sufficient in such cases.
--
With Regards,
Amit Kapila.
On Mon, Aug 2, 2021 at 12:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Aug 2, 2021 at 7:45 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Jul 30, 2021 at 12:52 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Jul 29, 2021 at 11:18 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Setting up logical rep error context in a generic function looks a bit
odd to me. Do we really need to set up error context here? I
understand we can't do this in caller but anyway I think we are not
sending this to logical replication view as well, so not sure we need
to do it here.Yeah, I'm not convinced of this part yet. I wanted to show relid also
in truncate cases but I came up with only this idea.If an error happens during truncating the table (in
ExecuteTruncateGuts()), relid set by
set_logicalrep_error_context_rel() is actually sent to the view. If we
don’t have it, the view always shows relid as NULL in truncate cases.
On the other hand, it doesn’t cover all cases. For example, it doesn’t
cover an error that the target table doesn’t exist on the subscriber,
which happens when opening the target table. Anyway, in most cases,
even if relid is NULL, the error message in the view helps users to
know which relation the error happened on. What do you think?Yeah, I also think at this stage error message is sufficient in such cases.
I've attached new patches that incorporate all comments I got so far.
Please review them.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v5-0004-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchapplication/octet-stream; name=v5-0004-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchDownload
From c42bc5a6ae93a5838a76e641a611d57764e4d2cd Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 2 Aug 2021 14:27:40 +0900
Subject: [PATCH v5 4/4] Add skip_xid option to ALTER SUBSCRIPTION.
If incoming change violates any constraint, lotigcal replication stops
until it's resolved. This commit introduces another way to skip the
transaction in question.
The user can specify XID by ALTER SUBSCRIPTION ... SET (skip_xid =
XXX), updating pg_subscription.subskipxid field, telling the apply
worker to skip the transaction. The apply worker skips all data
modification changes within the specified transaction.
After skipping the transaciton the apply worker clears
pg_subscription.subskipxid. Also, it clears the error statistics of
the subscription in pg_stat_subscription_errors system view as well in
order the user not to get confused. It's done by sending the message
for clearing a subscription error to the stats collector.
---
doc/src/sgml/logical-replication.sgml | 49 +++++-
doc/src/sgml/ref/alter_subscription.sgml | 33 +++-
src/backend/catalog/pg_subscription.c | 10 ++
src/backend/commands/subscriptioncmds.c | 45 ++++-
src/backend/postmaster/pgstat.c | 44 ++++-
src/backend/replication/logical/worker.c | 190 ++++++++++++++++++++-
src/include/catalog/pg_subscription.h | 4 +
src/include/nodes/parsenodes.h | 1 +
src/include/pgstat.h | 8 +-
src/test/regress/expected/subscription.out | 13 ++
src/test/regress/sql/subscription.sql | 11 ++
src/test/subscription/t/023_skip_xact.pl | 185 ++++++++++++++++++++
12 files changed, 570 insertions(+), 23 deletions(-)
create mode 100644 src/test/subscription/t/023_skip_xact.pl
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 88646bc859..992d8b4ac1 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -333,20 +333,63 @@
<para>
A conflict will produce an error and will stop the replication; it must be
resolved manually by the user. Details about the conflict can be found in
- the subscriber's server log.
+ <link linkend="monitoring-pg-stat-subscription-errors"/> as well as the
+ subscriber's server log.
</para>
<para>
The resolution can be done either by changing data on the subscriber so
that it does not conflict with the incoming change or by skipping the
- transaction that conflicts with the existing data. The transaction can be
- skipped by calling the <link linkend="pg-replication-origin-advance">
+ transaction that conflicts with the existing data. When a conflict produce
+ an error, it is shown in <structname>pg_stat_subscription_errors</structname>
+ view as follows:
+ </para>
+
+ <programlisting>
+postgres=# SELECT * FROM pg_stat_subscription_errors;
+-[ RECORD 1 ]--------+-----------------------------------------------------------
+datname | postgres
+subid | 16395
+subname | test_sub
+relid | 16385
+command | INSERT
+xid | 716
+failure_source | apply
+failure_count | 50
+last_failure | 2021-07-21 21:16:02.781779+00
+last_failure_message | duplicate key value violates unique constraint "test_pkey"
+stats_reset |
+</programlisting>
+
+ <para>
+ and it is also shown in subscriber's server log as follows:
+ </para>
+
+<screen>
+ERROR: duplicate key value violates unique constraint "test_pkey"
+DETAIL: Key (id)=(1) already exists.
+CONTEXT: during apply of "INSERT" for relation "public.test" in transaction with xid 716 committs 2021-07-15 21:54:58.802874+00
+</screen>
+
+ <para>
+ The transaction ID that contains the change violating the constraint can be
+ found from those outputs (transaction ID 740 in the above case). The transaction
+ can be skipped by setting <replaceable>skip_xid</replaceable> to the subscription
+ by <command>ALTER SUBSCRIPTION ... SET</command>. Alternatively, the transaction
+ can also be skipped by calling the <link linkend="pg-replication-origin-advance">
<function>pg_replication_origin_advance()</function></link> function with
a <parameter>node_name</parameter> corresponding to the subscription name,
and a position. The current position of origins can be seen in the
<link linkend="view-pg-replication-origin-status">
<structname>pg_replication_origin_status</structname></link> system view.
</para>
+
+ <para>
+ In either way, those should be used as a last resort. They skip the whole
+ transaction including changes that may not violate any constraint and easily
+ make subscriber inconsistent if a user specifies the wrong transaction ID or
+ the position of origin.
+ </para>
</sect1>
<sect1 id="logical-replication-restrictions">
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 626fb8afa2..591f554fc7 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -199,11 +199,40 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
This clause sets or resets a subscription option. The parameters that can be
set are the parameters originally set by <xref linkend="sql-createsubscription"/>:
<literal>slot_name</literal>, <literal>synchronous_commit</literal>,
- <literal>binary</literal>, <literal>streaming</literal>.
+ <literal>binary</literal>, <literal>streaming</literal>, and following
+ parameter:
+ </para>
+ <para>
+ <variablelist>
+ <varlistentry>
+ <term><literal>skip_xid</literal> (<type>xid</type>)</term>
+ <listitem>
+ <para>
+ If incoming data violates any constraints the logical replication
+ will stop until it is resolved. The resolution can be done either
+ by changing data on the subscriber so that it doesn't conflict with
+ incoming change or by skipping the whole transaction. This option
+ specifies transaction ID that logical replication worker skips to
+ apply. The logical replication worker skips all data modification
+ changes within the specified transaction. Therefore, since it skips
+ the whole transaction including the changes that may not violate the
+ constraint, it should only be used as a last resort. This option has
+ no effect for the transaction that is already prepared with enabling
+ <literal>two_phase</literal> on susbscriber. After the logical
+ replication successfully skips the transaction, the transaction ID
+ (stored in
+ <structname>pg_subscription</structname>.<structfield>subskipxid</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
</para>
<para>
The parameters that can be reset are: <literal>streaming</literal>,
- <literal>binary</literal>, <literal>synchronous_commit</literal>.
+ <literal>binary</literal>, <literal>synchronous_commit</literal>,
+ and <literal>skip_xid</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index 25021e25a4..cb22cd7463 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -104,6 +104,16 @@ GetSubscription(Oid subid, bool missing_ok)
Assert(!isnull);
sub->publications = textarray_to_stringlist(DatumGetArrayTypeP(datum));
+ /* Get skip XID */
+ datum = SysCacheGetAttr(SUBSCRIPTIONOID,
+ tup,
+ Anum_pg_subscription_subskipxid,
+ &isnull);
+ if (!isnull)
+ sub->skipxid = DatumGetTransactionId(datum);
+ else
+ sub->skipxid = InvalidTransactionId;
+
ReleaseSysCache(tup);
return sub;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index d0cabedd15..419bbaab6b 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -60,6 +60,7 @@
#define SUBOPT_BINARY 0x00000080
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
+#define SUBOPT_SKIP_XID 0x00000400
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -81,6 +82,7 @@ typedef struct SubOpts
bool binary;
bool streaming;
bool twophase;
+ TransactionId skip_xid;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -129,6 +131,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->streaming = false;
if (IsSet(supported_opts, SUBOPT_TWOPHASE_COMMIT))
opts->twophase = false;
+ if (IsSet(supported_opts, SUBOPT_SKIP_XID))
+ opts->skip_xid = InvalidTransactionId;
/* Parse options */
foreach(lc, stmt_options)
@@ -272,6 +276,29 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
if (!is_reset)
opts->twophase = defGetBoolean(defel);
}
+ else if (strcmp(defel->defname, "skip_xid") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_SKIP_XID))
+ errorConflictingDefElem(defel, pstate);
+
+ if (!is_reset)
+ {
+ char *xid_str = defGetString(defel);
+ TransactionId xid;
+
+ /* Parse the argument as TransactionId */
+ xid = DatumGetTransactionId(DirectFunctionCall1(xidin,
+ CStringGetDatum(xid_str)));
+
+ if (!TransactionIdIsNormal(xid))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid transaction id")));
+ opts->skip_xid = xid;
+ }
+
+ opts->specified_opts |= SUBOPT_SKIP_XID;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -515,6 +542,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
CharGetDatum(opts.twophase ?
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -915,7 +943,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
{
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
- SUBOPT_STREAMING);
+ SUBOPT_STREAMING | SUBOPT_SKIP_XID);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts, false);
@@ -970,6 +998,13 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
replaces[Anum_pg_subscription_substream - 1] = true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_SKIP_XID))
+ {
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(opts.skip_xid);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+ }
+
update_tuple = true;
break;
}
@@ -977,7 +1012,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
case ALTER_SUBSCRIPTION_RESET_OPTIONS:
{
supported_opts = (SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
- SUBOPT_STREAMING);
+ SUBOPT_STREAMING | SUBOPT_SKIP_XID);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts, true);
@@ -1003,6 +1038,12 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
replaces[Anum_pg_subscription_substream - 1] = true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_SKIP_XID))
+ {
+ nulls[Anum_pg_subscription_subskipxid - 1] =
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+ }
+
update_tuple = true;
break;
}
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 8b5ff370d3..51386ba708 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -1731,11 +1731,32 @@ pgstat_reset_subscription_error(Oid subid, Oid subrelid)
pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
msg.m_subid = subid;
msg.m_subrelid = subrelid;
+ msg.m_clear = false;
msg.m_reset = true;
pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
}
+/* ----------
+ * pgstat_clear_subscription_error() -
+ *
+ * Tell the collector about clear the error of subscription.
+ * ----------
+ */
+void
+pgstat_clear_subscription_error(Oid subid, Oid subrelid)
+{
+ PgStat_MsgSubscriptionErr msg;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_clear = true;
+ msg.m_reset = false;
+
+ pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
+}
+
/* ----------
* pgstat_report_autovac() -
*
@@ -2078,6 +2099,7 @@ pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
msg.m_subrelid = subrelid;
msg.m_databaseid = MyDatabaseId;
msg.m_relid = relid;
+ msg.m_clear = false;
msg.m_reset = false;
msg.m_command = command;
msg.m_xid = xid;
@@ -6095,27 +6117,37 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
static void
pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len)
{
+
PgStat_StatSubErrEntry *errent;
- bool create = !msg->m_reset;
+ bool create = !(msg->m_reset || msg->m_clear);
/* Get subscription error */
errent = pgstat_get_subscription_error_entry(msg->m_subid,
msg->m_subrelid,
create);
- if (msg->m_reset)
+ if (msg->m_reset || msg->m_clear)
{
+ Assert(!(msg->m_reset && msg->m_clear));
+
if (errent == NULL)
return;
- /* reset fields and set reset timestamp */
errent->relid = InvalidOid;
errent->command = 0;
errent->xid = InvalidTransactionId;
errent->failure_count = 0;
- errent->last_failure = 0;
- errent->last_errmsg[0] = '\0';
- errent->stat_reset_timestamp = GetCurrentTimestamp();
+
+ /*
+ * If the reset is requested, reset more fields and set the reset
+ * timestamp.
+ */
+ if (msg->m_reset)
+ {
+ errent->last_failure = 0;
+ errent->last_errmsg[0] = '\0';
+ errent->stat_reset_timestamp = GetCurrentTimestamp();
+ }
}
else
{
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 5a3ba8d7c1..5b7cb1cea6 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -136,6 +136,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -278,6 +279,21 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * skipping_xid is a valid XID if we're skipping all data modification changes of the
+ * specified transaction in MySubscription->skipxid. Please note that we don’t skip
+ * receiving the changes particularly in streaming cases, since we decide whether or
+ * not to skip applying the changes when starting to apply. Once starting skipping
+ * changes, we copy the XID to skipping_xid and then don't stop skipping until we skip
+ * the whole transaction even if the subscription is invalidated and
+ * MySubscription->skipxid gets changed or reset. When stopping the skipping behavior,
+ * we reset the skip XID (subskipxid) in the pg_subscription and associate origin status
+ * to the transaction that resets the skip XID so that we can start streaming from the
+ * next transaction.
+ */
+static TransactionId skipping_xid = InvalidTransactionId;
+#define skipping_changes() (TransactionIdIsValid(skipping_xid))
+
/*
* Hash table for storing the streaming xid information along with shared file
* set for streaming and subxact files.
@@ -365,6 +381,9 @@ static void set_apply_error_context_rel(LogicalRepRelMapEntry *rel);
static void reset_apply_error_context_rel(void);
static void reset_apply_error_context_info(void);
+static void maybe_start_skipping_changes(TransactionId xid);
+static void stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs);
+
/*
* Should this worker apply changes for given relation.
*
@@ -862,6 +881,11 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
+ /*
+ * Start skipping all changes of this transaction if specified
+ */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -886,7 +910,18 @@ apply_handle_commit(StringInfo s)
LSN_FORMAT_ARGS(commit_data.commit_lsn),
LSN_FORMAT_ARGS(remote_final_lsn))));
- apply_handle_commit_internal(&commit_data);
+ /*
+ * Stop the skipping transaction if enabled. Otherwise, commit the changes
+ * that are just applied.
+ */
+ if (skipping_changes())
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ apply_handle_commit_internal(&commit_data);
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(commit_data.end_lsn);
@@ -915,6 +950,9 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
+ /* Start skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -983,7 +1021,10 @@ apply_handle_prepare(StringInfo s)
*/
begin_replication_step();
- apply_handle_prepare_internal(&prepare_data);
+ if (skipping_changes())
+ stop_skipping_changes(prepare_data.end_lsn, prepare_data.prepare_time);
+ else
+ apply_handle_prepare_internal(&prepare_data);
end_replication_step();
CommitTransactionCommand();
@@ -1103,9 +1144,10 @@ apply_handle_origin(StringInfo s)
{
/*
* ORIGIN message can only come inside streaming transaction or inside
- * remote transaction and before any actual writes.
+ * remote transaction and before any actual writes unless we're skipping
+ * changes of the transaction.
*/
- if (!in_streamed_transaction &&
+ if (!in_streamed_transaction && !skipping_changes() &&
(!in_remote_transaction ||
(IsTransactionState() && !am_tablesync_worker())))
ereport(ERROR,
@@ -1127,6 +1169,9 @@ apply_handle_stream_start(StringInfo s)
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg_internal("duplicate STREAM START message")));
+ /* extract XID of the top-level transaction */
+ stream_xid = logicalrep_read_stream_start(s, &first_segment);
+
/*
* Start a transaction on stream start, this transaction will be committed
* on the stream stop unless it is a tablesync worker in which case it
@@ -1139,9 +1184,6 @@ apply_handle_stream_start(StringInfo s)
/* notify handle methods we're processing a remote transaction */
in_streamed_transaction = true;
- /* extract XID of the top-level transaction */
- stream_xid = logicalrep_read_stream_start(s, &first_segment);
-
if (!TransactionIdIsValid(stream_xid))
ereport(ERROR,
(errcode(ERRCODE_PROTOCOL_VIOLATION),
@@ -1223,6 +1265,7 @@ apply_handle_stream_abort(StringInfo s)
errmsg_internal("STREAM ABORT message without STREAM STOP")));
logicalrep_read_stream_abort(s, &xid, &subxid);
+ maybe_start_skipping_changes(xid);
/*
* If the two XIDs are the same, it's in fact abort of toplevel xact, so
@@ -1315,6 +1358,10 @@ apply_handle_stream_abort(StringInfo s)
CommitTransactionCommand();
}
+ /* Stop the skipping transaction if enabled */
+ if (skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
reset_apply_error_context_info();
}
@@ -1465,9 +1512,22 @@ apply_handle_stream_commit(StringInfo s)
elog(DEBUG1, "received commit for streamed transaction %u", xid);
+ /* Start skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(xid);
+
apply_spooled_messages(xid, commit_data.commit_lsn);
- apply_handle_commit_internal(&commit_data);
+ if (skipping_changes())
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ {
+ /* commit the streamed transaction */
+ apply_handle_commit_internal(&commit_data);
+ }
/* unlink the files with serialized changes and subxact info */
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
@@ -2353,6 +2413,17 @@ apply_dispatch(StringInfo s)
LogicalRepMsgType action = pq_getmsgbyte(s);
ErrorContextCallback errcallback;
+ /*
+ * Skip all data-modification changes if we're skipping changes of this
+ * transaction.
+ */
+ if (skipping_changes() &&
+ (action == LOGICAL_REP_MSG_INSERT ||
+ action == LOGICAL_REP_MSG_UPDATE ||
+ action == LOGICAL_REP_MSG_DELETE ||
+ action == LOGICAL_REP_MSG_TRUNCATE))
+ return;
+
/*
* Push apply error context callback. Other fields will be filled during
* applying the change.
@@ -3768,3 +3839,106 @@ reset_apply_error_context_info(void)
apply_error_callback_arg.committs = 0;
reset_apply_error_context_rel();
}
+
+/*
+ * Start skipping changes of the transaction if the given XID matches the
+ * transaction ID specified by skip_xid option.
+ */
+static void
+maybe_start_skipping_changes(TransactionId xid)
+{
+ Assert(!skipping_changes());
+ Assert(!TransactionIdIsValid(skipping_xid));
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ if (!TransactionIdIsValid(MySubscription->skipxid) ||
+ MySubscription->skipxid != xid)
+ return;
+
+ skipping_xid = xid;
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction with xid %u",
+ skipping_xid));
+}
+
+/*
+ * Stop skipping changes and reset the skip XID.
+ *
+ * If origin_lsn and origin_committs are valid, we set origin state to the
+ * transaction commit that resets the skip XID so that we can start streaming
+ * from the transaction next to the one that we just skipped.
+ */
+static void
+stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs)
+{
+ Relation rel;
+ HeapTuple tup;
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ Assert(skipping_changes());
+ Assert(TransactionIdIsValid(skipping_xid));
+ Assert(in_remote_transaction);
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction with xid %u",
+ skipping_xid)));
+
+ /* Stop skipping changes */
+ skipping_xid = InvalidTransactionId;
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ if (!IsTransactionState())
+ StartTransactionCommand();
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ TransactionIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ /* Set subskipxid to null */
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ /* Update the system catalog to reset the skip XID */
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+
+ heap_freetuple(tup);
+ table_close(rel, RowExclusiveLock);
+
+ if (!XLogRecPtrIsInvalid(origin_lsn))
+ {
+ /*
+ * Update origin state so we can restart streaming from correct
+ * position in case of crash.
+ */
+ replorigin_session_origin_lsn = origin_lsn;
+ replorigin_session_origin_timestamp = origin_committs;
+ }
+
+ CommitTransactionCommand();
+ pgstat_report_stat(false);
+
+ /*
+ * Clear the error statistics of this subscription to let users know the
+ * subscription is no longer getting stuck by the conflict.
+ *
+ * The message for clearing the error statistics can be lost but that's
+ * okay. The user can know the logical replication is working fine in
+ * other ways, for example, checking pg_stat_subscription view. And the
+ * user is able to reset the single subscription error statistics by
+ * pg_reset_subscription_error SQL function.
+ */
+ pgstat_clear_subscription_error(MySubscription->oid, InvalidOid);
+}
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 21061493ea..beaa6e646d 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -67,6 +67,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
char subtwophasestate; /* Stream two-phase transactions */
+ TransactionId subskipxid BKI_FORCE_NULL; /* All changes associated with
+ * this XID are skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -103,6 +106,7 @@ typedef struct Subscription
* binary format */
bool stream; /* Allow streaming in-progress transactions. */
char twophasestate; /* Allow streaming two-phase transactions */
+ TransactionId skipxid; /* All changes of the XID are skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index aac2433d82..af5c16abfa 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3677,6 +3677,7 @@ typedef struct AlterSubscriptionStmt
char *conninfo; /* Connection string to publisher */
List *publication; /* One or more publication to subscribe to */
List *options; /* List of DefElem nodes */
+ TransactionId skip_xid; /* XID to skip */
} AlterSubscriptionStmt;
typedef struct DropSubscriptionStmt
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index bc4a549cdb..025f759c93 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -545,7 +545,7 @@ typedef struct PgStat_MsgReplSlot
/* ----------
* PgStat_MsgSubscriptionErr Sent by the apply worker or the table sync worker to
- * update/reset the error happening during logical
+ * update/reset/clear the error happening during logical
* replication.
* ----------
*/
@@ -563,7 +563,10 @@ typedef struct PgStat_MsgSubscriptionErr
Oid m_subid;
Oid m_subrelid;
- /* The clear message uses below field */
+
+ /* The clear and reset messages use below fields */
+ bool m_clear; /* clear all fields except for last_failure and
+ * last_errmsg */
bool m_reset; /* Reset all fields and set reset_stats
* timestamp */
@@ -1097,6 +1100,7 @@ extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type t
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
extern void pgstat_reset_subscription_error(Oid subid, Oid subrelid);
+extern void pgstat_clear_subscription_error(Oid subid, Oid subrelid);
extern void pgstat_report_autovac(Oid dboid);
extern void pgstat_report_vacuum(Oid tableoid, bool shared,
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 9518e2c20b..f1934c1064 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -297,6 +297,19 @@ ERROR: unrecognized subscription parameter: "enabled"
-- fail - RESET must not include values
ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
ERROR: RESET must not include values for parameters
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = '4294967295');
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1.1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+ERROR: invalid transaction id
\dRs+
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index b73d932dfc..a69bb3b36e 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -231,6 +231,17 @@ ALTER SUBSCRIPTION regress_testsub RESET (enabled);
-- fail - RESET must not include values
ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = '4294967295');
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1.1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+
\dRs+
DROP SUBSCRIPTION regress_testsub;
diff --git a/src/test/subscription/t/023_skip_xact.pl b/src/test/subscription/t/023_skip_xact.pl
new file mode 100644
index 0000000000..8384645759
--- /dev/null
+++ b/src/test/subscription/t/023_skip_xact.pl
@@ -0,0 +1,185 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Test skipping logical replication transactions
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 8;
+
+sub test_subscription_error
+{
+ my ($node, $expected, $source, $relname, $msg) = @_;
+
+ # Wait for the error statistics to be updated.
+ $node->poll_query_until(
+ 'postgres', qq[
+SELECT count(1) > 0 FROM pg_stat_subscription_errors
+WHERE relid = '$relname'::regclass AND failure_source = '$source';
+]) or die "Timed out while waiting for statistics to be updated";
+
+ my $result = $node->safe_psql(
+ 'postgres',
+ qq[
+SELECT datname, subname, command, relid::regclass, failure_source, failure_count > 0
+FROM pg_stat_subscription_errors
+WHERE relid = '$relname'::regclass AND failure_source = '$source';
+]);
+ is($result, $expected, $msg);
+}
+
+# Create publisher node.
+my $node_publisher = PostgresNode->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgresNode->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+
+# don't overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ 'wal_retrieve_retry_interval = 5s');
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we create
+# the same tables but with primary keys. Also, insert some data that will conflict
+# with the data replicated from publisher later.
+$node_publisher->safe_psql('postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int);
+CREATE TABLE test_tab2 (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+INSERT INTO test_tab1 VALUES (1);
+INSERT INTO test_tab2 VALUES (1);
+COMMIT;
+]);
+$node_subscriber->safe_psql('postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int primary key);
+CREATE TABLE test_tab2 (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab2 VALUES (1);
+INSERT INTO test_tab_streaming SELECT 10000, md5(10000::text);
+COMMIT;
+]);
+
+# Check if there is no subscription errors before starting logical replication.
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_errors");
+is($result, qq(0), 'check no subscription error');
+
+# Setup logical replication.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ q[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Start logical replication. The table sync for test_tab2 on tap_sub will enter
+# infinite error due to violating the unique constraint.
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (two_phase = on);");
+my $appname_streaming = 'tap_sub_streaming';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming' PUBLICATION tap_pub_streaming WITH (streaming = on);");
+
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Also wait for initial table sync for test_tab1 and test_tab_streaming to finish.
+$node_subscriber->poll_query_until('postgres',
+ q[
+SELECT count(1) = 2 FROM pg_subscription_rel
+WHERE srrelid in ('test_tab1'::regclass, 'test_tab_streaming'::regclass) AND srsubstate = 'r'
+]) or die "Timed out while waiting for subscriber to synchronize data";
+
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab1");
+is($result, q(1), 'check initial data was copied to subscriber');
+
+# Insert more data to test_tab1, raising an error on the subscriber due to violating
+# the unique constraint on test_tab1.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab1 VALUES (1)");
+
+# Insert enough rows to test_tab_streaming to exceed the 64kB limit, also raising an
+# error on the subscriber for the same reason.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);");
+
+# Check both two errors on tap_sub subscription are reported.
+test_subscription_error($node_subscriber, qq(postgres|tap_sub|INSERT|test_tab1|apply|t),
+ 'apply', 'test_tab1', 'error reporting by the apply worker');
+test_subscription_error($node_subscriber, qq(postgres|tap_sub||test_tab2|tablesync|t),
+ 'tablesync', 'test_tab2', 'error reporting by the table sync worker');
+test_subscription_error($node_subscriber, qq(postgres|tap_sub_streaming|INSERT|test_tab_streaming|apply|t),
+ 'apply', 'test_tab_streaming', 'error reporting by the apply worker');
+
+# Set XIDs of the transactions in question to the subscriptions to skip.
+my $skip_xid1 = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT xid FROM pg_stat_subscription_errors WHERE relid = 'test_tab1'::regclass");
+my $skip_xid2 = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT xid FROM pg_stat_subscription_errors WHERE relid = 'test_tab_streaming'::regclass");
+
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (skip_xid = $skip_xid1)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub_streaming SET (skip_xid = $skip_xid2)");
+
+# Restart the subscriber to restart logical replication without interval.
+$node_subscriber->restart;
+
+# Wait for the transaction in question is skipped.
+$node_subscriber->poll_query_until(
+ 'postgres',
+ q[
+SELECT count(1) = 2 FROM pg_subscription
+WHERE subname in ('tap_sub', 'tap_sub_streaming') AND subskipxid IS NULL
+]) or die "Timed out while waiting for the transaction to be skipped";
+
+# Insert data to test_tab1 that doesn't conflict.
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab1 VALUES (2)");
+
+# Also, insert data to test_tab_streaming that doesn't conflict.
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab_streaming VALUES (10001, md5(10001::text))");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Check the data is successfully replicated after skipping the transaction.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT * FROM test_tab1");
+is($result, q(1
+2), "subscription gets changes after skipped transaction");
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM test_tab_streaming");
+is($result, q(2), "subscription gets changes after skipped transaction");
+
+# Check if the view doesn't show any entries after dropping the subscription.
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+DROP SUBSCRIPTION tap_sub;
+DROP SUBSCRIPTION tap_sub_streaming;
+]);
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_errors");
+is($result, q(0), 'no error after dropping subscription');
--
2.24.3 (Apple Git-128)
v5-0003-Add-RESET-command-to-ALTER-SUBSCRIPTION-command.patchapplication/octet-stream; name=v5-0003-Add-RESET-command-to-ALTER-SUBSCRIPTION-command.patchDownload
From 52e1960ffd2135db8190be6d401d746b3ec1598e Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 2 Aug 2021 14:23:18 +0900
Subject: [PATCH v5 3/4] Add RESET command to ALTER SUBSCRIPTION command.
ALTER SUBSCRIPTION ... RESET command resets subscription
parameters. The parameters that can be set are streaming, binary,
synchronous_commit.
RESET command is reuiqred by follow-up commit introducing to a new
parameter skip_xid to reset.
---
doc/src/sgml/ref/alter_subscription.sgml | 17 ++--
src/backend/commands/subscriptioncmds.c | 103 ++++++++++++++++-----
src/backend/parser/gram.y | 11 ++-
src/include/nodes/parsenodes.h | 3 +-
src/test/regress/expected/subscription.out | 11 +++
src/test/regress/sql/subscription.sql | 11 +++
6 files changed, 125 insertions(+), 31 deletions(-)
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index a6f994450d..626fb8afa2 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -29,6 +29,7 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> REFRESH PUB
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> ENABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> DISABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <replaceable>new_name</replaceable>
</synopsis>
@@ -192,15 +193,17 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
<varlistentry>
<term><literal>SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )</literal></term>
+ <term><literal>RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )</literal></term>
<listitem>
<para>
- This clause alters parameters originally set by
- <xref linkend="sql-createsubscription"/>. See there for more
- information. The parameters that can be altered
- are <literal>slot_name</literal>,
- <literal>synchronous_commit</literal>,
- <literal>binary</literal>, and
- <literal>streaming</literal>.
+ This clause sets or resets a subscription option. The parameters that can be
+ set are the parameters originally set by <xref linkend="sql-createsubscription"/>:
+ <literal>slot_name</literal>, <literal>synchronous_commit</literal>,
+ <literal>binary</literal>, <literal>streaming</literal>.
+ </para>
+ <para>
+ The parameters that can be reset are: <literal>streaming</literal>,
+ <literal>binary</literal>, <literal>synchronous_commit</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 22ae982328..d0cabedd15 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -99,7 +99,8 @@ static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname,
*/
static void
parse_subscription_options(ParseState *pstate, List *stmt_options,
- bits32 supported_opts, SubOpts *opts)
+ bits32 supported_opts, SubOpts *opts,
+ bool is_reset)
{
ListCell *lc;
@@ -134,6 +135,15 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
{
DefElem *defel = (DefElem *) lfirst(lc);
+ if (is_reset)
+ {
+ if (defel->arg != NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("RESET must not include values for parameters")));
+
+ }
+
if (IsSet(supported_opts, SUBOPT_CONNECT) &&
strcmp(defel->defname, "connect") == 0)
{
@@ -141,7 +151,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_CONNECT;
- opts->connect = defGetBoolean(defel);
+ if (!is_reset)
+ opts->connect = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_ENABLED) &&
strcmp(defel->defname, "enabled") == 0)
@@ -150,7 +161,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_ENABLED;
- opts->enabled = defGetBoolean(defel);
+ if (!is_reset)
+ opts->enabled = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_CREATE_SLOT) &&
strcmp(defel->defname, "create_slot") == 0)
@@ -159,7 +171,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_CREATE_SLOT;
- opts->create_slot = defGetBoolean(defel);
+ if (!is_reset)
+ opts->create_slot = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_SLOT_NAME) &&
strcmp(defel->defname, "slot_name") == 0)
@@ -168,7 +181,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_SLOT_NAME;
- opts->slot_name = defGetString(defel);
+ if (!is_reset)
+ opts->slot_name = defGetString(defel);
/* Setting slot_name = NONE is treated as no slot name. */
if (strcmp(opts->slot_name, "none") == 0)
@@ -183,7 +197,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_COPY_DATA;
- opts->copy_data = defGetBoolean(defel);
+ if (!is_reset)
+ opts->copy_data = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_SYNCHRONOUS_COMMIT) &&
strcmp(defel->defname, "synchronous_commit") == 0)
@@ -192,12 +207,18 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_SYNCHRONOUS_COMMIT;
- opts->synchronous_commit = defGetString(defel);
+ if (!is_reset)
+ {
+ opts->synchronous_commit = defGetString(defel);
- /* Test if the given value is valid for synchronous_commit GUC. */
- (void) set_config_option("synchronous_commit", opts->synchronous_commit,
- PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
- false, 0, false);
+ /*
+ * Test if the given value is valid for synchronous_commit
+ * GUC.
+ */
+ (void) set_config_option("synchronous_commit", opts->synchronous_commit,
+ PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
+ false, 0, false);
+ }
}
else if (IsSet(supported_opts, SUBOPT_REFRESH) &&
strcmp(defel->defname, "refresh") == 0)
@@ -206,7 +227,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_REFRESH;
- opts->refresh = defGetBoolean(defel);
+ if (!is_reset)
+ opts->refresh = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_BINARY) &&
strcmp(defel->defname, "binary") == 0)
@@ -215,7 +237,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_BINARY;
- opts->binary = defGetBoolean(defel);
+ if (!is_reset)
+ opts->binary = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_STREAMING) &&
strcmp(defel->defname, "streaming") == 0)
@@ -224,7 +247,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_STREAMING;
- opts->streaming = defGetBoolean(defel);
+ if (!is_reset)
+ opts->streaming = defGetBoolean(defel);
}
else if (strcmp(defel->defname, "two_phase") == 0)
{
@@ -245,7 +269,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_TWOPHASE_COMMIT;
- opts->twophase = defGetBoolean(defel);
+ if (!is_reset)
+ opts->twophase = defGetBoolean(defel);
}
else
ereport(ERROR,
@@ -416,7 +441,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT);
- parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
+ parse_subscription_options(pstate, stmt->options, supported_opts, &opts,
+ false);
/*
* Since creating a replication slot is not transactional, rolling back
@@ -885,14 +911,14 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
switch (stmt->kind)
{
- case ALTER_SUBSCRIPTION_OPTIONS:
+ case ALTER_SUBSCRIPTION_SET_OPTIONS:
{
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING);
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
if (IsSet(opts.specified_opts, SUBOPT_SLOT_NAME))
{
@@ -948,10 +974,43 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
break;
}
+ case ALTER_SUBSCRIPTION_RESET_OPTIONS:
+ {
+ supported_opts = (SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
+ SUBOPT_STREAMING);
+
+ parse_subscription_options(pstate, stmt->options,
+ supported_opts, &opts, true);
+
+ if (IsSet(opts.specified_opts, SUBOPT_SYNCHRONOUS_COMMIT))
+ {
+ values[Anum_pg_subscription_subsynccommit - 1] =
+ CStringGetTextDatum("off");
+ replaces[Anum_pg_subscription_subsynccommit - 1] = true;
+ }
+
+ if (IsSet(opts.specified_opts, SUBOPT_BINARY))
+ {
+ values[Anum_pg_subscription_subbinary - 1] =
+ BoolGetDatum(false);
+ replaces[Anum_pg_subscription_subbinary - 1] = true;
+ }
+
+ if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
+ {
+ values[Anum_pg_subscription_substream - 1] =
+ BoolGetDatum(false);
+ replaces[Anum_pg_subscription_substream - 1] = true;
+ }
+
+ update_tuple = true;
+ break;
+ }
case ALTER_SUBSCRIPTION_ENABLED:
{
parse_subscription_options(pstate, stmt->options,
- SUBOPT_ENABLED, &opts);
+ SUBOPT_ENABLED, &opts, false);
+
Assert(IsSet(opts.specified_opts, SUBOPT_ENABLED));
if (!sub->slotname && opts.enabled)
@@ -986,7 +1045,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
{
supported_opts = SUBOPT_COPY_DATA | SUBOPT_REFRESH;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
values[Anum_pg_subscription_subpublications - 1] =
publicationListToArray(stmt->publication);
@@ -1036,7 +1095,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts |= SUBOPT_COPY_DATA;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
publist = merge_publications(sub->publications, stmt->publication, isadd, stmt->subname);
values[Anum_pg_subscription_subpublications - 1] =
@@ -1084,7 +1143,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
errmsg("ALTER SUBSCRIPTION ... REFRESH is not allowed for disabled subscriptions")));
parse_subscription_options(pstate, stmt->options,
- SUBOPT_COPY_DATA, &opts);
+ SUBOPT_COPY_DATA, &opts, false);
/*
* The subscription option "two_phase" requires that
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 39a2849eba..bcf85e8980 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9707,7 +9707,16 @@ AlterSubscriptionStmt:
{
AlterSubscriptionStmt *n =
makeNode(AlterSubscriptionStmt);
- n->kind = ALTER_SUBSCRIPTION_OPTIONS;
+ n->kind = ALTER_SUBSCRIPTION_SET_OPTIONS;
+ n->subname = $3;
+ n->options = $5;
+ $$ = (Node *)n;
+ }
+ | ALTER SUBSCRIPTION name RESET definition
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_RESET_OPTIONS;
n->subname = $3;
n->options = $5;
$$ = (Node *)n;
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index e28248af32..aac2433d82 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3659,7 +3659,8 @@ typedef struct CreateSubscriptionStmt
typedef enum AlterSubscriptionType
{
- ALTER_SUBSCRIPTION_OPTIONS,
+ ALTER_SUBSCRIPTION_SET_OPTIONS,
+ ALTER_SUBSCRIPTION_RESET_OPTIONS,
ALTER_SUBSCRIPTION_CONNECTION,
ALTER_SUBSCRIPTION_SET_PUBLICATION,
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 67f92b3878..9518e2c20b 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -286,6 +286,17 @@ ERROR: unrecognized subscription parameter: "two_phase"
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
ERROR: cannot set streaming = true for two-phase enabled subscription
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+-- ok
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit);
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit, binary, streaming);
+-- fail - unsupported parameters
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ERROR: unrecognized subscription parameter: "connect"
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+ERROR: unrecognized subscription parameter: "enabled"
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+ERROR: RESET must not include values for parameters
\dRs+
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 88743ab33b..b73d932dfc 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -220,6 +220,17 @@ ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+-- ok
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit);
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit, binary, streaming);
+
+-- fail - unsupported parameters
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+
\dRs+
DROP SUBSCRIPTION regress_testsub;
--
2.24.3 (Apple Git-128)
v5-0001-Add-errcontext-to-errors-happening-during-applyin.patchapplication/octet-stream; name=v5-0001-Add-errcontext-to-errors-happening-during-applyin.patchDownload
From c87a6107aac7416f72e2236d687767fc50b470be Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 28 Jun 2021 13:21:58 +0900
Subject: [PATCH v5 1/4] Add errcontext to errors happening during applying
logical replication changes.
This commit adds the error context to errors happening during applying
logical replication changes, showing the command, the relation,
transaction ID, and commit timestamp in the server log.
---
src/backend/replication/logical/proto.c | 49 ++++++
src/backend/replication/logical/worker.c | 181 ++++++++++++++++++++---
src/include/replication/logicalproto.h | 1 +
3 files changed, 209 insertions(+), 22 deletions(-)
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 2d774567e0..aa77fb2820 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -1131,3 +1131,52 @@ logicalrep_read_stream_abort(StringInfo in, TransactionId *xid,
*xid = pq_getmsgint(in, 4);
*subxid = pq_getmsgint(in, 4);
}
+
+/*
+ * get string representing LogicalRepMsgType.
+ */
+char *
+logicalrep_message_type(LogicalRepMsgType action)
+{
+ switch (action)
+ {
+ case LOGICAL_REP_MSG_BEGIN:
+ return "BEGIN";
+ case LOGICAL_REP_MSG_COMMIT:
+ return "COMMIT";
+ case LOGICAL_REP_MSG_INSERT:
+ return "INSERT";
+ case LOGICAL_REP_MSG_UPDATE:
+ return "UPDATE";
+ case LOGICAL_REP_MSG_DELETE:
+ return "DELETE";
+ case LOGICAL_REP_MSG_TRUNCATE:
+ return "TRUNCATE";
+ case LOGICAL_REP_MSG_RELATION:
+ return "RELATION";
+ case LOGICAL_REP_MSG_TYPE:
+ return "TYPE";
+ case LOGICAL_REP_MSG_ORIGIN:
+ return "ORIGIN";
+ case LOGICAL_REP_MSG_MESSAGE:
+ return "MESSAGE";
+ case LOGICAL_REP_MSG_STREAM_START:
+ return "STREAM START";
+ case LOGICAL_REP_MSG_STREAM_END:
+ return "STREAM END";
+ case LOGICAL_REP_MSG_STREAM_ABORT:
+ return "STREAM ABORT";
+ case LOGICAL_REP_MSG_STREAM_COMMIT:
+ return "STREAM COMMIT";
+ case LOGICAL_REP_MSG_BEGIN_PREPARE:
+ return "BEGIN PREPARE";
+ case LOGICAL_REP_MSG_PREPARE:
+ return "PREPARE";
+ case LOGICAL_REP_MSG_COMMIT_PREPARED:
+ return "COMMIT PREPARED";
+ case LOGICAL_REP_MSG_ROLLBACK_PREPARED:
+ return "ROLLBACK PREPARED";
+ }
+
+ elog(ERROR, "invalid logical replication message type \"%c\"", action);
+}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 249de80798..c2376d755f 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -221,6 +221,28 @@ typedef struct ApplyExecutionData
PartitionTupleRouting *proute; /* partition routing info */
} ApplyExecutionData;
+/* Struct for saving and restoring apply information */
+typedef struct ApplyErrCallbackArg
+{
+ LogicalRepMsgType command; /* 0 if invalid */
+
+ /* Local relation information */
+ char *nspname;
+ char *relname;
+
+ TransactionId remote_xid;
+ TimestampTz committs;
+} ApplyErrCallbackArg;
+
+static ApplyErrCallbackArg apply_error_callback_arg =
+{
+ .command = 0,
+ .relname = NULL,
+ .nspname = NULL,
+ .remote_xid = InvalidTransactionId,
+ .committs = 0,
+};
+
/*
* Stream xid hash entry. Whenever we see a new xid we create this entry in the
* xidhash and along with it create the streaming file and store the fileset handle.
@@ -335,6 +357,12 @@ static void TwoPhaseTransactionGid(Oid subid, TransactionId xid, char *gid, int
/* Common streaming function to apply all the spooled messages */
static void apply_spooled_messages(TransactionId xid, XLogRecPtr lsn);
+/* Functions for apply error callback */
+static void apply_error_callback(void *arg);
+static void set_apply_error_context_rel(LogicalRepRelMapEntry *rel);
+static void reset_apply_error_context_rel(void);
+static void reset_apply_error_context_info(void);
+
/*
* Should this worker apply changes for given relation.
*
@@ -827,6 +855,8 @@ apply_handle_begin(StringInfo s)
LogicalRepBeginData begin_data;
logicalrep_read_begin(s, &begin_data);
+ apply_error_callback_arg.remote_xid = begin_data.xid;
+ apply_error_callback_arg.committs = begin_data.committime;
remote_final_lsn = begin_data.final_lsn;
@@ -860,6 +890,7 @@ apply_handle_commit(StringInfo s)
process_syncing_tables(commit_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -877,6 +908,8 @@ apply_handle_begin_prepare(StringInfo s)
errmsg_internal("tablesync worker received a BEGIN PREPARE message")));
logicalrep_read_begin_prepare(s, &begin_data);
+ apply_error_callback_arg.remote_xid = begin_data.xid;
+ apply_error_callback_arg.committs = begin_data.prepare_time;
remote_final_lsn = begin_data.prepare_lsn;
@@ -962,6 +995,7 @@ apply_handle_prepare(StringInfo s)
process_syncing_tables(prepare_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -974,6 +1008,8 @@ apply_handle_commit_prepared(StringInfo s)
char gid[GIDSIZE];
logicalrep_read_commit_prepared(s, &prepare_data);
+ apply_error_callback_arg.remote_xid = prepare_data.xid;
+ apply_error_callback_arg.committs = prepare_data.commit_time;
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
@@ -1001,6 +1037,7 @@ apply_handle_commit_prepared(StringInfo s)
process_syncing_tables(prepare_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1013,6 +1050,7 @@ apply_handle_rollback_prepared(StringInfo s)
char gid[GIDSIZE];
logicalrep_read_rollback_prepared(s, &rollback_data);
+ apply_error_callback_arg.remote_xid = rollback_data.xid;
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, rollback_data.xid,
@@ -1050,6 +1088,7 @@ apply_handle_rollback_prepared(StringInfo s)
process_syncing_tables(rollback_data.rollback_end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1106,6 +1145,8 @@ apply_handle_stream_start(StringInfo s)
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg_internal("invalid transaction ID in streamed replication transaction")));
+ apply_error_callback_arg.remote_xid = stream_xid;
+
/*
* Initialize the xidhash table if we haven't yet. This will be used for
* the entire duration of the apply worker so create it in permanent
@@ -1162,6 +1203,7 @@ apply_handle_stream_stop(StringInfo s)
MemoryContextReset(LogicalStreamingContext);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1185,7 +1227,10 @@ apply_handle_stream_abort(StringInfo s)
* just delete the files with serialized info.
*/
if (xid == subxid)
+ {
+ apply_error_callback_arg.remote_xid = xid;
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
+ }
else
{
/*
@@ -1210,6 +1255,7 @@ apply_handle_stream_abort(StringInfo s)
char path[MAXPGPATH];
StreamXidHash *ent;
+ apply_error_callback_arg.remote_xid = subxid;
subidx = -1;
begin_replication_step();
subxact_info_read(MyLogicalRepWorker->subid, xid);
@@ -1234,6 +1280,7 @@ apply_handle_stream_abort(StringInfo s)
cleanup_subxact_info();
end_replication_step();
CommitTransactionCommand();
+ reset_apply_error_context_info();
return;
}
@@ -1265,6 +1312,8 @@ apply_handle_stream_abort(StringInfo s)
end_replication_step();
CommitTransactionCommand();
}
+
+ reset_apply_error_context_info();
}
/*
@@ -1409,6 +1458,8 @@ apply_handle_stream_commit(StringInfo s)
errmsg_internal("STREAM COMMIT message without STREAM STOP")));
xid = logicalrep_read_stream_commit(s, &commit_data);
+ apply_error_callback_arg.remote_xid = xid;
+ apply_error_callback_arg.committs = commit_data.committime;
elog(DEBUG1, "received commit for streamed transaction %u", xid);
@@ -1423,6 +1474,8 @@ apply_handle_stream_commit(StringInfo s)
process_syncing_tables(commit_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+
+ reset_apply_error_context_info();
}
/*
@@ -1542,6 +1595,9 @@ apply_handle_insert(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ set_apply_error_context_rel(rel);
+
/* Initialize the executor state. */
edata = create_edata_for_relation(rel);
estate = edata->estate;
@@ -1565,6 +1621,9 @@ apply_handle_insert(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ reset_apply_error_context_rel();
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -1663,6 +1722,9 @@ apply_handle_update(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ set_apply_error_context_rel(rel);
+
/* Check if we can do the update. */
check_relation_updatable(rel);
@@ -1716,6 +1778,9 @@ apply_handle_update(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ reset_apply_error_context_rel();
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -1819,6 +1884,9 @@ apply_handle_delete(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ set_apply_error_context_rel(rel);
+
/* Check if we can do the delete. */
check_relation_updatable(rel);
@@ -1844,6 +1912,9 @@ apply_handle_delete(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ reset_apply_error_context_rel();
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -2278,44 +2349,54 @@ static void
apply_dispatch(StringInfo s)
{
LogicalRepMsgType action = pq_getmsgbyte(s);
+ ErrorContextCallback errcallback;
+
+ /*
+ * Push apply error context callback. Other fields will be filled during
+ * applying the change.
+ */
+ apply_error_callback_arg.command = action;
+ errcallback.callback = apply_error_callback;
+ errcallback.previous = error_context_stack;
+ error_context_stack = &errcallback;
switch (action)
{
case LOGICAL_REP_MSG_BEGIN:
apply_handle_begin(s);
- return;
+ break;
case LOGICAL_REP_MSG_COMMIT:
apply_handle_commit(s);
- return;
+ break;
case LOGICAL_REP_MSG_INSERT:
apply_handle_insert(s);
- return;
+ break;
case LOGICAL_REP_MSG_UPDATE:
apply_handle_update(s);
- return;
+ break;
case LOGICAL_REP_MSG_DELETE:
apply_handle_delete(s);
- return;
+ break;
case LOGICAL_REP_MSG_TRUNCATE:
apply_handle_truncate(s);
- return;
+ break;
case LOGICAL_REP_MSG_RELATION:
apply_handle_relation(s);
- return;
+ break;
case LOGICAL_REP_MSG_TYPE:
apply_handle_type(s);
- return;
+ break;
case LOGICAL_REP_MSG_ORIGIN:
apply_handle_origin(s);
- return;
+ break;
case LOGICAL_REP_MSG_MESSAGE:
@@ -2324,45 +2405,48 @@ apply_dispatch(StringInfo s)
* Although, it could be used by other applications that use this
* output plugin.
*/
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_START:
apply_handle_stream_start(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_END:
apply_handle_stream_stop(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_ABORT:
apply_handle_stream_abort(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_COMMIT:
apply_handle_stream_commit(s);
- return;
+ break;
case LOGICAL_REP_MSG_BEGIN_PREPARE:
apply_handle_begin_prepare(s);
- return;
+ break;
case LOGICAL_REP_MSG_PREPARE:
apply_handle_prepare(s);
- return;
+ break;
case LOGICAL_REP_MSG_COMMIT_PREPARED:
apply_handle_commit_prepared(s);
- return;
+ break;
case LOGICAL_REP_MSG_ROLLBACK_PREPARED:
apply_handle_rollback_prepared(s);
- return;
+ break;
+
+ default:
+ ereport(ERROR,
+ (errcode(ERRCODE_PROTOCOL_VIOLATION),
+ errmsg("invalid logical replication message type \"%c\"", action)));
}
- ereport(ERROR,
- (errcode(ERRCODE_PROTOCOL_VIOLATION),
- errmsg_internal("invalid logical replication message type \"%c\"",
- action)));
+ /* Pop the error context stack */
+ error_context_stack = errcallback.previous;
}
/*
@@ -3595,3 +3679,56 @@ IsLogicalWorker(void)
{
return MyLogicalRepWorker != NULL;
}
+
+static void
+apply_error_callback(void *arg)
+{
+ StringInfoData buf;
+
+ if (apply_error_callback_arg.command == 0)
+ return;
+
+ initStringInfo(&buf);
+ appendStringInfo(&buf, _("during apply of \"%s\""),
+ logicalrep_message_type(apply_error_callback_arg.command));
+
+ if (apply_error_callback_arg.relname)
+ appendStringInfo(&buf, _(" for relation \"%s.%s\""),
+ apply_error_callback_arg.nspname,
+ apply_error_callback_arg.relname);
+
+ if (TransactionIdIsNormal(apply_error_callback_arg.remote_xid))
+ appendStringInfo(&buf, _(" in transaction with xid %u committs %s"),
+ apply_error_callback_arg.remote_xid,
+ apply_error_callback_arg.committs == 0
+ ? "(unset)"
+ : timestamptz_to_str(apply_error_callback_arg.committs));
+
+ errcontext("%s", buf.data);
+}
+
+/* Set relation information of apply error callback */
+static void
+set_apply_error_context_rel(LogicalRepRelMapEntry *rel)
+{
+ apply_error_callback_arg.nspname = rel->remoterel.nspname;
+ apply_error_callback_arg.relname = rel->remoterel.relname;
+}
+
+/* Reset relation information of apply error callback */
+static void
+reset_apply_error_context_rel(void)
+{
+ apply_error_callback_arg.nspname = NULL;
+ apply_error_callback_arg.relname = NULL;
+}
+
+/* Reset all information of apply error callback */
+static void
+reset_apply_error_context_info(void)
+{
+ apply_error_callback_arg.command = 0;
+ apply_error_callback_arg.remote_xid = InvalidTransactionId;
+ apply_error_callback_arg.committs = 0;
+ reset_apply_error_context_rel();
+}
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 63de90d94a..c78a4409bc 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -242,5 +242,6 @@ extern void logicalrep_write_stream_abort(StringInfo out, TransactionId xid,
TransactionId subxid);
extern void logicalrep_read_stream_abort(StringInfo in, TransactionId *xid,
TransactionId *subxid);
+extern char *logicalrep_message_type(LogicalRepMsgType action);
#endif /* LOGICAL_PROTO_H */
--
2.24.3 (Apple Git-128)
v5-0002-Add-pg_stat_subscription_errors-statistics-view.patchapplication/octet-stream; name=v5-0002-Add-pg_stat_subscription_errors-statistics-view.patchDownload
From 02457ee21ed617dc08ec4aec4340442b00eb7147 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 16 Jul 2021 23:10:22 +0900
Subject: [PATCH v5 2/4] Add pg_stat_subscription_errors statistics view.
This commits adds new system view pg_stat_logical_replication_error,
showing errors happening during applying logical replication changes
as well as during performing initial table synchronization.
The subscription error entries are removed by autovacuum workers when
the table synchronization competed in table sync worker cases and when
dropping the subscription in apply worker cases.
It also adds SQL function pg_stat_reset_subscription_error() to
reset the single subscription error.
---
doc/src/sgml/monitoring.sgml | 160 ++++++
src/backend/catalog/system_functions.sql | 2 +
src/backend/catalog/system_views.sql | 27 +
src/backend/postmaster/pgstat.c | 651 +++++++++++++++++++++++
src/backend/replication/logical/worker.c | 46 +-
src/backend/utils/adt/pgstatfuncs.c | 119 +++++
src/backend/utils/error/elog.c | 16 +
src/include/catalog/pg_proc.dat | 13 +
src/include/pgstat.h | 106 ++++
src/include/utils/elog.h | 1 +
src/test/regress/expected/rules.out | 22 +
11 files changed, 1158 insertions(+), 5 deletions(-)
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 74a58a916c..0fd0bbfa1f 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -346,6 +346,15 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
</entry>
</row>
+ <row>
+ <entry><structname>pg_stat_subscription_errors</structname><indexterm><primary>pg_stat_subscription_errors</primary></indexterm></entry>
+ <entry>One row per error that happened on subscription, showing information about
+ the subscription errors.
+ See <link linkend="monitoring-pg-stat-subscription-errors">
+ <structname>pg_stat_subscription_errors</structname></link> for details.
+ </entry>
+ </row>
+
<row>
<entry><structname>pg_stat_ssl</structname><indexterm><primary>pg_stat_ssl</primary></indexterm></entry>
<entry>One row per connection (regular and replication), showing information about
@@ -3050,6 +3059,135 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
</sect2>
+ <sect2 id="monitoring-pg-stat-subscription-errors">
+ <title><structname>pg_stat_subscription_errors</structname></title>
+
+ <indexterm>
+ <primary>pg_stat_subscription_errors</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_stat_subscription_errors</structname> view will contain one
+ row per subscription error reported by workers applying logical replication
+ changes and workers handling the initial data copy of the subscribed tables.
+ </para>
+
+ <table id="pg-stat-subscription-errors" xreflabel="pg_stat_subscription-errors">
+ <title><structname>pg_stat_subscription_errors</structname> View</title>
+ <tgroup cols="1">
+ <thead>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ Column Type
+ </para>
+ <para>
+ Description
+ </para></entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>datname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the database in which the subscription is created.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the subscription.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the subscription.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>relid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker is processing when the
+ error happened.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>command</structfield> <type>text</type>
+ </para>
+ <para>
+ Name of command being applied when the error happened. This
+ field is always NULL if the error is reported by
+ <literal>tablesync</literal> worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>xid</structfield> <type>xid</type>
+ </para>
+ <para>
+ Transaction ID of publisher node being applied when the error
+ happened. This field is always NULL if the error is reported
+ by <literal>tablesync</literal> worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>failure_source</structfield> <type>text</type>
+ </para>
+ <para>
+ Type of worker reported the error: <literal>apply</literal> or
+ <literal>tablesync</literal>.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>failure_count</structfield> <type>uint8</type>
+ </para>
+ <para>
+ Number of times error happened on the worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_failure</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which the last error happened.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_failure_message</structfield> <type>text</type>
+ </para>
+ <para>
+ Error message which is reported last failure time.
+ </para></entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ </sect2>
+
<sect2 id="monitoring-pg-stat-ssl-view">
<title><structname>pg_stat_ssl</structname></title>
@@ -5172,6 +5310,28 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
can be granted EXECUTE to run the function.
</para></entry>
</row>
+
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_stat_reset_subscription_error</primary>
+ </indexterm>
+ <function>pg_stat_reset_subscription_error</function> ( <parameter>subid</parameter> <type>oid</type>, <parameter>relid</parameter> <type>oid</type> )
+ <returnvalue>void</returnvalue>
+ </para>
+ <para>
+ Resets statistics of a single subscription error. If
+ the argument <parameter>relid</parameter> is not <literal>NULL</literal>,
+ resets error statistics of the <literal>tablesync</literal> worker for
+ the relation with <parameter>relid</parameter>. Otherwise, resets the
+ error statistics of the <literal>apply</literal> worker running on the
+ subscription with <parameter>subid</parameter>.
+ </para>
+ <para>
+ This function is restricted to superusers by default, but other users
+ can be granted EXECUTE to run the function.
+ </para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index a416e94d37..c9aa6f04d3 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -639,6 +639,8 @@ REVOKE EXECUTE ON FUNCTION pg_stat_reset_single_function_counters(oid) FROM publ
REVOKE EXECUTE ON FUNCTION pg_stat_reset_replication_slot(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_stat_reset_subscription_error(oid, oid) FROM public;
+
REVOKE EXECUTE ON FUNCTION lo_import(text) FROM public;
REVOKE EXECUTE ON FUNCTION lo_import(text, oid) FROM public;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 55f6e3711d..449692afa9 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1257,3 +1257,30 @@ REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
substream, subtwophasestate, subslotname, subsynccommit, subpublications)
ON pg_subscription TO public;
+
+CREATE VIEW pg_stat_subscription_errors AS
+ SELECT
+ d.datname,
+ sr.subid,
+ s.subname,
+ e.relid,
+ e.command,
+ e.xid,
+ e.failure_source,
+ e.failure_count,
+ e.last_failure,
+ e.last_failure_message,
+ e.stats_reset
+ FROM (SELECT
+ oid as subid,
+ NULL as relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT
+ srsubid as subid,
+ srrelid as relid
+ FROM pg_subscription_rel
+ WHERE srsubstate <> 'r') sr,
+ LATERAL pg_stat_get_subscription_error(sr.subid, sr.relid) e
+ JOIN pg_database d ON (e.datid = d.oid)
+ JOIN pg_subscription s ON (e.subid = s.oid);
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 11702f2a80..8b5ff370d3 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -41,6 +41,8 @@
#include "catalog/partition.h"
#include "catalog/pg_database.h"
#include "catalog/pg_proc.h"
+#include "catalog/pg_subscription.h"
+#include "catalog/pg_subscription_rel.h"
#include "common/ip.h"
#include "executor/instrument.h"
#include "libpq/libpq.h"
@@ -106,6 +108,7 @@
#define PGSTAT_TAB_HASH_SIZE 512
#define PGSTAT_FUNCTION_HASH_SIZE 512
#define PGSTAT_REPLSLOT_HASH_SIZE 32
+#define PGSTAT_SUBSCRIPTION_HASH_SIZE 32
/* ----------
@@ -279,6 +282,7 @@ static PgStat_GlobalStats globalStats;
static PgStat_WalStats walStats;
static PgStat_SLRUStats slruStats[SLRU_NUM_ELEMENTS];
static HTAB *replSlotStatHash = NULL;
+static HTAB *subscriptionHash = NULL;
/*
* List of OIDs of databases we need to write out. If an entry is InvalidOid,
@@ -320,6 +324,12 @@ static bool pgstat_db_requested(Oid databaseid);
static PgStat_StatReplSlotEntry *pgstat_get_replslot_entry(NameData name, bool create_it);
static void pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotstats, TimestampTz ts);
+static PgStat_StatSubEntry *pgstat_get_subscription_entry(Oid subid,
+ bool create);
+static PgStat_StatSubErrEntry *pgstat_get_subscription_error_entry(Oid subid,
+ Oid subrelid,
+ bool create);
+
static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg);
static void pgstat_send_funcstats(void);
static void pgstat_send_slru(void);
@@ -358,6 +368,10 @@ static void pgstat_recv_checksum_failure(PgStat_MsgChecksumFailure *msg, int len
static void pgstat_recv_connstat(PgStat_MsgConn *msg, int len);
static void pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len);
static void pgstat_recv_tempfile(PgStat_MsgTempFile *msg, int len);
+static void pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len);
+static void pgstat_recv_subscription_error_purge(PgStat_MsgSubscriptionErrPurge *msg,
+ int len);
+static void pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len);
/* ------------------------------------------------------------
* Public functions called from postmaster follow
@@ -1134,6 +1148,166 @@ pgstat_vacuum_stat(void)
}
}
+ /*
+ * Search for all the dead subscriptions and error entries in stats
+ * hashtable and tell the stats collector to drop them.
+ */
+ if (subscriptionHash)
+ {
+ PgStat_MsgSubscriptionPurge submsg;
+ PgStat_StatSubEntry *subent;
+ HTAB *htab;
+
+ htab = pgstat_collect_oids(SubscriptionRelationId, Anum_pg_subscription_oid);
+
+ submsg.m_nentries = 0;
+
+ hash_seq_init(&hstat, subscriptionHash);
+ while ((subent = (PgStat_StatSubEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ PgStat_MsgSubscriptionErrPurge errmsg;
+ PgStat_StatSubErrEntry *errent;
+ HASH_SEQ_STATUS hstat_rel;
+ List *not_ready_rels_list;
+ HTAB *not_ready_rels_htab;
+ ListCell *lc;
+ HASHCTL hash_ctl;
+
+ CHECK_FOR_INTERRUPTS();
+
+ if (hash_search(htab, (void *) &(subent->subid), HASH_FIND, NULL) == NULL)
+ {
+ /* This subscription is dead, add subid to the message */
+ submsg.m_subids[submsg.m_nentries++] = subent->subid;
+
+ /*
+ * If the message is full, send it out and reinitialize to
+ * empty
+ */
+ if (submsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONPURGE)
+ {
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + submsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&submsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(&submsg, len);
+ submsg.m_nentries = 0;
+ }
+
+ continue;
+ }
+
+ /*
+ * Nothing to do here if the subscription exists but has no error
+ * entries.
+ */
+ if (subent->suberrors == NULL)
+ continue;
+
+ /*
+ * The subscription has error entries. We search errors of the
+ * table sync workers who are already in sync state. Those errors
+ * should be removed.
+ *
+ * Note that the lifetime of error entries of the apply worker and
+ * the table sync worker are different. The former lives until
+ * the subscription is dropped whereas the latter lives the table
+ * synchronization is completed.
+ */
+ not_ready_rels_list = GetSubscriptionNotReadyRelations(subent->subid);
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(SubscriptionRelState);
+ not_ready_rels_htab = hash_create("not ready relations in subscription",
+ 64,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+
+ /*
+ * The number of not-ready relations can be high for example right
+ * after creating a subscription, so we load the list of
+ * SubscriptionRelState into the hash table for a faster lookup.
+ */
+ foreach(lc, not_ready_rels_list)
+ {
+ SubscriptionRelState *r_elem = (SubscriptionRelState *) lfirst(lc);
+ SubscriptionRelState *r_entry;
+
+ CHECK_FOR_INTERRUPTS();
+ r_entry = hash_search(not_ready_rels_htab, (void *) &(r_elem->relid),
+ HASH_ENTER, NULL);
+ memcpy(r_entry, r_elem, sizeof(SubscriptionRelState));
+ }
+
+ list_free(not_ready_rels_list);
+
+ errmsg.m_nentries = 0;
+ errmsg.m_subid = subent->subid;
+
+ /*
+ * Search for all error entries of which relation is already ready
+ * state
+ */
+ hash_seq_init(&hstat_rel, subent->suberrors);
+ while ((errent = (PgStat_StatSubErrEntry *) hash_seq_search(&hstat_rel)) != NULL)
+ {
+ CHECK_FOR_INTERRUPTS();
+
+ /* Skip the apply worker's error */
+ if (!OidIsValid(errent->subrelid))
+ continue;
+
+ /*
+ * Add the relid to the message if the table synchronization
+ * for this relation already complete or the table is no
+ * longer subscribed.
+ */
+ if (hash_search(not_ready_rels_htab, (void *) &(errent->subrelid),
+ HASH_FIND, NULL) == NULL)
+ errmsg.m_relids[errmsg.m_nentries++] = errent->subrelid;
+
+ /*
+ * If the message is full, send it out and reinitialize to
+ * empty
+ */
+ if (errmsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONERRPURGE)
+ {
+ len = offsetof(PgStat_MsgSubscriptionErrPurge, m_relids[0])
+ + errmsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&errmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE);
+ pgstat_send(&errmsg, len);
+ errmsg.m_nentries = 0;
+ }
+ }
+
+ /* Send the rest of dead error entries */
+ if (errmsg.m_nentries > 0)
+ {
+ len = offsetof(PgStat_MsgSubscriptionErrPurge, m_relids[0])
+ + errmsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&errmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE);
+ pgstat_send(&errmsg, len);
+ errmsg.m_nentries = 0;
+ }
+
+ hash_destroy(not_ready_rels_htab);
+ }
+
+ /* Send the rest of dead subscriptions */
+ if (submsg.m_nentries > 0)
+ {
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + submsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&submsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(&submsg, len);
+ }
+
+ hash_destroy(htab);
+ }
+
/*
* Lookup our own database entry; if not found, nothing more to do.
*/
@@ -1543,6 +1717,25 @@ pgstat_reset_replslot_counter(const char *name)
pgstat_send(&msg, sizeof(msg));
}
+/* ----------
+ * pgstat_reset_subscription_error() -
+ *
+ * Tell the collector about reset the subscription error.
+ * ----------
+ */
+void
+pgstat_reset_subscription_error(Oid subid, Oid subrelid)
+{
+ PgStat_MsgSubscriptionErr msg;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_reset = true;
+
+ pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
+}
+
/* ----------
* pgstat_report_autovac() -
*
@@ -1863,6 +2056,37 @@ pgstat_report_replslot_drop(const char *slotname)
pgstat_send(&msg, sizeof(PgStat_MsgReplSlot));
}
+/* ----------
+ * pgstat_report_subscription_error() -
+ *
+ * Tell the collector about the subscription error.
+ * ----------
+ */
+void
+pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command, TransactionId xid,
+ const char *errmsg)
+{
+ PgStat_MsgSubscriptionErr msg;
+ int len;
+
+ len = offsetof(PgStat_MsgSubscriptionErr, m_errmsg[0]) + strlen(errmsg);
+ Assert(len < PGSTAT_MAX_MSG_SIZE);
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_databaseid = MyDatabaseId;
+ msg.m_relid = relid;
+ msg.m_reset = false;
+ msg.m_command = command;
+ msg.m_xid = xid;
+ msg.m_failure_time = GetCurrentTimestamp();
+ strlcpy(msg.m_errmsg, errmsg, PGSTAT_SUBSCRIPTIONERR_MSGLEN);
+
+ pgstat_send(&msg, len);
+}
+
/* ----------
* pgstat_ping() -
*
@@ -2895,6 +3119,22 @@ pgstat_fetch_replslot(NameData slotname)
return pgstat_get_replslot_entry(slotname, false);
}
+/*
+ * ---------
+ * pgstat_fetch_subscription_error() -
+ *
+ * Support function for the SQL-callable pgstat* functions. Returns
+ * a pointer to the subscription error struct.
+ * ---------
+ */
+PgStat_StatSubErrEntry *
+pgstat_fetch_subscription_error(Oid subid, Oid relid)
+{
+ backend_read_statsfile();
+
+ return pgstat_get_subscription_error_entry(subid, relid, false);
+}
+
/*
* Shut down a single backend's statistics reporting at process exit.
*
@@ -3424,6 +3664,19 @@ PgstatCollectorMain(int argc, char *argv[])
pgstat_recv_connstat(&msg.msg_conn, len);
break;
+ case PGSTAT_MTYPE_SUBSCRIPTIONERR:
+ pgstat_recv_subscription_error(&msg.msg_subscriptionerr, len);
+ break;
+
+ case PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE:
+ pgstat_recv_subscription_error_purge(&msg.msg_subscriptionerrpurge,
+ len);
+ break;
+
+ case PGSTAT_MTYPE_SUBSCRIPTIONPURGE:
+ pgstat_recv_subscription_purge(&msg.msg_subscriptionpurge, len);
+ break;
+
default:
break;
}
@@ -3725,6 +3978,50 @@ pgstat_write_statsfiles(bool permanent, bool allDbs)
}
}
+ /*
+ * Write subscription error structs
+ */
+ if (subscriptionHash)
+ {
+ PgStat_StatSubEntry *subent;
+
+ hash_seq_init(&hstat, subscriptionHash);
+ while ((subent = (PgStat_StatSubEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ PgStat_StatSubErrEntry *errent;
+ HASH_SEQ_STATUS relhstat;
+ int32 nerrors = hash_get_num_entries(subent->suberrors);
+
+ /* Skip this subscription if not have any errors */
+ if (subent->suberrors == NULL)
+ continue;
+
+ fputc('S', fpout);
+ rc = fwrite(subent, sizeof(PgStat_StatSubEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+
+ /* The number of errors follows */
+ rc = fwrite(&nerrors, sizeof(int32), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+
+ /* Then, the error entries follow */
+ hash_seq_init(&relhstat, subent->suberrors);
+ while ((errent = (PgStat_StatSubErrEntry *) hash_seq_search(&relhstat)) != NULL)
+ {
+ /*
+ * XXX we write the whole PgStat_StatSubErrEntry entry that
+ * contains the fixed-legnth error message string which is
+ * PGSTAT_SUBSCRIPTIONERR_MSGLEN in length, making the stats
+ * file bloat. It's okay since we assume that the number of
+ * error entries is not high. But if the expectation became
+ * false we should write the string and its length instead.
+ */
+ rc = fwrite(errent, sizeof(PgStat_StatSubErrEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+ }
+ }
+ }
+
/*
* No more output to be done. Close the temp file and replace the old
* pgstat.stat with it. The ferror() check replaces testing for error
@@ -4184,6 +4481,100 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
break;
}
+ /*
+ * 'S' A PgStat_StatSubEntry struct followed by the number of
+ * errors and PgStat_StatSubErrEntry structs, describing
+ * subscription errors.
+ */
+ case 'S':
+ {
+ PgStat_StatSubEntry subbuf;
+ PgStat_StatSubEntry *subent;
+ int32 nerrors;
+
+ /* Read the subscription entry */
+ if (fread(&subbuf, 1, sizeof(PgStat_StatSubEntry), fpin) !=
+ sizeof(PgStat_StatSubEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /* Create hash table if we don't have it already. */
+ if (subscriptionHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ subscriptionHash = hash_create("Subscription stat entries",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ }
+
+ /* Enter the subscription error entry */
+ subent =
+ (PgStat_StatSubEntry *) hash_search(subscriptionHash,
+ (void *) &(subbuf.subid),
+ HASH_ENTER, NULL);
+ subent->suberrors = NULL;
+
+ /* Read the number of errors in the subscription */
+ if (fread(&nerrors, 1, sizeof(int32), fpin) != sizeof(int32))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /* Read error entries */
+ for (int i = 0; i < nerrors; i++)
+ {
+ PgStat_StatSubErrEntry errbuf;
+ PgStat_StatSubErrEntry *errent;
+
+ if (fread(&errbuf, 1, sizeof(PgStat_StatSubErrEntry), fpin) !=
+ sizeof(PgStat_StatSubErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ if (subent->suberrors == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubErrEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ subent->suberrors = hash_create("Subscription error hash",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ }
+
+ /*
+ * Enter the error information to the subscription
+ * hash
+ */
+ errent =
+ (PgStat_StatSubErrEntry *) hash_search(subent->suberrors,
+ (void *) &(errbuf.subrelid),
+ HASH_ENTER, NULL);
+
+ memcpy(errent, &errbuf, sizeof(PgStat_StatSubErrEntry));
+ }
+
+ break;
+ }
+
case 'E':
goto done;
@@ -4526,6 +4917,50 @@ pgstat_read_db_statsfile_timestamp(Oid databaseid, bool permanent,
}
break;
+ /*
+ * 'S' A PgStat_StatSubEntry struct followed by the number of
+ * errors and PgStat_StatSubErrEntry structs describing
+ * subscription errors.
+ */
+ case 'S':
+ {
+ PgStat_StatSubEntry subbuf;
+ PgStat_StatSubErrEntry errbuf;
+ int32 nerrors;
+
+ if (fread(&subbuf, 1, sizeof(PgStat_StatSubEntry), fpin)
+ != sizeof(PgStat_StatSubEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ FreeFile(fpin);
+ return false;
+ }
+
+ if (fread(&nerrors, 1, sizeof(int32), fpin) != sizeof(int32))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ for (int i = 0; i < nerrors; i++)
+ {
+ if (fread(&errbuf, 1, sizeof(PgStat_StatSubErrEntry), fpin) !=
+ sizeof(PgStat_StatSubErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+ }
+ }
+
+ break;
+
case 'E':
goto done;
@@ -4716,6 +5151,7 @@ pgstat_clear_snapshot(void)
pgStatLocalContext = NULL;
pgStatDBHash = NULL;
replSlotStatHash = NULL;
+ subscriptionHash = NULL;
/*
* Historically the backend_status.c facilities lived in this file, and
@@ -5650,6 +6086,116 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
}
}
+/* ----------
+ * pgstat_recv_subscription_error() -
+ *
+ * Process a SUBSCRIPTIONERR message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len)
+{
+ PgStat_StatSubErrEntry *errent;
+ bool create = !msg->m_reset;
+
+ /* Get subscription error */
+ errent = pgstat_get_subscription_error_entry(msg->m_subid,
+ msg->m_subrelid,
+ create);
+
+ if (msg->m_reset)
+ {
+ if (errent == NULL)
+ return;
+
+ /* reset fields and set reset timestamp */
+ errent->relid = InvalidOid;
+ errent->command = 0;
+ errent->xid = InvalidTransactionId;
+ errent->failure_count = 0;
+ errent->last_failure = 0;
+ errent->last_errmsg[0] = '\0';
+ errent->stat_reset_timestamp = GetCurrentTimestamp();
+ }
+ else
+ {
+ Assert(errent);
+
+ /* update the error entry */
+ errent->databaseid = msg->m_databaseid;
+ errent->relid = msg->m_relid;
+ errent->command = msg->m_command;
+ errent->xid = msg->m_xid;
+ errent->failure_count++;
+ errent->last_failure = msg->m_failure_time;
+ strlcpy(errent->last_errmsg, msg->m_errmsg, PGSTAT_SUBSCRIPTIONERR_MSGLEN);
+ }
+}
+
+/* ----------
+ * pgstat_recv_subscription_purge() -
+ *
+ * Process a SUBSCRIPTIONPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len)
+{
+ if (subscriptionHash == NULL)
+ return;
+
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ PgStat_StatSubEntry *subent;
+
+ subent = pgstat_get_subscription_entry(msg->m_subids[i], false);
+
+ /*
+ * Nothing to do if the subscription entry is not found. This could
+ * happen when the subscription is dropped and the message for
+ * dropping subscription entry arrived before the message for
+ * reporting the error.
+ */
+ if (subent == NULL)
+ continue;
+
+ /* Cleanup the hash table for errors */
+ if (subent->suberrors != NULL)
+ hash_destroy(subent->suberrors);
+
+ /* Remote the subscription entry */
+ (void) hash_search(subscriptionHash, (void *) &(msg->m_subids[i]),
+ HASH_REMOVE, NULL);
+ }
+}
+
+/* ----------
+ * pgstat_recv_subscription_error_purge() -
+ *
+ * Process a SUBSCRIPTIONERRPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_error_purge(PgStat_MsgSubscriptionErrPurge *msg, int len)
+{
+ PgStat_StatSubEntry *subent;
+
+ subent = pgstat_get_subscription_entry(msg->m_subid, false);
+
+ /*
+ * Nothing to do if the subscription entry is not found. This could
+ * happen when the subscription with msg->m_subid is removed and the
+ * corresponding entry is also removed before receiving the error purge
+ * message.
+ */
+ if (subent == NULL)
+ return;
+
+ for (int i = 0; i < msg->m_nentries; i++)
+ (void) hash_search(subent->suberrors, (void *) &(msg->m_relids[i]),
+ HASH_REMOVE, NULL);
+}
+
/* ----------
* pgstat_write_statsfile_needed() -
*
@@ -5747,6 +6293,111 @@ pgstat_get_replslot_entry(NameData name, bool create)
return slotent;
}
+/* ----------
+ * pgstat_get_subscription_entry
+ *
+ * Return the subscription statistics with the subscription OID. Return NULL
+ * if not found and the caller didn't request to create it.
+ *
+ * create tells whether to create the new subscription entry if it is not
+ * found.
+ * ----------
+ */
+static PgStat_StatSubEntry *
+pgstat_get_subscription_entry(Oid subid, bool create)
+{
+ PgStat_StatSubEntry *subent;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);
+ bool found;
+
+ if (subscriptionHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ if (!create)
+ return NULL;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubEntry);
+ subscriptionHash = hash_create("Subscription stat entries",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+ }
+
+ subent = (PgStat_StatSubEntry *) hash_search(subscriptionHash,
+ (void *) &subid,
+ action, &found);
+
+ /* initialize field */
+ if (create && !found)
+ subent->suberrors = NULL;
+
+ return subent;
+}
+
+/* ----------
+ * pgstat_get_subscription_error_entry
+ *
+ * Return the entry of subscription error entry with the subscription
+ * OID and relation OID. Return NULL if not found and the caller didn't
+ * request to create it.
+ *
+ * create tells whether to create the new subscription relation entry if it is
+ * not found.
+ * ----------
+ */
+static PgStat_StatSubErrEntry *
+pgstat_get_subscription_error_entry(Oid subid, Oid subrelid, bool create)
+{
+ PgStat_StatSubEntry *subent;
+ PgStat_StatSubErrEntry *errent;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);
+ bool found;
+
+ subent = pgstat_get_subscription_entry(subid, create);
+
+ if (subent == NULL)
+ {
+ Assert(!create);
+ return NULL;
+ }
+
+ if (subent->suberrors == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ if (!create)
+ return NULL;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubErrEntry);
+ subent->suberrors = hash_create("Subscription error hash",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+ }
+
+ errent = (PgStat_StatSubErrEntry *) hash_search(subent->suberrors,
+ (void *) &subrelid,
+ action, &found);
+
+ /* initialize fields */
+ if (create && !found)
+ {
+ errent->databaseid = InvalidOid;
+ errent->relid = InvalidOid;
+ errent->command = 0;
+ errent->xid = InvalidTransactionId;
+ errent->failure_count = 0;
+ errent->last_failure = 0;
+ errent->last_errmsg[0] = '\0';
+ errent->stat_reset_timestamp = 0;
+ }
+
+ return errent;
+}
+
/* ----------
* pgstat_reset_replslot
*
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index c2376d755f..5a3ba8d7c1 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -227,8 +227,9 @@ typedef struct ApplyErrCallbackArg
LogicalRepMsgType command; /* 0 if invalid */
/* Local relation information */
- char *nspname;
- char *relname;
+ Oid relid; /* used for error report */
+ char *nspname; /* used for error context */
+ char *relname; /* used for error context */
TransactionId remote_xid;
TimestampTz committs;
@@ -237,6 +238,7 @@ typedef struct ApplyErrCallbackArg
static ApplyErrCallbackArg apply_error_callback_arg =
{
.command = 0,
+ .relid = InvalidOid,
.relname = NULL,
.nspname = NULL,
.remote_xid = InvalidTransactionId,
@@ -3547,8 +3549,23 @@ ApplyWorkerMain(Datum main_arg)
{
char *syncslotname;
- /* This is table synchronization worker, call initial sync. */
- syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ PG_TRY();
+ {
+ /* This is table synchronization worker, call initial sync. */
+ syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ }
+ PG_CATCH();
+ {
+ /* report the table sync error */
+ pgstat_report_subscription_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ MyLogicalRepWorker->relid,
+ 0,
+ InvalidTransactionId,
+ geterrmessage());
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
/* allocate slot name in long-lived context */
myslotname = MemoryContextStrdup(ApplyContext, syncslotname);
@@ -3666,7 +3683,24 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- LogicalRepApplyLoop(origin_startpos);
+ PG_TRY();
+ {
+ LogicalRepApplyLoop(origin_startpos);
+ }
+ PG_CATCH();
+ {
+ /* report the apply error */
+ if (apply_error_callback_arg.command != 0)
+ pgstat_report_subscription_error(MySubscription->oid,
+ InvalidOid,
+ apply_error_callback_arg.relid,
+ apply_error_callback_arg.command,
+ apply_error_callback_arg.remote_xid,
+ geterrmessage());
+
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
proc_exit(0);
}
@@ -3711,6 +3745,7 @@ apply_error_callback(void *arg)
static void
set_apply_error_context_rel(LogicalRepRelMapEntry *rel)
{
+ apply_error_callback_arg.relid = rel->localreloid;
apply_error_callback_arg.nspname = rel->remoterel.nspname;
apply_error_callback_arg.relname = rel->remoterel.relname;
}
@@ -3719,6 +3754,7 @@ set_apply_error_context_rel(LogicalRepRelMapEntry *rel)
static void
reset_apply_error_context_rel(void)
{
+ apply_error_callback_arg.relid = InvalidOid;
apply_error_callback_arg.nspname = NULL;
apply_error_callback_arg.relname = NULL;
}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index f0e09eae4d..b53be1576f 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -24,6 +24,7 @@
#include "pgstat.h"
#include "postmaster/bgworker_internals.h"
#include "postmaster/postmaster.h"
+#include "replication/logicalproto.h"
#include "replication/slot.h"
#include "storage/proc.h"
#include "storage/procarray.h"
@@ -2240,6 +2241,23 @@ pg_stat_reset_replication_slot(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
+/* Reset a subscription error stats */
+Datum
+pg_stat_reset_subscription_error(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid;
+
+ if (PG_ARGISNULL(1))
+ relid = InvalidOid; /* reset apply worker error stats */
+ else
+ relid = PG_GETARG_OID(1); /* reset table sync worker error stats */
+
+ pgstat_reset_subscription_error(subid, relid);
+
+ PG_RETURN_VOID();
+}
+
Datum
pg_stat_get_archiver(PG_FUNCTION_ARGS)
{
@@ -2380,3 +2398,104 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
/* Returns the record as Datum */
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
+
+/*
+ * Get the subscription error for the given subscription (and relation).
+ */
+Datum
+pg_stat_get_subscription_error(PG_FUNCTION_ARGS)
+{
+#define PG_STAT_GET_SUBSCRIPTION_ERROR_COLS 10
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid;
+ TupleDesc tupdesc;
+ Datum values[PG_STAT_GET_SUBSCRIPTION_ERROR_COLS];
+ bool nulls[PG_STAT_GET_SUBSCRIPTION_ERROR_COLS];
+ PgStat_StatSubErrEntry *errent;
+
+ /* Initialise values and NULL flags arrays */
+ MemSet(values, 0, sizeof(values));
+ MemSet(nulls, 0, sizeof(nulls));
+
+ /* Initialise attributes information in the tuple descriptor */
+ tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_SUBSCRIPTION_ERROR_COLS);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "datid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 3, "relid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 4, "command",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 5, "xid",
+ XIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 6, "failure_source",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 7, "failure_count",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 8, "last_failure",
+ TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 9, "last_failure_message",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 10, "stats_reset",
+ TIMESTAMPTZOID, -1, 0);
+ BlessTupleDesc(tupdesc);
+
+ if (PG_ARGISNULL(1))
+ relid = InvalidOid;
+ else
+ relid = PG_GETARG_OID(1);
+
+ /* Get subscription errors */
+ errent = pgstat_fetch_subscription_error(subid, relid);
+
+ /* Return NULL if the subscription doesn't have any errors */
+ if (errent == NULL)
+ PG_RETURN_NULL();
+
+ values[0] = ObjectIdGetDatum(errent->databaseid);
+ values[1] = ObjectIdGetDatum(subid);
+
+ if (OidIsValid(errent->relid))
+ values[2] = ObjectIdGetDatum(errent->relid);
+ else
+ nulls[2] = true;
+
+ if (errent->command == 0)
+ nulls[3] = true;
+ else
+ {
+ /* should be apply worker */
+ Assert(!OidIsValid(errent->subrelid));
+
+ values[3] = CStringGetTextDatum(logicalrep_message_type(errent->command));
+ }
+
+ if (TransactionIdIsValid(errent->xid))
+ values[4] = TransactionIdGetDatum(errent->xid);
+ else
+ nulls[4] = true;
+
+ if (OidIsValid(errent->subrelid))
+ values[5] = CStringGetTextDatum("tablesync");
+ else
+ values[5] = CStringGetTextDatum("apply");
+
+
+ values[6] = Int64GetDatum(errent->failure_count);
+
+ if (errent->last_failure == 0)
+ nulls[7] = true;
+ else
+ values[7] = TimestampTzGetDatum(errent->last_failure);
+
+ values[8] = CStringGetTextDatum(errent->last_errmsg);
+
+ if (errent->stat_reset_timestamp == 0)
+ nulls[9] = true;
+ else
+ values[9] = TimestampTzGetDatum(errent->stat_reset_timestamp);
+
+ /* Returns the record as Datum */
+ PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
+}
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index a3e1c59a82..dd36850016 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -1441,6 +1441,22 @@ getinternalerrposition(void)
return edata->internalpos;
}
+/*
+ * geterrmessage --- return the currently set error message
+ *
+ * This is only intended for use in error callback subroutines, since there
+ * is no other place outside elog.c where the concept is meaningful.
+ */
+const char *
+geterrmessage(void)
+{
+ ErrorData *edata = &errordata[errordata_stack_depth];
+
+ /* we don't bother incrementing recursion_depth */
+ CHECK_STACK_DEPTH();
+
+ return (const char *) edata->message;
+}
/*
* Functions to allow construction of error message strings separately from
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 8cd0252082..044ff52227 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5321,6 +5321,14 @@
proargmodes => '{i,o,o,o,o,o,o,o,o,o,o}',
proargnames => '{slot_name,slot_name,spill_txns,spill_count,spill_bytes,stream_txns,stream_count,stream_bytes,total_txns,total_bytes,stats_reset}',
prosrc => 'pg_stat_get_replication_slot' },
+{ oid => '8523', descr => 'statistics: information about subscription error',
+ proname => 'pg_stat_get_subscription_error', prorows => '1', proisstrict => 'f',
+ proretset => 't', provolatile => 's', proparallel => 'r',
+ prorettype => 'record', proargtypes => 'oid oid',
+ proallargtypes => '{oid,oid,oid,oid,oid,text,xid,text,int8,timestamptz,text,timestamptz}',
+ proargmodes => '{i,i,o,o,o,o,o,o,o,o,o,o}',
+ proargnames => '{subid,relid,datid,subid,relid,command,xid,failure_source,failure_count,last_failure,last_failure_message,stats_reset}',
+ prosrc => 'pg_stat_get_subscription_error' },
{ oid => '6118', descr => 'statistics: information about subscription',
proname => 'pg_stat_get_subscription', prorows => '10', proisstrict => 'f',
proretset => 't', provolatile => 's', proparallel => 'r',
@@ -5708,6 +5716,11 @@
proname => 'pg_stat_reset_replication_slot', proisstrict => 'f',
provolatile => 'v', prorettype => 'void', proargtypes => 'text',
prosrc => 'pg_stat_reset_replication_slot' },
+{ oid => '8524',
+ descr => 'statistics: reset collected statistics for a single subscription error',
+ proname => 'pg_stat_reset_subscription_error', proisstrict => 'f',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid oid',
+ prosrc => 'pg_stat_reset_subscription_error' },
{ oid => '3163', descr => 'current trigger depth',
proname => 'pg_trigger_depth', provolatile => 's', proparallel => 'r',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 9612c0a6c2..bc4a549cdb 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -14,6 +14,7 @@
#include "datatype/timestamp.h"
#include "portability/instr_time.h"
#include "postmaster/pgarch.h" /* for MAX_XFN_CHARS */
+#include "replication/logicalproto.h"
#include "utils/backend_progress.h" /* for backward compatibility */
#include "utils/backend_status.h" /* for backward compatibility */
#include "utils/hsearch.h"
@@ -66,6 +67,9 @@ typedef enum StatMsgType
PGSTAT_MTYPE_RESETSINGLECOUNTER,
PGSTAT_MTYPE_RESETSLRUCOUNTER,
PGSTAT_MTYPE_RESETREPLSLOTCOUNTER,
+ PGSTAT_MTYPE_SUBSCRIPTIONERR,
+ PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE,
+ PGSTAT_MTYPE_SUBSCRIPTIONPURGE,
PGSTAT_MTYPE_AUTOVAC_START,
PGSTAT_MTYPE_VACUUM,
PGSTAT_MTYPE_ANALYZE,
@@ -539,6 +543,68 @@ typedef struct PgStat_MsgReplSlot
PgStat_Counter m_total_bytes;
} PgStat_MsgReplSlot;
+/* ----------
+ * PgStat_MsgSubscriptionErr Sent by the apply worker or the table sync worker to
+ * update/reset the error happening during logical
+ * replication.
+ * ----------
+ */
+#define PGSTAT_SUBSCRIPTIONERR_MSGLEN 256
+typedef struct PgStat_MsgSubscriptionErr
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * m_subid and m_subrelid are used to determine the subscription and the
+ * reporter of this error. m_subrelid is InvalidOid if reported by the
+ * apply worker, otherwise by the table sync worker. In table sync worker
+ * case, m_subrelid must be the same as m_relid.
+ */
+ Oid m_subid;
+ Oid m_subrelid;
+
+ /* The clear message uses below field */
+ bool m_reset; /* Reset all fields and set reset_stats
+ * timestamp */
+
+ /* The error report message uses below fields */
+ Oid m_databaseid;
+ Oid m_relid;
+ LogicalRepMsgType m_command;
+ TransactionId m_xid;
+ TimestampTz m_failure_time;
+ char m_errmsg[PGSTAT_SUBSCRIPTIONERR_MSGLEN];
+} PgStat_MsgSubscriptionErr;
+
+/* ----------
+ * PgStat_MsgSubscriptionPurge Sent by the autovacuum purge the subscriptions.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionPurge
+{
+ PgStat_MsgHdr m_hdr;
+ int m_nentries;
+ Oid m_subids[PGSTAT_NUM_SUBSCRIPTIONPURGE];
+} PgStat_MsgSubscriptionPurge;
+
+/* ----------
+ * PgStat_MsgSubscriptionErrPurge Sent by the autovacuum purge the subscription
+ * errors.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONERRPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(Oid) - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionErrPurge
+{
+ PgStat_MsgHdr m_hdr;
+ Oid m_subid;
+ int m_nentries;
+ Oid m_relids[PGSTAT_NUM_SUBSCRIPTIONERRPURGE];
+} PgStat_MsgSubscriptionErrPurge;
/* ----------
* PgStat_MsgRecoveryConflict Sent by the backend upon recovery conflict
@@ -710,6 +776,9 @@ typedef union PgStat_Msg
PgStat_MsgChecksumFailure msg_checksumfailure;
PgStat_MsgReplSlot msg_replslot;
PgStat_MsgConn msg_conn;
+ PgStat_MsgSubscriptionErr msg_subscriptionerr;
+ PgStat_MsgSubscriptionErrPurge msg_subscriptionerrpurge;
+ PgStat_MsgSubscriptionPurge msg_subscriptionpurge;
} PgStat_Msg;
@@ -908,6 +977,38 @@ typedef struct PgStat_StatReplSlotEntry
TimestampTz stat_reset_timestamp;
} PgStat_StatReplSlotEntry;
+/*
+ * Subscription statistics kept in the stats collector.
+ */
+typedef struct PgStat_StatSubEntry
+{
+ Oid subid; /* hash table key */
+ HTAB *suberrors;
+} PgStat_StatSubEntry;
+
+/*
+ * Subscription error statistics kep in the stats collector. One entry represents
+ * an error that happened during logical replication, reported by the apply worker
+ * (subrelid is InvalidOid) or by the table sync worker (subrelid is a valid OID).
+ * The error reported by the table sync worker is removed also when the table
+ * synchronization process completed.
+ */
+
+typedef struct PgStat_StatSubErrEntry
+{
+ Oid subrelid; /* InvalidOid if the apply worker, otherwise
+ * the table sync worker. hash table key. */
+ Oid databaseid;
+ Oid relid; /* OID of relation related to the error. Must
+ * be the same as subrelid in the table sync
+ * case. */
+ LogicalRepMsgType command;
+ TransactionId xid;
+ PgStat_Counter failure_count;
+ TimestampTz last_failure;
+ char last_errmsg[PGSTAT_SUBSCRIPTIONERR_MSGLEN];
+ TimestampTz stat_reset_timestamp;
+} PgStat_StatSubErrEntry;
/*
* Working state needed to accumulate per-function-call timing statistics.
@@ -995,6 +1096,7 @@ extern void pgstat_reset_shared_counters(const char *);
extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type type);
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
+extern void pgstat_reset_subscription_error(Oid subid, Oid subrelid);
extern void pgstat_report_autovac(Oid dboid);
extern void pgstat_report_vacuum(Oid tableoid, bool shared,
@@ -1011,6 +1113,9 @@ extern void pgstat_report_checksum_failure(void);
extern void pgstat_report_replslot(const PgStat_StatReplSlotEntry *repSlotStat);
extern void pgstat_report_replslot_create(const char *slotname);
extern void pgstat_report_replslot_drop(const char *slotname);
+extern void pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command,
+ TransactionId xid, const char *errmsg);
extern void pgstat_initialize(void);
@@ -1106,6 +1211,7 @@ extern PgStat_GlobalStats *pgstat_fetch_global(void);
extern PgStat_WalStats *pgstat_fetch_stat_wal(void);
extern PgStat_SLRUStats *pgstat_fetch_slru(void);
extern PgStat_StatReplSlotEntry *pgstat_fetch_replslot(NameData slotname);
+extern PgStat_StatSubErrEntry *pgstat_fetch_subscription_error(Oid subid, Oid relid);
extern void pgstat_count_slru_page_zeroed(int slru_idx);
extern void pgstat_count_slru_page_hit(int slru_idx);
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index f53607e12e..155145a77d 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -222,6 +222,7 @@ extern int err_generic_string(int field, const char *str);
extern int geterrcode(void);
extern int geterrposition(void);
extern int getinternalerrposition(void);
+extern const char *geterrmessage(void);
/*----------
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index e5ab11275d..ffad9790ae 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2094,6 +2094,28 @@ pg_stat_subscription| SELECT su.oid AS subid,
st.latest_end_time
FROM (pg_subscription su
LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid, pid, received_lsn, last_msg_send_time, last_msg_receipt_time, latest_end_lsn, latest_end_time) ON ((st.subid = su.oid)));
+pg_stat_subscription_errors| SELECT d.datname,
+ sr.subid,
+ s.subname,
+ e.relid,
+ e.command,
+ e.xid,
+ e.failure_source,
+ e.failure_count,
+ e.last_failure,
+ e.last_failure_message,
+ e.stats_reset
+ FROM ( SELECT pg_subscription.oid AS subid,
+ NULL::oid AS relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT pg_subscription_rel.srsubid AS subid,
+ pg_subscription_rel.srrelid AS relid
+ FROM pg_subscription_rel
+ WHERE (pg_subscription_rel.srsubstate <> 'r'::"char")) sr,
+ ((LATERAL pg_stat_get_subscription_error(sr.subid, sr.relid) e(datid, subid, relid, command, xid, failure_source, failure_count, last_failure, last_failure_message, stats_reset)
+ JOIN pg_database d ON ((e.datid = d.oid)))
+ JOIN pg_subscription s ON ((e.subid = s.oid)));
pg_stat_sys_indexes| SELECT pg_stat_all_indexes.relid,
pg_stat_all_indexes.indexrelid,
pg_stat_all_indexes.schemaname,
--
2.24.3 (Apple Git-128)
On Tue, Aug 3, 2021 at 12:20 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Aug 2, 2021 at 12:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Aug 2, 2021 at 7:45 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Jul 30, 2021 at 12:52 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Jul 29, 2021 at 11:18 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Setting up logical rep error context in a generic function looks a bit
odd to me. Do we really need to set up error context here? I
understand we can't do this in caller but anyway I think we are not
sending this to logical replication view as well, so not sure we need
to do it here.Yeah, I'm not convinced of this part yet. I wanted to show relid also
in truncate cases but I came up with only this idea.If an error happens during truncating the table (in
ExecuteTruncateGuts()), relid set by
set_logicalrep_error_context_rel() is actually sent to the view. If we
don’t have it, the view always shows relid as NULL in truncate cases.
On the other hand, it doesn’t cover all cases. For example, it doesn’t
cover an error that the target table doesn’t exist on the subscriber,
which happens when opening the target table. Anyway, in most cases,
even if relid is NULL, the error message in the view helps users to
know which relation the error happened on. What do you think?Yeah, I also think at this stage error message is sufficient in such cases.
I've attached new patches that incorporate all comments I got so far.
Please review them.
I had a look at the first patch, couple of minor comments:
1) Should we include this in typedefs.lst
+/* Struct for saving and restoring apply information */
+typedef struct ApplyErrCallbackArg
+{
+ LogicalRepMsgType command; /* 0 if invalid */
+
+ /* Local relation information */
+ char *nspname;
2) We can keep the case statement in the same order as in the
LogicalRepMsgType enum, this will help in easily identifying if any
enum gets missed.
+ case LOGICAL_REP_MSG_RELATION:
+ return "RELATION";
+ case LOGICAL_REP_MSG_TYPE:
+ return "TYPE";
+ case LOGICAL_REP_MSG_ORIGIN:
+ return "ORIGIN";
+ case LOGICAL_REP_MSG_MESSAGE:
+ return "MESSAGE";
+ case LOGICAL_REP_MSG_STREAM_START:
+ return "STREAM START";
Regards,
Vignesh
On Tuesday, August 3, 2021 2:49 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached new patches that incorporate all comments I got so far.
Please review them.
Hi,
I had a few comments for the 0003 patch.
1).
- This clause alters parameters originally set by
- <xref linkend="sql-createsubscription"/>. See there for more
- information. The parameters that can be altered
- are <literal>slot_name</literal>,
- <literal>synchronous_commit</literal>,
- <literal>binary</literal>, and
- <literal>streaming</literal>.
+ This clause sets or resets a subscription option. The parameters that can be
+ set are the parameters originally set by <xref linkend="sql-createsubscription"/>:
+ <literal>slot_name</literal>, <literal>synchronous_commit</literal>,
+ <literal>binary</literal>, <literal>streaming</literal>.
+ </para>
+ <para>
+ The parameters that can be reset are: <literal>streaming</literal>,
+ <literal>binary</literal>, <literal>synchronous_commit</literal>.
Maybe the doc looks better like the following ?
+ This clause alters parameters originally set by
+ <xref linkend="sql-createsubscription"/>. See there for more
+ information. The parameters that can be set
+ are <literal>slot_name</literal>,
+ <literal>synchronous_commit</literal>,
+ <literal>binary</literal>, and
+ <literal>streaming</literal>.
+ </para>
+ <para>
+ The parameters that can be reset are: <literal>streaming</literal>,
+ <literal>binary</literal>, <literal>synchronous_commit</literal>.
2).
- opts->create_slot = defGetBoolean(defel);
+ if (!is_reset)
+ opts->create_slot = defGetBoolean(defel);
}
Since we only support RESET streaming/binary/synchronous_commit, it
might be unnecessary to add the check 'if (!is_reset)' for other
option.
3).
typedef struct AlterSubscriptionStmt
{
NodeTag type;
AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_OPTIONS, etc */
Since the patch change the remove the enum value
'ALTER_SUBSCRIPTION_OPTIONS', it'd better to change the comment here
as well.
Best regards,
houzj
On Tue, Aug 3, 2021 at 7:54 PM vignesh C <vignesh21@gmail.com> wrote:
On Tue, Aug 3, 2021 at 12:20 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Aug 2, 2021 at 12:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Aug 2, 2021 at 7:45 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Jul 30, 2021 at 12:52 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Jul 29, 2021 at 11:18 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Setting up logical rep error context in a generic function looks a bit
odd to me. Do we really need to set up error context here? I
understand we can't do this in caller but anyway I think we are not
sending this to logical replication view as well, so not sure we need
to do it here.Yeah, I'm not convinced of this part yet. I wanted to show relid also
in truncate cases but I came up with only this idea.If an error happens during truncating the table (in
ExecuteTruncateGuts()), relid set by
set_logicalrep_error_context_rel() is actually sent to the view. If we
don’t have it, the view always shows relid as NULL in truncate cases.
On the other hand, it doesn’t cover all cases. For example, it doesn’t
cover an error that the target table doesn’t exist on the subscriber,
which happens when opening the target table. Anyway, in most cases,
even if relid is NULL, the error message in the view helps users to
know which relation the error happened on. What do you think?Yeah, I also think at this stage error message is sufficient in such cases.
I've attached new patches that incorporate all comments I got so far.
Please review them.I had a look at the first patch, couple of minor comments: 1) Should we include this in typedefs.lst +/* Struct for saving and restoring apply information */ +typedef struct ApplyErrCallbackArg +{ + LogicalRepMsgType command; /* 0 if invalid */ + + /* Local relation information */ + char *nspname;2) We can keep the case statement in the same order as in the LogicalRepMsgType enum, this will help in easily identifying if any enum gets missed. + case LOGICAL_REP_MSG_RELATION: + return "RELATION"; + case LOGICAL_REP_MSG_TYPE: + return "TYPE"; + case LOGICAL_REP_MSG_ORIGIN: + return "ORIGIN"; + case LOGICAL_REP_MSG_MESSAGE: + return "MESSAGE"; + case LOGICAL_REP_MSG_STREAM_START: + return "STREAM START";
Thank you for reviewing the patch!
I agreed with all comments and will fix them in the next version patch.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wed, Aug 4, 2021 at 1:02 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
On Tuesday, August 3, 2021 2:49 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached new patches that incorporate all comments I got so far.
Please review them.Hi,
I had a few comments for the 0003 patch.
Thanks for reviewing the patch!
1). - This clause alters parameters originally set by - <xref linkend="sql-createsubscription"/>. See there for more - information. The parameters that can be altered - are <literal>slot_name</literal>, - <literal>synchronous_commit</literal>, - <literal>binary</literal>, and - <literal>streaming</literal>. + This clause sets or resets a subscription option. The parameters that can be + set are the parameters originally set by <xref linkend="sql-createsubscription"/>: + <literal>slot_name</literal>, <literal>synchronous_commit</literal>, + <literal>binary</literal>, <literal>streaming</literal>. + </para> + <para> + The parameters that can be reset are: <literal>streaming</literal>, + <literal>binary</literal>, <literal>synchronous_commit</literal>.Maybe the doc looks better like the following ?
+ This clause alters parameters originally set by + <xref linkend="sql-createsubscription"/>. See there for more + information. The parameters that can be set + are <literal>slot_name</literal>, + <literal>synchronous_commit</literal>, + <literal>binary</literal>, and + <literal>streaming</literal>. + </para> + <para> + The parameters that can be reset are: <literal>streaming</literal>, + <literal>binary</literal>, <literal>synchronous_commit</literal>.
Agreed.
2). - opts->create_slot = defGetBoolean(defel); + if (!is_reset) + opts->create_slot = defGetBoolean(defel); }Since we only support RESET streaming/binary/synchronous_commit, it
might be unnecessary to add the check 'if (!is_reset)' for other
option.
Good point.
3).
typedef struct AlterSubscriptionStmt
{
NodeTag type;
AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_OPTIONS, etc */Since the patch change the remove the enum value
'ALTER_SUBSCRIPTION_OPTIONS', it'd better to change the comment here
as well.
Agreed.
I'll incorporate those comments in the next version patch.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tuesday, August 3, 2021 3:49 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached new patches that incorporate all comments I got so far.
Please review them.
Hi, I had a chance to look at the patch-set during my other development.
Just let me share some minor cosmetic things.
[1] unnatural wording ? in v5-0002.
+ * create tells whether to create the new subscription entry if it is not
+ * create tells whether to create the new subscription relation entry if it is
I'm not sure if this wording is correct or not.
You meant just "tells whether to create ...." ?,
although we already have 1 other "create tells" in HEAD.
[2]: typo "kep" in v05-0002.
I think you meant "kept" in below sentence.
+/*
+ * Subscription error statistics kep in the stats collector. One entry represents
+ * an error that happened during logical replication, reported by the apply worker
+ * (subrelid is InvalidOid) or by the table sync worker (subrelid is a valid OID).
[3]: typo "lotigcal" in the v05-0004 commit message.
If incoming change violates any constraint, lotigcal replication stops
until it's resolved. This commit introduces another way to skip the
transaction in question.
It should be "logical".
[4]: warning of doc build
I've gotten an output like below during my process of make html.
Could you please check this ?
Link element has no content and no Endterm. Nothing to show in the link to monitoring-pg-stat-subscription-errors
Best Regards,
Takamichi Osumi
On Wednesday, August 4, 2021 8:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I'll incorporate those comments in the next version patch.
Hi, when are you going to make and share the updated v6 ?
Best Regards,
Takamichi Osumi
On Thu, Aug 5, 2021 at 5:58 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
On Tuesday, August 3, 2021 3:49 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached new patches that incorporate all comments I got so far.
Please review them.Hi, I had a chance to look at the patch-set during my other development.
Just let me share some minor cosmetic things.
Thank you for reviewing the patches!
[1] unnatural wording ? in v5-0002. + * create tells whether to create the new subscription entry if it is not + * create tells whether to create the new subscription relation entry if it isI'm not sure if this wording is correct or not.
You meant just "tells whether to create ...." ?,
although we already have 1 other "create tells" in HEAD.
create here means the function argument of
pgstat_get_subscription_entry() and
pgstat_get_subscription_error_entry(). That is, the function argument
'create' tells whether to create the new entry if not found. I
single-quoted the 'create' to avoid confusion.g
[2] typo "kep" in v05-0002.
I think you meant "kept" in below sentence.
+/* + * Subscription error statistics kep in the stats collector. One entry represents + * an error that happened during logical replication, reported by the apply worker + * (subrelid is InvalidOid) or by the table sync worker (subrelid is a valid OID).
Fixed.
[3] typo "lotigcal" in the v05-0004 commit message.
If incoming change violates any constraint, lotigcal replication stops
until it's resolved. This commit introduces another way to skip the
transaction in question.It should be "logical".
Fixed.
[4] warning of doc build
I've gotten an output like below during my process of make html.
Could you please check this ?Link element has no content and no Endterm. Nothing to show in the link to monitoring-pg-stat-subscription-errors
Fixed.
I've attached the latest patches that incorporated all comments I got
so far. Please review them.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v6-0003-Add-RESET-command-to-ALTER-SUBSCRIPTION-command.patchapplication/octet-stream; name=v6-0003-Add-RESET-command-to-ALTER-SUBSCRIPTION-command.patchDownload
From a4fd49eb92744f40de69996a5185cc2754715fe0 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 2 Aug 2021 14:23:18 +0900
Subject: [PATCH v6 3/4] Add RESET command to ALTER SUBSCRIPTION command.
ALTER SUBSCRIPTION ... RESET command resets subscription
parameters. The parameters that can be set are streaming, binary,
synchronous_commit.
RESET command is reuiqred by follow-up commit introducing to a new
parameter skip_xid to reset.
---
doc/src/sgml/ref/alter_subscription.sgml | 8 ++-
src/backend/commands/subscriptioncmds.c | 78 +++++++++++++++++-----
src/backend/parser/gram.y | 11 ++-
src/include/nodes/parsenodes.h | 5 +-
src/test/regress/expected/subscription.out | 11 +++
src/test/regress/sql/subscription.sql | 11 +++
6 files changed, 105 insertions(+), 19 deletions(-)
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index a6f994450d..8c3c28b7e7 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -29,6 +29,7 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> REFRESH PUB
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> ENABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> DISABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <replaceable>new_name</replaceable>
</synopsis>
@@ -192,16 +193,21 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
<varlistentry>
<term><literal>SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )</literal></term>
+ <term><literal>RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )</literal></term>
<listitem>
<para>
This clause alters parameters originally set by
<xref linkend="sql-createsubscription"/>. See there for more
- information. The parameters that can be altered
+ information. The parameters that can be set
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, and
<literal>streaming</literal>.
</para>
+ <para>
+ The parameters that can be reset are: <literal>streaming</literal>,
+ <literal>binary</literal>, <literal>synchronous_commit</literal>.
+ </para>
</listitem>
</varlistentry>
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 22ae982328..b72cdaba90 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -99,7 +99,8 @@ static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname,
*/
static void
parse_subscription_options(ParseState *pstate, List *stmt_options,
- bits32 supported_opts, SubOpts *opts)
+ bits32 supported_opts, SubOpts *opts,
+ bool is_reset)
{
ListCell *lc;
@@ -134,6 +135,11 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
{
DefElem *defel = (DefElem *) lfirst(lc);
+ if (is_reset && defel->arg != NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("RESET must not include values for parameters")));
+
if (IsSet(supported_opts, SUBOPT_CONNECT) &&
strcmp(defel->defname, "connect") == 0)
{
@@ -192,12 +198,18 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_SYNCHRONOUS_COMMIT;
- opts->synchronous_commit = defGetString(defel);
+ if (!is_reset)
+ {
+ opts->synchronous_commit = defGetString(defel);
- /* Test if the given value is valid for synchronous_commit GUC. */
- (void) set_config_option("synchronous_commit", opts->synchronous_commit,
- PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
- false, 0, false);
+ /*
+ * Test if the given value is valid for synchronous_commit
+ * GUC.
+ */
+ (void) set_config_option("synchronous_commit", opts->synchronous_commit,
+ PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
+ false, 0, false);
+ }
}
else if (IsSet(supported_opts, SUBOPT_REFRESH) &&
strcmp(defel->defname, "refresh") == 0)
@@ -215,7 +227,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_BINARY;
- opts->binary = defGetBoolean(defel);
+ if (!is_reset)
+ opts->binary = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_STREAMING) &&
strcmp(defel->defname, "streaming") == 0)
@@ -224,7 +237,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_STREAMING;
- opts->streaming = defGetBoolean(defel);
+ if (!is_reset)
+ opts->streaming = defGetBoolean(defel);
}
else if (strcmp(defel->defname, "two_phase") == 0)
{
@@ -416,7 +430,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT);
- parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
+ parse_subscription_options(pstate, stmt->options, supported_opts, &opts,
+ false);
/*
* Since creating a replication slot is not transactional, rolling back
@@ -885,14 +900,14 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
switch (stmt->kind)
{
- case ALTER_SUBSCRIPTION_OPTIONS:
+ case ALTER_SUBSCRIPTION_SET_OPTIONS:
{
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING);
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
if (IsSet(opts.specified_opts, SUBOPT_SLOT_NAME))
{
@@ -948,10 +963,43 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
break;
}
+ case ALTER_SUBSCRIPTION_RESET_OPTIONS:
+ {
+ supported_opts = (SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
+ SUBOPT_STREAMING);
+
+ parse_subscription_options(pstate, stmt->options,
+ supported_opts, &opts, true);
+
+ if (IsSet(opts.specified_opts, SUBOPT_SYNCHRONOUS_COMMIT))
+ {
+ values[Anum_pg_subscription_subsynccommit - 1] =
+ CStringGetTextDatum("off");
+ replaces[Anum_pg_subscription_subsynccommit - 1] = true;
+ }
+
+ if (IsSet(opts.specified_opts, SUBOPT_BINARY))
+ {
+ values[Anum_pg_subscription_subbinary - 1] =
+ BoolGetDatum(false);
+ replaces[Anum_pg_subscription_subbinary - 1] = true;
+ }
+
+ if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
+ {
+ values[Anum_pg_subscription_substream - 1] =
+ BoolGetDatum(false);
+ replaces[Anum_pg_subscription_substream - 1] = true;
+ }
+
+ update_tuple = true;
+ break;
+ }
case ALTER_SUBSCRIPTION_ENABLED:
{
parse_subscription_options(pstate, stmt->options,
- SUBOPT_ENABLED, &opts);
+ SUBOPT_ENABLED, &opts, false);
+
Assert(IsSet(opts.specified_opts, SUBOPT_ENABLED));
if (!sub->slotname && opts.enabled)
@@ -986,7 +1034,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
{
supported_opts = SUBOPT_COPY_DATA | SUBOPT_REFRESH;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
values[Anum_pg_subscription_subpublications - 1] =
publicationListToArray(stmt->publication);
@@ -1036,7 +1084,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts |= SUBOPT_COPY_DATA;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
publist = merge_publications(sub->publications, stmt->publication, isadd, stmt->subname);
values[Anum_pg_subscription_subpublications - 1] =
@@ -1084,7 +1132,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
errmsg("ALTER SUBSCRIPTION ... REFRESH is not allowed for disabled subscriptions")));
parse_subscription_options(pstate, stmt->options,
- SUBOPT_COPY_DATA, &opts);
+ SUBOPT_COPY_DATA, &opts, false);
/*
* The subscription option "two_phase" requires that
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 39a2849eba..bcf85e8980 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9707,7 +9707,16 @@ AlterSubscriptionStmt:
{
AlterSubscriptionStmt *n =
makeNode(AlterSubscriptionStmt);
- n->kind = ALTER_SUBSCRIPTION_OPTIONS;
+ n->kind = ALTER_SUBSCRIPTION_SET_OPTIONS;
+ n->subname = $3;
+ n->options = $5;
+ $$ = (Node *)n;
+ }
+ | ALTER SUBSCRIPTION name RESET definition
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_RESET_OPTIONS;
n->subname = $3;
n->options = $5;
$$ = (Node *)n;
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index e28248af32..504d65f7d6 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3659,7 +3659,8 @@ typedef struct CreateSubscriptionStmt
typedef enum AlterSubscriptionType
{
- ALTER_SUBSCRIPTION_OPTIONS,
+ ALTER_SUBSCRIPTION_SET_OPTIONS,
+ ALTER_SUBSCRIPTION_RESET_OPTIONS,
ALTER_SUBSCRIPTION_CONNECTION,
ALTER_SUBSCRIPTION_SET_PUBLICATION,
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
@@ -3671,7 +3672,7 @@ typedef enum AlterSubscriptionType
typedef struct AlterSubscriptionStmt
{
NodeTag type;
- AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_OPTIONS, etc */
+ AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_SET_OPTIONS, etc */
char *subname; /* Name of the subscription */
char *conninfo; /* Connection string to publisher */
List *publication; /* One or more publication to subscribe to */
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 67f92b3878..9518e2c20b 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -286,6 +286,17 @@ ERROR: unrecognized subscription parameter: "two_phase"
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
ERROR: cannot set streaming = true for two-phase enabled subscription
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+-- ok
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit);
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit, binary, streaming);
+-- fail - unsupported parameters
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ERROR: unrecognized subscription parameter: "connect"
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+ERROR: unrecognized subscription parameter: "enabled"
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+ERROR: RESET must not include values for parameters
\dRs+
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 88743ab33b..b73d932dfc 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -220,6 +220,17 @@ ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+-- ok
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit);
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit, binary, streaming);
+
+-- fail - unsupported parameters
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+
\dRs+
DROP SUBSCRIPTION regress_testsub;
--
2.24.3 (Apple Git-128)
v6-0001-Add-errcontext-to-errors-happening-during-applyin.patchapplication/octet-stream; name=v6-0001-Add-errcontext-to-errors-happening-during-applyin.patchDownload
From 58b7e870bde972d80c4e5af64f1ae933b0a0c6fa Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 28 Jun 2021 13:21:58 +0900
Subject: [PATCH v6 1/4] Add errcontext to errors happening during applying
logical replication changes.
This commit adds the error context to errors happening during applying
logical replication changes, showing the command, the relation,
transaction ID, and commit timestamp in the server log.
---
src/backend/replication/logical/proto.c | 49 ++++++
src/backend/replication/logical/worker.c | 181 ++++++++++++++++++++---
src/include/replication/logicalproto.h | 1 +
src/tools/pgindent/typedefs.list | 1 +
4 files changed, 210 insertions(+), 22 deletions(-)
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 2d774567e0..bd0a8bf02b 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -1131,3 +1131,52 @@ logicalrep_read_stream_abort(StringInfo in, TransactionId *xid,
*xid = pq_getmsgint(in, 4);
*subxid = pq_getmsgint(in, 4);
}
+
+/*
+ * get string representing LogicalRepMsgType.
+ */
+char *
+logicalrep_message_type(LogicalRepMsgType action)
+{
+ switch (action)
+ {
+ case LOGICAL_REP_MSG_BEGIN:
+ return "BEGIN";
+ case LOGICAL_REP_MSG_COMMIT:
+ return "COMMIT";
+ case LOGICAL_REP_MSG_ORIGIN:
+ return "ORIGIN";
+ case LOGICAL_REP_MSG_INSERT:
+ return "INSERT";
+ case LOGICAL_REP_MSG_UPDATE:
+ return "UPDATE";
+ case LOGICAL_REP_MSG_DELETE:
+ return "DELETE";
+ case LOGICAL_REP_MSG_TRUNCATE:
+ return "TRUNCATE";
+ case LOGICAL_REP_MSG_RELATION:
+ return "RELATION";
+ case LOGICAL_REP_MSG_TYPE:
+ return "TYPE";
+ case LOGICAL_REP_MSG_MESSAGE:
+ return "MESSAGE";
+ case LOGICAL_REP_MSG_BEGIN_PREPARE:
+ return "BEGIN PREPARE";
+ case LOGICAL_REP_MSG_PREPARE:
+ return "PREPARE";
+ case LOGICAL_REP_MSG_COMMIT_PREPARED:
+ return "COMMIT PREPARED";
+ case LOGICAL_REP_MSG_ROLLBACK_PREPARED:
+ return "ROLLBACK PREPARED";
+ case LOGICAL_REP_MSG_STREAM_START:
+ return "STREAM START";
+ case LOGICAL_REP_MSG_STREAM_END:
+ return "STREAM END";
+ case LOGICAL_REP_MSG_STREAM_COMMIT:
+ return "STREAM COMMIT";
+ case LOGICAL_REP_MSG_STREAM_ABORT:
+ return "STREAM ABORT";
+ }
+
+ elog(ERROR, "invalid logical replication message type \"%c\"", action);
+}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 249de80798..c2376d755f 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -221,6 +221,28 @@ typedef struct ApplyExecutionData
PartitionTupleRouting *proute; /* partition routing info */
} ApplyExecutionData;
+/* Struct for saving and restoring apply information */
+typedef struct ApplyErrCallbackArg
+{
+ LogicalRepMsgType command; /* 0 if invalid */
+
+ /* Local relation information */
+ char *nspname;
+ char *relname;
+
+ TransactionId remote_xid;
+ TimestampTz committs;
+} ApplyErrCallbackArg;
+
+static ApplyErrCallbackArg apply_error_callback_arg =
+{
+ .command = 0,
+ .relname = NULL,
+ .nspname = NULL,
+ .remote_xid = InvalidTransactionId,
+ .committs = 0,
+};
+
/*
* Stream xid hash entry. Whenever we see a new xid we create this entry in the
* xidhash and along with it create the streaming file and store the fileset handle.
@@ -335,6 +357,12 @@ static void TwoPhaseTransactionGid(Oid subid, TransactionId xid, char *gid, int
/* Common streaming function to apply all the spooled messages */
static void apply_spooled_messages(TransactionId xid, XLogRecPtr lsn);
+/* Functions for apply error callback */
+static void apply_error_callback(void *arg);
+static void set_apply_error_context_rel(LogicalRepRelMapEntry *rel);
+static void reset_apply_error_context_rel(void);
+static void reset_apply_error_context_info(void);
+
/*
* Should this worker apply changes for given relation.
*
@@ -827,6 +855,8 @@ apply_handle_begin(StringInfo s)
LogicalRepBeginData begin_data;
logicalrep_read_begin(s, &begin_data);
+ apply_error_callback_arg.remote_xid = begin_data.xid;
+ apply_error_callback_arg.committs = begin_data.committime;
remote_final_lsn = begin_data.final_lsn;
@@ -860,6 +890,7 @@ apply_handle_commit(StringInfo s)
process_syncing_tables(commit_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -877,6 +908,8 @@ apply_handle_begin_prepare(StringInfo s)
errmsg_internal("tablesync worker received a BEGIN PREPARE message")));
logicalrep_read_begin_prepare(s, &begin_data);
+ apply_error_callback_arg.remote_xid = begin_data.xid;
+ apply_error_callback_arg.committs = begin_data.prepare_time;
remote_final_lsn = begin_data.prepare_lsn;
@@ -962,6 +995,7 @@ apply_handle_prepare(StringInfo s)
process_syncing_tables(prepare_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -974,6 +1008,8 @@ apply_handle_commit_prepared(StringInfo s)
char gid[GIDSIZE];
logicalrep_read_commit_prepared(s, &prepare_data);
+ apply_error_callback_arg.remote_xid = prepare_data.xid;
+ apply_error_callback_arg.committs = prepare_data.commit_time;
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
@@ -1001,6 +1037,7 @@ apply_handle_commit_prepared(StringInfo s)
process_syncing_tables(prepare_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1013,6 +1050,7 @@ apply_handle_rollback_prepared(StringInfo s)
char gid[GIDSIZE];
logicalrep_read_rollback_prepared(s, &rollback_data);
+ apply_error_callback_arg.remote_xid = rollback_data.xid;
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, rollback_data.xid,
@@ -1050,6 +1088,7 @@ apply_handle_rollback_prepared(StringInfo s)
process_syncing_tables(rollback_data.rollback_end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1106,6 +1145,8 @@ apply_handle_stream_start(StringInfo s)
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg_internal("invalid transaction ID in streamed replication transaction")));
+ apply_error_callback_arg.remote_xid = stream_xid;
+
/*
* Initialize the xidhash table if we haven't yet. This will be used for
* the entire duration of the apply worker so create it in permanent
@@ -1162,6 +1203,7 @@ apply_handle_stream_stop(StringInfo s)
MemoryContextReset(LogicalStreamingContext);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1185,7 +1227,10 @@ apply_handle_stream_abort(StringInfo s)
* just delete the files with serialized info.
*/
if (xid == subxid)
+ {
+ apply_error_callback_arg.remote_xid = xid;
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
+ }
else
{
/*
@@ -1210,6 +1255,7 @@ apply_handle_stream_abort(StringInfo s)
char path[MAXPGPATH];
StreamXidHash *ent;
+ apply_error_callback_arg.remote_xid = subxid;
subidx = -1;
begin_replication_step();
subxact_info_read(MyLogicalRepWorker->subid, xid);
@@ -1234,6 +1280,7 @@ apply_handle_stream_abort(StringInfo s)
cleanup_subxact_info();
end_replication_step();
CommitTransactionCommand();
+ reset_apply_error_context_info();
return;
}
@@ -1265,6 +1312,8 @@ apply_handle_stream_abort(StringInfo s)
end_replication_step();
CommitTransactionCommand();
}
+
+ reset_apply_error_context_info();
}
/*
@@ -1409,6 +1458,8 @@ apply_handle_stream_commit(StringInfo s)
errmsg_internal("STREAM COMMIT message without STREAM STOP")));
xid = logicalrep_read_stream_commit(s, &commit_data);
+ apply_error_callback_arg.remote_xid = xid;
+ apply_error_callback_arg.committs = commit_data.committime;
elog(DEBUG1, "received commit for streamed transaction %u", xid);
@@ -1423,6 +1474,8 @@ apply_handle_stream_commit(StringInfo s)
process_syncing_tables(commit_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+
+ reset_apply_error_context_info();
}
/*
@@ -1542,6 +1595,9 @@ apply_handle_insert(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ set_apply_error_context_rel(rel);
+
/* Initialize the executor state. */
edata = create_edata_for_relation(rel);
estate = edata->estate;
@@ -1565,6 +1621,9 @@ apply_handle_insert(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ reset_apply_error_context_rel();
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -1663,6 +1722,9 @@ apply_handle_update(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ set_apply_error_context_rel(rel);
+
/* Check if we can do the update. */
check_relation_updatable(rel);
@@ -1716,6 +1778,9 @@ apply_handle_update(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ reset_apply_error_context_rel();
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -1819,6 +1884,9 @@ apply_handle_delete(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ set_apply_error_context_rel(rel);
+
/* Check if we can do the delete. */
check_relation_updatable(rel);
@@ -1844,6 +1912,9 @@ apply_handle_delete(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ reset_apply_error_context_rel();
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -2278,44 +2349,54 @@ static void
apply_dispatch(StringInfo s)
{
LogicalRepMsgType action = pq_getmsgbyte(s);
+ ErrorContextCallback errcallback;
+
+ /*
+ * Push apply error context callback. Other fields will be filled during
+ * applying the change.
+ */
+ apply_error_callback_arg.command = action;
+ errcallback.callback = apply_error_callback;
+ errcallback.previous = error_context_stack;
+ error_context_stack = &errcallback;
switch (action)
{
case LOGICAL_REP_MSG_BEGIN:
apply_handle_begin(s);
- return;
+ break;
case LOGICAL_REP_MSG_COMMIT:
apply_handle_commit(s);
- return;
+ break;
case LOGICAL_REP_MSG_INSERT:
apply_handle_insert(s);
- return;
+ break;
case LOGICAL_REP_MSG_UPDATE:
apply_handle_update(s);
- return;
+ break;
case LOGICAL_REP_MSG_DELETE:
apply_handle_delete(s);
- return;
+ break;
case LOGICAL_REP_MSG_TRUNCATE:
apply_handle_truncate(s);
- return;
+ break;
case LOGICAL_REP_MSG_RELATION:
apply_handle_relation(s);
- return;
+ break;
case LOGICAL_REP_MSG_TYPE:
apply_handle_type(s);
- return;
+ break;
case LOGICAL_REP_MSG_ORIGIN:
apply_handle_origin(s);
- return;
+ break;
case LOGICAL_REP_MSG_MESSAGE:
@@ -2324,45 +2405,48 @@ apply_dispatch(StringInfo s)
* Although, it could be used by other applications that use this
* output plugin.
*/
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_START:
apply_handle_stream_start(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_END:
apply_handle_stream_stop(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_ABORT:
apply_handle_stream_abort(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_COMMIT:
apply_handle_stream_commit(s);
- return;
+ break;
case LOGICAL_REP_MSG_BEGIN_PREPARE:
apply_handle_begin_prepare(s);
- return;
+ break;
case LOGICAL_REP_MSG_PREPARE:
apply_handle_prepare(s);
- return;
+ break;
case LOGICAL_REP_MSG_COMMIT_PREPARED:
apply_handle_commit_prepared(s);
- return;
+ break;
case LOGICAL_REP_MSG_ROLLBACK_PREPARED:
apply_handle_rollback_prepared(s);
- return;
+ break;
+
+ default:
+ ereport(ERROR,
+ (errcode(ERRCODE_PROTOCOL_VIOLATION),
+ errmsg("invalid logical replication message type \"%c\"", action)));
}
- ereport(ERROR,
- (errcode(ERRCODE_PROTOCOL_VIOLATION),
- errmsg_internal("invalid logical replication message type \"%c\"",
- action)));
+ /* Pop the error context stack */
+ error_context_stack = errcallback.previous;
}
/*
@@ -3595,3 +3679,56 @@ IsLogicalWorker(void)
{
return MyLogicalRepWorker != NULL;
}
+
+static void
+apply_error_callback(void *arg)
+{
+ StringInfoData buf;
+
+ if (apply_error_callback_arg.command == 0)
+ return;
+
+ initStringInfo(&buf);
+ appendStringInfo(&buf, _("during apply of \"%s\""),
+ logicalrep_message_type(apply_error_callback_arg.command));
+
+ if (apply_error_callback_arg.relname)
+ appendStringInfo(&buf, _(" for relation \"%s.%s\""),
+ apply_error_callback_arg.nspname,
+ apply_error_callback_arg.relname);
+
+ if (TransactionIdIsNormal(apply_error_callback_arg.remote_xid))
+ appendStringInfo(&buf, _(" in transaction with xid %u committs %s"),
+ apply_error_callback_arg.remote_xid,
+ apply_error_callback_arg.committs == 0
+ ? "(unset)"
+ : timestamptz_to_str(apply_error_callback_arg.committs));
+
+ errcontext("%s", buf.data);
+}
+
+/* Set relation information of apply error callback */
+static void
+set_apply_error_context_rel(LogicalRepRelMapEntry *rel)
+{
+ apply_error_callback_arg.nspname = rel->remoterel.nspname;
+ apply_error_callback_arg.relname = rel->remoterel.relname;
+}
+
+/* Reset relation information of apply error callback */
+static void
+reset_apply_error_context_rel(void)
+{
+ apply_error_callback_arg.nspname = NULL;
+ apply_error_callback_arg.relname = NULL;
+}
+
+/* Reset all information of apply error callback */
+static void
+reset_apply_error_context_info(void)
+{
+ apply_error_callback_arg.command = 0;
+ apply_error_callback_arg.remote_xid = InvalidTransactionId;
+ apply_error_callback_arg.committs = 0;
+ reset_apply_error_context_rel();
+}
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 63de90d94a..c78a4409bc 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -242,5 +242,6 @@ extern void logicalrep_write_stream_abort(StringInfo out, TransactionId xid,
TransactionId subxid);
extern void logicalrep_read_stream_abort(StringInfo in, TransactionId *xid,
TransactionId *subxid);
+extern char *logicalrep_message_type(LogicalRepMsgType action);
#endif /* LOGICAL_PROTO_H */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 37cf4b2f76..e69b708e33 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -113,6 +113,7 @@ Append
AppendPath
AppendRelInfo
AppendState
+ApplyErrCallbackArg
ApplyExecutionData
ApplySubXactData
Archive
--
2.24.3 (Apple Git-128)
v6-0004-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchapplication/octet-stream; name=v6-0004-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchDownload
From 25211cdcf5d4cab4eb79bd1752ffbfb34f3a6c91 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 2 Aug 2021 14:27:40 +0900
Subject: [PATCH v6 4/4] Add skip_xid option to ALTER SUBSCRIPTION.
If incoming change violates any constraint, logical replication stops
until it's resolved. This commit introduces another way to skip the
transaction in question.
The user can specify XID by ALTER SUBSCRIPTION ... SET (skip_xid =
XXX), updating pg_subscription.subskipxid field, telling the apply
worker to skip the transaction. The apply worker skips all data
modification changes within the specified transaction.
After skipping the transaciton the apply worker clears
pg_subscription.subskipxid. Also, it clears the error statistics of
the subscription in pg_stat_subscription_errors system view as well in
order the user not to get confused. It's done by sending the message
for clearing a subscription error to the stats collector.
---
doc/src/sgml/logical-replication.sgml | 49 +++++-
doc/src/sgml/ref/alter_subscription.sgml | 32 +++-
src/backend/catalog/pg_subscription.c | 10 ++
src/backend/commands/subscriptioncmds.c | 45 ++++-
src/backend/postmaster/pgstat.c | 44 ++++-
src/backend/replication/logical/worker.c | 190 ++++++++++++++++++++-
src/include/catalog/pg_subscription.h | 4 +
src/include/nodes/parsenodes.h | 1 +
src/include/pgstat.h | 8 +-
src/test/regress/expected/subscription.out | 13 ++
src/test/regress/sql/subscription.sql | 11 ++
src/test/subscription/t/023_skip_xact.pl | 185 ++++++++++++++++++++
12 files changed, 569 insertions(+), 23 deletions(-)
create mode 100644 src/test/subscription/t/023_skip_xact.pl
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 88646bc859..1e3c8c40f5 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -333,20 +333,63 @@
<para>
A conflict will produce an error and will stop the replication; it must be
resolved manually by the user. Details about the conflict can be found in
- the subscriber's server log.
+ <xref linkend="monitoring-pg-stat-subscription-errors"/> as well as the
+ subscriber's server log.
</para>
<para>
The resolution can be done either by changing data on the subscriber so
that it does not conflict with the incoming change or by skipping the
- transaction that conflicts with the existing data. The transaction can be
- skipped by calling the <link linkend="pg-replication-origin-advance">
+ transaction that conflicts with the existing data. When a conflict produce
+ an error, it is shown in <structname>pg_stat_subscription_errors</structname>
+ view as follows:
+ </para>
+
+ <programlisting>
+postgres=# SELECT * FROM pg_stat_subscription_errors;
+-[ RECORD 1 ]--------+-----------------------------------------------------------
+datname | postgres
+subid | 16395
+subname | test_sub
+relid | 16385
+command | INSERT
+xid | 716
+failure_source | apply
+failure_count | 50
+last_failure | 2021-07-21 21:16:02.781779+00
+last_failure_message | duplicate key value violates unique constraint "test_pkey"
+stats_reset |
+</programlisting>
+
+ <para>
+ and it is also shown in subscriber's server log as follows:
+ </para>
+
+<screen>
+ERROR: duplicate key value violates unique constraint "test_pkey"
+DETAIL: Key (id)=(1) already exists.
+CONTEXT: during apply of "INSERT" for relation "public.test" in transaction with xid 716 committs 2021-07-15 21:54:58.802874+00
+</screen>
+
+ <para>
+ The transaction ID that contains the change violating the constraint can be
+ found from those outputs (transaction ID 740 in the above case). The transaction
+ can be skipped by setting <replaceable>skip_xid</replaceable> to the subscription
+ by <command>ALTER SUBSCRIPTION ... SET</command>. Alternatively, the transaction
+ can also be skipped by calling the <link linkend="pg-replication-origin-advance">
<function>pg_replication_origin_advance()</function></link> function with
a <parameter>node_name</parameter> corresponding to the subscription name,
and a position. The current position of origins can be seen in the
<link linkend="view-pg-replication-origin-status">
<structname>pg_replication_origin_status</structname></link> system view.
</para>
+
+ <para>
+ In either way, those should be used as a last resort. They skip the whole
+ transaction including changes that may not violate any constraint and easily
+ make subscriber inconsistent if a user specifies the wrong transaction ID or
+ the position of origin.
+ </para>
</sect1>
<sect1 id="logical-replication-restrictions">
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 8c3c28b7e7..cfb318e08c 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -205,8 +205,36 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
<literal>streaming</literal>.
</para>
<para>
- The parameters that can be reset are: <literal>streaming</literal>,
- <literal>binary</literal>, <literal>synchronous_commit</literal>.
+ The parameters that can be reset are: <literal>slot_name</literal>,
+ <literal>synchronous_commit</literal>, <literal>binary</literal>,
+ <literal>streaming</literal>, and following parameter:
+ </para>
+ <para>
+ <variablelist>
+ <varlistentry>
+ <term><literal>skip_xid</literal> (<type>xid</type>)</term>
+ <listitem>
+ <para>
+ If incoming data violates any constraints the logical replication
+ will stop until it is resolved. The resolution can be done either
+ by changing data on the subscriber so that it doesn't conflict with
+ incoming change or by skipping the whole transaction. This option
+ specifies transaction ID that logical replication worker skips to
+ apply. The logical replication worker skips all data modification
+ changes within the specified transaction. Therefore, since it skips
+ the whole transaction including the changes that may not violate the
+ constraint, it should only be used as a last resort. This option has
+ no effect for the transaction that is already prepared with enabling
+ <literal>two_phase</literal> on susbscriber. After the logical
+ replication successfully skips the transaction, the transaction ID
+ (stored in
+ <structname>pg_subscription</structname>.<structfield>subskipxid</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index 25021e25a4..cb22cd7463 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -104,6 +104,16 @@ GetSubscription(Oid subid, bool missing_ok)
Assert(!isnull);
sub->publications = textarray_to_stringlist(DatumGetArrayTypeP(datum));
+ /* Get skip XID */
+ datum = SysCacheGetAttr(SUBSCRIPTIONOID,
+ tup,
+ Anum_pg_subscription_subskipxid,
+ &isnull);
+ if (!isnull)
+ sub->skipxid = DatumGetTransactionId(datum);
+ else
+ sub->skipxid = InvalidTransactionId;
+
ReleaseSysCache(tup);
return sub;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index b72cdaba90..1489b04793 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -60,6 +60,7 @@
#define SUBOPT_BINARY 0x00000080
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
+#define SUBOPT_SKIP_XID 0x00000400
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -81,6 +82,7 @@ typedef struct SubOpts
bool binary;
bool streaming;
bool twophase;
+ TransactionId skip_xid;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -129,6 +131,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->streaming = false;
if (IsSet(supported_opts, SUBOPT_TWOPHASE_COMMIT))
opts->twophase = false;
+ if (IsSet(supported_opts, SUBOPT_SKIP_XID))
+ opts->skip_xid = InvalidTransactionId;
/* Parse options */
foreach(lc, stmt_options)
@@ -261,6 +265,29 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_TWOPHASE_COMMIT;
opts->twophase = defGetBoolean(defel);
}
+ else if (strcmp(defel->defname, "skip_xid") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_SKIP_XID))
+ errorConflictingDefElem(defel, pstate);
+
+ if (!is_reset)
+ {
+ char *xid_str = defGetString(defel);
+ TransactionId xid;
+
+ /* Parse the argument as TransactionId */
+ xid = DatumGetTransactionId(DirectFunctionCall1(xidin,
+ CStringGetDatum(xid_str)));
+
+ if (!TransactionIdIsNormal(xid))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid transaction id")));
+ opts->skip_xid = xid;
+ }
+
+ opts->specified_opts |= SUBOPT_SKIP_XID;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -504,6 +531,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
CharGetDatum(opts.twophase ?
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -904,7 +932,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
{
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
- SUBOPT_STREAMING);
+ SUBOPT_STREAMING | SUBOPT_SKIP_XID);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts, false);
@@ -959,6 +987,13 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
replaces[Anum_pg_subscription_substream - 1] = true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_SKIP_XID))
+ {
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(opts.skip_xid);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+ }
+
update_tuple = true;
break;
}
@@ -966,7 +1001,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
case ALTER_SUBSCRIPTION_RESET_OPTIONS:
{
supported_opts = (SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
- SUBOPT_STREAMING);
+ SUBOPT_STREAMING | SUBOPT_SKIP_XID);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts, true);
@@ -992,6 +1027,12 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
replaces[Anum_pg_subscription_substream - 1] = true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_SKIP_XID))
+ {
+ nulls[Anum_pg_subscription_subskipxid - 1] =
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+ }
+
update_tuple = true;
break;
}
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 0507f1961e..ab87b77552 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -1731,11 +1731,32 @@ pgstat_reset_subscription_error(Oid subid, Oid subrelid)
pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
msg.m_subid = subid;
msg.m_subrelid = subrelid;
+ msg.m_clear = false;
msg.m_reset = true;
pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
}
+/* ----------
+ * pgstat_clear_subscription_error() -
+ *
+ * Tell the collector about clear the error of subscription.
+ * ----------
+ */
+void
+pgstat_clear_subscription_error(Oid subid, Oid subrelid)
+{
+ PgStat_MsgSubscriptionErr msg;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_clear = true;
+ msg.m_reset = false;
+
+ pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
+}
+
/* ----------
* pgstat_report_autovac() -
*
@@ -2078,6 +2099,7 @@ pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
msg.m_subrelid = subrelid;
msg.m_databaseid = MyDatabaseId;
msg.m_relid = relid;
+ msg.m_clear = false;
msg.m_reset = false;
msg.m_command = command;
msg.m_xid = xid;
@@ -6095,27 +6117,37 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
static void
pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len)
{
+
PgStat_StatSubErrEntry *errent;
- bool create = !msg->m_reset;
+ bool create = !(msg->m_reset || msg->m_clear);
/* Get subscription error */
errent = pgstat_get_subscription_error_entry(msg->m_subid,
msg->m_subrelid,
create);
- if (msg->m_reset)
+ if (msg->m_reset || msg->m_clear)
{
+ Assert(!(msg->m_reset && msg->m_clear));
+
if (errent == NULL)
return;
- /* reset fields and set reset timestamp */
errent->relid = InvalidOid;
errent->command = 0;
errent->xid = InvalidTransactionId;
errent->failure_count = 0;
- errent->last_failure = 0;
- errent->last_errmsg[0] = '\0';
- errent->stat_reset_timestamp = GetCurrentTimestamp();
+
+ /*
+ * If the reset is requested, reset more fields and set the reset
+ * timestamp.
+ */
+ if (msg->m_reset)
+ {
+ errent->last_failure = 0;
+ errent->last_errmsg[0] = '\0';
+ errent->stat_reset_timestamp = GetCurrentTimestamp();
+ }
}
else
{
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 5a3ba8d7c1..5b7cb1cea6 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -136,6 +136,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -278,6 +279,21 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * skipping_xid is a valid XID if we're skipping all data modification changes of the
+ * specified transaction in MySubscription->skipxid. Please note that we don’t skip
+ * receiving the changes particularly in streaming cases, since we decide whether or
+ * not to skip applying the changes when starting to apply. Once starting skipping
+ * changes, we copy the XID to skipping_xid and then don't stop skipping until we skip
+ * the whole transaction even if the subscription is invalidated and
+ * MySubscription->skipxid gets changed or reset. When stopping the skipping behavior,
+ * we reset the skip XID (subskipxid) in the pg_subscription and associate origin status
+ * to the transaction that resets the skip XID so that we can start streaming from the
+ * next transaction.
+ */
+static TransactionId skipping_xid = InvalidTransactionId;
+#define skipping_changes() (TransactionIdIsValid(skipping_xid))
+
/*
* Hash table for storing the streaming xid information along with shared file
* set for streaming and subxact files.
@@ -365,6 +381,9 @@ static void set_apply_error_context_rel(LogicalRepRelMapEntry *rel);
static void reset_apply_error_context_rel(void);
static void reset_apply_error_context_info(void);
+static void maybe_start_skipping_changes(TransactionId xid);
+static void stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs);
+
/*
* Should this worker apply changes for given relation.
*
@@ -862,6 +881,11 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
+ /*
+ * Start skipping all changes of this transaction if specified
+ */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -886,7 +910,18 @@ apply_handle_commit(StringInfo s)
LSN_FORMAT_ARGS(commit_data.commit_lsn),
LSN_FORMAT_ARGS(remote_final_lsn))));
- apply_handle_commit_internal(&commit_data);
+ /*
+ * Stop the skipping transaction if enabled. Otherwise, commit the changes
+ * that are just applied.
+ */
+ if (skipping_changes())
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ apply_handle_commit_internal(&commit_data);
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(commit_data.end_lsn);
@@ -915,6 +950,9 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
+ /* Start skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -983,7 +1021,10 @@ apply_handle_prepare(StringInfo s)
*/
begin_replication_step();
- apply_handle_prepare_internal(&prepare_data);
+ if (skipping_changes())
+ stop_skipping_changes(prepare_data.end_lsn, prepare_data.prepare_time);
+ else
+ apply_handle_prepare_internal(&prepare_data);
end_replication_step();
CommitTransactionCommand();
@@ -1103,9 +1144,10 @@ apply_handle_origin(StringInfo s)
{
/*
* ORIGIN message can only come inside streaming transaction or inside
- * remote transaction and before any actual writes.
+ * remote transaction and before any actual writes unless we're skipping
+ * changes of the transaction.
*/
- if (!in_streamed_transaction &&
+ if (!in_streamed_transaction && !skipping_changes() &&
(!in_remote_transaction ||
(IsTransactionState() && !am_tablesync_worker())))
ereport(ERROR,
@@ -1127,6 +1169,9 @@ apply_handle_stream_start(StringInfo s)
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg_internal("duplicate STREAM START message")));
+ /* extract XID of the top-level transaction */
+ stream_xid = logicalrep_read_stream_start(s, &first_segment);
+
/*
* Start a transaction on stream start, this transaction will be committed
* on the stream stop unless it is a tablesync worker in which case it
@@ -1139,9 +1184,6 @@ apply_handle_stream_start(StringInfo s)
/* notify handle methods we're processing a remote transaction */
in_streamed_transaction = true;
- /* extract XID of the top-level transaction */
- stream_xid = logicalrep_read_stream_start(s, &first_segment);
-
if (!TransactionIdIsValid(stream_xid))
ereport(ERROR,
(errcode(ERRCODE_PROTOCOL_VIOLATION),
@@ -1223,6 +1265,7 @@ apply_handle_stream_abort(StringInfo s)
errmsg_internal("STREAM ABORT message without STREAM STOP")));
logicalrep_read_stream_abort(s, &xid, &subxid);
+ maybe_start_skipping_changes(xid);
/*
* If the two XIDs are the same, it's in fact abort of toplevel xact, so
@@ -1315,6 +1358,10 @@ apply_handle_stream_abort(StringInfo s)
CommitTransactionCommand();
}
+ /* Stop the skipping transaction if enabled */
+ if (skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
reset_apply_error_context_info();
}
@@ -1465,9 +1512,22 @@ apply_handle_stream_commit(StringInfo s)
elog(DEBUG1, "received commit for streamed transaction %u", xid);
+ /* Start skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(xid);
+
apply_spooled_messages(xid, commit_data.commit_lsn);
- apply_handle_commit_internal(&commit_data);
+ if (skipping_changes())
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ {
+ /* commit the streamed transaction */
+ apply_handle_commit_internal(&commit_data);
+ }
/* unlink the files with serialized changes and subxact info */
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
@@ -2353,6 +2413,17 @@ apply_dispatch(StringInfo s)
LogicalRepMsgType action = pq_getmsgbyte(s);
ErrorContextCallback errcallback;
+ /*
+ * Skip all data-modification changes if we're skipping changes of this
+ * transaction.
+ */
+ if (skipping_changes() &&
+ (action == LOGICAL_REP_MSG_INSERT ||
+ action == LOGICAL_REP_MSG_UPDATE ||
+ action == LOGICAL_REP_MSG_DELETE ||
+ action == LOGICAL_REP_MSG_TRUNCATE))
+ return;
+
/*
* Push apply error context callback. Other fields will be filled during
* applying the change.
@@ -3768,3 +3839,106 @@ reset_apply_error_context_info(void)
apply_error_callback_arg.committs = 0;
reset_apply_error_context_rel();
}
+
+/*
+ * Start skipping changes of the transaction if the given XID matches the
+ * transaction ID specified by skip_xid option.
+ */
+static void
+maybe_start_skipping_changes(TransactionId xid)
+{
+ Assert(!skipping_changes());
+ Assert(!TransactionIdIsValid(skipping_xid));
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ if (!TransactionIdIsValid(MySubscription->skipxid) ||
+ MySubscription->skipxid != xid)
+ return;
+
+ skipping_xid = xid;
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction with xid %u",
+ skipping_xid));
+}
+
+/*
+ * Stop skipping changes and reset the skip XID.
+ *
+ * If origin_lsn and origin_committs are valid, we set origin state to the
+ * transaction commit that resets the skip XID so that we can start streaming
+ * from the transaction next to the one that we just skipped.
+ */
+static void
+stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs)
+{
+ Relation rel;
+ HeapTuple tup;
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ Assert(skipping_changes());
+ Assert(TransactionIdIsValid(skipping_xid));
+ Assert(in_remote_transaction);
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction with xid %u",
+ skipping_xid)));
+
+ /* Stop skipping changes */
+ skipping_xid = InvalidTransactionId;
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ if (!IsTransactionState())
+ StartTransactionCommand();
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ TransactionIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ /* Set subskipxid to null */
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ /* Update the system catalog to reset the skip XID */
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+
+ heap_freetuple(tup);
+ table_close(rel, RowExclusiveLock);
+
+ if (!XLogRecPtrIsInvalid(origin_lsn))
+ {
+ /*
+ * Update origin state so we can restart streaming from correct
+ * position in case of crash.
+ */
+ replorigin_session_origin_lsn = origin_lsn;
+ replorigin_session_origin_timestamp = origin_committs;
+ }
+
+ CommitTransactionCommand();
+ pgstat_report_stat(false);
+
+ /*
+ * Clear the error statistics of this subscription to let users know the
+ * subscription is no longer getting stuck by the conflict.
+ *
+ * The message for clearing the error statistics can be lost but that's
+ * okay. The user can know the logical replication is working fine in
+ * other ways, for example, checking pg_stat_subscription view. And the
+ * user is able to reset the single subscription error statistics by
+ * pg_reset_subscription_error SQL function.
+ */
+ pgstat_clear_subscription_error(MySubscription->oid, InvalidOid);
+}
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 21061493ea..beaa6e646d 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -67,6 +67,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
char subtwophasestate; /* Stream two-phase transactions */
+ TransactionId subskipxid BKI_FORCE_NULL; /* All changes associated with
+ * this XID are skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -103,6 +106,7 @@ typedef struct Subscription
* binary format */
bool stream; /* Allow streaming in-progress transactions. */
char twophasestate; /* Allow streaming two-phase transactions */
+ TransactionId skipxid; /* All changes of the XID are skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 504d65f7d6..aec06b0d23 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3677,6 +3677,7 @@ typedef struct AlterSubscriptionStmt
char *conninfo; /* Connection string to publisher */
List *publication; /* One or more publication to subscribe to */
List *options; /* List of DefElem nodes */
+ TransactionId skip_xid; /* XID to skip */
} AlterSubscriptionStmt;
typedef struct DropSubscriptionStmt
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 16eb64b607..d1360bb068 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -545,7 +545,7 @@ typedef struct PgStat_MsgReplSlot
/* ----------
* PgStat_MsgSubscriptionErr Sent by the apply worker or the table sync worker to
- * update/reset the error happening during logical
+ * update/reset/clear the error happening during logical
* replication.
* ----------
*/
@@ -563,7 +563,10 @@ typedef struct PgStat_MsgSubscriptionErr
Oid m_subid;
Oid m_subrelid;
- /* The clear message uses below field */
+
+ /* The clear and reset messages use below fields */
+ bool m_clear; /* clear all fields except for last_failure and
+ * last_errmsg */
bool m_reset; /* Reset all fields and set reset_stats
* timestamp */
@@ -1097,6 +1100,7 @@ extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type t
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
extern void pgstat_reset_subscription_error(Oid subid, Oid subrelid);
+extern void pgstat_clear_subscription_error(Oid subid, Oid subrelid);
extern void pgstat_report_autovac(Oid dboid);
extern void pgstat_report_vacuum(Oid tableoid, bool shared,
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 9518e2c20b..f1934c1064 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -297,6 +297,19 @@ ERROR: unrecognized subscription parameter: "enabled"
-- fail - RESET must not include values
ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
ERROR: RESET must not include values for parameters
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = '4294967295');
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1.1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+ERROR: invalid transaction id
\dRs+
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index b73d932dfc..a69bb3b36e 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -231,6 +231,17 @@ ALTER SUBSCRIPTION regress_testsub RESET (enabled);
-- fail - RESET must not include values
ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = '4294967295');
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1.1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+
\dRs+
DROP SUBSCRIPTION regress_testsub;
diff --git a/src/test/subscription/t/023_skip_xact.pl b/src/test/subscription/t/023_skip_xact.pl
new file mode 100644
index 0000000000..8384645759
--- /dev/null
+++ b/src/test/subscription/t/023_skip_xact.pl
@@ -0,0 +1,185 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Test skipping logical replication transactions
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 8;
+
+sub test_subscription_error
+{
+ my ($node, $expected, $source, $relname, $msg) = @_;
+
+ # Wait for the error statistics to be updated.
+ $node->poll_query_until(
+ 'postgres', qq[
+SELECT count(1) > 0 FROM pg_stat_subscription_errors
+WHERE relid = '$relname'::regclass AND failure_source = '$source';
+]) or die "Timed out while waiting for statistics to be updated";
+
+ my $result = $node->safe_psql(
+ 'postgres',
+ qq[
+SELECT datname, subname, command, relid::regclass, failure_source, failure_count > 0
+FROM pg_stat_subscription_errors
+WHERE relid = '$relname'::regclass AND failure_source = '$source';
+]);
+ is($result, $expected, $msg);
+}
+
+# Create publisher node.
+my $node_publisher = PostgresNode->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgresNode->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+
+# don't overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ 'wal_retrieve_retry_interval = 5s');
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we create
+# the same tables but with primary keys. Also, insert some data that will conflict
+# with the data replicated from publisher later.
+$node_publisher->safe_psql('postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int);
+CREATE TABLE test_tab2 (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+INSERT INTO test_tab1 VALUES (1);
+INSERT INTO test_tab2 VALUES (1);
+COMMIT;
+]);
+$node_subscriber->safe_psql('postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int primary key);
+CREATE TABLE test_tab2 (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab2 VALUES (1);
+INSERT INTO test_tab_streaming SELECT 10000, md5(10000::text);
+COMMIT;
+]);
+
+# Check if there is no subscription errors before starting logical replication.
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_errors");
+is($result, qq(0), 'check no subscription error');
+
+# Setup logical replication.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ q[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Start logical replication. The table sync for test_tab2 on tap_sub will enter
+# infinite error due to violating the unique constraint.
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (two_phase = on);");
+my $appname_streaming = 'tap_sub_streaming';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming' PUBLICATION tap_pub_streaming WITH (streaming = on);");
+
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Also wait for initial table sync for test_tab1 and test_tab_streaming to finish.
+$node_subscriber->poll_query_until('postgres',
+ q[
+SELECT count(1) = 2 FROM pg_subscription_rel
+WHERE srrelid in ('test_tab1'::regclass, 'test_tab_streaming'::regclass) AND srsubstate = 'r'
+]) or die "Timed out while waiting for subscriber to synchronize data";
+
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab1");
+is($result, q(1), 'check initial data was copied to subscriber');
+
+# Insert more data to test_tab1, raising an error on the subscriber due to violating
+# the unique constraint on test_tab1.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab1 VALUES (1)");
+
+# Insert enough rows to test_tab_streaming to exceed the 64kB limit, also raising an
+# error on the subscriber for the same reason.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);");
+
+# Check both two errors on tap_sub subscription are reported.
+test_subscription_error($node_subscriber, qq(postgres|tap_sub|INSERT|test_tab1|apply|t),
+ 'apply', 'test_tab1', 'error reporting by the apply worker');
+test_subscription_error($node_subscriber, qq(postgres|tap_sub||test_tab2|tablesync|t),
+ 'tablesync', 'test_tab2', 'error reporting by the table sync worker');
+test_subscription_error($node_subscriber, qq(postgres|tap_sub_streaming|INSERT|test_tab_streaming|apply|t),
+ 'apply', 'test_tab_streaming', 'error reporting by the apply worker');
+
+# Set XIDs of the transactions in question to the subscriptions to skip.
+my $skip_xid1 = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT xid FROM pg_stat_subscription_errors WHERE relid = 'test_tab1'::regclass");
+my $skip_xid2 = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT xid FROM pg_stat_subscription_errors WHERE relid = 'test_tab_streaming'::regclass");
+
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (skip_xid = $skip_xid1)");
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub_streaming SET (skip_xid = $skip_xid2)");
+
+# Restart the subscriber to restart logical replication without interval.
+$node_subscriber->restart;
+
+# Wait for the transaction in question is skipped.
+$node_subscriber->poll_query_until(
+ 'postgres',
+ q[
+SELECT count(1) = 2 FROM pg_subscription
+WHERE subname in ('tap_sub', 'tap_sub_streaming') AND subskipxid IS NULL
+]) or die "Timed out while waiting for the transaction to be skipped";
+
+# Insert data to test_tab1 that doesn't conflict.
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab1 VALUES (2)");
+
+# Also, insert data to test_tab_streaming that doesn't conflict.
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab_streaming VALUES (10001, md5(10001::text))");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Check the data is successfully replicated after skipping the transaction.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT * FROM test_tab1");
+is($result, q(1
+2), "subscription gets changes after skipped transaction");
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM test_tab_streaming");
+is($result, q(2), "subscription gets changes after skipped transaction");
+
+# Check if the view doesn't show any entries after dropping the subscription.
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+DROP SUBSCRIPTION tap_sub;
+DROP SUBSCRIPTION tap_sub_streaming;
+]);
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_errors");
+is($result, q(0), 'no error after dropping subscription');
--
2.24.3 (Apple Git-128)
v6-0002-Add-pg_stat_subscription_errors-statistics-view.patchapplication/octet-stream; name=v6-0002-Add-pg_stat_subscription_errors-statistics-view.patchDownload
From 46ab7d46f308aa6f60470463a10a883c1a98b1a1 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 16 Jul 2021 23:10:22 +0900
Subject: [PATCH v6 2/4] Add pg_stat_subscription_errors statistics view.
This commits adds new system view pg_stat_logical_replication_error,
showing errors happening during applying logical replication changes
as well as during performing initial table synchronization.
The subscription error entries are removed by autovacuum workers when
the table synchronization competed in table sync worker cases and when
dropping the subscription in apply worker cases.
It also adds SQL function pg_stat_reset_subscription_error() to
reset the single subscription error.
---
doc/src/sgml/monitoring.sgml | 160 ++++++
src/backend/catalog/system_functions.sql | 2 +
src/backend/catalog/system_views.sql | 27 +
src/backend/postmaster/pgstat.c | 651 +++++++++++++++++++++++
src/backend/replication/logical/worker.c | 46 +-
src/backend/utils/adt/pgstatfuncs.c | 119 +++++
src/backend/utils/error/elog.c | 16 +
src/include/catalog/pg_proc.dat | 13 +
src/include/pgstat.h | 106 ++++
src/include/utils/elog.h | 1 +
src/test/regress/expected/rules.out | 22 +
src/tools/pgindent/typedefs.list | 5 +
12 files changed, 1163 insertions(+), 5 deletions(-)
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 74a58a916c..0fd0bbfa1f 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -346,6 +346,15 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
</entry>
</row>
+ <row>
+ <entry><structname>pg_stat_subscription_errors</structname><indexterm><primary>pg_stat_subscription_errors</primary></indexterm></entry>
+ <entry>One row per error that happened on subscription, showing information about
+ the subscription errors.
+ See <link linkend="monitoring-pg-stat-subscription-errors">
+ <structname>pg_stat_subscription_errors</structname></link> for details.
+ </entry>
+ </row>
+
<row>
<entry><structname>pg_stat_ssl</structname><indexterm><primary>pg_stat_ssl</primary></indexterm></entry>
<entry>One row per connection (regular and replication), showing information about
@@ -3050,6 +3059,135 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
</sect2>
+ <sect2 id="monitoring-pg-stat-subscription-errors">
+ <title><structname>pg_stat_subscription_errors</structname></title>
+
+ <indexterm>
+ <primary>pg_stat_subscription_errors</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_stat_subscription_errors</structname> view will contain one
+ row per subscription error reported by workers applying logical replication
+ changes and workers handling the initial data copy of the subscribed tables.
+ </para>
+
+ <table id="pg-stat-subscription-errors" xreflabel="pg_stat_subscription-errors">
+ <title><structname>pg_stat_subscription_errors</structname> View</title>
+ <tgroup cols="1">
+ <thead>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ Column Type
+ </para>
+ <para>
+ Description
+ </para></entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>datname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the database in which the subscription is created.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the subscription.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the subscription.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>relid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker is processing when the
+ error happened.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>command</structfield> <type>text</type>
+ </para>
+ <para>
+ Name of command being applied when the error happened. This
+ field is always NULL if the error is reported by
+ <literal>tablesync</literal> worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>xid</structfield> <type>xid</type>
+ </para>
+ <para>
+ Transaction ID of publisher node being applied when the error
+ happened. This field is always NULL if the error is reported
+ by <literal>tablesync</literal> worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>failure_source</structfield> <type>text</type>
+ </para>
+ <para>
+ Type of worker reported the error: <literal>apply</literal> or
+ <literal>tablesync</literal>.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>failure_count</structfield> <type>uint8</type>
+ </para>
+ <para>
+ Number of times error happened on the worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_failure</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which the last error happened.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_failure_message</structfield> <type>text</type>
+ </para>
+ <para>
+ Error message which is reported last failure time.
+ </para></entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ </sect2>
+
<sect2 id="monitoring-pg-stat-ssl-view">
<title><structname>pg_stat_ssl</structname></title>
@@ -5172,6 +5310,28 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
can be granted EXECUTE to run the function.
</para></entry>
</row>
+
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_stat_reset_subscription_error</primary>
+ </indexterm>
+ <function>pg_stat_reset_subscription_error</function> ( <parameter>subid</parameter> <type>oid</type>, <parameter>relid</parameter> <type>oid</type> )
+ <returnvalue>void</returnvalue>
+ </para>
+ <para>
+ Resets statistics of a single subscription error. If
+ the argument <parameter>relid</parameter> is not <literal>NULL</literal>,
+ resets error statistics of the <literal>tablesync</literal> worker for
+ the relation with <parameter>relid</parameter>. Otherwise, resets the
+ error statistics of the <literal>apply</literal> worker running on the
+ subscription with <parameter>subid</parameter>.
+ </para>
+ <para>
+ This function is restricted to superusers by default, but other users
+ can be granted EXECUTE to run the function.
+ </para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index a416e94d37..c9aa6f04d3 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -639,6 +639,8 @@ REVOKE EXECUTE ON FUNCTION pg_stat_reset_single_function_counters(oid) FROM publ
REVOKE EXECUTE ON FUNCTION pg_stat_reset_replication_slot(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_stat_reset_subscription_error(oid, oid) FROM public;
+
REVOKE EXECUTE ON FUNCTION lo_import(text) FROM public;
REVOKE EXECUTE ON FUNCTION lo_import(text, oid) FROM public;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 55f6e3711d..449692afa9 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1257,3 +1257,30 @@ REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
substream, subtwophasestate, subslotname, subsynccommit, subpublications)
ON pg_subscription TO public;
+
+CREATE VIEW pg_stat_subscription_errors AS
+ SELECT
+ d.datname,
+ sr.subid,
+ s.subname,
+ e.relid,
+ e.command,
+ e.xid,
+ e.failure_source,
+ e.failure_count,
+ e.last_failure,
+ e.last_failure_message,
+ e.stats_reset
+ FROM (SELECT
+ oid as subid,
+ NULL as relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT
+ srsubid as subid,
+ srrelid as relid
+ FROM pg_subscription_rel
+ WHERE srsubstate <> 'r') sr,
+ LATERAL pg_stat_get_subscription_error(sr.subid, sr.relid) e
+ JOIN pg_database d ON (e.datid = d.oid)
+ JOIN pg_subscription s ON (e.subid = s.oid);
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 11702f2a80..0507f1961e 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -41,6 +41,8 @@
#include "catalog/partition.h"
#include "catalog/pg_database.h"
#include "catalog/pg_proc.h"
+#include "catalog/pg_subscription.h"
+#include "catalog/pg_subscription_rel.h"
#include "common/ip.h"
#include "executor/instrument.h"
#include "libpq/libpq.h"
@@ -106,6 +108,7 @@
#define PGSTAT_TAB_HASH_SIZE 512
#define PGSTAT_FUNCTION_HASH_SIZE 512
#define PGSTAT_REPLSLOT_HASH_SIZE 32
+#define PGSTAT_SUBSCRIPTION_HASH_SIZE 32
/* ----------
@@ -279,6 +282,7 @@ static PgStat_GlobalStats globalStats;
static PgStat_WalStats walStats;
static PgStat_SLRUStats slruStats[SLRU_NUM_ELEMENTS];
static HTAB *replSlotStatHash = NULL;
+static HTAB *subscriptionHash = NULL;
/*
* List of OIDs of databases we need to write out. If an entry is InvalidOid,
@@ -320,6 +324,12 @@ static bool pgstat_db_requested(Oid databaseid);
static PgStat_StatReplSlotEntry *pgstat_get_replslot_entry(NameData name, bool create_it);
static void pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotstats, TimestampTz ts);
+static PgStat_StatSubEntry *pgstat_get_subscription_entry(Oid subid,
+ bool create);
+static PgStat_StatSubErrEntry *pgstat_get_subscription_error_entry(Oid subid,
+ Oid subrelid,
+ bool create);
+
static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg);
static void pgstat_send_funcstats(void);
static void pgstat_send_slru(void);
@@ -358,6 +368,10 @@ static void pgstat_recv_checksum_failure(PgStat_MsgChecksumFailure *msg, int len
static void pgstat_recv_connstat(PgStat_MsgConn *msg, int len);
static void pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len);
static void pgstat_recv_tempfile(PgStat_MsgTempFile *msg, int len);
+static void pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len);
+static void pgstat_recv_subscription_error_purge(PgStat_MsgSubscriptionErrPurge *msg,
+ int len);
+static void pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len);
/* ------------------------------------------------------------
* Public functions called from postmaster follow
@@ -1134,6 +1148,166 @@ pgstat_vacuum_stat(void)
}
}
+ /*
+ * Search for all the dead subscriptions and error entries in stats
+ * hashtable and tell the stats collector to drop them.
+ */
+ if (subscriptionHash)
+ {
+ PgStat_MsgSubscriptionPurge submsg;
+ PgStat_StatSubEntry *subent;
+ HTAB *htab;
+
+ htab = pgstat_collect_oids(SubscriptionRelationId, Anum_pg_subscription_oid);
+
+ submsg.m_nentries = 0;
+
+ hash_seq_init(&hstat, subscriptionHash);
+ while ((subent = (PgStat_StatSubEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ PgStat_MsgSubscriptionErrPurge errmsg;
+ PgStat_StatSubErrEntry *errent;
+ HASH_SEQ_STATUS hstat_rel;
+ List *not_ready_rels_list;
+ HTAB *not_ready_rels_htab;
+ ListCell *lc;
+ HASHCTL hash_ctl;
+
+ CHECK_FOR_INTERRUPTS();
+
+ if (hash_search(htab, (void *) &(subent->subid), HASH_FIND, NULL) == NULL)
+ {
+ /* This subscription is dead, add subid to the message */
+ submsg.m_subids[submsg.m_nentries++] = subent->subid;
+
+ /*
+ * If the message is full, send it out and reinitialize to
+ * empty
+ */
+ if (submsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONPURGE)
+ {
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + submsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&submsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(&submsg, len);
+ submsg.m_nentries = 0;
+ }
+
+ continue;
+ }
+
+ /*
+ * Nothing to do here if the subscription exists but has no error
+ * entries.
+ */
+ if (subent->suberrors == NULL)
+ continue;
+
+ /*
+ * The subscription has error entries. We search errors of the
+ * table sync workers who are already in sync state. Those errors
+ * should be removed.
+ *
+ * Note that the lifetime of error entries of the apply worker and
+ * the table sync worker are different. The former lives until
+ * the subscription is dropped whereas the latter lives the table
+ * synchronization is completed.
+ */
+ not_ready_rels_list = GetSubscriptionNotReadyRelations(subent->subid);
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(SubscriptionRelState);
+ not_ready_rels_htab = hash_create("not ready relations in subscription",
+ 64,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+
+ /*
+ * The number of not-ready relations can be high for example right
+ * after creating a subscription, so we load the list of
+ * SubscriptionRelState into the hash table for a faster lookup.
+ */
+ foreach(lc, not_ready_rels_list)
+ {
+ SubscriptionRelState *r_elem = (SubscriptionRelState *) lfirst(lc);
+ SubscriptionRelState *r_entry;
+
+ CHECK_FOR_INTERRUPTS();
+ r_entry = hash_search(not_ready_rels_htab, (void *) &(r_elem->relid),
+ HASH_ENTER, NULL);
+ memcpy(r_entry, r_elem, sizeof(SubscriptionRelState));
+ }
+
+ list_free(not_ready_rels_list);
+
+ errmsg.m_nentries = 0;
+ errmsg.m_subid = subent->subid;
+
+ /*
+ * Search for all error entries of which relation is already ready
+ * state
+ */
+ hash_seq_init(&hstat_rel, subent->suberrors);
+ while ((errent = (PgStat_StatSubErrEntry *) hash_seq_search(&hstat_rel)) != NULL)
+ {
+ CHECK_FOR_INTERRUPTS();
+
+ /* Skip the apply worker's error */
+ if (!OidIsValid(errent->subrelid))
+ continue;
+
+ /*
+ * Add the relid to the message if the table synchronization
+ * for this relation already complete or the table is no
+ * longer subscribed.
+ */
+ if (hash_search(not_ready_rels_htab, (void *) &(errent->subrelid),
+ HASH_FIND, NULL) == NULL)
+ errmsg.m_relids[errmsg.m_nentries++] = errent->subrelid;
+
+ /*
+ * If the message is full, send it out and reinitialize to
+ * empty
+ */
+ if (errmsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONERRPURGE)
+ {
+ len = offsetof(PgStat_MsgSubscriptionErrPurge, m_relids[0])
+ + errmsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&errmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE);
+ pgstat_send(&errmsg, len);
+ errmsg.m_nentries = 0;
+ }
+ }
+
+ /* Send the rest of dead error entries */
+ if (errmsg.m_nentries > 0)
+ {
+ len = offsetof(PgStat_MsgSubscriptionErrPurge, m_relids[0])
+ + errmsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&errmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE);
+ pgstat_send(&errmsg, len);
+ errmsg.m_nentries = 0;
+ }
+
+ hash_destroy(not_ready_rels_htab);
+ }
+
+ /* Send the rest of dead subscriptions */
+ if (submsg.m_nentries > 0)
+ {
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + submsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&submsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(&submsg, len);
+ }
+
+ hash_destroy(htab);
+ }
+
/*
* Lookup our own database entry; if not found, nothing more to do.
*/
@@ -1543,6 +1717,25 @@ pgstat_reset_replslot_counter(const char *name)
pgstat_send(&msg, sizeof(msg));
}
+/* ----------
+ * pgstat_reset_subscription_error() -
+ *
+ * Tell the collector about reset the subscription error.
+ * ----------
+ */
+void
+pgstat_reset_subscription_error(Oid subid, Oid subrelid)
+{
+ PgStat_MsgSubscriptionErr msg;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_reset = true;
+
+ pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
+}
+
/* ----------
* pgstat_report_autovac() -
*
@@ -1863,6 +2056,37 @@ pgstat_report_replslot_drop(const char *slotname)
pgstat_send(&msg, sizeof(PgStat_MsgReplSlot));
}
+/* ----------
+ * pgstat_report_subscription_error() -
+ *
+ * Tell the collector about the subscription error.
+ * ----------
+ */
+void
+pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command, TransactionId xid,
+ const char *errmsg)
+{
+ PgStat_MsgSubscriptionErr msg;
+ int len;
+
+ len = offsetof(PgStat_MsgSubscriptionErr, m_errmsg[0]) + strlen(errmsg);
+ Assert(len < PGSTAT_MAX_MSG_SIZE);
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_databaseid = MyDatabaseId;
+ msg.m_relid = relid;
+ msg.m_reset = false;
+ msg.m_command = command;
+ msg.m_xid = xid;
+ msg.m_failure_time = GetCurrentTimestamp();
+ strlcpy(msg.m_errmsg, errmsg, PGSTAT_SUBSCRIPTIONERR_MSGLEN);
+
+ pgstat_send(&msg, len);
+}
+
/* ----------
* pgstat_ping() -
*
@@ -2895,6 +3119,22 @@ pgstat_fetch_replslot(NameData slotname)
return pgstat_get_replslot_entry(slotname, false);
}
+/*
+ * ---------
+ * pgstat_fetch_subscription_error() -
+ *
+ * Support function for the SQL-callable pgstat* functions. Returns
+ * a pointer to the subscription error struct.
+ * ---------
+ */
+PgStat_StatSubErrEntry *
+pgstat_fetch_subscription_error(Oid subid, Oid relid)
+{
+ backend_read_statsfile();
+
+ return pgstat_get_subscription_error_entry(subid, relid, false);
+}
+
/*
* Shut down a single backend's statistics reporting at process exit.
*
@@ -3424,6 +3664,19 @@ PgstatCollectorMain(int argc, char *argv[])
pgstat_recv_connstat(&msg.msg_conn, len);
break;
+ case PGSTAT_MTYPE_SUBSCRIPTIONERR:
+ pgstat_recv_subscription_error(&msg.msg_subscriptionerr, len);
+ break;
+
+ case PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE:
+ pgstat_recv_subscription_error_purge(&msg.msg_subscriptionerrpurge,
+ len);
+ break;
+
+ case PGSTAT_MTYPE_SUBSCRIPTIONPURGE:
+ pgstat_recv_subscription_purge(&msg.msg_subscriptionpurge, len);
+ break;
+
default:
break;
}
@@ -3725,6 +3978,50 @@ pgstat_write_statsfiles(bool permanent, bool allDbs)
}
}
+ /*
+ * Write subscription error structs
+ */
+ if (subscriptionHash)
+ {
+ PgStat_StatSubEntry *subent;
+
+ hash_seq_init(&hstat, subscriptionHash);
+ while ((subent = (PgStat_StatSubEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ PgStat_StatSubErrEntry *errent;
+ HASH_SEQ_STATUS relhstat;
+ int32 nerrors = hash_get_num_entries(subent->suberrors);
+
+ /* Skip this subscription if not have any errors */
+ if (subent->suberrors == NULL)
+ continue;
+
+ fputc('S', fpout);
+ rc = fwrite(subent, sizeof(PgStat_StatSubEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+
+ /* The number of errors follows */
+ rc = fwrite(&nerrors, sizeof(int32), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+
+ /* Then, the error entries follow */
+ hash_seq_init(&relhstat, subent->suberrors);
+ while ((errent = (PgStat_StatSubErrEntry *) hash_seq_search(&relhstat)) != NULL)
+ {
+ /*
+ * XXX we write the whole PgStat_StatSubErrEntry entry that
+ * contains the fixed-legnth error message string which is
+ * PGSTAT_SUBSCRIPTIONERR_MSGLEN in length, making the stats
+ * file bloat. It's okay since we assume that the number of
+ * error entries is not high. But if the expectation became
+ * false we should write the string and its length instead.
+ */
+ rc = fwrite(errent, sizeof(PgStat_StatSubErrEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+ }
+ }
+ }
+
/*
* No more output to be done. Close the temp file and replace the old
* pgstat.stat with it. The ferror() check replaces testing for error
@@ -4184,6 +4481,100 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
break;
}
+ /*
+ * 'S' A PgStat_StatSubEntry struct followed by the number of
+ * errors and PgStat_StatSubErrEntry structs, describing
+ * subscription errors.
+ */
+ case 'S':
+ {
+ PgStat_StatSubEntry subbuf;
+ PgStat_StatSubEntry *subent;
+ int32 nerrors;
+
+ /* Read the subscription entry */
+ if (fread(&subbuf, 1, sizeof(PgStat_StatSubEntry), fpin) !=
+ sizeof(PgStat_StatSubEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /* Create hash table if we don't have it already. */
+ if (subscriptionHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ subscriptionHash = hash_create("Subscription stat entries",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ }
+
+ /* Enter the subscription error entry */
+ subent =
+ (PgStat_StatSubEntry *) hash_search(subscriptionHash,
+ (void *) &(subbuf.subid),
+ HASH_ENTER, NULL);
+ subent->suberrors = NULL;
+
+ /* Read the number of errors in the subscription */
+ if (fread(&nerrors, 1, sizeof(int32), fpin) != sizeof(int32))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /* Read error entries */
+ for (int i = 0; i < nerrors; i++)
+ {
+ PgStat_StatSubErrEntry errbuf;
+ PgStat_StatSubErrEntry *errent;
+
+ if (fread(&errbuf, 1, sizeof(PgStat_StatSubErrEntry), fpin) !=
+ sizeof(PgStat_StatSubErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ if (subent->suberrors == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubErrEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ subent->suberrors = hash_create("Subscription error hash",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ }
+
+ /*
+ * Enter the error information to the subscription
+ * hash
+ */
+ errent =
+ (PgStat_StatSubErrEntry *) hash_search(subent->suberrors,
+ (void *) &(errbuf.subrelid),
+ HASH_ENTER, NULL);
+
+ memcpy(errent, &errbuf, sizeof(PgStat_StatSubErrEntry));
+ }
+
+ break;
+ }
+
case 'E':
goto done;
@@ -4526,6 +4917,50 @@ pgstat_read_db_statsfile_timestamp(Oid databaseid, bool permanent,
}
break;
+ /*
+ * 'S' A PgStat_StatSubEntry struct followed by the number of
+ * errors and PgStat_StatSubErrEntry structs describing
+ * subscription errors.
+ */
+ case 'S':
+ {
+ PgStat_StatSubEntry subbuf;
+ PgStat_StatSubErrEntry errbuf;
+ int32 nerrors;
+
+ if (fread(&subbuf, 1, sizeof(PgStat_StatSubEntry), fpin)
+ != sizeof(PgStat_StatSubEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ FreeFile(fpin);
+ return false;
+ }
+
+ if (fread(&nerrors, 1, sizeof(int32), fpin) != sizeof(int32))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ for (int i = 0; i < nerrors; i++)
+ {
+ if (fread(&errbuf, 1, sizeof(PgStat_StatSubErrEntry), fpin) !=
+ sizeof(PgStat_StatSubErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+ }
+ }
+
+ break;
+
case 'E':
goto done;
@@ -4716,6 +5151,7 @@ pgstat_clear_snapshot(void)
pgStatLocalContext = NULL;
pgStatDBHash = NULL;
replSlotStatHash = NULL;
+ subscriptionHash = NULL;
/*
* Historically the backend_status.c facilities lived in this file, and
@@ -5650,6 +6086,116 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
}
}
+/* ----------
+ * pgstat_recv_subscription_error() -
+ *
+ * Process a SUBSCRIPTIONERR message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len)
+{
+ PgStat_StatSubErrEntry *errent;
+ bool create = !msg->m_reset;
+
+ /* Get subscription error */
+ errent = pgstat_get_subscription_error_entry(msg->m_subid,
+ msg->m_subrelid,
+ create);
+
+ if (msg->m_reset)
+ {
+ if (errent == NULL)
+ return;
+
+ /* reset fields and set reset timestamp */
+ errent->relid = InvalidOid;
+ errent->command = 0;
+ errent->xid = InvalidTransactionId;
+ errent->failure_count = 0;
+ errent->last_failure = 0;
+ errent->last_errmsg[0] = '\0';
+ errent->stat_reset_timestamp = GetCurrentTimestamp();
+ }
+ else
+ {
+ Assert(errent);
+
+ /* update the error entry */
+ errent->databaseid = msg->m_databaseid;
+ errent->relid = msg->m_relid;
+ errent->command = msg->m_command;
+ errent->xid = msg->m_xid;
+ errent->failure_count++;
+ errent->last_failure = msg->m_failure_time;
+ strlcpy(errent->last_errmsg, msg->m_errmsg, PGSTAT_SUBSCRIPTIONERR_MSGLEN);
+ }
+}
+
+/* ----------
+ * pgstat_recv_subscription_purge() -
+ *
+ * Process a SUBSCRIPTIONPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len)
+{
+ if (subscriptionHash == NULL)
+ return;
+
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ PgStat_StatSubEntry *subent;
+
+ subent = pgstat_get_subscription_entry(msg->m_subids[i], false);
+
+ /*
+ * Nothing to do if the subscription entry is not found. This could
+ * happen when the subscription is dropped and the message for
+ * dropping subscription entry arrived before the message for
+ * reporting the error.
+ */
+ if (subent == NULL)
+ continue;
+
+ /* Cleanup the hash table for errors */
+ if (subent->suberrors != NULL)
+ hash_destroy(subent->suberrors);
+
+ /* Remote the subscription entry */
+ (void) hash_search(subscriptionHash, (void *) &(msg->m_subids[i]),
+ HASH_REMOVE, NULL);
+ }
+}
+
+/* ----------
+ * pgstat_recv_subscription_error_purge() -
+ *
+ * Process a SUBSCRIPTIONERRPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_error_purge(PgStat_MsgSubscriptionErrPurge *msg, int len)
+{
+ PgStat_StatSubEntry *subent;
+
+ subent = pgstat_get_subscription_entry(msg->m_subid, false);
+
+ /*
+ * Nothing to do if the subscription entry is not found. This could
+ * happen when the subscription with msg->m_subid is removed and the
+ * corresponding entry is also removed before receiving the error purge
+ * message.
+ */
+ if (subent == NULL)
+ return;
+
+ for (int i = 0; i < msg->m_nentries; i++)
+ (void) hash_search(subent->suberrors, (void *) &(msg->m_relids[i]),
+ HASH_REMOVE, NULL);
+}
+
/* ----------
* pgstat_write_statsfile_needed() -
*
@@ -5747,6 +6293,111 @@ pgstat_get_replslot_entry(NameData name, bool create)
return slotent;
}
+/* ----------
+ * pgstat_get_subscription_entry
+ *
+ * Return the subscription statistics with the subscription OID. Return NULL
+ * if not found and the caller didn't request to create it.
+ *
+ * 'create' tells whether to create the new subscription entry if it is not
+ * found.
+ * ----------
+ */
+static PgStat_StatSubEntry *
+pgstat_get_subscription_entry(Oid subid, bool create)
+{
+ PgStat_StatSubEntry *subent;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);
+ bool found;
+
+ if (subscriptionHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ if (!create)
+ return NULL;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubEntry);
+ subscriptionHash = hash_create("Subscription stat entries",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+ }
+
+ subent = (PgStat_StatSubEntry *) hash_search(subscriptionHash,
+ (void *) &subid,
+ action, &found);
+
+ /* initialize field */
+ if (create && !found)
+ subent->suberrors = NULL;
+
+ return subent;
+}
+
+/* ----------
+ * pgstat_get_subscription_error_entry
+ *
+ * Return the entry of subscription error entry with the subscription
+ * OID and relation OID. Return NULL if not found and the caller didn't
+ * request to create it.
+ *
+ * 'create' tells whether to create the new subscription relation entry if it is
+ * not found.
+ * ----------
+ */
+static PgStat_StatSubErrEntry *
+pgstat_get_subscription_error_entry(Oid subid, Oid subrelid, bool create)
+{
+ PgStat_StatSubEntry *subent;
+ PgStat_StatSubErrEntry *errent;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);
+ bool found;
+
+ subent = pgstat_get_subscription_entry(subid, create);
+
+ if (subent == NULL)
+ {
+ Assert(!create);
+ return NULL;
+ }
+
+ if (subent->suberrors == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ if (!create)
+ return NULL;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubErrEntry);
+ subent->suberrors = hash_create("Subscription error hash",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+ }
+
+ errent = (PgStat_StatSubErrEntry *) hash_search(subent->suberrors,
+ (void *) &subrelid,
+ action, &found);
+
+ /* initialize fields */
+ if (create && !found)
+ {
+ errent->databaseid = InvalidOid;
+ errent->relid = InvalidOid;
+ errent->command = 0;
+ errent->xid = InvalidTransactionId;
+ errent->failure_count = 0;
+ errent->last_failure = 0;
+ errent->last_errmsg[0] = '\0';
+ errent->stat_reset_timestamp = 0;
+ }
+
+ return errent;
+}
+
/* ----------
* pgstat_reset_replslot
*
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index c2376d755f..5a3ba8d7c1 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -227,8 +227,9 @@ typedef struct ApplyErrCallbackArg
LogicalRepMsgType command; /* 0 if invalid */
/* Local relation information */
- char *nspname;
- char *relname;
+ Oid relid; /* used for error report */
+ char *nspname; /* used for error context */
+ char *relname; /* used for error context */
TransactionId remote_xid;
TimestampTz committs;
@@ -237,6 +238,7 @@ typedef struct ApplyErrCallbackArg
static ApplyErrCallbackArg apply_error_callback_arg =
{
.command = 0,
+ .relid = InvalidOid,
.relname = NULL,
.nspname = NULL,
.remote_xid = InvalidTransactionId,
@@ -3547,8 +3549,23 @@ ApplyWorkerMain(Datum main_arg)
{
char *syncslotname;
- /* This is table synchronization worker, call initial sync. */
- syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ PG_TRY();
+ {
+ /* This is table synchronization worker, call initial sync. */
+ syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ }
+ PG_CATCH();
+ {
+ /* report the table sync error */
+ pgstat_report_subscription_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ MyLogicalRepWorker->relid,
+ 0,
+ InvalidTransactionId,
+ geterrmessage());
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
/* allocate slot name in long-lived context */
myslotname = MemoryContextStrdup(ApplyContext, syncslotname);
@@ -3666,7 +3683,24 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- LogicalRepApplyLoop(origin_startpos);
+ PG_TRY();
+ {
+ LogicalRepApplyLoop(origin_startpos);
+ }
+ PG_CATCH();
+ {
+ /* report the apply error */
+ if (apply_error_callback_arg.command != 0)
+ pgstat_report_subscription_error(MySubscription->oid,
+ InvalidOid,
+ apply_error_callback_arg.relid,
+ apply_error_callback_arg.command,
+ apply_error_callback_arg.remote_xid,
+ geterrmessage());
+
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
proc_exit(0);
}
@@ -3711,6 +3745,7 @@ apply_error_callback(void *arg)
static void
set_apply_error_context_rel(LogicalRepRelMapEntry *rel)
{
+ apply_error_callback_arg.relid = rel->localreloid;
apply_error_callback_arg.nspname = rel->remoterel.nspname;
apply_error_callback_arg.relname = rel->remoterel.relname;
}
@@ -3719,6 +3754,7 @@ set_apply_error_context_rel(LogicalRepRelMapEntry *rel)
static void
reset_apply_error_context_rel(void)
{
+ apply_error_callback_arg.relid = InvalidOid;
apply_error_callback_arg.nspname = NULL;
apply_error_callback_arg.relname = NULL;
}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index f0e09eae4d..b53be1576f 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -24,6 +24,7 @@
#include "pgstat.h"
#include "postmaster/bgworker_internals.h"
#include "postmaster/postmaster.h"
+#include "replication/logicalproto.h"
#include "replication/slot.h"
#include "storage/proc.h"
#include "storage/procarray.h"
@@ -2240,6 +2241,23 @@ pg_stat_reset_replication_slot(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
+/* Reset a subscription error stats */
+Datum
+pg_stat_reset_subscription_error(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid;
+
+ if (PG_ARGISNULL(1))
+ relid = InvalidOid; /* reset apply worker error stats */
+ else
+ relid = PG_GETARG_OID(1); /* reset table sync worker error stats */
+
+ pgstat_reset_subscription_error(subid, relid);
+
+ PG_RETURN_VOID();
+}
+
Datum
pg_stat_get_archiver(PG_FUNCTION_ARGS)
{
@@ -2380,3 +2398,104 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
/* Returns the record as Datum */
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
+
+/*
+ * Get the subscription error for the given subscription (and relation).
+ */
+Datum
+pg_stat_get_subscription_error(PG_FUNCTION_ARGS)
+{
+#define PG_STAT_GET_SUBSCRIPTION_ERROR_COLS 10
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid;
+ TupleDesc tupdesc;
+ Datum values[PG_STAT_GET_SUBSCRIPTION_ERROR_COLS];
+ bool nulls[PG_STAT_GET_SUBSCRIPTION_ERROR_COLS];
+ PgStat_StatSubErrEntry *errent;
+
+ /* Initialise values and NULL flags arrays */
+ MemSet(values, 0, sizeof(values));
+ MemSet(nulls, 0, sizeof(nulls));
+
+ /* Initialise attributes information in the tuple descriptor */
+ tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_SUBSCRIPTION_ERROR_COLS);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "datid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 3, "relid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 4, "command",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 5, "xid",
+ XIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 6, "failure_source",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 7, "failure_count",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 8, "last_failure",
+ TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 9, "last_failure_message",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 10, "stats_reset",
+ TIMESTAMPTZOID, -1, 0);
+ BlessTupleDesc(tupdesc);
+
+ if (PG_ARGISNULL(1))
+ relid = InvalidOid;
+ else
+ relid = PG_GETARG_OID(1);
+
+ /* Get subscription errors */
+ errent = pgstat_fetch_subscription_error(subid, relid);
+
+ /* Return NULL if the subscription doesn't have any errors */
+ if (errent == NULL)
+ PG_RETURN_NULL();
+
+ values[0] = ObjectIdGetDatum(errent->databaseid);
+ values[1] = ObjectIdGetDatum(subid);
+
+ if (OidIsValid(errent->relid))
+ values[2] = ObjectIdGetDatum(errent->relid);
+ else
+ nulls[2] = true;
+
+ if (errent->command == 0)
+ nulls[3] = true;
+ else
+ {
+ /* should be apply worker */
+ Assert(!OidIsValid(errent->subrelid));
+
+ values[3] = CStringGetTextDatum(logicalrep_message_type(errent->command));
+ }
+
+ if (TransactionIdIsValid(errent->xid))
+ values[4] = TransactionIdGetDatum(errent->xid);
+ else
+ nulls[4] = true;
+
+ if (OidIsValid(errent->subrelid))
+ values[5] = CStringGetTextDatum("tablesync");
+ else
+ values[5] = CStringGetTextDatum("apply");
+
+
+ values[6] = Int64GetDatum(errent->failure_count);
+
+ if (errent->last_failure == 0)
+ nulls[7] = true;
+ else
+ values[7] = TimestampTzGetDatum(errent->last_failure);
+
+ values[8] = CStringGetTextDatum(errent->last_errmsg);
+
+ if (errent->stat_reset_timestamp == 0)
+ nulls[9] = true;
+ else
+ values[9] = TimestampTzGetDatum(errent->stat_reset_timestamp);
+
+ /* Returns the record as Datum */
+ PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
+}
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index a3e1c59a82..dd36850016 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -1441,6 +1441,22 @@ getinternalerrposition(void)
return edata->internalpos;
}
+/*
+ * geterrmessage --- return the currently set error message
+ *
+ * This is only intended for use in error callback subroutines, since there
+ * is no other place outside elog.c where the concept is meaningful.
+ */
+const char *
+geterrmessage(void)
+{
+ ErrorData *edata = &errordata[errordata_stack_depth];
+
+ /* we don't bother incrementing recursion_depth */
+ CHECK_STACK_DEPTH();
+
+ return (const char *) edata->message;
+}
/*
* Functions to allow construction of error message strings separately from
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 8cd0252082..044ff52227 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5321,6 +5321,14 @@
proargmodes => '{i,o,o,o,o,o,o,o,o,o,o}',
proargnames => '{slot_name,slot_name,spill_txns,spill_count,spill_bytes,stream_txns,stream_count,stream_bytes,total_txns,total_bytes,stats_reset}',
prosrc => 'pg_stat_get_replication_slot' },
+{ oid => '8523', descr => 'statistics: information about subscription error',
+ proname => 'pg_stat_get_subscription_error', prorows => '1', proisstrict => 'f',
+ proretset => 't', provolatile => 's', proparallel => 'r',
+ prorettype => 'record', proargtypes => 'oid oid',
+ proallargtypes => '{oid,oid,oid,oid,oid,text,xid,text,int8,timestamptz,text,timestamptz}',
+ proargmodes => '{i,i,o,o,o,o,o,o,o,o,o,o}',
+ proargnames => '{subid,relid,datid,subid,relid,command,xid,failure_source,failure_count,last_failure,last_failure_message,stats_reset}',
+ prosrc => 'pg_stat_get_subscription_error' },
{ oid => '6118', descr => 'statistics: information about subscription',
proname => 'pg_stat_get_subscription', prorows => '10', proisstrict => 'f',
proretset => 't', provolatile => 's', proparallel => 'r',
@@ -5708,6 +5716,11 @@
proname => 'pg_stat_reset_replication_slot', proisstrict => 'f',
provolatile => 'v', prorettype => 'void', proargtypes => 'text',
prosrc => 'pg_stat_reset_replication_slot' },
+{ oid => '8524',
+ descr => 'statistics: reset collected statistics for a single subscription error',
+ proname => 'pg_stat_reset_subscription_error', proisstrict => 'f',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid oid',
+ prosrc => 'pg_stat_reset_subscription_error' },
{ oid => '3163', descr => 'current trigger depth',
proname => 'pg_trigger_depth', provolatile => 's', proparallel => 'r',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 9612c0a6c2..16eb64b607 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -14,6 +14,7 @@
#include "datatype/timestamp.h"
#include "portability/instr_time.h"
#include "postmaster/pgarch.h" /* for MAX_XFN_CHARS */
+#include "replication/logicalproto.h"
#include "utils/backend_progress.h" /* for backward compatibility */
#include "utils/backend_status.h" /* for backward compatibility */
#include "utils/hsearch.h"
@@ -66,6 +67,9 @@ typedef enum StatMsgType
PGSTAT_MTYPE_RESETSINGLECOUNTER,
PGSTAT_MTYPE_RESETSLRUCOUNTER,
PGSTAT_MTYPE_RESETREPLSLOTCOUNTER,
+ PGSTAT_MTYPE_SUBSCRIPTIONERR,
+ PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE,
+ PGSTAT_MTYPE_SUBSCRIPTIONPURGE,
PGSTAT_MTYPE_AUTOVAC_START,
PGSTAT_MTYPE_VACUUM,
PGSTAT_MTYPE_ANALYZE,
@@ -539,6 +543,68 @@ typedef struct PgStat_MsgReplSlot
PgStat_Counter m_total_bytes;
} PgStat_MsgReplSlot;
+/* ----------
+ * PgStat_MsgSubscriptionErr Sent by the apply worker or the table sync worker to
+ * update/reset the error happening during logical
+ * replication.
+ * ----------
+ */
+#define PGSTAT_SUBSCRIPTIONERR_MSGLEN 256
+typedef struct PgStat_MsgSubscriptionErr
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * m_subid and m_subrelid are used to determine the subscription and the
+ * reporter of this error. m_subrelid is InvalidOid if reported by the
+ * apply worker, otherwise by the table sync worker. In table sync worker
+ * case, m_subrelid must be the same as m_relid.
+ */
+ Oid m_subid;
+ Oid m_subrelid;
+
+ /* The clear message uses below field */
+ bool m_reset; /* Reset all fields and set reset_stats
+ * timestamp */
+
+ /* The error report message uses below fields */
+ Oid m_databaseid;
+ Oid m_relid;
+ LogicalRepMsgType m_command;
+ TransactionId m_xid;
+ TimestampTz m_failure_time;
+ char m_errmsg[PGSTAT_SUBSCRIPTIONERR_MSGLEN];
+} PgStat_MsgSubscriptionErr;
+
+/* ----------
+ * PgStat_MsgSubscriptionPurge Sent by the autovacuum purge the subscriptions.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionPurge
+{
+ PgStat_MsgHdr m_hdr;
+ int m_nentries;
+ Oid m_subids[PGSTAT_NUM_SUBSCRIPTIONPURGE];
+} PgStat_MsgSubscriptionPurge;
+
+/* ----------
+ * PgStat_MsgSubscriptionErrPurge Sent by the autovacuum purge the subscription
+ * errors.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONERRPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(Oid) - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionErrPurge
+{
+ PgStat_MsgHdr m_hdr;
+ Oid m_subid;
+ int m_nentries;
+ Oid m_relids[PGSTAT_NUM_SUBSCRIPTIONERRPURGE];
+} PgStat_MsgSubscriptionErrPurge;
/* ----------
* PgStat_MsgRecoveryConflict Sent by the backend upon recovery conflict
@@ -710,6 +776,9 @@ typedef union PgStat_Msg
PgStat_MsgChecksumFailure msg_checksumfailure;
PgStat_MsgReplSlot msg_replslot;
PgStat_MsgConn msg_conn;
+ PgStat_MsgSubscriptionErr msg_subscriptionerr;
+ PgStat_MsgSubscriptionErrPurge msg_subscriptionerrpurge;
+ PgStat_MsgSubscriptionPurge msg_subscriptionpurge;
} PgStat_Msg;
@@ -908,6 +977,38 @@ typedef struct PgStat_StatReplSlotEntry
TimestampTz stat_reset_timestamp;
} PgStat_StatReplSlotEntry;
+/*
+ * Subscription statistics kept in the stats collector.
+ */
+typedef struct PgStat_StatSubEntry
+{
+ Oid subid; /* hash table key */
+ HTAB *suberrors;
+} PgStat_StatSubEntry;
+
+/*
+ * Subscription error statistics kept in the stats collector. One entry represents
+ * an error that happened during logical replication, reported by the apply worker
+ * (subrelid is InvalidOid) or by the table sync worker (subrelid is a valid OID).
+ * The error reported by the table sync worker is removed also when the table
+ * synchronization process completed.
+ */
+
+typedef struct PgStat_StatSubErrEntry
+{
+ Oid subrelid; /* InvalidOid if the apply worker, otherwise
+ * the table sync worker. hash table key. */
+ Oid databaseid;
+ Oid relid; /* OID of relation related to the error. Must
+ * be the same as subrelid in the table sync
+ * case. */
+ LogicalRepMsgType command;
+ TransactionId xid;
+ PgStat_Counter failure_count;
+ TimestampTz last_failure;
+ char last_errmsg[PGSTAT_SUBSCRIPTIONERR_MSGLEN];
+ TimestampTz stat_reset_timestamp;
+} PgStat_StatSubErrEntry;
/*
* Working state needed to accumulate per-function-call timing statistics.
@@ -995,6 +1096,7 @@ extern void pgstat_reset_shared_counters(const char *);
extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type type);
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
+extern void pgstat_reset_subscription_error(Oid subid, Oid subrelid);
extern void pgstat_report_autovac(Oid dboid);
extern void pgstat_report_vacuum(Oid tableoid, bool shared,
@@ -1011,6 +1113,9 @@ extern void pgstat_report_checksum_failure(void);
extern void pgstat_report_replslot(const PgStat_StatReplSlotEntry *repSlotStat);
extern void pgstat_report_replslot_create(const char *slotname);
extern void pgstat_report_replslot_drop(const char *slotname);
+extern void pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command,
+ TransactionId xid, const char *errmsg);
extern void pgstat_initialize(void);
@@ -1106,6 +1211,7 @@ extern PgStat_GlobalStats *pgstat_fetch_global(void);
extern PgStat_WalStats *pgstat_fetch_stat_wal(void);
extern PgStat_SLRUStats *pgstat_fetch_slru(void);
extern PgStat_StatReplSlotEntry *pgstat_fetch_replslot(NameData slotname);
+extern PgStat_StatSubErrEntry *pgstat_fetch_subscription_error(Oid subid, Oid relid);
extern void pgstat_count_slru_page_zeroed(int slru_idx);
extern void pgstat_count_slru_page_hit(int slru_idx);
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index f53607e12e..155145a77d 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -222,6 +222,7 @@ extern int err_generic_string(int field, const char *str);
extern int geterrcode(void);
extern int geterrposition(void);
extern int getinternalerrposition(void);
+extern const char *geterrmessage(void);
/*----------
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index e5ab11275d..ffad9790ae 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2094,6 +2094,28 @@ pg_stat_subscription| SELECT su.oid AS subid,
st.latest_end_time
FROM (pg_subscription su
LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid, pid, received_lsn, last_msg_send_time, last_msg_receipt_time, latest_end_lsn, latest_end_time) ON ((st.subid = su.oid)));
+pg_stat_subscription_errors| SELECT d.datname,
+ sr.subid,
+ s.subname,
+ e.relid,
+ e.command,
+ e.xid,
+ e.failure_source,
+ e.failure_count,
+ e.last_failure,
+ e.last_failure_message,
+ e.stats_reset
+ FROM ( SELECT pg_subscription.oid AS subid,
+ NULL::oid AS relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT pg_subscription_rel.srsubid AS subid,
+ pg_subscription_rel.srrelid AS relid
+ FROM pg_subscription_rel
+ WHERE (pg_subscription_rel.srsubstate <> 'r'::"char")) sr,
+ ((LATERAL pg_stat_get_subscription_error(sr.subid, sr.relid) e(datid, subid, relid, command, xid, failure_source, failure_count, last_failure, last_failure_message, stats_reset)
+ JOIN pg_database d ON ((e.datid = d.oid)))
+ JOIN pg_subscription s ON ((e.subid = s.oid)));
pg_stat_sys_indexes| SELECT pg_stat_all_indexes.relid,
pg_stat_all_indexes.indexrelid,
pg_stat_all_indexes.schemaname,
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index e69b708e33..b294063640 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1938,6 +1938,9 @@ PgStat_MsgResetsharedcounter
PgStat_MsgResetsinglecounter
PgStat_MsgResetslrucounter
PgStat_MsgSLRU
+PgStat_MsgSubscriptionErr
+PgStat_MsgSubscriptionErrPurge
+PgStat_MsgSubscriptionPurge
PgStat_MsgTabpurge
PgStat_MsgTabstat
PgStat_MsgTempFile
@@ -1949,6 +1952,8 @@ PgStat_Single_Reset_Type
PgStat_StatDBEntry
PgStat_StatFuncEntry
PgStat_StatReplSlotEntry
+PgStat_StatSubEntry
+PgStat_StatSubErrEntry
PgStat_StatTabEntry
PgStat_SubXactStatus
PgStat_TableCounts
--
2.24.3 (Apple Git-128)
On Tue, Aug 10, 2021 at 10:37 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached the latest patches that incorporated all comments I got
so far. Please review them.
I am not able to apply the latest patch
(v6-0001-Add-errcontext-to-errors-happening-during-applyin) on HEAD,
getting the below error:
patching file src/backend/replication/logical/worker.c
Hunk #11 succeeded at 1195 (offset 50 lines).
Hunk #12 succeeded at 1253 (offset 50 lines).
Hunk #13 succeeded at 1277 (offset 50 lines).
Hunk #14 succeeded at 1305 (offset 50 lines).
Hunk #15 succeeded at 1330 (offset 50 lines).
Hunk #16 succeeded at 1362 (offset 50 lines).
Hunk #17 succeeded at 1508 (offset 50 lines).
Hunk #18 succeeded at 1524 (offset 50 lines).
Hunk #19 succeeded at 1645 (offset 50 lines).
Hunk #20 succeeded at 1671 (offset 50 lines).
Hunk #21 succeeded at 1772 (offset 50 lines).
Hunk #22 succeeded at 1828 (offset 50 lines).
Hunk #23 succeeded at 1934 (offset 50 lines).
Hunk #24 succeeded at 1962 (offset 50 lines).
Hunk #25 succeeded at 2399 (offset 50 lines).
Hunk #26 FAILED at 2405.
Hunk #27 succeeded at 3730 (offset 54 lines).
1 out of 27 hunks FAILED -- saving rejects to file
src/backend/replication/logical/worker.c.rej
--
With Regards,
Amit Kapila.
On Tue, Aug 10, 2021 at 11:59 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Aug 10, 2021 at 10:37 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached the latest patches that incorporated all comments I got
so far. Please review them.I am not able to apply the latest patch
(v6-0001-Add-errcontext-to-errors-happening-during-applyin) on HEAD,
getting the below error:
Few comments on v6-0001-Add-errcontext-to-errors-happening-during-applyin
==============================================================
1. While applying DML operations, we are setting up the error context
multiple times due to which the context information is not
appropriate. The first is set in apply_dispatch and then during
processing, we set another error callback slot_store_error_callback in
slot_store_data and slot_modify_data. When I forced one of the errors
in slot_store_data(), it displays the below information in CONTEXT
which doesn't make much sense.
2021-08-10 15:16:39.887 IST [6784] ERROR: incorrect binary data
format in logical replication column 1
2021-08-10 15:16:39.887 IST [6784] CONTEXT: processing remote data
for replication target relation "public.test1" column "id"
during apply of "INSERT" for relation "public.test1" in
transaction with xid 740 committs 2021-08-10 14:44:38.058174+05:30
2.
I think we can slightly change the new context information as below:
Before
during apply of "INSERT" for relation "public.test1" in transaction
with xid 740 committs 2021-08-10 14:44:38.058174+05:30
After
during apply of "INSERT" for relation "public.test1" in transaction id
740 with commit timestamp 2021-08-10 14:44:38.058174+05:30
3.
+/* Struct for saving and restoring apply information */
+typedef struct ApplyErrCallbackArg
+{
+ LogicalRepMsgType command; /* 0 if invalid */
+
+ /* Local relation information */
+ char *nspname;
+ char *relname;
...
...
+
+static ApplyErrCallbackArg apply_error_callback_arg =
+{
+ .command = 0,
+ .relname = NULL,
+ .nspname = NULL,
Let's initialize the struct members in the order they are declared.
The order of relname and nspname should be another way.
4.
+
+ TransactionId remote_xid;
+ TimestampTz committs;
+} ApplyErrCallbackArg;
It might be better to add a comment like "remote xact information"
above these structure members.
5.
+static void
+apply_error_callback(void *arg)
+{
+ StringInfoData buf;
+
+ if (apply_error_callback_arg.command == 0)
+ return;
+
+ initStringInfo(&buf);
At the end of this call, it is better to free this (pfree(buf.data))
6. In the commit message, you might want to indicate that this
additional information can be used by the future patch to skip the
conflicting transaction.
--
With Regards,
Amit Kapila.
On Tue, Aug 10, 2021 at 3:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Aug 10, 2021 at 10:37 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached the latest patches that incorporated all comments I got
so far. Please review them.I am not able to apply the latest patch
(v6-0001-Add-errcontext-to-errors-happening-during-applyin) on HEAD,
getting the below error:
patching file src/backend/replication/logical/worker.c
Hunk #11 succeeded at 1195 (offset 50 lines).
Hunk #12 succeeded at 1253 (offset 50 lines).
Hunk #13 succeeded at 1277 (offset 50 lines).
Hunk #14 succeeded at 1305 (offset 50 lines).
Hunk #15 succeeded at 1330 (offset 50 lines).
Hunk #16 succeeded at 1362 (offset 50 lines).
Hunk #17 succeeded at 1508 (offset 50 lines).
Hunk #18 succeeded at 1524 (offset 50 lines).
Hunk #19 succeeded at 1645 (offset 50 lines).
Hunk #20 succeeded at 1671 (offset 50 lines).
Hunk #21 succeeded at 1772 (offset 50 lines).
Hunk #22 succeeded at 1828 (offset 50 lines).
Hunk #23 succeeded at 1934 (offset 50 lines).
Hunk #24 succeeded at 1962 (offset 50 lines).
Hunk #25 succeeded at 2399 (offset 50 lines).
Hunk #26 FAILED at 2405.
Hunk #27 succeeded at 3730 (offset 54 lines).
1 out of 27 hunks FAILED -- saving rejects to file
src/backend/replication/logical/worker.c.rej
Sorry, I forgot to rebase the patches to the current HEAD. Since
stream_prepare is introduced, I'll add some tests to the patches. I’ll
submit the new patches tomorrow that also incorporates your comments
on v6-0001 patch.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, Aug 10, 2021 at 3:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached the latest patches that incorporated all comments I got
so far. Please review them.
Some initial review comments on the v6-0001 patch:
src/backend/replication/logical/proto.c:
(1)
+ TimestampTz committs;
I think it looks better to name "committs" as "commit_ts", and also is
more consistent with naming for other member "remote_xid".
src/backend/replication/logical/worker.c:
(2)
To be consistent with all other function headers, should start
sentence with capital: "get" -> "Get"
+ * get string representing LogicalRepMsgType.
(3) It looks a bit cumbersome and repetitive to set/update the members
of apply_error_callback_arg in numerous places.
I suggest making the "set_apply_error_context..." and
"reset_apply_error_context..." functions as "static inline void"
functions (moving them to the top part of the source file, and
removing the existing function declarations for these).
Also, can add something similar to below:
static inline void
set_apply_error_callback_xid(TransactionId xid)
{
apply_error_callback_arg.remote_xid = xid;
}
static inline void
set_apply_error_callback_xid_info(TransactionId xid, TimestampTz commit_ts)
{
apply_error_callback_arg.remote_xid = xid;
apply_error_callback_arg.commit_ts = commit_ts;
}
so that instances of, for example:
apply_error_callback_arg.remote_xid = prepare_data.xid;
apply_error_callback_arg.committs = prepare_data.commit_time;
can be:
set_apply_error_callback_tx_info(prepare_data.xid, prepare_data.commit_time);
(4) The apply_error_callback() function is missing a function header/comment.
Regards,
Greg Nancarrow
Fujitsu Australia
On Tue, Aug 10, 2021 at 7:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Aug 10, 2021 at 11:59 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Aug 10, 2021 at 10:37 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached the latest patches that incorporated all comments I got
so far. Please review them.I am not able to apply the latest patch
(v6-0001-Add-errcontext-to-errors-happening-during-applyin) on HEAD,
getting the below error:Few comments on v6-0001-Add-errcontext-to-errors-happening-during-applyin
Thank you for the comments!
==============================================================
1. While applying DML operations, we are setting up the error context
multiple times due to which the context information is not
appropriate. The first is set in apply_dispatch and then during
processing, we set another error callback slot_store_error_callback in
slot_store_data and slot_modify_data. When I forced one of the errors
in slot_store_data(), it displays the below information in CONTEXT
which doesn't make much sense.2021-08-10 15:16:39.887 IST [6784] ERROR: incorrect binary data
format in logical replication column 1
2021-08-10 15:16:39.887 IST [6784] CONTEXT: processing remote data
for replication target relation "public.test1" column "id"
during apply of "INSERT" for relation "public.test1" in
transaction with xid 740 committs 2021-08-10 14:44:38.058174+05:30
Yes, but we cannot change the error context message depending on other
error context messages. So it seems hard to construct a complete
sentence in the context message that is okay in terms of English
grammar. Is the following message better?
CONTEXT: processing remote data for replication target relation
"public.test1" column “id"
applying "INSERT" for relation "public.test1” in transaction
with xid 740 committs 2021-08-10 14:44:38.058174+05:30
2.
I think we can slightly change the new context information as below:
Before
during apply of "INSERT" for relation "public.test1" in transaction
with xid 740 committs 2021-08-10 14:44:38.058174+05:30
After
during apply of "INSERT" for relation "public.test1" in transaction id
740 with commit timestamp 2021-08-10 14:44:38.058174+05:30
Fixed.
3. +/* Struct for saving and restoring apply information */ +typedef struct ApplyErrCallbackArg +{ + LogicalRepMsgType command; /* 0 if invalid */ + + /* Local relation information */ + char *nspname; + char *relname;...
...+ +static ApplyErrCallbackArg apply_error_callback_arg = +{ + .command = 0, + .relname = NULL, + .nspname = NULL,Let's initialize the struct members in the order they are declared.
The order of relname and nspname should be another way.
Fixed.
4. + + TransactionId remote_xid; + TimestampTz committs; +} ApplyErrCallbackArg;It might be better to add a comment like "remote xact information"
above these structure members.
Fixed.
5. +static void +apply_error_callback(void *arg) +{ + StringInfoData buf; + + if (apply_error_callback_arg.command == 0) + return; + + initStringInfo(&buf);At the end of this call, it is better to free this (pfree(buf.data))
Fixed.
6. In the commit message, you might want to indicate that this
additional information can be used by the future patch to skip the
conflicting transaction.
Fixed.
I've attached the new patches. Please review them.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v7-0004-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchapplication/octet-stream; name=v7-0004-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchDownload
From 4e628e580a0b3835bd6047aacc2610761cb11b26 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 2 Aug 2021 14:27:40 +0900
Subject: [PATCH v7 4/4] Add skip_xid option to ALTER SUBSCRIPTION.
If incoming change violates any constraint, logical replication stops
until it's resolved. This commit introduces another way to skip the
transaction in question.
The user can specify XID by ALTER SUBSCRIPTION ... SET (skip_xid =
XXX), updating pg_subscription.subskipxid field, telling the apply
worker to skip the transaction. The apply worker skips all data
modification changes within the specified transaction.
After skipping the transaciton the apply worker clears
pg_subscription.subskipxid. Also, it clears the error statistics of
the subscription in pg_stat_subscription_errors system view as well in
order the user not to get confused. It's done by sending the message
for clearing a subscription error to the stats collector.
---
doc/src/sgml/logical-replication.sgml | 49 ++++-
doc/src/sgml/ref/alter_subscription.sgml | 32 ++-
src/backend/catalog/pg_subscription.c | 10 +
src/backend/commands/subscriptioncmds.c | 45 +++-
src/backend/postmaster/pgstat.c | 44 +++-
src/backend/replication/logical/worker.c | 201 ++++++++++++++++-
src/include/catalog/pg_subscription.h | 4 +
src/include/nodes/parsenodes.h | 1 +
src/include/pgstat.h | 8 +-
src/test/regress/expected/subscription.out | 13 ++
src/test/regress/sql/subscription.sql | 11 +
src/test/subscription/t/024_skip_xact.pl | 244 +++++++++++++++++++++
12 files changed, 637 insertions(+), 25 deletions(-)
create mode 100644 src/test/subscription/t/024_skip_xact.pl
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 88646bc859..1e3c8c40f5 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -333,20 +333,63 @@
<para>
A conflict will produce an error and will stop the replication; it must be
resolved manually by the user. Details about the conflict can be found in
- the subscriber's server log.
+ <xref linkend="monitoring-pg-stat-subscription-errors"/> as well as the
+ subscriber's server log.
</para>
<para>
The resolution can be done either by changing data on the subscriber so
that it does not conflict with the incoming change or by skipping the
- transaction that conflicts with the existing data. The transaction can be
- skipped by calling the <link linkend="pg-replication-origin-advance">
+ transaction that conflicts with the existing data. When a conflict produce
+ an error, it is shown in <structname>pg_stat_subscription_errors</structname>
+ view as follows:
+ </para>
+
+ <programlisting>
+postgres=# SELECT * FROM pg_stat_subscription_errors;
+-[ RECORD 1 ]--------+-----------------------------------------------------------
+datname | postgres
+subid | 16395
+subname | test_sub
+relid | 16385
+command | INSERT
+xid | 716
+failure_source | apply
+failure_count | 50
+last_failure | 2021-07-21 21:16:02.781779+00
+last_failure_message | duplicate key value violates unique constraint "test_pkey"
+stats_reset |
+</programlisting>
+
+ <para>
+ and it is also shown in subscriber's server log as follows:
+ </para>
+
+<screen>
+ERROR: duplicate key value violates unique constraint "test_pkey"
+DETAIL: Key (id)=(1) already exists.
+CONTEXT: during apply of "INSERT" for relation "public.test" in transaction with xid 716 committs 2021-07-15 21:54:58.802874+00
+</screen>
+
+ <para>
+ The transaction ID that contains the change violating the constraint can be
+ found from those outputs (transaction ID 740 in the above case). The transaction
+ can be skipped by setting <replaceable>skip_xid</replaceable> to the subscription
+ by <command>ALTER SUBSCRIPTION ... SET</command>. Alternatively, the transaction
+ can also be skipped by calling the <link linkend="pg-replication-origin-advance">
<function>pg_replication_origin_advance()</function></link> function with
a <parameter>node_name</parameter> corresponding to the subscription name,
and a position. The current position of origins can be seen in the
<link linkend="view-pg-replication-origin-status">
<structname>pg_replication_origin_status</structname></link> system view.
</para>
+
+ <para>
+ In either way, those should be used as a last resort. They skip the whole
+ transaction including changes that may not violate any constraint and easily
+ make subscriber inconsistent if a user specifies the wrong transaction ID or
+ the position of origin.
+ </para>
</sect1>
<sect1 id="logical-replication-restrictions">
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 8c3c28b7e7..cfb318e08c 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -205,8 +205,36 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
<literal>streaming</literal>.
</para>
<para>
- The parameters that can be reset are: <literal>streaming</literal>,
- <literal>binary</literal>, <literal>synchronous_commit</literal>.
+ The parameters that can be reset are: <literal>slot_name</literal>,
+ <literal>synchronous_commit</literal>, <literal>binary</literal>,
+ <literal>streaming</literal>, and following parameter:
+ </para>
+ <para>
+ <variablelist>
+ <varlistentry>
+ <term><literal>skip_xid</literal> (<type>xid</type>)</term>
+ <listitem>
+ <para>
+ If incoming data violates any constraints the logical replication
+ will stop until it is resolved. The resolution can be done either
+ by changing data on the subscriber so that it doesn't conflict with
+ incoming change or by skipping the whole transaction. This option
+ specifies transaction ID that logical replication worker skips to
+ apply. The logical replication worker skips all data modification
+ changes within the specified transaction. Therefore, since it skips
+ the whole transaction including the changes that may not violate the
+ constraint, it should only be used as a last resort. This option has
+ no effect for the transaction that is already prepared with enabling
+ <literal>two_phase</literal> on susbscriber. After the logical
+ replication successfully skips the transaction, the transaction ID
+ (stored in
+ <structname>pg_subscription</structname>.<structfield>subskipxid</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index 25021e25a4..cb22cd7463 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -104,6 +104,16 @@ GetSubscription(Oid subid, bool missing_ok)
Assert(!isnull);
sub->publications = textarray_to_stringlist(DatumGetArrayTypeP(datum));
+ /* Get skip XID */
+ datum = SysCacheGetAttr(SUBSCRIPTIONOID,
+ tup,
+ Anum_pg_subscription_subskipxid,
+ &isnull);
+ if (!isnull)
+ sub->skipxid = DatumGetTransactionId(datum);
+ else
+ sub->skipxid = InvalidTransactionId;
+
ReleaseSysCache(tup);
return sub;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index cc390ce95a..188f3e42fd 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -60,6 +60,7 @@
#define SUBOPT_BINARY 0x00000080
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
+#define SUBOPT_SKIP_XID 0x00000400
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -81,6 +82,7 @@ typedef struct SubOpts
bool binary;
bool streaming;
bool twophase;
+ TransactionId skip_xid;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -129,6 +131,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->streaming = false;
if (IsSet(supported_opts, SUBOPT_TWOPHASE_COMMIT))
opts->twophase = false;
+ if (IsSet(supported_opts, SUBOPT_SKIP_XID))
+ opts->skip_xid = InvalidTransactionId;
/* Parse options */
foreach(lc, stmt_options)
@@ -261,6 +265,29 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_TWOPHASE_COMMIT;
opts->twophase = defGetBoolean(defel);
}
+ else if (strcmp(defel->defname, "skip_xid") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_SKIP_XID))
+ errorConflictingDefElem(defel, pstate);
+
+ if (!is_reset)
+ {
+ char *xid_str = defGetString(defel);
+ TransactionId xid;
+
+ /* Parse the argument as TransactionId */
+ xid = DatumGetTransactionId(DirectFunctionCall1(xidin,
+ CStringGetDatum(xid_str)));
+
+ if (!TransactionIdIsNormal(xid))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid transaction id")));
+ opts->skip_xid = xid;
+ }
+
+ opts->specified_opts |= SUBOPT_SKIP_XID;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -485,6 +512,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
CharGetDatum(opts.twophase ?
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -885,7 +913,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
{
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
- SUBOPT_STREAMING);
+ SUBOPT_STREAMING | SUBOPT_SKIP_XID);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts, false);
@@ -934,6 +962,13 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
replaces[Anum_pg_subscription_substream - 1] = true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_SKIP_XID))
+ {
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(opts.skip_xid);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+ }
+
update_tuple = true;
break;
}
@@ -941,7 +976,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
case ALTER_SUBSCRIPTION_RESET_OPTIONS:
{
supported_opts = (SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
- SUBOPT_STREAMING);
+ SUBOPT_STREAMING | SUBOPT_SKIP_XID);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts, true);
@@ -967,6 +1002,12 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
replaces[Anum_pg_subscription_substream - 1] = true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_SKIP_XID))
+ {
+ nulls[Anum_pg_subscription_subskipxid - 1] =
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+ }
+
update_tuple = true;
break;
}
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 498ea63863..90172fc99d 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -1745,11 +1745,32 @@ pgstat_reset_subscription_error(Oid subid, Oid subrelid)
pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
msg.m_subid = subid;
msg.m_subrelid = subrelid;
+ msg.m_clear = false;
msg.m_reset = true;
pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
}
+/* ----------
+ * pgstat_clear_subscription_error() -
+ *
+ * Tell the collector about clear the error of subscription.
+ * ----------
+ */
+void
+pgstat_clear_subscription_error(Oid subid, Oid subrelid)
+{
+ PgStat_MsgSubscriptionErr msg;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_clear = true;
+ msg.m_reset = false;
+
+ pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
+}
+
/* ----------
* pgstat_report_autovac() -
*
@@ -2092,6 +2113,7 @@ pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
msg.m_subrelid = subrelid;
msg.m_databaseid = MyDatabaseId;
msg.m_relid = relid;
+ msg.m_clear = false;
msg.m_reset = false;
msg.m_command = command;
msg.m_xid = xid;
@@ -6224,27 +6246,37 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
static void
pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len)
{
+
PgStat_StatSubErrEntry *errent;
- bool create = !msg->m_reset;
+ bool create = !(msg->m_reset || msg->m_clear);
/* Get subscription error */
errent = pgstat_get_subscription_error_entry(msg->m_subid,
msg->m_subrelid,
create);
- if (msg->m_reset)
+ if (msg->m_reset || msg->m_clear)
{
+ Assert(!(msg->m_reset && msg->m_clear));
+
if (errent == NULL)
return;
- /* reset fields and set reset timestamp */
errent->relid = InvalidOid;
errent->command = 0;
errent->xid = InvalidTransactionId;
errent->failure_count = 0;
- errent->last_failure = 0;
- errent->last_errmsg[0] = '\0';
- errent->stat_reset_timestamp = GetCurrentTimestamp();
+
+ /*
+ * If the reset is requested, reset more fields and set the reset
+ * timestamp.
+ */
+ if (msg->m_reset)
+ {
+ errent->last_failure = 0;
+ errent->last_errmsg[0] = '\0';
+ errent->stat_reset_timestamp = GetCurrentTimestamp();
+ }
}
else
{
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 31845fdba3..240fef0a86 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -136,6 +136,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -279,6 +280,21 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * skipping_xid is a valid XID if we're skipping all data modification changes
+ * (INSERT/DELETE/UPDATE/TRUNCATE) of the specified transaction in MySubscription->skipxid.
+ * Please note that we don’t skip receiving the changes particularly in streaming
+ * cases, since we decide whether or not to skip applying the changes when starting
+ * to apply. Once starting skipping changes, we copy the XID to skipping_xid and
+ * then don't stop skipping until we skip the whole transaction even if the
+ * subscription is invalidated and* MySubscription->skipxid gets changed or reset.
+ * When stopping the skipping behavior, we reset the skip XID (subskipxid) in the
+ * pg_subscription catalog and associate origin status to the transaction that resets
+ * the skip XID so that we can start streaming from the next transaction.
+ */
+static TransactionId skipping_xid = InvalidTransactionId;
+#define is_skipping_changes() (TransactionIdIsValid(skipping_xid))
+
/*
* Hash table for storing the streaming xid information along with shared file
* set for streaming and subxact files.
@@ -367,6 +383,9 @@ static inline void set_apply_error_context_rel(LogicalRepRelMapEntry *rel);
static inline void reset_apply_error_context_rel(void);
static inline void reset_apply_error_context_info(void);
+static void maybe_start_skipping_changes(TransactionId xid);
+static void stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs);
+
/*
* Should this worker apply changes for given relation.
*
@@ -863,6 +882,11 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
+ /*
+ * Enable skipping all changes of this transaction if specified.
+ */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -887,7 +911,18 @@ apply_handle_commit(StringInfo s)
LSN_FORMAT_ARGS(commit_data.commit_lsn),
LSN_FORMAT_ARGS(remote_final_lsn))));
- apply_handle_commit_internal(&commit_data);
+ /*
+ * Stop the skipping transaction if enabled. Otherwise, commit the changes
+ * that are just applied.
+ */
+ if (is_skipping_changes())
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ apply_handle_commit_internal(&commit_data);
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(commit_data.end_lsn);
@@ -915,6 +950,9 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
+ /* Enable skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -973,9 +1011,10 @@ apply_handle_prepare(StringInfo s)
/*
* Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction. It is done this way because at
- * commit prepared time, we won't know whether we have skipped preparing a
- * transaction because of no change.
+ * change has happened in this transaction, possibly because we're
+ * skipping data-modification changes of this transaction. It is done this
+ * way because at commit prepared time, we won't know whether we have
+ * skipped preparing a transaction because of no change.
*
* XXX, We can optimize such that at commit prepared time, we first check
* whether we have prepared the transaction or not but that doesn't seem
@@ -989,6 +1028,10 @@ apply_handle_prepare(StringInfo s)
CommitTransactionCommand();
pgstat_report_stat(false);
+ /* Stop the skipping changes if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
store_flush_position(prepare_data.end_lsn);
in_remote_transaction = false;
@@ -1120,6 +1163,9 @@ apply_handle_stream_prepare(StringInfo s)
elog(DEBUG1, "received prepare for streamed transaction %u", prepare_data.xid);
+ /* Enable skipping all changes of this transaction if specified. */
+ maybe_start_skipping_changes(prepare_data.xid);
+
/* Replay all the spooled operations. */
apply_spooled_messages(prepare_data.xid, prepare_data.prepare_lsn);
@@ -1130,6 +1176,10 @@ apply_handle_stream_prepare(StringInfo s)
pgstat_report_stat(false);
+ /* Stop the skipping changes if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
store_flush_position(prepare_data.end_lsn);
in_remote_transaction = false;
@@ -1155,9 +1205,10 @@ apply_handle_origin(StringInfo s)
{
/*
* ORIGIN message can only come inside streaming transaction or inside
- * remote transaction and before any actual writes.
+ * remote transaction and before any actual writes unless we're skipping
+ * changes of the transaction.
*/
- if (!in_streamed_transaction &&
+ if (!in_streamed_transaction && !is_skipping_changes() &&
(!in_remote_transaction ||
(IsTransactionState() && !am_tablesync_worker())))
ereport(ERROR,
@@ -1179,6 +1230,9 @@ apply_handle_stream_start(StringInfo s)
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg_internal("duplicate STREAM START message")));
+ /* extract XID of the top-level transaction */
+ stream_xid = logicalrep_read_stream_start(s, &first_segment);
+
/*
* Start a transaction on stream start, this transaction will be committed
* on the stream stop unless it is a tablesync worker in which case it
@@ -1191,9 +1245,6 @@ apply_handle_stream_start(StringInfo s)
/* notify handle methods we're processing a remote transaction */
in_streamed_transaction = true;
- /* extract XID of the top-level transaction */
- stream_xid = logicalrep_read_stream_start(s, &first_segment);
-
if (!TransactionIdIsValid(stream_xid))
ereport(ERROR,
(errcode(ERRCODE_PROTOCOL_VIOLATION),
@@ -1275,6 +1326,7 @@ apply_handle_stream_abort(StringInfo s)
errmsg_internal("STREAM ABORT message without STREAM STOP")));
logicalrep_read_stream_abort(s, &xid, &subxid);
+ maybe_start_skipping_changes(xid);
/*
* If the two XIDs are the same, it's in fact abort of toplevel xact, so
@@ -1368,6 +1420,10 @@ apply_handle_stream_abort(StringInfo s)
CommitTransactionCommand();
}
+ /* Stop the skipping transaction if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
reset_apply_error_context_info();
}
@@ -1517,9 +1573,23 @@ apply_handle_stream_commit(StringInfo s)
elog(DEBUG1, "received commit for streamed transaction %u", xid);
+ /* Enable skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(xid);
+
apply_spooled_messages(xid, commit_data.commit_lsn);
- apply_handle_commit_internal(&commit_data);
+ if (is_skipping_changes())
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ {
+ /* commit the streamed transaction */
+ apply_handle_commit_internal(&commit_data);
+ }
/* unlink the files with serialized changes and subxact info */
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
@@ -2406,6 +2476,17 @@ apply_dispatch(StringInfo s)
ErrorContextCallback errcallback;
bool set_callback = false;
+ /*
+ * Skip all data-modification changes if we're skipping changes of this
+ * transaction.
+ */
+ if (is_skipping_changes() &&
+ (action == LOGICAL_REP_MSG_INSERT ||
+ action == LOGICAL_REP_MSG_UPDATE ||
+ action == LOGICAL_REP_MSG_DELETE ||
+ action == LOGICAL_REP_MSG_TRUNCATE))
+ return;
+
/*
* Push apply error context callback if not yet. Other fields will be
* filled during applying the change. Since this function can be called
@@ -3842,3 +3923,103 @@ reset_apply_error_context_info(void)
set_apply_error_context_xact(InvalidTransactionId, 0);
reset_apply_error_context_rel();
}
+
+/*
+ * Start skipping changes of the transaction if the given XID matches the
+ * transaction ID specified by skip_xid option.
+ */
+static void
+maybe_start_skipping_changes(TransactionId xid)
+{
+ Assert(!is_skipping_changes());
+ Assert(!TransactionIdIsValid(skipping_xid));
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ if (!TransactionIdIsValid(MySubscription->skipxid) ||
+ MySubscription->skipxid != xid)
+ return;
+
+ skipping_xid = xid;
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction with xid %u",
+ skipping_xid));
+}
+
+/*
+ * Stop skipping changes and reset the skip XID. Also, reset the skip XID
+ * (pg_subscription.subskipxid). If origin_lsn and origin_committs are valid, we
+ * set origin state to the transaction commit that resets the skip XID so that we
+ * can start streaming from the transaction next to the one that we just skipped.
+ */
+static void
+stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs)
+{
+ Relation rel;
+ HeapTuple tup;
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ Assert(is_skipping_changes());
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction with xid %u",
+ skipping_xid)));
+
+ /* Stop skipping changes */
+ skipping_xid = InvalidTransactionId;
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ if (!IsTransactionState())
+ StartTransactionCommand();
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ TransactionIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ /* Set subskipxid to null */
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ /* Update the system catalog to reset the skip XID */
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+
+ heap_freetuple(tup);
+ table_close(rel, RowExclusiveLock);
+
+ if (!XLogRecPtrIsInvalid(origin_lsn))
+ {
+ /*
+ * Update origin state so we can restart streaming from correct
+ * position in case of crash.
+ */
+ replorigin_session_origin_lsn = origin_lsn;
+ replorigin_session_origin_timestamp = origin_committs;
+ }
+
+ CommitTransactionCommand();
+ pgstat_report_stat(false);
+
+ /*
+ * Clear the error statistics of this subscription to let users know the
+ * subscription is no longer getting stuck by the conflict.
+ *
+ * The message for clearing the error statistics can be lost but that's
+ * okay. The user can know the logical replication is working fine in
+ * other ways, for example, checking pg_stat_subscription view. And the
+ * user is able to reset the single subscription error statistics by
+ * pg_reset_subscription_error SQL function.
+ */
+ pgstat_clear_subscription_error(MySubscription->oid, InvalidOid);
+}
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 21061493ea..beaa6e646d 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -67,6 +67,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
char subtwophasestate; /* Stream two-phase transactions */
+ TransactionId subskipxid BKI_FORCE_NULL; /* All changes associated with
+ * this XID are skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -103,6 +106,7 @@ typedef struct Subscription
* binary format */
bool stream; /* Allow streaming in-progress transactions. */
char twophasestate; /* Allow streaming two-phase transactions */
+ TransactionId skipxid; /* All changes of the XID are skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 504d65f7d6..aec06b0d23 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3677,6 +3677,7 @@ typedef struct AlterSubscriptionStmt
char *conninfo; /* Connection string to publisher */
List *publication; /* One or more publication to subscribe to */
List *options; /* List of DefElem nodes */
+ TransactionId skip_xid; /* XID to skip */
} AlterSubscriptionStmt;
typedef struct DropSubscriptionStmt
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index f66eb290bb..a611ca7cce 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -556,7 +556,7 @@ typedef struct PgStat_MsgReplSlot
/* ----------
* PgStat_MsgSubscriptionErr Sent by the apply worker or the table sync worker to
- * update/reset the error happening during logical
+ * update/reset/clear the error happening during logical
* replication.
* ----------
*/
@@ -574,7 +574,10 @@ typedef struct PgStat_MsgSubscriptionErr
Oid m_subid;
Oid m_subrelid;
- /* The clear message uses below field */
+
+ /* The clear and reset messages use below fields */
+ bool m_clear; /* clear all fields except for last_failure and
+ * last_errmsg */
bool m_reset; /* Reset all fields and set reset_stats
* timestamp */
@@ -1133,6 +1136,7 @@ extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type t
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
extern void pgstat_reset_subscription_error(Oid subid, Oid subrelid);
+extern void pgstat_clear_subscription_error(Oid subid, Oid subrelid);
extern void pgstat_report_autovac(Oid dboid);
extern void pgstat_report_vacuum(Oid tableoid, bool shared,
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index b87f67fe55..217b5fabd1 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -296,6 +296,19 @@ ERROR: unrecognized subscription parameter: "enabled"
-- fail - RESET must not include values
ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
ERROR: RESET must not include values for parameters
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = '4294967295');
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1.1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+ERROR: invalid transaction id
\dRs+
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index aa90560691..4c9d25f0a4 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -231,6 +231,17 @@ ALTER SUBSCRIPTION regress_testsub RESET (enabled);
-- fail - RESET must not include values
ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = '4294967295');
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1.1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+
\dRs+
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/subscription/t/024_skip_xact.pl b/src/test/subscription/t/024_skip_xact.pl
new file mode 100644
index 0000000000..affb663803
--- /dev/null
+++ b/src/test/subscription/t/024_skip_xact.pl
@@ -0,0 +1,244 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Tests for skipping logical replication transactions
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 10;
+
+# Test if the error reported on pg_subscription_errors view is expected.
+sub test_subscription_error
+{
+ my ($node, $source, $relname, $expected_error, $msg) = @_;
+
+ # Wait for the error statistics to be updated.
+ $node->poll_query_until(
+ 'postgres', qq[
+SELECT count(1) > 0 FROM pg_stat_subscription_errors
+WHERE relid = '$relname'::regclass AND failure_source = '$source';
+]) or die "Timed out while waiting for statistics to be updated";
+
+ my $result = $node->safe_psql(
+ 'postgres',
+ qq[
+SELECT datname, subname, command, relid::regclass, failure_source, failure_count > 0
+FROM pg_stat_subscription_errors
+WHERE relid = '$relname'::regclass AND failure_source = '$source';
+]);
+ is($result, $expected_error, $msg);
+}
+
+# Check the error reported on pg_stat_subscription view and skip the failed
+# transaction.
+sub test_skip_subscription_error
+{
+ my ($node, $source, $subname, $relname, $expected_error, $msg) = @_;
+
+ # Check the reported error.
+ test_subscription_error($node, $source, $relname, $expected_error, $msg);
+
+ # Get XID of the failed transaction.
+ my $skipxid = $node->safe_psql(
+ 'postgres',
+ "SELECT xid FROM pg_stat_subscription_errors WHERE relid = '$relname'::regclass");
+ $node->safe_psql('postgres',
+ "ALTER SUBSCRIPTION $subname SET (skip_xid = '$skipxid')");
+
+ # Restart the subscriber to restart logical replication without interval.
+ $node->restart;
+
+ # Wait for the failed transaction to be skipped.
+ $node->poll_query_until(
+ 'postgres',
+ qq[
+SELECT subskipxid IS NULL FROM pg_subscription
+WHERE subname = '$subname'
+]) or die "Timed out while waiting for the transaction to be skipped";
+
+ # Also wait for the error details to be cleared.
+ $node->poll_query_until(
+ 'postgres',
+ qq[
+SELECT command IS NULL FROM pg_stat_subscription_errors
+WHERE subname = '$subname' AND failure_source = '$source';
+]) or die "Timed out while waiting for the transaction to be skipped";
+}
+
+# Create publisher node.
+my $node_publisher = PostgresNode->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+logical_decoding_work_mem = 64kB
+]);
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgresNode->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+
+# don't overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+wal_retrieve_retry_interval = 5s
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we create
+# the same tables but with primary keys. Also, insert some data that will conflict
+# with the data replicated from publisher later.
+$node_publisher->safe_psql('postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int);
+CREATE TABLE test_tab2 (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+INSERT INTO test_tab1 VALUES (1);
+INSERT INTO test_tab2 VALUES (1);
+COMMIT;
+]);
+$node_subscriber->safe_psql('postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int primary key);
+CREATE TABLE test_tab2 (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab2 VALUES (1);
+INSERT INTO test_tab_streaming SELECT 10000, md5(10000::text);
+COMMIT;
+]);
+
+# Check if there is no subscription errors before starting logical replication.
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_errors");
+is($result, qq(0), 'check no subscription error');
+
+# Setup publications.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ q[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Create subscriptions. The table sync for test_tab2 on tap_sub will enter to
+# infinite error due to violating the unique constraint.
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (two_phase = on);");
+my $appname_streaming = 'tap_sub_streaming';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming' PUBLICATION tap_pub_streaming WITH (streaming = on, two_phase = on);");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Wait for initial table sync for test_tab1 and test_tab_streaming to finish.
+$node_subscriber->poll_query_until('postgres',
+ q[
+SELECT count(1) = 2 FROM pg_subscription_rel
+WHERE srrelid in ('test_tab1'::regclass, 'test_tab_streaming'::regclass) AND srsubstate = 'r'
+]) or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check the initial data.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab1");
+is($result, q(1), 'check initial data was copied to subscriber');
+
+# Insert more data to test_tab1, raising an error on the subscriber due to violating
+# the unique constraint on test_tab1. Then skip the transaction in question.
+$node_publisher->safe_psql('postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab1 VALUES (1);
+COMMIT;
+]);
+test_skip_subscription_error($node_subscriber,
+ 'apply', 'tap_sub', 'test_tab1',
+ qq(postgres|tap_sub|INSERT|test_tab1|apply|t),
+ 'skip the error reported by the apply worker');
+
+# Check the table sync worker's error in the view.
+test_subscription_error($node_subscriber,
+ 'tablesync', 'test_tab2',
+ qq(postgres|tap_sub||test_tab2|tablesync|t),
+ 'skip the error reported by the table sync worker');
+
+# Insert enough rows to test_tab_streaming to exceed the 64kB limit, also raising an
+# error on the subscriber during applying spooled changes for the same reason. Then
+# skip the transactio in question.
+$node_publisher->safe_psql('postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+COMMIT;
+]);
+test_skip_subscription_error($node_subscriber,
+ 'apply', 'tap_sub_streaming', 'test_tab_streaming',
+ qq(postgres|tap_sub_streaming|INSERT|test_tab_streaming|apply|t),
+ 'skip the error reported by the table sync worker during during applying streaming changes');
+
+# Insert data to test_tab1 and test_tab_streaming that don't conflict.
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab1 VALUES (2)");
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab_streaming VALUES (10001, md5(10001::text))");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Check the data is successfully replicated after skipping the transactions.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT * FROM test_tab1");
+is($result, q(1
+2), "subscription gets changes after skipped transaction");
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM test_tab_streaming");
+is($result, q(2), "subscription gets changes after skipped streamed transaction");
+
+# Tests for skipping the transactions that are prepared and stream_prepared. We insert
+# the same data as the previous tests but prepare the transactions. Those insertions
+# raise an error on the subscriptions. Then we skip the transactions in question.
+$node_publisher->safe_psql('postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab1 VALUES (1);
+PREPARE TRANSACTION 'skip_sub1';
+COMMIT PREPARED 'skip_sub1';
+]);
+test_skip_subscription_error($node_subscriber,
+ 'apply', 'tap_sub', 'test_tab1',
+ qq(postgres|tap_sub|INSERT|test_tab1|apply|t),
+ 'skip the error on changes of the prepared transaction');
+
+$node_publisher->safe_psql('postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+PREPARE TRANSACTION 'skip_sub2';
+COMMIT PREPARED 'skip_sub2';
+]);
+test_skip_subscription_error($node_subscriber,
+ 'apply', 'tap_sub_streaming', 'test_tab_streaming',
+ qq(postgres|tap_sub_streaming|INSERT|test_tab_streaming|apply|t),
+ 'skip the error on changes of the prepared-streamed transaction');
+
+# Check if the view doesn't show any entries after dropping the subscriptions.
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+DROP SUBSCRIPTION tap_sub;
+DROP SUBSCRIPTION tap_sub_streaming;
+]);
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_errors");
+is($result, q(0), 'no error after dropping subscription');
--
2.24.3 (Apple Git-128)
v7-0001-Add-errcontext-to-errors-happening-during-applyin.patchapplication/octet-stream; name=v7-0001-Add-errcontext-to-errors-happening-during-applyin.patchDownload
From 8085e23327bc108ebd3ae5668a0c28946c4b4e84 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 28 Jun 2021 13:21:58 +0900
Subject: [PATCH v7 1/4] Add errcontext to errors happening during applying
logical replication changes.
This commit adds the error context to errors happening during applying
logical replication changes, showing the command, the relation,
transaction ID, and commit timestamp in the server log.
Also, this additional information can be used by the follow-up commit
that enables to skip the particular transaction on the subscriber.
---
src/backend/replication/logical/proto.c | 51 ++++++
src/backend/replication/logical/worker.c | 203 ++++++++++++++++++++---
src/include/replication/logicalproto.h | 1 +
src/tools/pgindent/typedefs.list | 1 +
4 files changed, 233 insertions(+), 23 deletions(-)
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 52b65e9572..bb5016aa17 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -1156,3 +1156,54 @@ logicalrep_read_stream_abort(StringInfo in, TransactionId *xid,
*xid = pq_getmsgint(in, 4);
*subxid = pq_getmsgint(in, 4);
}
+
+/*
+ * Get string representing LogicalRepMsgType.
+ */
+char *
+logicalrep_message_type(LogicalRepMsgType action)
+{
+ switch (action)
+ {
+ case LOGICAL_REP_MSG_BEGIN:
+ return "BEGIN";
+ case LOGICAL_REP_MSG_COMMIT:
+ return "COMMIT";
+ case LOGICAL_REP_MSG_ORIGIN:
+ return "ORIGIN";
+ case LOGICAL_REP_MSG_INSERT:
+ return "INSERT";
+ case LOGICAL_REP_MSG_UPDATE:
+ return "UPDATE";
+ case LOGICAL_REP_MSG_DELETE:
+ return "DELETE";
+ case LOGICAL_REP_MSG_TRUNCATE:
+ return "TRUNCATE";
+ case LOGICAL_REP_MSG_RELATION:
+ return "RELATION";
+ case LOGICAL_REP_MSG_TYPE:
+ return "TYPE";
+ case LOGICAL_REP_MSG_MESSAGE:
+ return "MESSAGE";
+ case LOGICAL_REP_MSG_BEGIN_PREPARE:
+ return "BEGIN PREPARE";
+ case LOGICAL_REP_MSG_PREPARE:
+ return "PREPARE";
+ case LOGICAL_REP_MSG_COMMIT_PREPARED:
+ return "COMMIT PREPARED";
+ case LOGICAL_REP_MSG_ROLLBACK_PREPARED:
+ return "ROLLBACK PREPARED";
+ case LOGICAL_REP_MSG_STREAM_START:
+ return "STREAM START";
+ case LOGICAL_REP_MSG_STREAM_END:
+ return "STREAM END";
+ case LOGICAL_REP_MSG_STREAM_COMMIT:
+ return "STREAM COMMIT";
+ case LOGICAL_REP_MSG_STREAM_ABORT:
+ return "STREAM ABORT";
+ case LOGICAL_REP_MSG_STREAM_PREPARE:
+ return "STREAM PREPARE";
+ }
+
+ elog(ERROR, "invalid logical replication message type \"%c\"", action);
+}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index ecaed157f2..e22b8a3903 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -221,6 +221,29 @@ typedef struct ApplyExecutionData
PartitionTupleRouting *proute; /* partition routing info */
} ApplyExecutionData;
+/* Struct for saving and restoring apply information */
+typedef struct ApplyErrCallbackArg
+{
+ LogicalRepMsgType command; /* 0 if invalid */
+
+ /* Local relation information */
+ char *nspname;
+ char *relname;
+
+ /* Remote transaction information */
+ TransactionId remote_xid;
+ TimestampTz commit_ts;
+} ApplyErrCallbackArg;
+
+static ApplyErrCallbackArg apply_error_callback_arg =
+{
+ .command = 0,
+ .nspname = NULL,
+ .relname = NULL,
+ .remote_xid = InvalidTransactionId,
+ .commit_ts = 0,
+};
+
/*
* Stream xid hash entry. Whenever we see a new xid we create this entry in the
* xidhash and along with it create the streaming file and store the fileset handle.
@@ -335,6 +358,13 @@ static void TwoPhaseTransactionGid(Oid subid, TransactionId xid, char *gid, int
/* Common streaming function to apply all the spooled messages */
static void apply_spooled_messages(TransactionId xid, XLogRecPtr lsn);
+/* Functions for apply error callback */
+static void apply_error_callback(void *arg);
+static inline void set_apply_error_context_xact(TransactionId xid, TimestampTz commit_ts);
+static inline void set_apply_error_context_rel(LogicalRepRelMapEntry *rel);
+static inline void reset_apply_error_context_rel(void);
+static inline void reset_apply_error_context_info(void);
+
/*
* Should this worker apply changes for given relation.
*
@@ -827,6 +857,7 @@ apply_handle_begin(StringInfo s)
LogicalRepBeginData begin_data;
logicalrep_read_begin(s, &begin_data);
+ set_apply_error_context_xact(begin_data.xid, begin_data.committime);
remote_final_lsn = begin_data.final_lsn;
@@ -860,6 +891,7 @@ apply_handle_commit(StringInfo s)
process_syncing_tables(commit_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -877,6 +909,7 @@ apply_handle_begin_prepare(StringInfo s)
errmsg_internal("tablesync worker received a BEGIN PREPARE message")));
logicalrep_read_begin_prepare(s, &begin_data);
+ set_apply_error_context_xact(begin_data.xid, begin_data.prepare_time);
remote_final_lsn = begin_data.prepare_lsn;
@@ -962,6 +995,7 @@ apply_handle_prepare(StringInfo s)
process_syncing_tables(prepare_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -974,6 +1008,7 @@ apply_handle_commit_prepared(StringInfo s)
char gid[GIDSIZE];
logicalrep_read_commit_prepared(s, &prepare_data);
+ set_apply_error_context_xact(prepare_data.xid, prepare_data.commit_time);
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
@@ -1001,6 +1036,7 @@ apply_handle_commit_prepared(StringInfo s)
process_syncing_tables(prepare_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1013,6 +1049,7 @@ apply_handle_rollback_prepared(StringInfo s)
char gid[GIDSIZE];
logicalrep_read_rollback_prepared(s, &rollback_data);
+ set_apply_error_context_xact(rollback_data.xid, rollback_data.rollback_time);
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, rollback_data.xid,
@@ -1050,6 +1087,7 @@ apply_handle_rollback_prepared(StringInfo s)
process_syncing_tables(rollback_data.rollback_end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1076,6 +1114,7 @@ apply_handle_stream_prepare(StringInfo s)
errmsg_internal("tablesync worker received a STREAM PREPARE message")));
logicalrep_read_stream_prepare(s, &prepare_data);
+ set_apply_error_context_xact(prepare_data.xid, prepare_data.prepare_time);
elog(DEBUG1, "received prepare for streamed transaction %u", prepare_data.xid);
@@ -1100,6 +1139,8 @@ apply_handle_stream_prepare(StringInfo s)
process_syncing_tables(prepare_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+
+ reset_apply_error_context_info();
}
/*
@@ -1156,6 +1197,8 @@ apply_handle_stream_start(StringInfo s)
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg_internal("invalid transaction ID in streamed replication transaction")));
+ set_apply_error_context_xact(stream_xid, 0);
+
/*
* Initialize the xidhash table if we haven't yet. This will be used for
* the entire duration of the apply worker so create it in permanent
@@ -1212,6 +1255,7 @@ apply_handle_stream_stop(StringInfo s)
MemoryContextReset(LogicalStreamingContext);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1235,7 +1279,10 @@ apply_handle_stream_abort(StringInfo s)
* just delete the files with serialized info.
*/
if (xid == subxid)
+ {
+ set_apply_error_context_xact(xid, 0);
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
+ }
else
{
/*
@@ -1260,6 +1307,8 @@ apply_handle_stream_abort(StringInfo s)
char path[MAXPGPATH];
StreamXidHash *ent;
+ set_apply_error_context_xact(subxid, 0);
+
subidx = -1;
begin_replication_step();
subxact_info_read(MyLogicalRepWorker->subid, xid);
@@ -1284,6 +1333,7 @@ apply_handle_stream_abort(StringInfo s)
cleanup_subxact_info();
end_replication_step();
CommitTransactionCommand();
+ reset_apply_error_context_info();
return;
}
@@ -1315,6 +1365,8 @@ apply_handle_stream_abort(StringInfo s)
end_replication_step();
CommitTransactionCommand();
}
+
+ reset_apply_error_context_info();
}
/*
@@ -1459,6 +1511,7 @@ apply_handle_stream_commit(StringInfo s)
errmsg_internal("STREAM COMMIT message without STREAM STOP")));
xid = logicalrep_read_stream_commit(s, &commit_data);
+ set_apply_error_context_xact(xid, commit_data.committime);
elog(DEBUG1, "received commit for streamed transaction %u", xid);
@@ -1473,6 +1526,8 @@ apply_handle_stream_commit(StringInfo s)
process_syncing_tables(commit_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+
+ reset_apply_error_context_info();
}
/*
@@ -1592,6 +1647,9 @@ apply_handle_insert(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ set_apply_error_context_rel(rel);
+
/* Initialize the executor state. */
edata = create_edata_for_relation(rel);
estate = edata->estate;
@@ -1615,6 +1673,9 @@ apply_handle_insert(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ reset_apply_error_context_rel();
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -1713,6 +1774,9 @@ apply_handle_update(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ set_apply_error_context_rel(rel);
+
/* Check if we can do the update. */
check_relation_updatable(rel);
@@ -1766,6 +1830,9 @@ apply_handle_update(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ reset_apply_error_context_rel();
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -1869,6 +1936,9 @@ apply_handle_delete(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ set_apply_error_context_rel(rel);
+
/* Check if we can do the delete. */
check_relation_updatable(rel);
@@ -1894,6 +1964,9 @@ apply_handle_delete(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ reset_apply_error_context_rel();
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -2328,44 +2401,62 @@ static void
apply_dispatch(StringInfo s)
{
LogicalRepMsgType action = pq_getmsgbyte(s);
+ ErrorContextCallback errcallback;
+ bool set_callback = false;
+
+ /*
+ * Push apply error context callback if not yet. Other fields will be
+ * filled during applying the change. Since this function can be called
+ * recursively when applying spooled changes, we set the callback only
+ * once.
+ */
+ if (apply_error_callback_arg.command == 0)
+ {
+ errcallback.callback = apply_error_callback;
+ errcallback.previous = error_context_stack;
+ error_context_stack = &errcallback;
+ set_callback = true;
+ }
+
+ apply_error_callback_arg.command = action;
switch (action)
{
case LOGICAL_REP_MSG_BEGIN:
apply_handle_begin(s);
- return;
+ break;
case LOGICAL_REP_MSG_COMMIT:
apply_handle_commit(s);
- return;
+ break;
case LOGICAL_REP_MSG_INSERT:
apply_handle_insert(s);
- return;
+ break;
case LOGICAL_REP_MSG_UPDATE:
apply_handle_update(s);
- return;
+ break;
case LOGICAL_REP_MSG_DELETE:
apply_handle_delete(s);
- return;
+ break;
case LOGICAL_REP_MSG_TRUNCATE:
apply_handle_truncate(s);
- return;
+ break;
case LOGICAL_REP_MSG_RELATION:
apply_handle_relation(s);
- return;
+ break;
case LOGICAL_REP_MSG_TYPE:
apply_handle_type(s);
- return;
+ break;
case LOGICAL_REP_MSG_ORIGIN:
apply_handle_origin(s);
- return;
+ break;
case LOGICAL_REP_MSG_MESSAGE:
@@ -2374,49 +2465,53 @@ apply_dispatch(StringInfo s)
* Although, it could be used by other applications that use this
* output plugin.
*/
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_START:
apply_handle_stream_start(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_END:
apply_handle_stream_stop(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_ABORT:
apply_handle_stream_abort(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_COMMIT:
apply_handle_stream_commit(s);
- return;
+ break;
case LOGICAL_REP_MSG_BEGIN_PREPARE:
apply_handle_begin_prepare(s);
- return;
+ break;
case LOGICAL_REP_MSG_PREPARE:
apply_handle_prepare(s);
- return;
+ break;
case LOGICAL_REP_MSG_COMMIT_PREPARED:
apply_handle_commit_prepared(s);
- return;
+ break;
case LOGICAL_REP_MSG_ROLLBACK_PREPARED:
apply_handle_rollback_prepared(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_PREPARE:
apply_handle_stream_prepare(s);
- return;
+ break;
+
+ default:
+ ereport(ERROR,
+ (errcode(ERRCODE_PROTOCOL_VIOLATION),
+ errmsg("invalid logical replication message type \"%c\"", action)));
}
- ereport(ERROR,
- (errcode(ERRCODE_PROTOCOL_VIOLATION),
- errmsg_internal("invalid logical replication message type \"%c\"",
- action)));
+ /* Pop the error context stack */
+ if (set_callback)
+ error_context_stack = errcallback.previous;
}
/*
@@ -3649,3 +3744,65 @@ IsLogicalWorker(void)
{
return MyLogicalRepWorker != NULL;
}
+
+/* Error callback to give more context info about the change being applied */
+static void
+apply_error_callback(void *arg)
+{
+ StringInfoData buf;
+
+ if (apply_error_callback_arg.command == 0)
+ return;
+
+ initStringInfo(&buf);
+ appendStringInfo(&buf, _("applying \"%s\""),
+ logicalrep_message_type(apply_error_callback_arg.command));
+
+ if (apply_error_callback_arg.relname)
+ appendStringInfo(&buf, _(" for relation \"%s.%s\""),
+ apply_error_callback_arg.nspname,
+ apply_error_callback_arg.relname);
+
+ if (TransactionIdIsNormal(apply_error_callback_arg.remote_xid))
+ appendStringInfo(&buf, _(" in transaction id %u with commit timestamp %s"),
+ apply_error_callback_arg.remote_xid,
+ apply_error_callback_arg.commit_ts == 0
+ ? "(unset)"
+ : timestamptz_to_str(apply_error_callback_arg.commit_ts));
+
+ errcontext("%s", buf.data);
+ pfree(buf.data);
+}
+
+/* Set transaction information of apply error callback */
+static inline void
+set_apply_error_context_xact(TransactionId xid, TimestampTz commit_ts)
+{
+ apply_error_callback_arg.remote_xid = xid;
+ apply_error_callback_arg.commit_ts = commit_ts;
+}
+
+/* Set relation information of apply error callback */
+static inline void
+set_apply_error_context_rel(LogicalRepRelMapEntry *rel)
+{
+ apply_error_callback_arg.nspname = rel->remoterel.nspname;
+ apply_error_callback_arg.relname = rel->remoterel.relname;
+}
+
+/* Reset relation information of apply error callback */
+static inline void
+reset_apply_error_context_rel(void)
+{
+ apply_error_callback_arg.nspname = NULL;
+ apply_error_callback_arg.relname = NULL;
+}
+
+/* Reset all information of apply error callback */
+static inline void
+reset_apply_error_context_info(void)
+{
+ apply_error_callback_arg.command = 0;
+ set_apply_error_context_xact(InvalidTransactionId, 0);
+ reset_apply_error_context_rel();
+}
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 2e29513151..af89f58fd3 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -246,5 +246,6 @@ extern void logicalrep_write_stream_abort(StringInfo out, TransactionId xid,
TransactionId subxid);
extern void logicalrep_read_stream_abort(StringInfo in, TransactionId *xid,
TransactionId *subxid);
+extern char *logicalrep_message_type(LogicalRepMsgType action);
#endif /* LOGICAL_PROTO_H */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 37cf4b2f76..e69b708e33 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -113,6 +113,7 @@ Append
AppendPath
AppendRelInfo
AppendState
+ApplyErrCallbackArg
ApplyExecutionData
ApplySubXactData
Archive
--
2.24.3 (Apple Git-128)
v7-0003-Add-RESET-command-to-ALTER-SUBSCRIPTION-command.patchapplication/octet-stream; name=v7-0003-Add-RESET-command-to-ALTER-SUBSCRIPTION-command.patchDownload
From 168764a0b756a07d3ed9cd84f022f2233e4a7dfc Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 2 Aug 2021 14:23:18 +0900
Subject: [PATCH v7 3/4] Add RESET command to ALTER SUBSCRIPTION command.
ALTER SUBSCRIPTION ... RESET command resets subscription
parameters. The parameters that can be set are streaming, binary,
synchronous_commit.
RESET command is reuiqred by follow-up commit introducing to a new
parameter skip_xid to reset.
---
doc/src/sgml/ref/alter_subscription.sgml | 8 ++-
src/backend/commands/subscriptioncmds.c | 78 +++++++++++++++++-----
src/backend/parser/gram.y | 11 ++-
src/include/nodes/parsenodes.h | 5 +-
src/test/regress/expected/subscription.out | 14 +++-
src/test/regress/sql/subscription.sql | 13 ++++
6 files changed, 109 insertions(+), 20 deletions(-)
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index a6f994450d..8c3c28b7e7 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -29,6 +29,7 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> REFRESH PUB
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> ENABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> DISABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <replaceable>new_name</replaceable>
</synopsis>
@@ -192,16 +193,21 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
<varlistentry>
<term><literal>SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )</literal></term>
+ <term><literal>RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )</literal></term>
<listitem>
<para>
This clause alters parameters originally set by
<xref linkend="sql-createsubscription"/>. See there for more
- information. The parameters that can be altered
+ information. The parameters that can be set
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, and
<literal>streaming</literal>.
</para>
+ <para>
+ The parameters that can be reset are: <literal>streaming</literal>,
+ <literal>binary</literal>, <literal>synchronous_commit</literal>.
+ </para>
</listitem>
</varlistentry>
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 5157f44058..cc390ce95a 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -99,7 +99,8 @@ static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname,
*/
static void
parse_subscription_options(ParseState *pstate, List *stmt_options,
- bits32 supported_opts, SubOpts *opts)
+ bits32 supported_opts, SubOpts *opts,
+ bool is_reset)
{
ListCell *lc;
@@ -134,6 +135,11 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
{
DefElem *defel = (DefElem *) lfirst(lc);
+ if (is_reset && defel->arg != NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("RESET must not include values for parameters")));
+
if (IsSet(supported_opts, SUBOPT_CONNECT) &&
strcmp(defel->defname, "connect") == 0)
{
@@ -192,12 +198,18 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_SYNCHRONOUS_COMMIT;
- opts->synchronous_commit = defGetString(defel);
+ if (!is_reset)
+ {
+ opts->synchronous_commit = defGetString(defel);
- /* Test if the given value is valid for synchronous_commit GUC. */
- (void) set_config_option("synchronous_commit", opts->synchronous_commit,
- PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
- false, 0, false);
+ /*
+ * Test if the given value is valid for synchronous_commit
+ * GUC.
+ */
+ (void) set_config_option("synchronous_commit", opts->synchronous_commit,
+ PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
+ false, 0, false);
+ }
}
else if (IsSet(supported_opts, SUBOPT_REFRESH) &&
strcmp(defel->defname, "refresh") == 0)
@@ -215,7 +227,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_BINARY;
- opts->binary = defGetBoolean(defel);
+ if (!is_reset)
+ opts->binary = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_STREAMING) &&
strcmp(defel->defname, "streaming") == 0)
@@ -224,7 +237,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_STREAMING;
- opts->streaming = defGetBoolean(defel);
+ if (!is_reset)
+ opts->streaming = defGetBoolean(defel);
}
else if (strcmp(defel->defname, "two_phase") == 0)
{
@@ -397,7 +411,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT);
- parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
+ parse_subscription_options(pstate, stmt->options, supported_opts, &opts,
+ false);
/*
* Since creating a replication slot is not transactional, rolling back
@@ -866,14 +881,14 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
switch (stmt->kind)
{
- case ALTER_SUBSCRIPTION_OPTIONS:
+ case ALTER_SUBSCRIPTION_SET_OPTIONS:
{
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING);
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
if (IsSet(opts.specified_opts, SUBOPT_SLOT_NAME))
{
@@ -923,10 +938,43 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
break;
}
+ case ALTER_SUBSCRIPTION_RESET_OPTIONS:
+ {
+ supported_opts = (SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
+ SUBOPT_STREAMING);
+
+ parse_subscription_options(pstate, stmt->options,
+ supported_opts, &opts, true);
+
+ if (IsSet(opts.specified_opts, SUBOPT_SYNCHRONOUS_COMMIT))
+ {
+ values[Anum_pg_subscription_subsynccommit - 1] =
+ CStringGetTextDatum("off");
+ replaces[Anum_pg_subscription_subsynccommit - 1] = true;
+ }
+
+ if (IsSet(opts.specified_opts, SUBOPT_BINARY))
+ {
+ values[Anum_pg_subscription_subbinary - 1] =
+ BoolGetDatum(false);
+ replaces[Anum_pg_subscription_subbinary - 1] = true;
+ }
+
+ if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
+ {
+ values[Anum_pg_subscription_substream - 1] =
+ BoolGetDatum(false);
+ replaces[Anum_pg_subscription_substream - 1] = true;
+ }
+
+ update_tuple = true;
+ break;
+ }
case ALTER_SUBSCRIPTION_ENABLED:
{
parse_subscription_options(pstate, stmt->options,
- SUBOPT_ENABLED, &opts);
+ SUBOPT_ENABLED, &opts, false);
+
Assert(IsSet(opts.specified_opts, SUBOPT_ENABLED));
if (!sub->slotname && opts.enabled)
@@ -961,7 +1009,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
{
supported_opts = SUBOPT_COPY_DATA | SUBOPT_REFRESH;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
values[Anum_pg_subscription_subpublications - 1] =
publicationListToArray(stmt->publication);
@@ -1011,7 +1059,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts |= SUBOPT_COPY_DATA;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
publist = merge_publications(sub->publications, stmt->publication, isadd, stmt->subname);
values[Anum_pg_subscription_subpublications - 1] =
@@ -1059,7 +1107,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
errmsg("ALTER SUBSCRIPTION ... REFRESH is not allowed for disabled subscriptions")));
parse_subscription_options(pstate, stmt->options,
- SUBOPT_COPY_DATA, &opts);
+ SUBOPT_COPY_DATA, &opts, false);
/*
* The subscription option "two_phase" requires that
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 39a2849eba..bcf85e8980 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9707,7 +9707,16 @@ AlterSubscriptionStmt:
{
AlterSubscriptionStmt *n =
makeNode(AlterSubscriptionStmt);
- n->kind = ALTER_SUBSCRIPTION_OPTIONS;
+ n->kind = ALTER_SUBSCRIPTION_SET_OPTIONS;
+ n->subname = $3;
+ n->options = $5;
+ $$ = (Node *)n;
+ }
+ | ALTER SUBSCRIPTION name RESET definition
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_RESET_OPTIONS;
n->subname = $3;
n->options = $5;
$$ = (Node *)n;
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index e28248af32..504d65f7d6 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3659,7 +3659,8 @@ typedef struct CreateSubscriptionStmt
typedef enum AlterSubscriptionType
{
- ALTER_SUBSCRIPTION_OPTIONS,
+ ALTER_SUBSCRIPTION_SET_OPTIONS,
+ ALTER_SUBSCRIPTION_RESET_OPTIONS,
ALTER_SUBSCRIPTION_CONNECTION,
ALTER_SUBSCRIPTION_SET_PUBLICATION,
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
@@ -3671,7 +3672,7 @@ typedef enum AlterSubscriptionType
typedef struct AlterSubscriptionStmt
{
NodeTag type;
- AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_OPTIONS, etc */
+ AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_SET_OPTIONS, etc */
char *subname; /* Name of the subscription */
char *conninfo; /* Connection string to publisher */
List *publication; /* One or more publication to subscribe to */
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 77b4437b69..b87f67fe55 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -284,11 +284,23 @@ ALTER SUBSCRIPTION regress_testsub SET (two_phase = false);
ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+-- ok
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit);
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit, binary, streaming);
+-- fail - unsupported parameters
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ERROR: unrecognized subscription parameter: "connect"
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+ERROR: unrecognized subscription parameter: "enabled"
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+ERROR: RESET must not include values for parameters
\dRs+
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index d42104c191..aa90560691 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -218,6 +218,19 @@ ALTER SUBSCRIPTION regress_testsub SET (two_phase = false);
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+
+-- ok
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit);
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit, binary, streaming);
+
+-- fail - unsupported parameters
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+
\dRs+
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
--
2.24.3 (Apple Git-128)
v7-0002-Add-pg_stat_subscription_errors-statistics-view.patchapplication/octet-stream; name=v7-0002-Add-pg_stat_subscription_errors-statistics-view.patchDownload
From 33b4f68209b8d9f98b60ba5af3c5d59e730cb388 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 16 Jul 2021 23:10:22 +0900
Subject: [PATCH v7 2/4] Add pg_stat_subscription_errors statistics view.
This commits adds new system view pg_stat_logical_replication_error,
showing errors happening during applying logical replication changes
as well as during performing initial table synchronization.
The subscription error entries are removed by autovacuum workers when
the table synchronization competed in table sync worker cases and when
dropping the subscription in apply worker cases.
It also adds SQL function pg_stat_reset_subscription_error() to
reset the single subscription error.
---
doc/src/sgml/monitoring.sgml | 160 ++++++
src/backend/catalog/system_functions.sql | 2 +
src/backend/catalog/system_views.sql | 27 +
src/backend/postmaster/pgstat.c | 651 +++++++++++++++++++++++
src/backend/replication/logical/worker.c | 46 +-
src/backend/utils/adt/pgstatfuncs.c | 119 +++++
src/backend/utils/error/elog.c | 16 +
src/include/catalog/pg_proc.dat | 13 +
src/include/pgstat.h | 106 ++++
src/include/utils/elog.h | 1 +
src/test/regress/expected/rules.out | 22 +
src/tools/pgindent/typedefs.list | 5 +
12 files changed, 1163 insertions(+), 5 deletions(-)
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 74a58a916c..0fd0bbfa1f 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -346,6 +346,15 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
</entry>
</row>
+ <row>
+ <entry><structname>pg_stat_subscription_errors</structname><indexterm><primary>pg_stat_subscription_errors</primary></indexterm></entry>
+ <entry>One row per error that happened on subscription, showing information about
+ the subscription errors.
+ See <link linkend="monitoring-pg-stat-subscription-errors">
+ <structname>pg_stat_subscription_errors</structname></link> for details.
+ </entry>
+ </row>
+
<row>
<entry><structname>pg_stat_ssl</structname><indexterm><primary>pg_stat_ssl</primary></indexterm></entry>
<entry>One row per connection (regular and replication), showing information about
@@ -3050,6 +3059,135 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
</sect2>
+ <sect2 id="monitoring-pg-stat-subscription-errors">
+ <title><structname>pg_stat_subscription_errors</structname></title>
+
+ <indexterm>
+ <primary>pg_stat_subscription_errors</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_stat_subscription_errors</structname> view will contain one
+ row per subscription error reported by workers applying logical replication
+ changes and workers handling the initial data copy of the subscribed tables.
+ </para>
+
+ <table id="pg-stat-subscription-errors" xreflabel="pg_stat_subscription-errors">
+ <title><structname>pg_stat_subscription_errors</structname> View</title>
+ <tgroup cols="1">
+ <thead>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ Column Type
+ </para>
+ <para>
+ Description
+ </para></entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>datname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the database in which the subscription is created.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the subscription.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the subscription.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>relid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker is processing when the
+ error happened.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>command</structfield> <type>text</type>
+ </para>
+ <para>
+ Name of command being applied when the error happened. This
+ field is always NULL if the error is reported by
+ <literal>tablesync</literal> worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>xid</structfield> <type>xid</type>
+ </para>
+ <para>
+ Transaction ID of publisher node being applied when the error
+ happened. This field is always NULL if the error is reported
+ by <literal>tablesync</literal> worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>failure_source</structfield> <type>text</type>
+ </para>
+ <para>
+ Type of worker reported the error: <literal>apply</literal> or
+ <literal>tablesync</literal>.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>failure_count</structfield> <type>uint8</type>
+ </para>
+ <para>
+ Number of times error happened on the worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_failure</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which the last error happened.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_failure_message</structfield> <type>text</type>
+ </para>
+ <para>
+ Error message which is reported last failure time.
+ </para></entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ </sect2>
+
<sect2 id="monitoring-pg-stat-ssl-view">
<title><structname>pg_stat_ssl</structname></title>
@@ -5172,6 +5310,28 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
can be granted EXECUTE to run the function.
</para></entry>
</row>
+
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_stat_reset_subscription_error</primary>
+ </indexterm>
+ <function>pg_stat_reset_subscription_error</function> ( <parameter>subid</parameter> <type>oid</type>, <parameter>relid</parameter> <type>oid</type> )
+ <returnvalue>void</returnvalue>
+ </para>
+ <para>
+ Resets statistics of a single subscription error. If
+ the argument <parameter>relid</parameter> is not <literal>NULL</literal>,
+ resets error statistics of the <literal>tablesync</literal> worker for
+ the relation with <parameter>relid</parameter>. Otherwise, resets the
+ error statistics of the <literal>apply</literal> worker running on the
+ subscription with <parameter>subid</parameter>.
+ </para>
+ <para>
+ This function is restricted to superusers by default, but other users
+ can be granted EXECUTE to run the function.
+ </para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index a416e94d37..c9aa6f04d3 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -639,6 +639,8 @@ REVOKE EXECUTE ON FUNCTION pg_stat_reset_single_function_counters(oid) FROM publ
REVOKE EXECUTE ON FUNCTION pg_stat_reset_replication_slot(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_stat_reset_subscription_error(oid, oid) FROM public;
+
REVOKE EXECUTE ON FUNCTION lo_import(text) FROM public;
REVOKE EXECUTE ON FUNCTION lo_import(text, oid) FROM public;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 55f6e3711d..449692afa9 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1257,3 +1257,30 @@ REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
substream, subtwophasestate, subslotname, subsynccommit, subpublications)
ON pg_subscription TO public;
+
+CREATE VIEW pg_stat_subscription_errors AS
+ SELECT
+ d.datname,
+ sr.subid,
+ s.subname,
+ e.relid,
+ e.command,
+ e.xid,
+ e.failure_source,
+ e.failure_count,
+ e.last_failure,
+ e.last_failure_message,
+ e.stats_reset
+ FROM (SELECT
+ oid as subid,
+ NULL as relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT
+ srsubid as subid,
+ srrelid as relid
+ FROM pg_subscription_rel
+ WHERE srsubstate <> 'r') sr,
+ LATERAL pg_stat_get_subscription_error(sr.subid, sr.relid) e
+ JOIN pg_database d ON (e.datid = d.oid)
+ JOIN pg_subscription s ON (e.subid = s.oid);
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 1b54ef74eb..498ea63863 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -41,6 +41,8 @@
#include "catalog/partition.h"
#include "catalog/pg_database.h"
#include "catalog/pg_proc.h"
+#include "catalog/pg_subscription.h"
+#include "catalog/pg_subscription_rel.h"
#include "common/ip.h"
#include "executor/instrument.h"
#include "libpq/libpq.h"
@@ -106,6 +108,7 @@
#define PGSTAT_TAB_HASH_SIZE 512
#define PGSTAT_FUNCTION_HASH_SIZE 512
#define PGSTAT_REPLSLOT_HASH_SIZE 32
+#define PGSTAT_SUBSCRIPTION_HASH_SIZE 32
/* ----------
@@ -280,6 +283,7 @@ static PgStat_GlobalStats globalStats;
static PgStat_WalStats walStats;
static PgStat_SLRUStats slruStats[SLRU_NUM_ELEMENTS];
static HTAB *replSlotStatHash = NULL;
+static HTAB *subscriptionHash = NULL;
/*
* List of OIDs of databases we need to write out. If an entry is InvalidOid,
@@ -330,6 +334,12 @@ static bool pgstat_db_requested(Oid databaseid);
static PgStat_StatReplSlotEntry *pgstat_get_replslot_entry(NameData name, bool create_it);
static void pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotstats, TimestampTz ts);
+static PgStat_StatSubEntry *pgstat_get_subscription_entry(Oid subid,
+ bool create);
+static PgStat_StatSubErrEntry *pgstat_get_subscription_error_entry(Oid subid,
+ Oid subrelid,
+ bool create);
+
static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg);
static void pgstat_send_funcstats(void);
static void pgstat_send_slru(void);
@@ -370,6 +380,10 @@ static void pgstat_recv_checksum_failure(PgStat_MsgChecksumFailure *msg, int len
static void pgstat_recv_connstat(PgStat_MsgConn *msg, int len);
static void pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len);
static void pgstat_recv_tempfile(PgStat_MsgTempFile *msg, int len);
+static void pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len);
+static void pgstat_recv_subscription_error_purge(PgStat_MsgSubscriptionErrPurge *msg,
+ int len);
+static void pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len);
/* ------------------------------------------------------------
* Public functions called from postmaster follow
@@ -1148,6 +1162,166 @@ pgstat_vacuum_stat(void)
}
}
+ /*
+ * Search for all the dead subscriptions and error entries in stats
+ * hashtable and tell the stats collector to drop them.
+ */
+ if (subscriptionHash)
+ {
+ PgStat_MsgSubscriptionPurge submsg;
+ PgStat_StatSubEntry *subent;
+ HTAB *htab;
+
+ htab = pgstat_collect_oids(SubscriptionRelationId, Anum_pg_subscription_oid);
+
+ submsg.m_nentries = 0;
+
+ hash_seq_init(&hstat, subscriptionHash);
+ while ((subent = (PgStat_StatSubEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ PgStat_MsgSubscriptionErrPurge errmsg;
+ PgStat_StatSubErrEntry *errent;
+ HASH_SEQ_STATUS hstat_rel;
+ List *not_ready_rels_list;
+ HTAB *not_ready_rels_htab;
+ ListCell *lc;
+ HASHCTL hash_ctl;
+
+ CHECK_FOR_INTERRUPTS();
+
+ if (hash_search(htab, (void *) &(subent->subid), HASH_FIND, NULL) == NULL)
+ {
+ /* This subscription is dead, add subid to the message */
+ submsg.m_subids[submsg.m_nentries++] = subent->subid;
+
+ /*
+ * If the message is full, send it out and reinitialize to
+ * empty
+ */
+ if (submsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONPURGE)
+ {
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + submsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&submsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(&submsg, len);
+ submsg.m_nentries = 0;
+ }
+
+ continue;
+ }
+
+ /*
+ * Nothing to do here if the subscription exists but has no error
+ * entries.
+ */
+ if (subent->suberrors == NULL)
+ continue;
+
+ /*
+ * The subscription has error entries. We search errors of the
+ * table sync workers who are already in sync state. Those errors
+ * should be removed.
+ *
+ * Note that the lifetime of error entries of the apply worker and
+ * the table sync worker are different. The former lives until
+ * the subscription is dropped whereas the latter lives the table
+ * synchronization is completed.
+ */
+ not_ready_rels_list = GetSubscriptionNotReadyRelations(subent->subid);
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(SubscriptionRelState);
+ not_ready_rels_htab = hash_create("not ready relations in subscription",
+ 64,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+
+ /*
+ * The number of not-ready relations can be high for example right
+ * after creating a subscription, so we load the list of
+ * SubscriptionRelState into the hash table for a faster lookup.
+ */
+ foreach(lc, not_ready_rels_list)
+ {
+ SubscriptionRelState *r_elem = (SubscriptionRelState *) lfirst(lc);
+ SubscriptionRelState *r_entry;
+
+ CHECK_FOR_INTERRUPTS();
+ r_entry = hash_search(not_ready_rels_htab, (void *) &(r_elem->relid),
+ HASH_ENTER, NULL);
+ memcpy(r_entry, r_elem, sizeof(SubscriptionRelState));
+ }
+
+ list_free(not_ready_rels_list);
+
+ errmsg.m_nentries = 0;
+ errmsg.m_subid = subent->subid;
+
+ /*
+ * Search for all error entries of which relation is already ready
+ * state
+ */
+ hash_seq_init(&hstat_rel, subent->suberrors);
+ while ((errent = (PgStat_StatSubErrEntry *) hash_seq_search(&hstat_rel)) != NULL)
+ {
+ CHECK_FOR_INTERRUPTS();
+
+ /* Skip the apply worker's error */
+ if (!OidIsValid(errent->subrelid))
+ continue;
+
+ /*
+ * Add the relid to the message if the table synchronization
+ * for this relation already complete or the table is no
+ * longer subscribed.
+ */
+ if (hash_search(not_ready_rels_htab, (void *) &(errent->subrelid),
+ HASH_FIND, NULL) == NULL)
+ errmsg.m_relids[errmsg.m_nentries++] = errent->subrelid;
+
+ /*
+ * If the message is full, send it out and reinitialize to
+ * empty
+ */
+ if (errmsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONERRPURGE)
+ {
+ len = offsetof(PgStat_MsgSubscriptionErrPurge, m_relids[0])
+ + errmsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&errmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE);
+ pgstat_send(&errmsg, len);
+ errmsg.m_nentries = 0;
+ }
+ }
+
+ /* Send the rest of dead error entries */
+ if (errmsg.m_nentries > 0)
+ {
+ len = offsetof(PgStat_MsgSubscriptionErrPurge, m_relids[0])
+ + errmsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&errmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE);
+ pgstat_send(&errmsg, len);
+ errmsg.m_nentries = 0;
+ }
+
+ hash_destroy(not_ready_rels_htab);
+ }
+
+ /* Send the rest of dead subscriptions */
+ if (submsg.m_nentries > 0)
+ {
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + submsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&submsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(&submsg, len);
+ }
+
+ hash_destroy(htab);
+ }
+
/*
* Lookup our own database entry; if not found, nothing more to do.
*/
@@ -1557,6 +1731,25 @@ pgstat_reset_replslot_counter(const char *name)
pgstat_send(&msg, sizeof(msg));
}
+/* ----------
+ * pgstat_reset_subscription_error() -
+ *
+ * Tell the collector about reset the subscription error.
+ * ----------
+ */
+void
+pgstat_reset_subscription_error(Oid subid, Oid subrelid)
+{
+ PgStat_MsgSubscriptionErr msg;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_reset = true;
+
+ pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
+}
+
/* ----------
* pgstat_report_autovac() -
*
@@ -1877,6 +2070,37 @@ pgstat_report_replslot_drop(const char *slotname)
pgstat_send(&msg, sizeof(PgStat_MsgReplSlot));
}
+/* ----------
+ * pgstat_report_subscription_error() -
+ *
+ * Tell the collector about the subscription error.
+ * ----------
+ */
+void
+pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command, TransactionId xid,
+ const char *errmsg)
+{
+ PgStat_MsgSubscriptionErr msg;
+ int len;
+
+ len = offsetof(PgStat_MsgSubscriptionErr, m_errmsg[0]) + strlen(errmsg);
+ Assert(len < PGSTAT_MAX_MSG_SIZE);
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_databaseid = MyDatabaseId;
+ msg.m_relid = relid;
+ msg.m_reset = false;
+ msg.m_command = command;
+ msg.m_xid = xid;
+ msg.m_failure_time = GetCurrentTimestamp();
+ strlcpy(msg.m_errmsg, errmsg, PGSTAT_SUBSCRIPTIONERR_MSGLEN);
+
+ pgstat_send(&msg, len);
+}
+
/* ----------
* pgstat_ping() -
*
@@ -2944,6 +3168,22 @@ pgstat_fetch_replslot(NameData slotname)
return pgstat_get_replslot_entry(slotname, false);
}
+/*
+ * ---------
+ * pgstat_fetch_subscription_error() -
+ *
+ * Support function for the SQL-callable pgstat* functions. Returns
+ * a pointer to the subscription error struct.
+ * ---------
+ */
+PgStat_StatSubErrEntry *
+pgstat_fetch_subscription_error(Oid subid, Oid relid)
+{
+ backend_read_statsfile();
+
+ return pgstat_get_subscription_error_entry(subid, relid, false);
+}
+
/*
* Shut down a single backend's statistics reporting at process exit.
*
@@ -3525,6 +3765,19 @@ PgstatCollectorMain(int argc, char *argv[])
pgstat_recv_connstat(&msg.msg_conn, len);
break;
+ case PGSTAT_MTYPE_SUBSCRIPTIONERR:
+ pgstat_recv_subscription_error(&msg.msg_subscriptionerr, len);
+ break;
+
+ case PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE:
+ pgstat_recv_subscription_error_purge(&msg.msg_subscriptionerrpurge,
+ len);
+ break;
+
+ case PGSTAT_MTYPE_SUBSCRIPTIONPURGE:
+ pgstat_recv_subscription_purge(&msg.msg_subscriptionpurge, len);
+ break;
+
default:
break;
}
@@ -3826,6 +4079,50 @@ pgstat_write_statsfiles(bool permanent, bool allDbs)
}
}
+ /*
+ * Write subscription error structs
+ */
+ if (subscriptionHash)
+ {
+ PgStat_StatSubEntry *subent;
+
+ hash_seq_init(&hstat, subscriptionHash);
+ while ((subent = (PgStat_StatSubEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ PgStat_StatSubErrEntry *errent;
+ HASH_SEQ_STATUS relhstat;
+ int32 nerrors = hash_get_num_entries(subent->suberrors);
+
+ /* Skip this subscription if not have any errors */
+ if (subent->suberrors == NULL)
+ continue;
+
+ fputc('S', fpout);
+ rc = fwrite(subent, sizeof(PgStat_StatSubEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+
+ /* The number of errors follows */
+ rc = fwrite(&nerrors, sizeof(int32), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+
+ /* Then, the error entries follow */
+ hash_seq_init(&relhstat, subent->suberrors);
+ while ((errent = (PgStat_StatSubErrEntry *) hash_seq_search(&relhstat)) != NULL)
+ {
+ /*
+ * XXX we write the whole PgStat_StatSubErrEntry entry that
+ * contains the fixed-legnth error message string which is
+ * PGSTAT_SUBSCRIPTIONERR_MSGLEN in length, making the stats
+ * file bloat. It's okay since we assume that the number of
+ * error entries is not high. But if the expectation became
+ * false we should write the string and its length instead.
+ */
+ rc = fwrite(errent, sizeof(PgStat_StatSubErrEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+ }
+ }
+ }
+
/*
* No more output to be done. Close the temp file and replace the old
* pgstat.stat with it. The ferror() check replaces testing for error
@@ -4287,6 +4584,100 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
break;
}
+ /*
+ * 'S' A PgStat_StatSubEntry struct followed by the number of
+ * errors and PgStat_StatSubErrEntry structs, describing
+ * subscription errors.
+ */
+ case 'S':
+ {
+ PgStat_StatSubEntry subbuf;
+ PgStat_StatSubEntry *subent;
+ int32 nerrors;
+
+ /* Read the subscription entry */
+ if (fread(&subbuf, 1, sizeof(PgStat_StatSubEntry), fpin) !=
+ sizeof(PgStat_StatSubEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /* Create hash table if we don't have it already. */
+ if (subscriptionHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ subscriptionHash = hash_create("Subscription stat entries",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ }
+
+ /* Enter the subscription error entry */
+ subent =
+ (PgStat_StatSubEntry *) hash_search(subscriptionHash,
+ (void *) &(subbuf.subid),
+ HASH_ENTER, NULL);
+ subent->suberrors = NULL;
+
+ /* Read the number of errors in the subscription */
+ if (fread(&nerrors, 1, sizeof(int32), fpin) != sizeof(int32))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /* Read error entries */
+ for (int i = 0; i < nerrors; i++)
+ {
+ PgStat_StatSubErrEntry errbuf;
+ PgStat_StatSubErrEntry *errent;
+
+ if (fread(&errbuf, 1, sizeof(PgStat_StatSubErrEntry), fpin) !=
+ sizeof(PgStat_StatSubErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ if (subent->suberrors == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubErrEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ subent->suberrors = hash_create("Subscription error hash",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ }
+
+ /*
+ * Enter the error information to the subscription
+ * hash
+ */
+ errent =
+ (PgStat_StatSubErrEntry *) hash_search(subent->suberrors,
+ (void *) &(errbuf.subrelid),
+ HASH_ENTER, NULL);
+
+ memcpy(errent, &errbuf, sizeof(PgStat_StatSubErrEntry));
+ }
+
+ break;
+ }
+
case 'E':
goto done;
@@ -4629,6 +5020,50 @@ pgstat_read_db_statsfile_timestamp(Oid databaseid, bool permanent,
}
break;
+ /*
+ * 'S' A PgStat_StatSubEntry struct followed by the number of
+ * errors and PgStat_StatSubErrEntry structs describing
+ * subscription errors.
+ */
+ case 'S':
+ {
+ PgStat_StatSubEntry subbuf;
+ PgStat_StatSubErrEntry errbuf;
+ int32 nerrors;
+
+ if (fread(&subbuf, 1, sizeof(PgStat_StatSubEntry), fpin)
+ != sizeof(PgStat_StatSubEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ FreeFile(fpin);
+ return false;
+ }
+
+ if (fread(&nerrors, 1, sizeof(int32), fpin) != sizeof(int32))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ for (int i = 0; i < nerrors; i++)
+ {
+ if (fread(&errbuf, 1, sizeof(PgStat_StatSubErrEntry), fpin) !=
+ sizeof(PgStat_StatSubErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+ }
+ }
+
+ break;
+
case 'E':
goto done;
@@ -4834,6 +5269,7 @@ pgstat_clear_snapshot(void)
pgStatLocalContext = NULL;
pgStatDBHash = NULL;
replSlotStatHash = NULL;
+ subscriptionHash = NULL;
/*
* Historically the backend_status.c facilities lived in this file, and
@@ -5779,6 +6215,116 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
}
}
+/* ----------
+ * pgstat_recv_subscription_error() -
+ *
+ * Process a SUBSCRIPTIONERR message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len)
+{
+ PgStat_StatSubErrEntry *errent;
+ bool create = !msg->m_reset;
+
+ /* Get subscription error */
+ errent = pgstat_get_subscription_error_entry(msg->m_subid,
+ msg->m_subrelid,
+ create);
+
+ if (msg->m_reset)
+ {
+ if (errent == NULL)
+ return;
+
+ /* reset fields and set reset timestamp */
+ errent->relid = InvalidOid;
+ errent->command = 0;
+ errent->xid = InvalidTransactionId;
+ errent->failure_count = 0;
+ errent->last_failure = 0;
+ errent->last_errmsg[0] = '\0';
+ errent->stat_reset_timestamp = GetCurrentTimestamp();
+ }
+ else
+ {
+ Assert(errent);
+
+ /* update the error entry */
+ errent->databaseid = msg->m_databaseid;
+ errent->relid = msg->m_relid;
+ errent->command = msg->m_command;
+ errent->xid = msg->m_xid;
+ errent->failure_count++;
+ errent->last_failure = msg->m_failure_time;
+ strlcpy(errent->last_errmsg, msg->m_errmsg, PGSTAT_SUBSCRIPTIONERR_MSGLEN);
+ }
+}
+
+/* ----------
+ * pgstat_recv_subscription_purge() -
+ *
+ * Process a SUBSCRIPTIONPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len)
+{
+ if (subscriptionHash == NULL)
+ return;
+
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ PgStat_StatSubEntry *subent;
+
+ subent = pgstat_get_subscription_entry(msg->m_subids[i], false);
+
+ /*
+ * Nothing to do if the subscription entry is not found. This could
+ * happen when the subscription is dropped and the message for
+ * dropping subscription entry arrived before the message for
+ * reporting the error.
+ */
+ if (subent == NULL)
+ continue;
+
+ /* Cleanup the hash table for errors */
+ if (subent->suberrors != NULL)
+ hash_destroy(subent->suberrors);
+
+ /* Remote the subscription entry */
+ (void) hash_search(subscriptionHash, (void *) &(msg->m_subids[i]),
+ HASH_REMOVE, NULL);
+ }
+}
+
+/* ----------
+ * pgstat_recv_subscription_error_purge() -
+ *
+ * Process a SUBSCRIPTIONERRPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_error_purge(PgStat_MsgSubscriptionErrPurge *msg, int len)
+{
+ PgStat_StatSubEntry *subent;
+
+ subent = pgstat_get_subscription_entry(msg->m_subid, false);
+
+ /*
+ * Nothing to do if the subscription entry is not found. This could
+ * happen when the subscription with msg->m_subid is removed and the
+ * corresponding entry is also removed before receiving the error purge
+ * message.
+ */
+ if (subent == NULL)
+ return;
+
+ for (int i = 0; i < msg->m_nentries; i++)
+ (void) hash_search(subent->suberrors, (void *) &(msg->m_relids[i]),
+ HASH_REMOVE, NULL);
+}
+
/* ----------
* pgstat_write_statsfile_needed() -
*
@@ -5876,6 +6422,111 @@ pgstat_get_replslot_entry(NameData name, bool create)
return slotent;
}
+/* ----------
+ * pgstat_get_subscription_entry
+ *
+ * Return the subscription statistics with the subscription OID. Return NULL
+ * if not found and the caller didn't request to create it.
+ *
+ * 'create' tells whether to create the new subscription entry if it is not
+ * found.
+ * ----------
+ */
+static PgStat_StatSubEntry *
+pgstat_get_subscription_entry(Oid subid, bool create)
+{
+ PgStat_StatSubEntry *subent;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);
+ bool found;
+
+ if (subscriptionHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ if (!create)
+ return NULL;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubEntry);
+ subscriptionHash = hash_create("Subscription stat entries",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+ }
+
+ subent = (PgStat_StatSubEntry *) hash_search(subscriptionHash,
+ (void *) &subid,
+ action, &found);
+
+ /* initialize field */
+ if (create && !found)
+ subent->suberrors = NULL;
+
+ return subent;
+}
+
+/* ----------
+ * pgstat_get_subscription_error_entry
+ *
+ * Return the entry of subscription error entry with the subscription
+ * OID and relation OID. Return NULL if not found and the caller didn't
+ * request to create it.
+ *
+ * 'create' tells whether to create the new subscription relation entry if it is
+ * not found.
+ * ----------
+ */
+static PgStat_StatSubErrEntry *
+pgstat_get_subscription_error_entry(Oid subid, Oid subrelid, bool create)
+{
+ PgStat_StatSubEntry *subent;
+ PgStat_StatSubErrEntry *errent;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);
+ bool found;
+
+ subent = pgstat_get_subscription_entry(subid, create);
+
+ if (subent == NULL)
+ {
+ Assert(!create);
+ return NULL;
+ }
+
+ if (subent->suberrors == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ if (!create)
+ return NULL;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubErrEntry);
+ subent->suberrors = hash_create("Subscription error hash",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+ }
+
+ errent = (PgStat_StatSubErrEntry *) hash_search(subent->suberrors,
+ (void *) &subrelid,
+ action, &found);
+
+ /* initialize fields */
+ if (create && !found)
+ {
+ errent->databaseid = InvalidOid;
+ errent->relid = InvalidOid;
+ errent->command = 0;
+ errent->xid = InvalidTransactionId;
+ errent->failure_count = 0;
+ errent->last_failure = 0;
+ errent->last_errmsg[0] = '\0';
+ errent->stat_reset_timestamp = 0;
+ }
+
+ return errent;
+}
+
/* ----------
* pgstat_reset_replslot
*
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index e22b8a3903..31845fdba3 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -227,8 +227,9 @@ typedef struct ApplyErrCallbackArg
LogicalRepMsgType command; /* 0 if invalid */
/* Local relation information */
- char *nspname;
- char *relname;
+ Oid relid; /* used for error report */
+ char *nspname; /* used for error context */
+ char *relname; /* used for error context */
/* Remote transaction information */
TransactionId remote_xid;
@@ -238,6 +239,7 @@ typedef struct ApplyErrCallbackArg
static ApplyErrCallbackArg apply_error_callback_arg =
{
.command = 0,
+ .relid = InvalidOid,
.nspname = NULL,
.relname = NULL,
.remote_xid = InvalidTransactionId,
@@ -3612,8 +3614,23 @@ ApplyWorkerMain(Datum main_arg)
{
char *syncslotname;
- /* This is table synchronization worker, call initial sync. */
- syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ PG_TRY();
+ {
+ /* This is table synchronization worker, call initial sync. */
+ syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ }
+ PG_CATCH();
+ {
+ /* report the table sync error */
+ pgstat_report_subscription_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ MyLogicalRepWorker->relid,
+ 0,
+ InvalidTransactionId,
+ geterrmessage());
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
/* allocate slot name in long-lived context */
myslotname = MemoryContextStrdup(ApplyContext, syncslotname);
@@ -3731,7 +3748,24 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- LogicalRepApplyLoop(origin_startpos);
+ PG_TRY();
+ {
+ LogicalRepApplyLoop(origin_startpos);
+ }
+ PG_CATCH();
+ {
+ /* report the apply error */
+ if (apply_error_callback_arg.command != 0)
+ pgstat_report_subscription_error(MySubscription->oid,
+ InvalidOid,
+ apply_error_callback_arg.relid,
+ apply_error_callback_arg.command,
+ apply_error_callback_arg.remote_xid,
+ geterrmessage());
+
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
proc_exit(0);
}
@@ -3786,6 +3820,7 @@ set_apply_error_context_xact(TransactionId xid, TimestampTz commit_ts)
static inline void
set_apply_error_context_rel(LogicalRepRelMapEntry *rel)
{
+ apply_error_callback_arg.relid = rel->localreloid;
apply_error_callback_arg.nspname = rel->remoterel.nspname;
apply_error_callback_arg.relname = rel->remoterel.relname;
}
@@ -3794,6 +3829,7 @@ set_apply_error_context_rel(LogicalRepRelMapEntry *rel)
static inline void
reset_apply_error_context_rel(void)
{
+ apply_error_callback_arg.relid = InvalidOid;
apply_error_callback_arg.nspname = NULL;
apply_error_callback_arg.relname = NULL;
}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index ff5aedc99c..4048c99a9e 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -24,6 +24,7 @@
#include "pgstat.h"
#include "postmaster/bgworker_internals.h"
#include "postmaster/postmaster.h"
+#include "replication/logicalproto.h"
#include "replication/slot.h"
#include "storage/proc.h"
#include "storage/procarray.h"
@@ -2239,6 +2240,23 @@ pg_stat_reset_replication_slot(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
+/* Reset a subscription error stats */
+Datum
+pg_stat_reset_subscription_error(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid;
+
+ if (PG_ARGISNULL(1))
+ relid = InvalidOid; /* reset apply worker error stats */
+ else
+ relid = PG_GETARG_OID(1); /* reset table sync worker error stats */
+
+ pgstat_reset_subscription_error(subid, relid);
+
+ PG_RETURN_VOID();
+}
+
Datum
pg_stat_get_archiver(PG_FUNCTION_ARGS)
{
@@ -2379,3 +2397,104 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
/* Returns the record as Datum */
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
+
+/*
+ * Get the subscription error for the given subscription (and relation).
+ */
+Datum
+pg_stat_get_subscription_error(PG_FUNCTION_ARGS)
+{
+#define PG_STAT_GET_SUBSCRIPTION_ERROR_COLS 10
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid;
+ TupleDesc tupdesc;
+ Datum values[PG_STAT_GET_SUBSCRIPTION_ERROR_COLS];
+ bool nulls[PG_STAT_GET_SUBSCRIPTION_ERROR_COLS];
+ PgStat_StatSubErrEntry *errent;
+
+ /* Initialise values and NULL flags arrays */
+ MemSet(values, 0, sizeof(values));
+ MemSet(nulls, 0, sizeof(nulls));
+
+ /* Initialise attributes information in the tuple descriptor */
+ tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_SUBSCRIPTION_ERROR_COLS);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "datid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 3, "relid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 4, "command",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 5, "xid",
+ XIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 6, "failure_source",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 7, "failure_count",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 8, "last_failure",
+ TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 9, "last_failure_message",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 10, "stats_reset",
+ TIMESTAMPTZOID, -1, 0);
+ BlessTupleDesc(tupdesc);
+
+ if (PG_ARGISNULL(1))
+ relid = InvalidOid;
+ else
+ relid = PG_GETARG_OID(1);
+
+ /* Get subscription errors */
+ errent = pgstat_fetch_subscription_error(subid, relid);
+
+ /* Return NULL if the subscription doesn't have any errors */
+ if (errent == NULL)
+ PG_RETURN_NULL();
+
+ values[0] = ObjectIdGetDatum(errent->databaseid);
+ values[1] = ObjectIdGetDatum(subid);
+
+ if (OidIsValid(errent->relid))
+ values[2] = ObjectIdGetDatum(errent->relid);
+ else
+ nulls[2] = true;
+
+ if (errent->command == 0)
+ nulls[3] = true;
+ else
+ {
+ /* should be apply worker */
+ Assert(!OidIsValid(errent->subrelid));
+
+ values[3] = CStringGetTextDatum(logicalrep_message_type(errent->command));
+ }
+
+ if (TransactionIdIsValid(errent->xid))
+ values[4] = TransactionIdGetDatum(errent->xid);
+ else
+ nulls[4] = true;
+
+ if (OidIsValid(errent->subrelid))
+ values[5] = CStringGetTextDatum("tablesync");
+ else
+ values[5] = CStringGetTextDatum("apply");
+
+
+ values[6] = Int64GetDatum(errent->failure_count);
+
+ if (errent->last_failure == 0)
+ nulls[7] = true;
+ else
+ values[7] = TimestampTzGetDatum(errent->last_failure);
+
+ values[8] = CStringGetTextDatum(errent->last_errmsg);
+
+ if (errent->stat_reset_timestamp == 0)
+ nulls[9] = true;
+ else
+ values[9] = TimestampTzGetDatum(errent->stat_reset_timestamp);
+
+ /* Returns the record as Datum */
+ PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
+}
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index a3e1c59a82..dd36850016 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -1441,6 +1441,22 @@ getinternalerrposition(void)
return edata->internalpos;
}
+/*
+ * geterrmessage --- return the currently set error message
+ *
+ * This is only intended for use in error callback subroutines, since there
+ * is no other place outside elog.c where the concept is meaningful.
+ */
+const char *
+geterrmessage(void)
+{
+ ErrorData *edata = &errordata[errordata_stack_depth];
+
+ /* we don't bother incrementing recursion_depth */
+ CHECK_STACK_DEPTH();
+
+ return (const char *) edata->message;
+}
/*
* Functions to allow construction of error message strings separately from
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index b603700ed9..7f9c27bdda 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5385,6 +5385,14 @@
proargmodes => '{i,o,o,o,o,o,o,o,o,o,o}',
proargnames => '{slot_name,slot_name,spill_txns,spill_count,spill_bytes,stream_txns,stream_count,stream_bytes,total_txns,total_bytes,stats_reset}',
prosrc => 'pg_stat_get_replication_slot' },
+{ oid => '8523', descr => 'statistics: information about subscription error',
+ proname => 'pg_stat_get_subscription_error', prorows => '1', proisstrict => 'f',
+ proretset => 't', provolatile => 's', proparallel => 'r',
+ prorettype => 'record', proargtypes => 'oid oid',
+ proallargtypes => '{oid,oid,oid,oid,oid,text,xid,text,int8,timestamptz,text,timestamptz}',
+ proargmodes => '{i,i,o,o,o,o,o,o,o,o,o,o}',
+ proargnames => '{subid,relid,datid,subid,relid,command,xid,failure_source,failure_count,last_failure,last_failure_message,stats_reset}',
+ prosrc => 'pg_stat_get_subscription_error' },
{ oid => '6118', descr => 'statistics: information about subscription',
proname => 'pg_stat_get_subscription', prorows => '10', proisstrict => 'f',
proretset => 't', provolatile => 's', proparallel => 'r',
@@ -5772,6 +5780,11 @@
proname => 'pg_stat_reset_replication_slot', proisstrict => 'f',
provolatile => 'v', prorettype => 'void', proargtypes => 'text',
prosrc => 'pg_stat_reset_replication_slot' },
+{ oid => '8524',
+ descr => 'statistics: reset collected statistics for a single subscription error',
+ proname => 'pg_stat_reset_subscription_error', proisstrict => 'f',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid oid',
+ prosrc => 'pg_stat_reset_subscription_error' },
{ oid => '3163', descr => 'current trigger depth',
proname => 'pg_trigger_depth', provolatile => 's', proparallel => 'r',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 2068a68a5f..f66eb290bb 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -14,6 +14,7 @@
#include "datatype/timestamp.h"
#include "portability/instr_time.h"
#include "postmaster/pgarch.h" /* for MAX_XFN_CHARS */
+#include "replication/logicalproto.h"
#include "utils/backend_progress.h" /* for backward compatibility */
#include "utils/backend_status.h" /* for backward compatibility */
#include "utils/hsearch.h"
@@ -66,6 +67,9 @@ typedef enum StatMsgType
PGSTAT_MTYPE_RESETSINGLECOUNTER,
PGSTAT_MTYPE_RESETSLRUCOUNTER,
PGSTAT_MTYPE_RESETREPLSLOTCOUNTER,
+ PGSTAT_MTYPE_SUBSCRIPTIONERR,
+ PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE,
+ PGSTAT_MTYPE_SUBSCRIPTIONPURGE,
PGSTAT_MTYPE_AUTOVAC_START,
PGSTAT_MTYPE_VACUUM,
PGSTAT_MTYPE_ANALYZE,
@@ -550,6 +554,68 @@ typedef struct PgStat_MsgReplSlot
PgStat_Counter m_total_bytes;
} PgStat_MsgReplSlot;
+/* ----------
+ * PgStat_MsgSubscriptionErr Sent by the apply worker or the table sync worker to
+ * update/reset the error happening during logical
+ * replication.
+ * ----------
+ */
+#define PGSTAT_SUBSCRIPTIONERR_MSGLEN 256
+typedef struct PgStat_MsgSubscriptionErr
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * m_subid and m_subrelid are used to determine the subscription and the
+ * reporter of this error. m_subrelid is InvalidOid if reported by the
+ * apply worker, otherwise by the table sync worker. In table sync worker
+ * case, m_subrelid must be the same as m_relid.
+ */
+ Oid m_subid;
+ Oid m_subrelid;
+
+ /* The clear message uses below field */
+ bool m_reset; /* Reset all fields and set reset_stats
+ * timestamp */
+
+ /* The error report message uses below fields */
+ Oid m_databaseid;
+ Oid m_relid;
+ LogicalRepMsgType m_command;
+ TransactionId m_xid;
+ TimestampTz m_failure_time;
+ char m_errmsg[PGSTAT_SUBSCRIPTIONERR_MSGLEN];
+} PgStat_MsgSubscriptionErr;
+
+/* ----------
+ * PgStat_MsgSubscriptionPurge Sent by the autovacuum purge the subscriptions.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionPurge
+{
+ PgStat_MsgHdr m_hdr;
+ int m_nentries;
+ Oid m_subids[PGSTAT_NUM_SUBSCRIPTIONPURGE];
+} PgStat_MsgSubscriptionPurge;
+
+/* ----------
+ * PgStat_MsgSubscriptionErrPurge Sent by the autovacuum purge the subscription
+ * errors.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONERRPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(Oid) - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionErrPurge
+{
+ PgStat_MsgHdr m_hdr;
+ Oid m_subid;
+ int m_nentries;
+ Oid m_relids[PGSTAT_NUM_SUBSCRIPTIONERRPURGE];
+} PgStat_MsgSubscriptionErrPurge;
/* ----------
* PgStat_MsgRecoveryConflict Sent by the backend upon recovery conflict
@@ -722,6 +788,9 @@ typedef union PgStat_Msg
PgStat_MsgChecksumFailure msg_checksumfailure;
PgStat_MsgReplSlot msg_replslot;
PgStat_MsgConn msg_conn;
+ PgStat_MsgSubscriptionErr msg_subscriptionerr;
+ PgStat_MsgSubscriptionErrPurge msg_subscriptionerrpurge;
+ PgStat_MsgSubscriptionPurge msg_subscriptionpurge;
} PgStat_Msg;
@@ -938,6 +1007,38 @@ typedef struct PgStat_StatReplSlotEntry
TimestampTz stat_reset_timestamp;
} PgStat_StatReplSlotEntry;
+/*
+ * Subscription statistics kept in the stats collector.
+ */
+typedef struct PgStat_StatSubEntry
+{
+ Oid subid; /* hash table key */
+ HTAB *suberrors;
+} PgStat_StatSubEntry;
+
+/*
+ * Subscription error statistics kept in the stats collector. One entry represents
+ * an error that happened during logical replication, reported by the apply worker
+ * (subrelid is InvalidOid) or by the table sync worker (subrelid is a valid OID).
+ * The error reported by the table sync worker is removed also when the table
+ * synchronization process completed.
+ */
+
+typedef struct PgStat_StatSubErrEntry
+{
+ Oid subrelid; /* InvalidOid if the apply worker, otherwise
+ * the table sync worker. hash table key. */
+ Oid databaseid;
+ Oid relid; /* OID of relation related to the error. Must
+ * be the same as subrelid in the table sync
+ * case. */
+ LogicalRepMsgType command;
+ TransactionId xid;
+ PgStat_Counter failure_count;
+ TimestampTz last_failure;
+ char last_errmsg[PGSTAT_SUBSCRIPTIONERR_MSGLEN];
+ TimestampTz stat_reset_timestamp;
+} PgStat_StatSubErrEntry;
/*
* Working state needed to accumulate per-function-call timing statistics.
@@ -1031,6 +1132,7 @@ extern void pgstat_reset_shared_counters(const char *);
extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type type);
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
+extern void pgstat_reset_subscription_error(Oid subid, Oid subrelid);
extern void pgstat_report_autovac(Oid dboid);
extern void pgstat_report_vacuum(Oid tableoid, bool shared,
@@ -1047,6 +1149,9 @@ extern void pgstat_report_checksum_failure(void);
extern void pgstat_report_replslot(const PgStat_StatReplSlotEntry *repSlotStat);
extern void pgstat_report_replslot_create(const char *slotname);
extern void pgstat_report_replslot_drop(const char *slotname);
+extern void pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command,
+ TransactionId xid, const char *errmsg);
extern void pgstat_initialize(void);
@@ -1145,6 +1250,7 @@ extern PgStat_GlobalStats *pgstat_fetch_global(void);
extern PgStat_WalStats *pgstat_fetch_stat_wal(void);
extern PgStat_SLRUStats *pgstat_fetch_slru(void);
extern PgStat_StatReplSlotEntry *pgstat_fetch_replslot(NameData slotname);
+extern PgStat_StatSubErrEntry *pgstat_fetch_subscription_error(Oid subid, Oid relid);
extern void pgstat_count_slru_page_zeroed(int slru_idx);
extern void pgstat_count_slru_page_hit(int slru_idx);
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index f53607e12e..155145a77d 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -222,6 +222,7 @@ extern int err_generic_string(int field, const char *str);
extern int geterrcode(void);
extern int geterrposition(void);
extern int getinternalerrposition(void);
+extern const char *geterrmessage(void);
/*----------
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index e5ab11275d..ffad9790ae 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2094,6 +2094,28 @@ pg_stat_subscription| SELECT su.oid AS subid,
st.latest_end_time
FROM (pg_subscription su
LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid, pid, received_lsn, last_msg_send_time, last_msg_receipt_time, latest_end_lsn, latest_end_time) ON ((st.subid = su.oid)));
+pg_stat_subscription_errors| SELECT d.datname,
+ sr.subid,
+ s.subname,
+ e.relid,
+ e.command,
+ e.xid,
+ e.failure_source,
+ e.failure_count,
+ e.last_failure,
+ e.last_failure_message,
+ e.stats_reset
+ FROM ( SELECT pg_subscription.oid AS subid,
+ NULL::oid AS relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT pg_subscription_rel.srsubid AS subid,
+ pg_subscription_rel.srrelid AS relid
+ FROM pg_subscription_rel
+ WHERE (pg_subscription_rel.srsubstate <> 'r'::"char")) sr,
+ ((LATERAL pg_stat_get_subscription_error(sr.subid, sr.relid) e(datid, subid, relid, command, xid, failure_source, failure_count, last_failure, last_failure_message, stats_reset)
+ JOIN pg_database d ON ((e.datid = d.oid)))
+ JOIN pg_subscription s ON ((e.subid = s.oid)));
pg_stat_sys_indexes| SELECT pg_stat_all_indexes.relid,
pg_stat_all_indexes.indexrelid,
pg_stat_all_indexes.schemaname,
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index e69b708e33..b294063640 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1938,6 +1938,9 @@ PgStat_MsgResetsharedcounter
PgStat_MsgResetsinglecounter
PgStat_MsgResetslrucounter
PgStat_MsgSLRU
+PgStat_MsgSubscriptionErr
+PgStat_MsgSubscriptionErrPurge
+PgStat_MsgSubscriptionPurge
PgStat_MsgTabpurge
PgStat_MsgTabstat
PgStat_MsgTempFile
@@ -1949,6 +1952,8 @@ PgStat_Single_Reset_Type
PgStat_StatDBEntry
PgStat_StatFuncEntry
PgStat_StatReplSlotEntry
+PgStat_StatSubEntry
+PgStat_StatSubErrEntry
PgStat_StatTabEntry
PgStat_SubXactStatus
PgStat_TableCounts
--
2.24.3 (Apple Git-128)
On Tue, Aug 10, 2021 at 10:27 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Tue, Aug 10, 2021 at 3:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached the latest patches that incorporated all comments I got
so far. Please review them.Some initial review comments on the v6-0001 patch:
Thanks for reviewing the patch!
src/backend/replication/logical/proto.c:
(1)+ TimestampTz committs;
I think it looks better to name "committs" as "commit_ts", and also is
more consistent with naming for other member "remote_xid".
Fixed.
src/backend/replication/logical/worker.c:
(2)
To be consistent with all other function headers, should start
sentence with capital: "get" -> "Get"+ * get string representing LogicalRepMsgType.
Fixed
(3) It looks a bit cumbersome and repetitive to set/update the members
of apply_error_callback_arg in numerous places.I suggest making the "set_apply_error_context..." and
"reset_apply_error_context..." functions as "static inline void"
functions (moving them to the top part of the source file, and
removing the existing function declarations for these).Also, can add something similar to below:
static inline void
set_apply_error_callback_xid(TransactionId xid)
{
apply_error_callback_arg.remote_xid = xid;
}static inline void
set_apply_error_callback_xid_info(TransactionId xid, TimestampTz commit_ts)
{
apply_error_callback_arg.remote_xid = xid;
apply_error_callback_arg.commit_ts = commit_ts;
}so that instances of, for example:
apply_error_callback_arg.remote_xid = prepare_data.xid;
apply_error_callback_arg.committs = prepare_data.commit_time;can be:
set_apply_error_callback_tx_info(prepare_data.xid, prepare_data.commit_time);
Okay. I've added set_apply_error_callback_xact() function to set
transaction information to apply error callback. Also, I inlined those
helper functions since we call them every change.
(4) The apply_error_callback() function is missing a function header/comment.
Added.
The fixes for the above comments are incorporated in the v7 patch I
just submitted[1]/messages/by-id/CAD21AoALAq_0q_Zz2K0tO=kuUj8aBrDdMJXbey1P6t4w8snpQQ@mail.gmail.com.
Regards,
[1]: /messages/by-id/CAD21AoALAq_0q_Zz2K0tO=kuUj8aBrDdMJXbey1P6t4w8snpQQ@mail.gmail.com
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wed, Aug 11, 2021 at 11:19 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Aug 10, 2021 at 7:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
==============================================================
1. While applying DML operations, we are setting up the error context
multiple times due to which the context information is not
appropriate. The first is set in apply_dispatch and then during
processing, we set another error callback slot_store_error_callback in
slot_store_data and slot_modify_data. When I forced one of the errors
in slot_store_data(), it displays the below information in CONTEXT
which doesn't make much sense.2021-08-10 15:16:39.887 IST [6784] ERROR: incorrect binary data
format in logical replication column 1
2021-08-10 15:16:39.887 IST [6784] CONTEXT: processing remote data
for replication target relation "public.test1" column "id"
during apply of "INSERT" for relation "public.test1" in
transaction with xid 740 committs 2021-08-10 14:44:38.058174+05:30Yes, but we cannot change the error context message depending on other
error context messages. So it seems hard to construct a complete
sentence in the context message that is okay in terms of English
grammar. Is the following message better?CONTEXT: processing remote data for replication target relation
"public.test1" column “id"
applying "INSERT" for relation "public.test1” in transaction
with xid 740 committs 2021-08-10 14:44:38.058174+05:30
I don't like the proposed text. How about if we combine both and have
something like: "processing remote data during "UPDATE" for
replication target relation "public.test1" column "id" in transaction
id 740 with commit timestamp 2021-08-10 14:44:38.058174+05:30"? For
this, I think we need to remove slot_store_error_callback and
add/change the ApplyErrCallbackArg to include the additional required
information in that callback.
--
With Regards,
Amit Kapila.
On Wed, Aug 11, 2021 at 2:48 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached the new patches. Please review them.
Please note that newly added tap tests fail due to known assertion
failure in pgstats that I reported here[1]/messages/by-id/CAD21AoCCAa+J1-udHRo5-Hbtv=D38WdZDAaXZGDbQQ_Vg_d3bQ@mail.gmail.com.
Regards,
[1]: /messages/by-id/CAD21AoCCAa+J1-udHRo5-Hbtv=D38WdZDAaXZGDbQQ_Vg_d3bQ@mail.gmail.com
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wed, Aug 11, 2021 at 5:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Aug 11, 2021 at 11:19 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Aug 10, 2021 at 7:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
==============================================================
1. While applying DML operations, we are setting up the error context
multiple times due to which the context information is not
appropriate. The first is set in apply_dispatch and then during
processing, we set another error callback slot_store_error_callback in
slot_store_data and slot_modify_data. When I forced one of the errors
in slot_store_data(), it displays the below information in CONTEXT
which doesn't make much sense.2021-08-10 15:16:39.887 IST [6784] ERROR: incorrect binary data
format in logical replication column 1
2021-08-10 15:16:39.887 IST [6784] CONTEXT: processing remote data
for replication target relation "public.test1" column "id"
during apply of "INSERT" for relation "public.test1" in
transaction with xid 740 committs 2021-08-10 14:44:38.058174+05:30Yes, but we cannot change the error context message depending on other
error context messages. So it seems hard to construct a complete
sentence in the context message that is okay in terms of English
grammar. Is the following message better?CONTEXT: processing remote data for replication target relation
"public.test1" column “id"
applying "INSERT" for relation "public.test1” in transaction
with xid 740 committs 2021-08-10 14:44:38.058174+05:30I don't like the proposed text. How about if we combine both and have
something like: "processing remote data during "UPDATE" for
replication target relation "public.test1" column "id" in transaction
id 740 with commit timestamp 2021-08-10 14:44:38.058174+05:30"? For
this, I think we need to remove slot_store_error_callback and
add/change the ApplyErrCallbackArg to include the additional required
information in that callback.
Oh, I've never thought about that. That's a good idea.
I've attached the updated patches. FYI I've included the patch
(v8-0005) that fixes the assertion failure during shared fileset
cleanup to make cfbot tests happy.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v8-0005-Move-shared-fileset-cleanup-to-before_shmem_exit.patchapplication/octet-stream; name=v8-0005-Move-shared-fileset-cleanup-to-before_shmem_exit.patchDownload
From d738bbe63494ec62b691a95fff2e4c8f8318f02b Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Thu, 12 Aug 2021 10:57:41 +0900
Subject: [PATCH v8 5/5] Move shared fileset cleanup to before_shmem_exit().
The reported problem is that shared file set created in
SharedFileSetInit() by logical replication apply worker is cleaned up
in SharedFileSetDeleteOnProcExit() when the process exited on an error
due to a conflict. As shared fileset cleanup causes pgstat reporting
for underlying temporary files, the assertions added in ee3f8d3d3ae
caused failures.
To fix the problem, similar to 675c945394, move shared fileset cleanup
to a before_shmem_exit() hook, ensuring that the fileset is dropped
while we can still report stats for underlying temporary files.
---
src/backend/storage/file/sharedfileset.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/src/backend/storage/file/sharedfileset.c b/src/backend/storage/file/sharedfileset.c
index ed37c940ad..0d9700bf56 100644
--- a/src/backend/storage/file/sharedfileset.c
+++ b/src/backend/storage/file/sharedfileset.c
@@ -36,7 +36,7 @@
static List *filesetlist = NIL;
static void SharedFileSetOnDetach(dsm_segment *segment, Datum datum);
-static void SharedFileSetDeleteOnProcExit(int status, Datum arg);
+static void SharedFileSetDeleteBeforeShmemExit(int status, Datum arg);
static void SharedFileSetPath(char *path, SharedFileSet *fileset, Oid tablespace);
static void SharedFilePath(char *path, SharedFileSet *fileset, const char *name);
static Oid ChooseTablespace(const SharedFileSet *fileset, const char *name);
@@ -112,7 +112,12 @@ SharedFileSetInit(SharedFileSet *fileset, dsm_segment *seg)
* fileset clean up.
*/
Assert(filesetlist == NIL);
- on_proc_exit(SharedFileSetDeleteOnProcExit, 0);
+
+ /*
+ * Register before-shmem-exit hook to ensure fileset is dropped
+ * while we can still report stats for underlying temporary files.
+ */
+ before_shmem_exit(SharedFileSetDeleteBeforeShmemExit, 0);
registered_cleanup = true;
}
@@ -259,12 +264,12 @@ SharedFileSetOnDetach(dsm_segment *segment, Datum datum)
}
/*
- * Callback function that will be invoked on the process exit. This will
+ * Callback function that will be invoked before shmem exit. This will
* process the list of all the registered sharedfilesets and delete the
* underlying files.
*/
static void
-SharedFileSetDeleteOnProcExit(int status, Datum arg)
+SharedFileSetDeleteBeforeShmemExit(int status, Datum arg)
{
/*
* Remove all the pending shared fileset entries. We don't use foreach()
--
2.24.3 (Apple Git-128)
v8-0003-Add-RESET-command-to-ALTER-SUBSCRIPTION-command.patchapplication/octet-stream; name=v8-0003-Add-RESET-command-to-ALTER-SUBSCRIPTION-command.patchDownload
From 16c8c672facfc89c4e0e80a6560eb7bc68bb3941 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 2 Aug 2021 14:23:18 +0900
Subject: [PATCH v8 3/5] Add RESET command to ALTER SUBSCRIPTION command.
ALTER SUBSCRIPTION ... RESET command resets subscription
parameters. The parameters that can be set are streaming, binary,
synchronous_commit.
RESET command is reuiqred by follow-up commit introducing to a new
parameter skip_xid to reset.
---
doc/src/sgml/ref/alter_subscription.sgml | 8 ++-
src/backend/commands/subscriptioncmds.c | 78 +++++++++++++++++-----
src/backend/parser/gram.y | 11 ++-
src/include/nodes/parsenodes.h | 5 +-
src/test/regress/expected/subscription.out | 14 +++-
src/test/regress/sql/subscription.sql | 13 ++++
6 files changed, 109 insertions(+), 20 deletions(-)
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index a6f994450d..8c3c28b7e7 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -29,6 +29,7 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> REFRESH PUB
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> ENABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> DISABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <replaceable>new_name</replaceable>
</synopsis>
@@ -192,16 +193,21 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
<varlistentry>
<term><literal>SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )</literal></term>
+ <term><literal>RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )</literal></term>
<listitem>
<para>
This clause alters parameters originally set by
<xref linkend="sql-createsubscription"/>. See there for more
- information. The parameters that can be altered
+ information. The parameters that can be set
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, and
<literal>streaming</literal>.
</para>
+ <para>
+ The parameters that can be reset are: <literal>streaming</literal>,
+ <literal>binary</literal>, <literal>synchronous_commit</literal>.
+ </para>
</listitem>
</varlistentry>
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 5157f44058..cc390ce95a 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -99,7 +99,8 @@ static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname,
*/
static void
parse_subscription_options(ParseState *pstate, List *stmt_options,
- bits32 supported_opts, SubOpts *opts)
+ bits32 supported_opts, SubOpts *opts,
+ bool is_reset)
{
ListCell *lc;
@@ -134,6 +135,11 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
{
DefElem *defel = (DefElem *) lfirst(lc);
+ if (is_reset && defel->arg != NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("RESET must not include values for parameters")));
+
if (IsSet(supported_opts, SUBOPT_CONNECT) &&
strcmp(defel->defname, "connect") == 0)
{
@@ -192,12 +198,18 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_SYNCHRONOUS_COMMIT;
- opts->synchronous_commit = defGetString(defel);
+ if (!is_reset)
+ {
+ opts->synchronous_commit = defGetString(defel);
- /* Test if the given value is valid for synchronous_commit GUC. */
- (void) set_config_option("synchronous_commit", opts->synchronous_commit,
- PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
- false, 0, false);
+ /*
+ * Test if the given value is valid for synchronous_commit
+ * GUC.
+ */
+ (void) set_config_option("synchronous_commit", opts->synchronous_commit,
+ PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
+ false, 0, false);
+ }
}
else if (IsSet(supported_opts, SUBOPT_REFRESH) &&
strcmp(defel->defname, "refresh") == 0)
@@ -215,7 +227,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_BINARY;
- opts->binary = defGetBoolean(defel);
+ if (!is_reset)
+ opts->binary = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_STREAMING) &&
strcmp(defel->defname, "streaming") == 0)
@@ -224,7 +237,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_STREAMING;
- opts->streaming = defGetBoolean(defel);
+ if (!is_reset)
+ opts->streaming = defGetBoolean(defel);
}
else if (strcmp(defel->defname, "two_phase") == 0)
{
@@ -397,7 +411,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT);
- parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
+ parse_subscription_options(pstate, stmt->options, supported_opts, &opts,
+ false);
/*
* Since creating a replication slot is not transactional, rolling back
@@ -866,14 +881,14 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
switch (stmt->kind)
{
- case ALTER_SUBSCRIPTION_OPTIONS:
+ case ALTER_SUBSCRIPTION_SET_OPTIONS:
{
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING);
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
if (IsSet(opts.specified_opts, SUBOPT_SLOT_NAME))
{
@@ -923,10 +938,43 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
break;
}
+ case ALTER_SUBSCRIPTION_RESET_OPTIONS:
+ {
+ supported_opts = (SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
+ SUBOPT_STREAMING);
+
+ parse_subscription_options(pstate, stmt->options,
+ supported_opts, &opts, true);
+
+ if (IsSet(opts.specified_opts, SUBOPT_SYNCHRONOUS_COMMIT))
+ {
+ values[Anum_pg_subscription_subsynccommit - 1] =
+ CStringGetTextDatum("off");
+ replaces[Anum_pg_subscription_subsynccommit - 1] = true;
+ }
+
+ if (IsSet(opts.specified_opts, SUBOPT_BINARY))
+ {
+ values[Anum_pg_subscription_subbinary - 1] =
+ BoolGetDatum(false);
+ replaces[Anum_pg_subscription_subbinary - 1] = true;
+ }
+
+ if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
+ {
+ values[Anum_pg_subscription_substream - 1] =
+ BoolGetDatum(false);
+ replaces[Anum_pg_subscription_substream - 1] = true;
+ }
+
+ update_tuple = true;
+ break;
+ }
case ALTER_SUBSCRIPTION_ENABLED:
{
parse_subscription_options(pstate, stmt->options,
- SUBOPT_ENABLED, &opts);
+ SUBOPT_ENABLED, &opts, false);
+
Assert(IsSet(opts.specified_opts, SUBOPT_ENABLED));
if (!sub->slotname && opts.enabled)
@@ -961,7 +1009,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
{
supported_opts = SUBOPT_COPY_DATA | SUBOPT_REFRESH;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
values[Anum_pg_subscription_subpublications - 1] =
publicationListToArray(stmt->publication);
@@ -1011,7 +1059,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts |= SUBOPT_COPY_DATA;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
publist = merge_publications(sub->publications, stmt->publication, isadd, stmt->subname);
values[Anum_pg_subscription_subpublications - 1] =
@@ -1059,7 +1107,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
errmsg("ALTER SUBSCRIPTION ... REFRESH is not allowed for disabled subscriptions")));
parse_subscription_options(pstate, stmt->options,
- SUBOPT_COPY_DATA, &opts);
+ SUBOPT_COPY_DATA, &opts, false);
/*
* The subscription option "two_phase" requires that
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 39a2849eba..bcf85e8980 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9707,7 +9707,16 @@ AlterSubscriptionStmt:
{
AlterSubscriptionStmt *n =
makeNode(AlterSubscriptionStmt);
- n->kind = ALTER_SUBSCRIPTION_OPTIONS;
+ n->kind = ALTER_SUBSCRIPTION_SET_OPTIONS;
+ n->subname = $3;
+ n->options = $5;
+ $$ = (Node *)n;
+ }
+ | ALTER SUBSCRIPTION name RESET definition
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_RESET_OPTIONS;
n->subname = $3;
n->options = $5;
$$ = (Node *)n;
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index e28248af32..504d65f7d6 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3659,7 +3659,8 @@ typedef struct CreateSubscriptionStmt
typedef enum AlterSubscriptionType
{
- ALTER_SUBSCRIPTION_OPTIONS,
+ ALTER_SUBSCRIPTION_SET_OPTIONS,
+ ALTER_SUBSCRIPTION_RESET_OPTIONS,
ALTER_SUBSCRIPTION_CONNECTION,
ALTER_SUBSCRIPTION_SET_PUBLICATION,
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
@@ -3671,7 +3672,7 @@ typedef enum AlterSubscriptionType
typedef struct AlterSubscriptionStmt
{
NodeTag type;
- AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_OPTIONS, etc */
+ AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_SET_OPTIONS, etc */
char *subname; /* Name of the subscription */
char *conninfo; /* Connection string to publisher */
List *publication; /* One or more publication to subscribe to */
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 77b4437b69..b87f67fe55 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -284,11 +284,23 @@ ALTER SUBSCRIPTION regress_testsub SET (two_phase = false);
ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+-- ok
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit);
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit, binary, streaming);
+-- fail - unsupported parameters
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ERROR: unrecognized subscription parameter: "connect"
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+ERROR: unrecognized subscription parameter: "enabled"
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+ERROR: RESET must not include values for parameters
\dRs+
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index d42104c191..aa90560691 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -218,6 +218,19 @@ ALTER SUBSCRIPTION regress_testsub SET (two_phase = false);
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+
+-- ok
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit);
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit, binary, streaming);
+
+-- fail - unsupported parameters
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+
\dRs+
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
--
2.24.3 (Apple Git-128)
v8-0001-Add-logical-changes-details-to-errcontext-of-appl.patchapplication/octet-stream; name=v8-0001-Add-logical-changes-details-to-errcontext-of-appl.patchDownload
From 8744e51060d9c52ab1b6c7560cd8815a79176085 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 28 Jun 2021 13:21:58 +0900
Subject: [PATCH v8 1/5] Add logical changes details to errcontext of apply
worker errors.
Previously, the error context was set to only the data conversion
failures. This commit expands the error context to add the details of
logical change being applied by the apply worker, newly showing the
command, transaction, and commit timestamp.
This additional information can be used by the follow-up commit that
enables to skip the particular transaction on the subscriber.
---
src/backend/replication/logical/proto.c | 51 +++++
src/backend/replication/logical/worker.c | 249 ++++++++++++++++-------
src/include/replication/logicalproto.h | 1 +
src/tools/pgindent/typedefs.list | 2 +-
4 files changed, 223 insertions(+), 80 deletions(-)
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 52b65e9572..bb5016aa17 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -1156,3 +1156,54 @@ logicalrep_read_stream_abort(StringInfo in, TransactionId *xid,
*xid = pq_getmsgint(in, 4);
*subxid = pq_getmsgint(in, 4);
}
+
+/*
+ * Get string representing LogicalRepMsgType.
+ */
+char *
+logicalrep_message_type(LogicalRepMsgType action)
+{
+ switch (action)
+ {
+ case LOGICAL_REP_MSG_BEGIN:
+ return "BEGIN";
+ case LOGICAL_REP_MSG_COMMIT:
+ return "COMMIT";
+ case LOGICAL_REP_MSG_ORIGIN:
+ return "ORIGIN";
+ case LOGICAL_REP_MSG_INSERT:
+ return "INSERT";
+ case LOGICAL_REP_MSG_UPDATE:
+ return "UPDATE";
+ case LOGICAL_REP_MSG_DELETE:
+ return "DELETE";
+ case LOGICAL_REP_MSG_TRUNCATE:
+ return "TRUNCATE";
+ case LOGICAL_REP_MSG_RELATION:
+ return "RELATION";
+ case LOGICAL_REP_MSG_TYPE:
+ return "TYPE";
+ case LOGICAL_REP_MSG_MESSAGE:
+ return "MESSAGE";
+ case LOGICAL_REP_MSG_BEGIN_PREPARE:
+ return "BEGIN PREPARE";
+ case LOGICAL_REP_MSG_PREPARE:
+ return "PREPARE";
+ case LOGICAL_REP_MSG_COMMIT_PREPARED:
+ return "COMMIT PREPARED";
+ case LOGICAL_REP_MSG_ROLLBACK_PREPARED:
+ return "ROLLBACK PREPARED";
+ case LOGICAL_REP_MSG_STREAM_START:
+ return "STREAM START";
+ case LOGICAL_REP_MSG_STREAM_END:
+ return "STREAM END";
+ case LOGICAL_REP_MSG_STREAM_COMMIT:
+ return "STREAM COMMIT";
+ case LOGICAL_REP_MSG_STREAM_ABORT:
+ return "STREAM ABORT";
+ case LOGICAL_REP_MSG_STREAM_PREPARE:
+ return "STREAM PREPARE";
+ }
+
+ elog(ERROR, "invalid logical replication message type \"%c\"", action);
+}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index ecaed157f2..a74493b610 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -203,12 +203,6 @@ typedef struct FlushPosition
static dlist_head lsn_mapping = DLIST_STATIC_INIT(lsn_mapping);
-typedef struct SlotErrCallbackArg
-{
- LogicalRepRelMapEntry *rel;
- int remote_attnum;
-} SlotErrCallbackArg;
-
typedef struct ApplyExecutionData
{
EState *estate; /* executor state, used to track resources */
@@ -221,6 +215,27 @@ typedef struct ApplyExecutionData
PartitionTupleRouting *proute; /* partition routing info */
} ApplyExecutionData;
+/* Struct for saving and restoring apply information */
+typedef struct ApplyErrCallbackArg
+{
+ LogicalRepMsgType command; /* 0 if invalid */
+ LogicalRepRelMapEntry *rel;
+
+ /* Remote information */
+ int remote_attnum; /* -1 if invalid */
+ TransactionId remote_xid;
+ TimestampTz commit_ts;
+} ApplyErrCallbackArg;
+
+static ApplyErrCallbackArg apply_error_callback_arg =
+{
+ .command = 0,
+ .rel = NULL,
+ .remote_attnum = -1,
+ .remote_xid = InvalidTransactionId,
+ .commit_ts = 0,
+};
+
/*
* Stream xid hash entry. Whenever we see a new xid we create this entry in the
* xidhash and along with it create the streaming file and store the fileset handle.
@@ -335,6 +350,11 @@ static void TwoPhaseTransactionGid(Oid subid, TransactionId xid, char *gid, int
/* Common streaming function to apply all the spooled messages */
static void apply_spooled_messages(TransactionId xid, XLogRecPtr lsn);
+/* Functions for apply error callback */
+static void apply_error_callback(void *arg);
+static inline void set_apply_error_context_xact(TransactionId xid, TimestampTz commit_ts);
+static inline void reset_apply_error_context_info(void);
+
/*
* Should this worker apply changes for given relation.
*
@@ -580,26 +600,6 @@ slot_fill_defaults(LogicalRepRelMapEntry *rel, EState *estate,
ExecEvalExpr(defexprs[i], econtext, &slot->tts_isnull[defmap[i]]);
}
-/*
- * Error callback to give more context info about data conversion failures
- * while reading data from the remote server.
- */
-static void
-slot_store_error_callback(void *arg)
-{
- SlotErrCallbackArg *errarg = (SlotErrCallbackArg *) arg;
- LogicalRepRelMapEntry *rel;
-
- /* Nothing to do if remote attribute number is not set */
- if (errarg->remote_attnum < 0)
- return;
-
- rel = errarg->rel;
- errcontext("processing remote data for replication target relation \"%s.%s\" column \"%s\"",
- rel->remoterel.nspname, rel->remoterel.relname,
- rel->remoterel.attnames[errarg->remote_attnum]);
-}
-
/*
* Store tuple data into slot.
*
@@ -611,19 +611,9 @@ slot_store_data(TupleTableSlot *slot, LogicalRepRelMapEntry *rel,
{
int natts = slot->tts_tupleDescriptor->natts;
int i;
- SlotErrCallbackArg errarg;
- ErrorContextCallback errcallback;
ExecClearTuple(slot);
- /* Push callback + info on the error context stack */
- errarg.rel = rel;
- errarg.remote_attnum = -1;
- errcallback.callback = slot_store_error_callback;
- errcallback.arg = (void *) &errarg;
- errcallback.previous = error_context_stack;
- error_context_stack = &errcallback;
-
/* Call the "in" function for each non-dropped, non-null attribute */
Assert(natts == rel->attrmap->maplen);
for (i = 0; i < natts; i++)
@@ -637,7 +627,7 @@ slot_store_data(TupleTableSlot *slot, LogicalRepRelMapEntry *rel,
Assert(remoteattnum < tupleData->ncols);
- errarg.remote_attnum = remoteattnum;
+ apply_error_callback_arg.remote_attnum = remoteattnum;
if (tupleData->colstatus[remoteattnum] == LOGICALREP_COLUMN_TEXT)
{
@@ -685,7 +675,7 @@ slot_store_data(TupleTableSlot *slot, LogicalRepRelMapEntry *rel,
slot->tts_isnull[i] = true;
}
- errarg.remote_attnum = -1;
+ apply_error_callback_arg.remote_attnum = -1;
}
else
{
@@ -699,9 +689,6 @@ slot_store_data(TupleTableSlot *slot, LogicalRepRelMapEntry *rel,
}
}
- /* Pop the error context stack */
- error_context_stack = errcallback.previous;
-
ExecStoreVirtualTuple(slot);
}
@@ -724,8 +711,6 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
{
int natts = slot->tts_tupleDescriptor->natts;
int i;
- SlotErrCallbackArg errarg;
- ErrorContextCallback errcallback;
/* We'll fill "slot" with a virtual tuple, so we must start with ... */
ExecClearTuple(slot);
@@ -739,14 +724,6 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
memcpy(slot->tts_values, srcslot->tts_values, natts * sizeof(Datum));
memcpy(slot->tts_isnull, srcslot->tts_isnull, natts * sizeof(bool));
- /* For error reporting, push callback + info on the error context stack */
- errarg.rel = rel;
- errarg.remote_attnum = -1;
- errcallback.callback = slot_store_error_callback;
- errcallback.arg = (void *) &errarg;
- errcallback.previous = error_context_stack;
- error_context_stack = &errcallback;
-
/* Call the "in" function for each replaced attribute */
Assert(natts == rel->attrmap->maplen);
for (i = 0; i < natts; i++)
@@ -763,7 +740,7 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
{
StringInfo colvalue = &tupleData->colvalues[remoteattnum];
- errarg.remote_attnum = remoteattnum;
+ apply_error_callback_arg.remote_attnum = remoteattnum;
if (tupleData->colstatus[remoteattnum] == LOGICALREP_COLUMN_TEXT)
{
@@ -807,13 +784,10 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
slot->tts_isnull[i] = true;
}
- errarg.remote_attnum = -1;
+ apply_error_callback_arg.remote_attnum = -1;
}
}
- /* Pop the error context stack */
- error_context_stack = errcallback.previous;
-
/* And finally, declare that "slot" contains a valid virtual tuple */
ExecStoreVirtualTuple(slot);
}
@@ -827,6 +801,7 @@ apply_handle_begin(StringInfo s)
LogicalRepBeginData begin_data;
logicalrep_read_begin(s, &begin_data);
+ set_apply_error_context_xact(begin_data.xid, begin_data.committime);
remote_final_lsn = begin_data.final_lsn;
@@ -860,6 +835,7 @@ apply_handle_commit(StringInfo s)
process_syncing_tables(commit_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -877,6 +853,7 @@ apply_handle_begin_prepare(StringInfo s)
errmsg_internal("tablesync worker received a BEGIN PREPARE message")));
logicalrep_read_begin_prepare(s, &begin_data);
+ set_apply_error_context_xact(begin_data.xid, begin_data.prepare_time);
remote_final_lsn = begin_data.prepare_lsn;
@@ -962,6 +939,7 @@ apply_handle_prepare(StringInfo s)
process_syncing_tables(prepare_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -974,6 +952,7 @@ apply_handle_commit_prepared(StringInfo s)
char gid[GIDSIZE];
logicalrep_read_commit_prepared(s, &prepare_data);
+ set_apply_error_context_xact(prepare_data.xid, prepare_data.commit_time);
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
@@ -1001,6 +980,7 @@ apply_handle_commit_prepared(StringInfo s)
process_syncing_tables(prepare_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1013,6 +993,7 @@ apply_handle_rollback_prepared(StringInfo s)
char gid[GIDSIZE];
logicalrep_read_rollback_prepared(s, &rollback_data);
+ set_apply_error_context_xact(rollback_data.xid, rollback_data.rollback_time);
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, rollback_data.xid,
@@ -1050,6 +1031,7 @@ apply_handle_rollback_prepared(StringInfo s)
process_syncing_tables(rollback_data.rollback_end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1076,6 +1058,7 @@ apply_handle_stream_prepare(StringInfo s)
errmsg_internal("tablesync worker received a STREAM PREPARE message")));
logicalrep_read_stream_prepare(s, &prepare_data);
+ set_apply_error_context_xact(prepare_data.xid, prepare_data.prepare_time);
elog(DEBUG1, "received prepare for streamed transaction %u", prepare_data.xid);
@@ -1100,6 +1083,8 @@ apply_handle_stream_prepare(StringInfo s)
process_syncing_tables(prepare_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+
+ reset_apply_error_context_info();
}
/*
@@ -1156,6 +1141,8 @@ apply_handle_stream_start(StringInfo s)
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg_internal("invalid transaction ID in streamed replication transaction")));
+ set_apply_error_context_xact(stream_xid, 0);
+
/*
* Initialize the xidhash table if we haven't yet. This will be used for
* the entire duration of the apply worker so create it in permanent
@@ -1212,6 +1199,7 @@ apply_handle_stream_stop(StringInfo s)
MemoryContextReset(LogicalStreamingContext);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1235,7 +1223,10 @@ apply_handle_stream_abort(StringInfo s)
* just delete the files with serialized info.
*/
if (xid == subxid)
+ {
+ set_apply_error_context_xact(xid, 0);
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
+ }
else
{
/*
@@ -1260,6 +1251,8 @@ apply_handle_stream_abort(StringInfo s)
char path[MAXPGPATH];
StreamXidHash *ent;
+ set_apply_error_context_xact(subxid, 0);
+
subidx = -1;
begin_replication_step();
subxact_info_read(MyLogicalRepWorker->subid, xid);
@@ -1284,6 +1277,7 @@ apply_handle_stream_abort(StringInfo s)
cleanup_subxact_info();
end_replication_step();
CommitTransactionCommand();
+ reset_apply_error_context_info();
return;
}
@@ -1315,6 +1309,8 @@ apply_handle_stream_abort(StringInfo s)
end_replication_step();
CommitTransactionCommand();
}
+
+ reset_apply_error_context_info();
}
/*
@@ -1459,6 +1455,7 @@ apply_handle_stream_commit(StringInfo s)
errmsg_internal("STREAM COMMIT message without STREAM STOP")));
xid = logicalrep_read_stream_commit(s, &commit_data);
+ set_apply_error_context_xact(xid, commit_data.committime);
elog(DEBUG1, "received commit for streamed transaction %u", xid);
@@ -1473,6 +1470,8 @@ apply_handle_stream_commit(StringInfo s)
process_syncing_tables(commit_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+
+ reset_apply_error_context_info();
}
/*
@@ -1592,6 +1591,9 @@ apply_handle_insert(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ apply_error_callback_arg.rel = rel;
+
/* Initialize the executor state. */
edata = create_edata_for_relation(rel);
estate = edata->estate;
@@ -1615,6 +1617,9 @@ apply_handle_insert(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ apply_error_callback_arg.rel = NULL;
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -1713,6 +1718,9 @@ apply_handle_update(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ apply_error_callback_arg.rel = rel;
+
/* Check if we can do the update. */
check_relation_updatable(rel);
@@ -1766,6 +1774,9 @@ apply_handle_update(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ apply_error_callback_arg.rel = NULL;
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -1869,6 +1880,9 @@ apply_handle_delete(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ apply_error_callback_arg.rel = rel;
+
/* Check if we can do the delete. */
check_relation_updatable(rel);
@@ -1894,6 +1908,9 @@ apply_handle_delete(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ apply_error_callback_arg.rel = NULL;
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -2328,44 +2345,62 @@ static void
apply_dispatch(StringInfo s)
{
LogicalRepMsgType action = pq_getmsgbyte(s);
+ ErrorContextCallback errcallback;
+ bool set_callback = false;
+
+ /*
+ * Push apply error context callback if not yet. Other fields will be
+ * filled during applying the change. Since this function can be called
+ * recursively when applying spooled changes, we set the callback only
+ * once.
+ */
+ if (apply_error_callback_arg.command == 0)
+ {
+ errcallback.callback = apply_error_callback;
+ errcallback.previous = error_context_stack;
+ error_context_stack = &errcallback;
+ set_callback = true;
+ }
+
+ apply_error_callback_arg.command = action;
switch (action)
{
case LOGICAL_REP_MSG_BEGIN:
apply_handle_begin(s);
- return;
+ break;
case LOGICAL_REP_MSG_COMMIT:
apply_handle_commit(s);
- return;
+ break;
case LOGICAL_REP_MSG_INSERT:
apply_handle_insert(s);
- return;
+ break;
case LOGICAL_REP_MSG_UPDATE:
apply_handle_update(s);
- return;
+ break;
case LOGICAL_REP_MSG_DELETE:
apply_handle_delete(s);
- return;
+ break;
case LOGICAL_REP_MSG_TRUNCATE:
apply_handle_truncate(s);
- return;
+ break;
case LOGICAL_REP_MSG_RELATION:
apply_handle_relation(s);
- return;
+ break;
case LOGICAL_REP_MSG_TYPE:
apply_handle_type(s);
- return;
+ break;
case LOGICAL_REP_MSG_ORIGIN:
apply_handle_origin(s);
- return;
+ break;
case LOGICAL_REP_MSG_MESSAGE:
@@ -2374,49 +2409,53 @@ apply_dispatch(StringInfo s)
* Although, it could be used by other applications that use this
* output plugin.
*/
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_START:
apply_handle_stream_start(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_END:
apply_handle_stream_stop(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_ABORT:
apply_handle_stream_abort(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_COMMIT:
apply_handle_stream_commit(s);
- return;
+ break;
case LOGICAL_REP_MSG_BEGIN_PREPARE:
apply_handle_begin_prepare(s);
- return;
+ break;
case LOGICAL_REP_MSG_PREPARE:
apply_handle_prepare(s);
- return;
+ break;
case LOGICAL_REP_MSG_COMMIT_PREPARED:
apply_handle_commit_prepared(s);
- return;
+ break;
case LOGICAL_REP_MSG_ROLLBACK_PREPARED:
apply_handle_rollback_prepared(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_PREPARE:
apply_handle_stream_prepare(s);
- return;
+ break;
+
+ default:
+ ereport(ERROR,
+ (errcode(ERRCODE_PROTOCOL_VIOLATION),
+ errmsg("invalid logical replication message type \"%c\"", action)));
}
- ereport(ERROR,
- (errcode(ERRCODE_PROTOCOL_VIOLATION),
- errmsg_internal("invalid logical replication message type \"%c\"",
- action)));
+ /* Pop the error context stack */
+ if (set_callback)
+ error_context_stack = errcallback.previous;
}
/*
@@ -3649,3 +3688,55 @@ IsLogicalWorker(void)
{
return MyLogicalRepWorker != NULL;
}
+
+/* Error callback to give more context info about the change being applied */
+static void
+apply_error_callback(void *arg)
+{
+ StringInfoData buf;
+ ApplyErrCallbackArg *errarg = &apply_error_callback_arg;
+
+ if (apply_error_callback_arg.command == 0)
+ return;
+
+ initStringInfo(&buf);
+ appendStringInfo(&buf, _("processing remote data during \"%s\""),
+ logicalrep_message_type(errarg->command));
+
+ if (errarg->rel)
+ appendStringInfo(&buf, _(" for replication target relation \"%s.%s\""),
+ errarg->rel->remoterel.nspname,
+ errarg->rel->remoterel.relname);
+
+ if (errarg->remote_attnum >= 0)
+ appendStringInfo(&buf, _(" column \"%s\""),
+ errarg->rel->remoterel.attnames[errarg->remote_attnum]);
+
+ if (TransactionIdIsNormal(errarg->remote_xid))
+ appendStringInfo(&buf, _(" in transaction id %u with commit timestamp %s"),
+ errarg->remote_xid,
+ errarg->commit_ts == 0
+ ? "(unset)"
+ : timestamptz_to_str(errarg->commit_ts));
+
+ errcontext("%s", buf.data);
+ pfree(buf.data);
+}
+
+/* Set transaction information of apply error callback */
+static inline void
+set_apply_error_context_xact(TransactionId xid, TimestampTz commit_ts)
+{
+ apply_error_callback_arg.remote_xid = xid;
+ apply_error_callback_arg.commit_ts = commit_ts;
+}
+
+/* Reset all information of apply error callback */
+static inline void
+reset_apply_error_context_info(void)
+{
+ apply_error_callback_arg.command = 0;
+ apply_error_callback_arg.rel = NULL;
+ apply_error_callback_arg.remote_attnum = -1;
+ set_apply_error_context_xact(InvalidTransactionId, 0);
+}
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 2e29513151..af89f58fd3 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -246,5 +246,6 @@ extern void logicalrep_write_stream_abort(StringInfo out, TransactionId xid,
TransactionId subxid);
extern void logicalrep_read_stream_abort(StringInfo in, TransactionId *xid,
TransactionId *subxid);
+extern char *logicalrep_message_type(LogicalRepMsgType action);
#endif /* LOGICAL_PROTO_H */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 37cf4b2f76..2dea7b1ac7 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -113,6 +113,7 @@ Append
AppendPath
AppendRelInfo
AppendState
+ApplyErrCallbackArg
ApplyExecutionData
ApplySubXactData
Archive
@@ -2423,7 +2424,6 @@ SlabBlock
SlabChunk
SlabContext
SlabSlot
-SlotErrCallbackArg
SlotNumber
SlruCtl
SlruCtlData
--
2.24.3 (Apple Git-128)
v8-0004-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchapplication/octet-stream; name=v8-0004-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchDownload
From dfad4cb08b39f6034ea0dc1472fc5d959f0c2670 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 2 Aug 2021 14:27:40 +0900
Subject: [PATCH v8 4/5] Add skip_xid option to ALTER SUBSCRIPTION.
If incoming change violates any constraint, logical replication stops
until it's resolved. This commit introduces another way to skip the
transaction in question.
The user can specify XID by ALTER SUBSCRIPTION ... SET (skip_xid =
XXX), updating pg_subscription.subskipxid field, telling the apply
worker to skip the transaction. The apply worker skips all data
modification changes within the specified transaction.
After skipping the transaciton the apply worker clears
pg_subscription.subskipxid. Also, it clears the error statistics of
the subscription in pg_stat_subscription_errors system view as well in
order the user not to get confused. It's done by sending the message
for clearing a subscription error to the stats collector.
---
doc/src/sgml/logical-replication.sgml | 49 ++++-
doc/src/sgml/ref/alter_subscription.sgml | 32 ++-
src/backend/catalog/pg_subscription.c | 10 +
src/backend/commands/subscriptioncmds.c | 45 +++-
src/backend/postmaster/pgstat.c | 44 +++-
src/backend/replication/logical/worker.c | 201 ++++++++++++++++-
src/include/catalog/pg_subscription.h | 4 +
src/include/nodes/parsenodes.h | 1 +
src/include/pgstat.h | 8 +-
src/test/regress/expected/subscription.out | 13 ++
src/test/regress/sql/subscription.sql | 11 +
src/test/subscription/t/024_skip_xact.pl | 244 +++++++++++++++++++++
12 files changed, 637 insertions(+), 25 deletions(-)
create mode 100644 src/test/subscription/t/024_skip_xact.pl
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 88646bc859..1e3c8c40f5 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -333,20 +333,63 @@
<para>
A conflict will produce an error and will stop the replication; it must be
resolved manually by the user. Details about the conflict can be found in
- the subscriber's server log.
+ <xref linkend="monitoring-pg-stat-subscription-errors"/> as well as the
+ subscriber's server log.
</para>
<para>
The resolution can be done either by changing data on the subscriber so
that it does not conflict with the incoming change or by skipping the
- transaction that conflicts with the existing data. The transaction can be
- skipped by calling the <link linkend="pg-replication-origin-advance">
+ transaction that conflicts with the existing data. When a conflict produce
+ an error, it is shown in <structname>pg_stat_subscription_errors</structname>
+ view as follows:
+ </para>
+
+ <programlisting>
+postgres=# SELECT * FROM pg_stat_subscription_errors;
+-[ RECORD 1 ]--------+-----------------------------------------------------------
+datname | postgres
+subid | 16395
+subname | test_sub
+relid | 16385
+command | INSERT
+xid | 716
+failure_source | apply
+failure_count | 50
+last_failure | 2021-07-21 21:16:02.781779+00
+last_failure_message | duplicate key value violates unique constraint "test_pkey"
+stats_reset |
+</programlisting>
+
+ <para>
+ and it is also shown in subscriber's server log as follows:
+ </para>
+
+<screen>
+ERROR: duplicate key value violates unique constraint "test_pkey"
+DETAIL: Key (id)=(1) already exists.
+CONTEXT: during apply of "INSERT" for relation "public.test" in transaction with xid 716 committs 2021-07-15 21:54:58.802874+00
+</screen>
+
+ <para>
+ The transaction ID that contains the change violating the constraint can be
+ found from those outputs (transaction ID 740 in the above case). The transaction
+ can be skipped by setting <replaceable>skip_xid</replaceable> to the subscription
+ by <command>ALTER SUBSCRIPTION ... SET</command>. Alternatively, the transaction
+ can also be skipped by calling the <link linkend="pg-replication-origin-advance">
<function>pg_replication_origin_advance()</function></link> function with
a <parameter>node_name</parameter> corresponding to the subscription name,
and a position. The current position of origins can be seen in the
<link linkend="view-pg-replication-origin-status">
<structname>pg_replication_origin_status</structname></link> system view.
</para>
+
+ <para>
+ In either way, those should be used as a last resort. They skip the whole
+ transaction including changes that may not violate any constraint and easily
+ make subscriber inconsistent if a user specifies the wrong transaction ID or
+ the position of origin.
+ </para>
</sect1>
<sect1 id="logical-replication-restrictions">
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 8c3c28b7e7..cfb318e08c 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -205,8 +205,36 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
<literal>streaming</literal>.
</para>
<para>
- The parameters that can be reset are: <literal>streaming</literal>,
- <literal>binary</literal>, <literal>synchronous_commit</literal>.
+ The parameters that can be reset are: <literal>slot_name</literal>,
+ <literal>synchronous_commit</literal>, <literal>binary</literal>,
+ <literal>streaming</literal>, and following parameter:
+ </para>
+ <para>
+ <variablelist>
+ <varlistentry>
+ <term><literal>skip_xid</literal> (<type>xid</type>)</term>
+ <listitem>
+ <para>
+ If incoming data violates any constraints the logical replication
+ will stop until it is resolved. The resolution can be done either
+ by changing data on the subscriber so that it doesn't conflict with
+ incoming change or by skipping the whole transaction. This option
+ specifies transaction ID that logical replication worker skips to
+ apply. The logical replication worker skips all data modification
+ changes within the specified transaction. Therefore, since it skips
+ the whole transaction including the changes that may not violate the
+ constraint, it should only be used as a last resort. This option has
+ no effect for the transaction that is already prepared with enabling
+ <literal>two_phase</literal> on susbscriber. After the logical
+ replication successfully skips the transaction, the transaction ID
+ (stored in
+ <structname>pg_subscription</structname>.<structfield>subskipxid</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index 25021e25a4..cb22cd7463 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -104,6 +104,16 @@ GetSubscription(Oid subid, bool missing_ok)
Assert(!isnull);
sub->publications = textarray_to_stringlist(DatumGetArrayTypeP(datum));
+ /* Get skip XID */
+ datum = SysCacheGetAttr(SUBSCRIPTIONOID,
+ tup,
+ Anum_pg_subscription_subskipxid,
+ &isnull);
+ if (!isnull)
+ sub->skipxid = DatumGetTransactionId(datum);
+ else
+ sub->skipxid = InvalidTransactionId;
+
ReleaseSysCache(tup);
return sub;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index cc390ce95a..188f3e42fd 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -60,6 +60,7 @@
#define SUBOPT_BINARY 0x00000080
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
+#define SUBOPT_SKIP_XID 0x00000400
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -81,6 +82,7 @@ typedef struct SubOpts
bool binary;
bool streaming;
bool twophase;
+ TransactionId skip_xid;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -129,6 +131,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->streaming = false;
if (IsSet(supported_opts, SUBOPT_TWOPHASE_COMMIT))
opts->twophase = false;
+ if (IsSet(supported_opts, SUBOPT_SKIP_XID))
+ opts->skip_xid = InvalidTransactionId;
/* Parse options */
foreach(lc, stmt_options)
@@ -261,6 +265,29 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_TWOPHASE_COMMIT;
opts->twophase = defGetBoolean(defel);
}
+ else if (strcmp(defel->defname, "skip_xid") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_SKIP_XID))
+ errorConflictingDefElem(defel, pstate);
+
+ if (!is_reset)
+ {
+ char *xid_str = defGetString(defel);
+ TransactionId xid;
+
+ /* Parse the argument as TransactionId */
+ xid = DatumGetTransactionId(DirectFunctionCall1(xidin,
+ CStringGetDatum(xid_str)));
+
+ if (!TransactionIdIsNormal(xid))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid transaction id")));
+ opts->skip_xid = xid;
+ }
+
+ opts->specified_opts |= SUBOPT_SKIP_XID;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -485,6 +512,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
CharGetDatum(opts.twophase ?
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -885,7 +913,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
{
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
- SUBOPT_STREAMING);
+ SUBOPT_STREAMING | SUBOPT_SKIP_XID);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts, false);
@@ -934,6 +962,13 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
replaces[Anum_pg_subscription_substream - 1] = true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_SKIP_XID))
+ {
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(opts.skip_xid);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+ }
+
update_tuple = true;
break;
}
@@ -941,7 +976,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
case ALTER_SUBSCRIPTION_RESET_OPTIONS:
{
supported_opts = (SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
- SUBOPT_STREAMING);
+ SUBOPT_STREAMING | SUBOPT_SKIP_XID);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts, true);
@@ -967,6 +1002,12 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
replaces[Anum_pg_subscription_substream - 1] = true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_SKIP_XID))
+ {
+ nulls[Anum_pg_subscription_subskipxid - 1] =
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+ }
+
update_tuple = true;
break;
}
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 498ea63863..90172fc99d 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -1745,11 +1745,32 @@ pgstat_reset_subscription_error(Oid subid, Oid subrelid)
pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
msg.m_subid = subid;
msg.m_subrelid = subrelid;
+ msg.m_clear = false;
msg.m_reset = true;
pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
}
+/* ----------
+ * pgstat_clear_subscription_error() -
+ *
+ * Tell the collector about clear the error of subscription.
+ * ----------
+ */
+void
+pgstat_clear_subscription_error(Oid subid, Oid subrelid)
+{
+ PgStat_MsgSubscriptionErr msg;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_clear = true;
+ msg.m_reset = false;
+
+ pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
+}
+
/* ----------
* pgstat_report_autovac() -
*
@@ -2092,6 +2113,7 @@ pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
msg.m_subrelid = subrelid;
msg.m_databaseid = MyDatabaseId;
msg.m_relid = relid;
+ msg.m_clear = false;
msg.m_reset = false;
msg.m_command = command;
msg.m_xid = xid;
@@ -6224,27 +6246,37 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
static void
pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len)
{
+
PgStat_StatSubErrEntry *errent;
- bool create = !msg->m_reset;
+ bool create = !(msg->m_reset || msg->m_clear);
/* Get subscription error */
errent = pgstat_get_subscription_error_entry(msg->m_subid,
msg->m_subrelid,
create);
- if (msg->m_reset)
+ if (msg->m_reset || msg->m_clear)
{
+ Assert(!(msg->m_reset && msg->m_clear));
+
if (errent == NULL)
return;
- /* reset fields and set reset timestamp */
errent->relid = InvalidOid;
errent->command = 0;
errent->xid = InvalidTransactionId;
errent->failure_count = 0;
- errent->last_failure = 0;
- errent->last_errmsg[0] = '\0';
- errent->stat_reset_timestamp = GetCurrentTimestamp();
+
+ /*
+ * If the reset is requested, reset more fields and set the reset
+ * timestamp.
+ */
+ if (msg->m_reset)
+ {
+ errent->last_failure = 0;
+ errent->last_errmsg[0] = '\0';
+ errent->stat_reset_timestamp = GetCurrentTimestamp();
+ }
}
else
{
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 7ecabaacb8..db31d29e44 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -136,6 +136,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -269,6 +270,21 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * skipping_xid is a valid XID if we're skipping all data modification changes
+ * (INSERT/DELETE/UPDATE/TRUNCATE) of the specified transaction in MySubscription->skipxid.
+ * Please note that we don’t skip receiving the changes particularly in streaming
+ * cases, since we decide whether or not to skip applying the changes when starting
+ * to apply. Once starting skipping changes, we copy the XID to skipping_xid and
+ * then don't stop skipping until we skip the whole transaction even if the
+ * subscription is invalidated and* MySubscription->skipxid gets changed or reset.
+ * When stopping the skipping behavior, we reset the skip XID (subskipxid) in the
+ * pg_subscription catalog and associate origin status to the transaction that resets
+ * the skip XID so that we can start streaming from the next transaction.
+ */
+static TransactionId skipping_xid = InvalidTransactionId;
+#define is_skipping_changes() (TransactionIdIsValid(skipping_xid))
+
/*
* Hash table for storing the streaming xid information along with shared file
* set for streaming and subxact files.
@@ -355,6 +371,9 @@ static void apply_error_callback(void *arg);
static inline void set_apply_error_context_xact(TransactionId xid, TimestampTz commit_ts);
static inline void reset_apply_error_context_info(void);
+static void maybe_start_skipping_changes(TransactionId xid);
+static void stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs);
+
/*
* Should this worker apply changes for given relation.
*
@@ -805,6 +824,11 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
+ /*
+ * Enable skipping all changes of this transaction if specified.
+ */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -829,7 +853,18 @@ apply_handle_commit(StringInfo s)
LSN_FORMAT_ARGS(commit_data.commit_lsn),
LSN_FORMAT_ARGS(remote_final_lsn))));
- apply_handle_commit_internal(&commit_data);
+ /*
+ * Stop the skipping transaction if enabled. Otherwise, commit the changes
+ * that are just applied.
+ */
+ if (is_skipping_changes())
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ apply_handle_commit_internal(&commit_data);
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(commit_data.end_lsn);
@@ -857,6 +892,9 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
+ /* Enable skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -915,9 +953,10 @@ apply_handle_prepare(StringInfo s)
/*
* Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction. It is done this way because at
- * commit prepared time, we won't know whether we have skipped preparing a
- * transaction because of no change.
+ * change has happened in this transaction, possibly because we're
+ * skipping data-modification changes of this transaction. It is done this
+ * way because at commit prepared time, we won't know whether we have
+ * skipped preparing a transaction because of no change.
*
* XXX, We can optimize such that at commit prepared time, we first check
* whether we have prepared the transaction or not but that doesn't seem
@@ -931,6 +970,10 @@ apply_handle_prepare(StringInfo s)
CommitTransactionCommand();
pgstat_report_stat(false);
+ /* Stop the skipping changes if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
store_flush_position(prepare_data.end_lsn);
in_remote_transaction = false;
@@ -1062,6 +1105,9 @@ apply_handle_stream_prepare(StringInfo s)
elog(DEBUG1, "received prepare for streamed transaction %u", prepare_data.xid);
+ /* Enable skipping all changes of this transaction if specified. */
+ maybe_start_skipping_changes(prepare_data.xid);
+
/* Replay all the spooled operations. */
apply_spooled_messages(prepare_data.xid, prepare_data.prepare_lsn);
@@ -1072,6 +1118,10 @@ apply_handle_stream_prepare(StringInfo s)
pgstat_report_stat(false);
+ /* Stop the skipping changes if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
store_flush_position(prepare_data.end_lsn);
in_remote_transaction = false;
@@ -1097,9 +1147,10 @@ apply_handle_origin(StringInfo s)
{
/*
* ORIGIN message can only come inside streaming transaction or inside
- * remote transaction and before any actual writes.
+ * remote transaction and before any actual writes unless we're skipping
+ * changes of the transaction.
*/
- if (!in_streamed_transaction &&
+ if (!in_streamed_transaction && !is_skipping_changes() &&
(!in_remote_transaction ||
(IsTransactionState() && !am_tablesync_worker())))
ereport(ERROR,
@@ -1121,6 +1172,9 @@ apply_handle_stream_start(StringInfo s)
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg_internal("duplicate STREAM START message")));
+ /* extract XID of the top-level transaction */
+ stream_xid = logicalrep_read_stream_start(s, &first_segment);
+
/*
* Start a transaction on stream start, this transaction will be committed
* on the stream stop unless it is a tablesync worker in which case it
@@ -1133,9 +1187,6 @@ apply_handle_stream_start(StringInfo s)
/* notify handle methods we're processing a remote transaction */
in_streamed_transaction = true;
- /* extract XID of the top-level transaction */
- stream_xid = logicalrep_read_stream_start(s, &first_segment);
-
if (!TransactionIdIsValid(stream_xid))
ereport(ERROR,
(errcode(ERRCODE_PROTOCOL_VIOLATION),
@@ -1217,6 +1268,7 @@ apply_handle_stream_abort(StringInfo s)
errmsg_internal("STREAM ABORT message without STREAM STOP")));
logicalrep_read_stream_abort(s, &xid, &subxid);
+ maybe_start_skipping_changes(xid);
/*
* If the two XIDs are the same, it's in fact abort of toplevel xact, so
@@ -1310,6 +1362,10 @@ apply_handle_stream_abort(StringInfo s)
CommitTransactionCommand();
}
+ /* Stop the skipping transaction if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
reset_apply_error_context_info();
}
@@ -1459,9 +1515,23 @@ apply_handle_stream_commit(StringInfo s)
elog(DEBUG1, "received commit for streamed transaction %u", xid);
+ /* Enable skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(xid);
+
apply_spooled_messages(xid, commit_data.commit_lsn);
- apply_handle_commit_internal(&commit_data);
+ if (is_skipping_changes())
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ {
+ /* commit the streamed transaction */
+ apply_handle_commit_internal(&commit_data);
+ }
/* unlink the files with serialized changes and subxact info */
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
@@ -2348,6 +2418,17 @@ apply_dispatch(StringInfo s)
ErrorContextCallback errcallback;
bool set_callback = false;
+ /*
+ * Skip all data-modification changes if we're skipping changes of this
+ * transaction.
+ */
+ if (is_skipping_changes() &&
+ (action == LOGICAL_REP_MSG_INSERT ||
+ action == LOGICAL_REP_MSG_UPDATE ||
+ action == LOGICAL_REP_MSG_DELETE ||
+ action == LOGICAL_REP_MSG_TRUNCATE))
+ return;
+
/*
* Push apply error context callback if not yet. Other fields will be
* filled during applying the change. Since this function can be called
@@ -3774,3 +3855,103 @@ reset_apply_error_context_info(void)
apply_error_callback_arg.remote_attnum = -1;
set_apply_error_context_xact(InvalidTransactionId, 0);
}
+
+/*
+ * Start skipping changes of the transaction if the given XID matches the
+ * transaction ID specified by skip_xid option.
+ */
+static void
+maybe_start_skipping_changes(TransactionId xid)
+{
+ Assert(!is_skipping_changes());
+ Assert(!TransactionIdIsValid(skipping_xid));
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ if (!TransactionIdIsValid(MySubscription->skipxid) ||
+ MySubscription->skipxid != xid)
+ return;
+
+ skipping_xid = xid;
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction with xid %u",
+ skipping_xid));
+}
+
+/*
+ * Stop skipping changes and reset the skip XID. Also, reset the skip XID
+ * (pg_subscription.subskipxid). If origin_lsn and origin_committs are valid, we
+ * set origin state to the transaction commit that resets the skip XID so that we
+ * can start streaming from the transaction next to the one that we just skipped.
+ */
+static void
+stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs)
+{
+ Relation rel;
+ HeapTuple tup;
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ Assert(is_skipping_changes());
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction with xid %u",
+ skipping_xid)));
+
+ /* Stop skipping changes */
+ skipping_xid = InvalidTransactionId;
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ if (!IsTransactionState())
+ StartTransactionCommand();
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ TransactionIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ /* Set subskipxid to null */
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ /* Update the system catalog to reset the skip XID */
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+
+ heap_freetuple(tup);
+ table_close(rel, RowExclusiveLock);
+
+ if (!XLogRecPtrIsInvalid(origin_lsn))
+ {
+ /*
+ * Update origin state so we can restart streaming from correct
+ * position in case of crash.
+ */
+ replorigin_session_origin_lsn = origin_lsn;
+ replorigin_session_origin_timestamp = origin_committs;
+ }
+
+ CommitTransactionCommand();
+ pgstat_report_stat(false);
+
+ /*
+ * Clear the error statistics of this subscription to let users know the
+ * subscription is no longer getting stuck by the conflict.
+ *
+ * The message for clearing the error statistics can be lost but that's
+ * okay. The user can know the logical replication is working fine in
+ * other ways, for example, checking pg_stat_subscription view. And the
+ * user is able to reset the single subscription error statistics by
+ * pg_reset_subscription_error SQL function.
+ */
+ pgstat_clear_subscription_error(MySubscription->oid, InvalidOid);
+}
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 21061493ea..beaa6e646d 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -67,6 +67,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
char subtwophasestate; /* Stream two-phase transactions */
+ TransactionId subskipxid BKI_FORCE_NULL; /* All changes associated with
+ * this XID are skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -103,6 +106,7 @@ typedef struct Subscription
* binary format */
bool stream; /* Allow streaming in-progress transactions. */
char twophasestate; /* Allow streaming two-phase transactions */
+ TransactionId skipxid; /* All changes of the XID are skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 504d65f7d6..aec06b0d23 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3677,6 +3677,7 @@ typedef struct AlterSubscriptionStmt
char *conninfo; /* Connection string to publisher */
List *publication; /* One or more publication to subscribe to */
List *options; /* List of DefElem nodes */
+ TransactionId skip_xid; /* XID to skip */
} AlterSubscriptionStmt;
typedef struct DropSubscriptionStmt
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index f66eb290bb..a611ca7cce 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -556,7 +556,7 @@ typedef struct PgStat_MsgReplSlot
/* ----------
* PgStat_MsgSubscriptionErr Sent by the apply worker or the table sync worker to
- * update/reset the error happening during logical
+ * update/reset/clear the error happening during logical
* replication.
* ----------
*/
@@ -574,7 +574,10 @@ typedef struct PgStat_MsgSubscriptionErr
Oid m_subid;
Oid m_subrelid;
- /* The clear message uses below field */
+
+ /* The clear and reset messages use below fields */
+ bool m_clear; /* clear all fields except for last_failure and
+ * last_errmsg */
bool m_reset; /* Reset all fields and set reset_stats
* timestamp */
@@ -1133,6 +1136,7 @@ extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type t
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
extern void pgstat_reset_subscription_error(Oid subid, Oid subrelid);
+extern void pgstat_clear_subscription_error(Oid subid, Oid subrelid);
extern void pgstat_report_autovac(Oid dboid);
extern void pgstat_report_vacuum(Oid tableoid, bool shared,
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index b87f67fe55..217b5fabd1 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -296,6 +296,19 @@ ERROR: unrecognized subscription parameter: "enabled"
-- fail - RESET must not include values
ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
ERROR: RESET must not include values for parameters
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = '4294967295');
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1.1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+ERROR: invalid transaction id
\dRs+
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index aa90560691..4c9d25f0a4 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -231,6 +231,17 @@ ALTER SUBSCRIPTION regress_testsub RESET (enabled);
-- fail - RESET must not include values
ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = '4294967295');
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1.1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+
\dRs+
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/subscription/t/024_skip_xact.pl b/src/test/subscription/t/024_skip_xact.pl
new file mode 100644
index 0000000000..affb663803
--- /dev/null
+++ b/src/test/subscription/t/024_skip_xact.pl
@@ -0,0 +1,244 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Tests for skipping logical replication transactions
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 10;
+
+# Test if the error reported on pg_subscription_errors view is expected.
+sub test_subscription_error
+{
+ my ($node, $source, $relname, $expected_error, $msg) = @_;
+
+ # Wait for the error statistics to be updated.
+ $node->poll_query_until(
+ 'postgres', qq[
+SELECT count(1) > 0 FROM pg_stat_subscription_errors
+WHERE relid = '$relname'::regclass AND failure_source = '$source';
+]) or die "Timed out while waiting for statistics to be updated";
+
+ my $result = $node->safe_psql(
+ 'postgres',
+ qq[
+SELECT datname, subname, command, relid::regclass, failure_source, failure_count > 0
+FROM pg_stat_subscription_errors
+WHERE relid = '$relname'::regclass AND failure_source = '$source';
+]);
+ is($result, $expected_error, $msg);
+}
+
+# Check the error reported on pg_stat_subscription view and skip the failed
+# transaction.
+sub test_skip_subscription_error
+{
+ my ($node, $source, $subname, $relname, $expected_error, $msg) = @_;
+
+ # Check the reported error.
+ test_subscription_error($node, $source, $relname, $expected_error, $msg);
+
+ # Get XID of the failed transaction.
+ my $skipxid = $node->safe_psql(
+ 'postgres',
+ "SELECT xid FROM pg_stat_subscription_errors WHERE relid = '$relname'::regclass");
+ $node->safe_psql('postgres',
+ "ALTER SUBSCRIPTION $subname SET (skip_xid = '$skipxid')");
+
+ # Restart the subscriber to restart logical replication without interval.
+ $node->restart;
+
+ # Wait for the failed transaction to be skipped.
+ $node->poll_query_until(
+ 'postgres',
+ qq[
+SELECT subskipxid IS NULL FROM pg_subscription
+WHERE subname = '$subname'
+]) or die "Timed out while waiting for the transaction to be skipped";
+
+ # Also wait for the error details to be cleared.
+ $node->poll_query_until(
+ 'postgres',
+ qq[
+SELECT command IS NULL FROM pg_stat_subscription_errors
+WHERE subname = '$subname' AND failure_source = '$source';
+]) or die "Timed out while waiting for the transaction to be skipped";
+}
+
+# Create publisher node.
+my $node_publisher = PostgresNode->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+logical_decoding_work_mem = 64kB
+]);
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgresNode->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+
+# don't overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+wal_retrieve_retry_interval = 5s
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we create
+# the same tables but with primary keys. Also, insert some data that will conflict
+# with the data replicated from publisher later.
+$node_publisher->safe_psql('postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int);
+CREATE TABLE test_tab2 (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+INSERT INTO test_tab1 VALUES (1);
+INSERT INTO test_tab2 VALUES (1);
+COMMIT;
+]);
+$node_subscriber->safe_psql('postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int primary key);
+CREATE TABLE test_tab2 (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab2 VALUES (1);
+INSERT INTO test_tab_streaming SELECT 10000, md5(10000::text);
+COMMIT;
+]);
+
+# Check if there is no subscription errors before starting logical replication.
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_errors");
+is($result, qq(0), 'check no subscription error');
+
+# Setup publications.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ q[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Create subscriptions. The table sync for test_tab2 on tap_sub will enter to
+# infinite error due to violating the unique constraint.
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (two_phase = on);");
+my $appname_streaming = 'tap_sub_streaming';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming' PUBLICATION tap_pub_streaming WITH (streaming = on, two_phase = on);");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Wait for initial table sync for test_tab1 and test_tab_streaming to finish.
+$node_subscriber->poll_query_until('postgres',
+ q[
+SELECT count(1) = 2 FROM pg_subscription_rel
+WHERE srrelid in ('test_tab1'::regclass, 'test_tab_streaming'::regclass) AND srsubstate = 'r'
+]) or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check the initial data.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab1");
+is($result, q(1), 'check initial data was copied to subscriber');
+
+# Insert more data to test_tab1, raising an error on the subscriber due to violating
+# the unique constraint on test_tab1. Then skip the transaction in question.
+$node_publisher->safe_psql('postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab1 VALUES (1);
+COMMIT;
+]);
+test_skip_subscription_error($node_subscriber,
+ 'apply', 'tap_sub', 'test_tab1',
+ qq(postgres|tap_sub|INSERT|test_tab1|apply|t),
+ 'skip the error reported by the apply worker');
+
+# Check the table sync worker's error in the view.
+test_subscription_error($node_subscriber,
+ 'tablesync', 'test_tab2',
+ qq(postgres|tap_sub||test_tab2|tablesync|t),
+ 'skip the error reported by the table sync worker');
+
+# Insert enough rows to test_tab_streaming to exceed the 64kB limit, also raising an
+# error on the subscriber during applying spooled changes for the same reason. Then
+# skip the transactio in question.
+$node_publisher->safe_psql('postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+COMMIT;
+]);
+test_skip_subscription_error($node_subscriber,
+ 'apply', 'tap_sub_streaming', 'test_tab_streaming',
+ qq(postgres|tap_sub_streaming|INSERT|test_tab_streaming|apply|t),
+ 'skip the error reported by the table sync worker during during applying streaming changes');
+
+# Insert data to test_tab1 and test_tab_streaming that don't conflict.
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab1 VALUES (2)");
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab_streaming VALUES (10001, md5(10001::text))");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Check the data is successfully replicated after skipping the transactions.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT * FROM test_tab1");
+is($result, q(1
+2), "subscription gets changes after skipped transaction");
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM test_tab_streaming");
+is($result, q(2), "subscription gets changes after skipped streamed transaction");
+
+# Tests for skipping the transactions that are prepared and stream_prepared. We insert
+# the same data as the previous tests but prepare the transactions. Those insertions
+# raise an error on the subscriptions. Then we skip the transactions in question.
+$node_publisher->safe_psql('postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab1 VALUES (1);
+PREPARE TRANSACTION 'skip_sub1';
+COMMIT PREPARED 'skip_sub1';
+]);
+test_skip_subscription_error($node_subscriber,
+ 'apply', 'tap_sub', 'test_tab1',
+ qq(postgres|tap_sub|INSERT|test_tab1|apply|t),
+ 'skip the error on changes of the prepared transaction');
+
+$node_publisher->safe_psql('postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+PREPARE TRANSACTION 'skip_sub2';
+COMMIT PREPARED 'skip_sub2';
+]);
+test_skip_subscription_error($node_subscriber,
+ 'apply', 'tap_sub_streaming', 'test_tab_streaming',
+ qq(postgres|tap_sub_streaming|INSERT|test_tab_streaming|apply|t),
+ 'skip the error on changes of the prepared-streamed transaction');
+
+# Check if the view doesn't show any entries after dropping the subscriptions.
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+DROP SUBSCRIPTION tap_sub;
+DROP SUBSCRIPTION tap_sub_streaming;
+]);
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_errors");
+is($result, q(0), 'no error after dropping subscription');
--
2.24.3 (Apple Git-128)
v8-0002-Add-pg_stat_subscription_errors-statistics-view.patchapplication/octet-stream; name=v8-0002-Add-pg_stat_subscription_errors-statistics-view.patchDownload
From 6f2d77c72a4e40e52ff83c54ec0cae8e29a962c4 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 16 Jul 2021 23:10:22 +0900
Subject: [PATCH v8 2/5] Add pg_stat_subscription_errors statistics view.
This commits adds new system view pg_stat_logical_replication_error,
showing errors happening during applying logical replication changes
as well as during performing initial table synchronization.
The subscription error entries are removed by autovacuum workers when
the table synchronization competed in table sync worker cases and when
dropping the subscription in apply worker cases.
It also adds SQL function pg_stat_reset_subscription_error() to
reset the single subscription error.
---
doc/src/sgml/monitoring.sgml | 160 ++++++
src/backend/catalog/system_functions.sql | 2 +
src/backend/catalog/system_views.sql | 27 +
src/backend/postmaster/pgstat.c | 651 +++++++++++++++++++++++
src/backend/replication/logical/worker.c | 40 +-
src/backend/utils/adt/pgstatfuncs.c | 119 +++++
src/backend/utils/error/elog.c | 16 +
src/include/catalog/pg_proc.dat | 13 +
src/include/pgstat.h | 106 ++++
src/include/utils/elog.h | 1 +
src/test/regress/expected/rules.out | 22 +
src/tools/pgindent/typedefs.list | 5 +
12 files changed, 1159 insertions(+), 3 deletions(-)
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 74a58a916c..0fd0bbfa1f 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -346,6 +346,15 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
</entry>
</row>
+ <row>
+ <entry><structname>pg_stat_subscription_errors</structname><indexterm><primary>pg_stat_subscription_errors</primary></indexterm></entry>
+ <entry>One row per error that happened on subscription, showing information about
+ the subscription errors.
+ See <link linkend="monitoring-pg-stat-subscription-errors">
+ <structname>pg_stat_subscription_errors</structname></link> for details.
+ </entry>
+ </row>
+
<row>
<entry><structname>pg_stat_ssl</structname><indexterm><primary>pg_stat_ssl</primary></indexterm></entry>
<entry>One row per connection (regular and replication), showing information about
@@ -3050,6 +3059,135 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
</sect2>
+ <sect2 id="monitoring-pg-stat-subscription-errors">
+ <title><structname>pg_stat_subscription_errors</structname></title>
+
+ <indexterm>
+ <primary>pg_stat_subscription_errors</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_stat_subscription_errors</structname> view will contain one
+ row per subscription error reported by workers applying logical replication
+ changes and workers handling the initial data copy of the subscribed tables.
+ </para>
+
+ <table id="pg-stat-subscription-errors" xreflabel="pg_stat_subscription-errors">
+ <title><structname>pg_stat_subscription_errors</structname> View</title>
+ <tgroup cols="1">
+ <thead>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ Column Type
+ </para>
+ <para>
+ Description
+ </para></entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>datname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the database in which the subscription is created.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the subscription.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the subscription.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>relid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker is processing when the
+ error happened.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>command</structfield> <type>text</type>
+ </para>
+ <para>
+ Name of command being applied when the error happened. This
+ field is always NULL if the error is reported by
+ <literal>tablesync</literal> worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>xid</structfield> <type>xid</type>
+ </para>
+ <para>
+ Transaction ID of publisher node being applied when the error
+ happened. This field is always NULL if the error is reported
+ by <literal>tablesync</literal> worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>failure_source</structfield> <type>text</type>
+ </para>
+ <para>
+ Type of worker reported the error: <literal>apply</literal> or
+ <literal>tablesync</literal>.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>failure_count</structfield> <type>uint8</type>
+ </para>
+ <para>
+ Number of times error happened on the worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_failure</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which the last error happened.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_failure_message</structfield> <type>text</type>
+ </para>
+ <para>
+ Error message which is reported last failure time.
+ </para></entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ </sect2>
+
<sect2 id="monitoring-pg-stat-ssl-view">
<title><structname>pg_stat_ssl</structname></title>
@@ -5172,6 +5310,28 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
can be granted EXECUTE to run the function.
</para></entry>
</row>
+
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_stat_reset_subscription_error</primary>
+ </indexterm>
+ <function>pg_stat_reset_subscription_error</function> ( <parameter>subid</parameter> <type>oid</type>, <parameter>relid</parameter> <type>oid</type> )
+ <returnvalue>void</returnvalue>
+ </para>
+ <para>
+ Resets statistics of a single subscription error. If
+ the argument <parameter>relid</parameter> is not <literal>NULL</literal>,
+ resets error statistics of the <literal>tablesync</literal> worker for
+ the relation with <parameter>relid</parameter>. Otherwise, resets the
+ error statistics of the <literal>apply</literal> worker running on the
+ subscription with <parameter>subid</parameter>.
+ </para>
+ <para>
+ This function is restricted to superusers by default, but other users
+ can be granted EXECUTE to run the function.
+ </para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index a416e94d37..c9aa6f04d3 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -639,6 +639,8 @@ REVOKE EXECUTE ON FUNCTION pg_stat_reset_single_function_counters(oid) FROM publ
REVOKE EXECUTE ON FUNCTION pg_stat_reset_replication_slot(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_stat_reset_subscription_error(oid, oid) FROM public;
+
REVOKE EXECUTE ON FUNCTION lo_import(text) FROM public;
REVOKE EXECUTE ON FUNCTION lo_import(text, oid) FROM public;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 55f6e3711d..449692afa9 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1257,3 +1257,30 @@ REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
substream, subtwophasestate, subslotname, subsynccommit, subpublications)
ON pg_subscription TO public;
+
+CREATE VIEW pg_stat_subscription_errors AS
+ SELECT
+ d.datname,
+ sr.subid,
+ s.subname,
+ e.relid,
+ e.command,
+ e.xid,
+ e.failure_source,
+ e.failure_count,
+ e.last_failure,
+ e.last_failure_message,
+ e.stats_reset
+ FROM (SELECT
+ oid as subid,
+ NULL as relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT
+ srsubid as subid,
+ srrelid as relid
+ FROM pg_subscription_rel
+ WHERE srsubstate <> 'r') sr,
+ LATERAL pg_stat_get_subscription_error(sr.subid, sr.relid) e
+ JOIN pg_database d ON (e.datid = d.oid)
+ JOIN pg_subscription s ON (e.subid = s.oid);
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 1b54ef74eb..498ea63863 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -41,6 +41,8 @@
#include "catalog/partition.h"
#include "catalog/pg_database.h"
#include "catalog/pg_proc.h"
+#include "catalog/pg_subscription.h"
+#include "catalog/pg_subscription_rel.h"
#include "common/ip.h"
#include "executor/instrument.h"
#include "libpq/libpq.h"
@@ -106,6 +108,7 @@
#define PGSTAT_TAB_HASH_SIZE 512
#define PGSTAT_FUNCTION_HASH_SIZE 512
#define PGSTAT_REPLSLOT_HASH_SIZE 32
+#define PGSTAT_SUBSCRIPTION_HASH_SIZE 32
/* ----------
@@ -280,6 +283,7 @@ static PgStat_GlobalStats globalStats;
static PgStat_WalStats walStats;
static PgStat_SLRUStats slruStats[SLRU_NUM_ELEMENTS];
static HTAB *replSlotStatHash = NULL;
+static HTAB *subscriptionHash = NULL;
/*
* List of OIDs of databases we need to write out. If an entry is InvalidOid,
@@ -330,6 +334,12 @@ static bool pgstat_db_requested(Oid databaseid);
static PgStat_StatReplSlotEntry *pgstat_get_replslot_entry(NameData name, bool create_it);
static void pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotstats, TimestampTz ts);
+static PgStat_StatSubEntry *pgstat_get_subscription_entry(Oid subid,
+ bool create);
+static PgStat_StatSubErrEntry *pgstat_get_subscription_error_entry(Oid subid,
+ Oid subrelid,
+ bool create);
+
static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg);
static void pgstat_send_funcstats(void);
static void pgstat_send_slru(void);
@@ -370,6 +380,10 @@ static void pgstat_recv_checksum_failure(PgStat_MsgChecksumFailure *msg, int len
static void pgstat_recv_connstat(PgStat_MsgConn *msg, int len);
static void pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len);
static void pgstat_recv_tempfile(PgStat_MsgTempFile *msg, int len);
+static void pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len);
+static void pgstat_recv_subscription_error_purge(PgStat_MsgSubscriptionErrPurge *msg,
+ int len);
+static void pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len);
/* ------------------------------------------------------------
* Public functions called from postmaster follow
@@ -1148,6 +1162,166 @@ pgstat_vacuum_stat(void)
}
}
+ /*
+ * Search for all the dead subscriptions and error entries in stats
+ * hashtable and tell the stats collector to drop them.
+ */
+ if (subscriptionHash)
+ {
+ PgStat_MsgSubscriptionPurge submsg;
+ PgStat_StatSubEntry *subent;
+ HTAB *htab;
+
+ htab = pgstat_collect_oids(SubscriptionRelationId, Anum_pg_subscription_oid);
+
+ submsg.m_nentries = 0;
+
+ hash_seq_init(&hstat, subscriptionHash);
+ while ((subent = (PgStat_StatSubEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ PgStat_MsgSubscriptionErrPurge errmsg;
+ PgStat_StatSubErrEntry *errent;
+ HASH_SEQ_STATUS hstat_rel;
+ List *not_ready_rels_list;
+ HTAB *not_ready_rels_htab;
+ ListCell *lc;
+ HASHCTL hash_ctl;
+
+ CHECK_FOR_INTERRUPTS();
+
+ if (hash_search(htab, (void *) &(subent->subid), HASH_FIND, NULL) == NULL)
+ {
+ /* This subscription is dead, add subid to the message */
+ submsg.m_subids[submsg.m_nentries++] = subent->subid;
+
+ /*
+ * If the message is full, send it out and reinitialize to
+ * empty
+ */
+ if (submsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONPURGE)
+ {
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + submsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&submsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(&submsg, len);
+ submsg.m_nentries = 0;
+ }
+
+ continue;
+ }
+
+ /*
+ * Nothing to do here if the subscription exists but has no error
+ * entries.
+ */
+ if (subent->suberrors == NULL)
+ continue;
+
+ /*
+ * The subscription has error entries. We search errors of the
+ * table sync workers who are already in sync state. Those errors
+ * should be removed.
+ *
+ * Note that the lifetime of error entries of the apply worker and
+ * the table sync worker are different. The former lives until
+ * the subscription is dropped whereas the latter lives the table
+ * synchronization is completed.
+ */
+ not_ready_rels_list = GetSubscriptionNotReadyRelations(subent->subid);
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(SubscriptionRelState);
+ not_ready_rels_htab = hash_create("not ready relations in subscription",
+ 64,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+
+ /*
+ * The number of not-ready relations can be high for example right
+ * after creating a subscription, so we load the list of
+ * SubscriptionRelState into the hash table for a faster lookup.
+ */
+ foreach(lc, not_ready_rels_list)
+ {
+ SubscriptionRelState *r_elem = (SubscriptionRelState *) lfirst(lc);
+ SubscriptionRelState *r_entry;
+
+ CHECK_FOR_INTERRUPTS();
+ r_entry = hash_search(not_ready_rels_htab, (void *) &(r_elem->relid),
+ HASH_ENTER, NULL);
+ memcpy(r_entry, r_elem, sizeof(SubscriptionRelState));
+ }
+
+ list_free(not_ready_rels_list);
+
+ errmsg.m_nentries = 0;
+ errmsg.m_subid = subent->subid;
+
+ /*
+ * Search for all error entries of which relation is already ready
+ * state
+ */
+ hash_seq_init(&hstat_rel, subent->suberrors);
+ while ((errent = (PgStat_StatSubErrEntry *) hash_seq_search(&hstat_rel)) != NULL)
+ {
+ CHECK_FOR_INTERRUPTS();
+
+ /* Skip the apply worker's error */
+ if (!OidIsValid(errent->subrelid))
+ continue;
+
+ /*
+ * Add the relid to the message if the table synchronization
+ * for this relation already complete or the table is no
+ * longer subscribed.
+ */
+ if (hash_search(not_ready_rels_htab, (void *) &(errent->subrelid),
+ HASH_FIND, NULL) == NULL)
+ errmsg.m_relids[errmsg.m_nentries++] = errent->subrelid;
+
+ /*
+ * If the message is full, send it out and reinitialize to
+ * empty
+ */
+ if (errmsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONERRPURGE)
+ {
+ len = offsetof(PgStat_MsgSubscriptionErrPurge, m_relids[0])
+ + errmsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&errmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE);
+ pgstat_send(&errmsg, len);
+ errmsg.m_nentries = 0;
+ }
+ }
+
+ /* Send the rest of dead error entries */
+ if (errmsg.m_nentries > 0)
+ {
+ len = offsetof(PgStat_MsgSubscriptionErrPurge, m_relids[0])
+ + errmsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&errmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE);
+ pgstat_send(&errmsg, len);
+ errmsg.m_nentries = 0;
+ }
+
+ hash_destroy(not_ready_rels_htab);
+ }
+
+ /* Send the rest of dead subscriptions */
+ if (submsg.m_nentries > 0)
+ {
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + submsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&submsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(&submsg, len);
+ }
+
+ hash_destroy(htab);
+ }
+
/*
* Lookup our own database entry; if not found, nothing more to do.
*/
@@ -1557,6 +1731,25 @@ pgstat_reset_replslot_counter(const char *name)
pgstat_send(&msg, sizeof(msg));
}
+/* ----------
+ * pgstat_reset_subscription_error() -
+ *
+ * Tell the collector about reset the subscription error.
+ * ----------
+ */
+void
+pgstat_reset_subscription_error(Oid subid, Oid subrelid)
+{
+ PgStat_MsgSubscriptionErr msg;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_reset = true;
+
+ pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
+}
+
/* ----------
* pgstat_report_autovac() -
*
@@ -1877,6 +2070,37 @@ pgstat_report_replslot_drop(const char *slotname)
pgstat_send(&msg, sizeof(PgStat_MsgReplSlot));
}
+/* ----------
+ * pgstat_report_subscription_error() -
+ *
+ * Tell the collector about the subscription error.
+ * ----------
+ */
+void
+pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command, TransactionId xid,
+ const char *errmsg)
+{
+ PgStat_MsgSubscriptionErr msg;
+ int len;
+
+ len = offsetof(PgStat_MsgSubscriptionErr, m_errmsg[0]) + strlen(errmsg);
+ Assert(len < PGSTAT_MAX_MSG_SIZE);
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_databaseid = MyDatabaseId;
+ msg.m_relid = relid;
+ msg.m_reset = false;
+ msg.m_command = command;
+ msg.m_xid = xid;
+ msg.m_failure_time = GetCurrentTimestamp();
+ strlcpy(msg.m_errmsg, errmsg, PGSTAT_SUBSCRIPTIONERR_MSGLEN);
+
+ pgstat_send(&msg, len);
+}
+
/* ----------
* pgstat_ping() -
*
@@ -2944,6 +3168,22 @@ pgstat_fetch_replslot(NameData slotname)
return pgstat_get_replslot_entry(slotname, false);
}
+/*
+ * ---------
+ * pgstat_fetch_subscription_error() -
+ *
+ * Support function for the SQL-callable pgstat* functions. Returns
+ * a pointer to the subscription error struct.
+ * ---------
+ */
+PgStat_StatSubErrEntry *
+pgstat_fetch_subscription_error(Oid subid, Oid relid)
+{
+ backend_read_statsfile();
+
+ return pgstat_get_subscription_error_entry(subid, relid, false);
+}
+
/*
* Shut down a single backend's statistics reporting at process exit.
*
@@ -3525,6 +3765,19 @@ PgstatCollectorMain(int argc, char *argv[])
pgstat_recv_connstat(&msg.msg_conn, len);
break;
+ case PGSTAT_MTYPE_SUBSCRIPTIONERR:
+ pgstat_recv_subscription_error(&msg.msg_subscriptionerr, len);
+ break;
+
+ case PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE:
+ pgstat_recv_subscription_error_purge(&msg.msg_subscriptionerrpurge,
+ len);
+ break;
+
+ case PGSTAT_MTYPE_SUBSCRIPTIONPURGE:
+ pgstat_recv_subscription_purge(&msg.msg_subscriptionpurge, len);
+ break;
+
default:
break;
}
@@ -3826,6 +4079,50 @@ pgstat_write_statsfiles(bool permanent, bool allDbs)
}
}
+ /*
+ * Write subscription error structs
+ */
+ if (subscriptionHash)
+ {
+ PgStat_StatSubEntry *subent;
+
+ hash_seq_init(&hstat, subscriptionHash);
+ while ((subent = (PgStat_StatSubEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ PgStat_StatSubErrEntry *errent;
+ HASH_SEQ_STATUS relhstat;
+ int32 nerrors = hash_get_num_entries(subent->suberrors);
+
+ /* Skip this subscription if not have any errors */
+ if (subent->suberrors == NULL)
+ continue;
+
+ fputc('S', fpout);
+ rc = fwrite(subent, sizeof(PgStat_StatSubEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+
+ /* The number of errors follows */
+ rc = fwrite(&nerrors, sizeof(int32), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+
+ /* Then, the error entries follow */
+ hash_seq_init(&relhstat, subent->suberrors);
+ while ((errent = (PgStat_StatSubErrEntry *) hash_seq_search(&relhstat)) != NULL)
+ {
+ /*
+ * XXX we write the whole PgStat_StatSubErrEntry entry that
+ * contains the fixed-legnth error message string which is
+ * PGSTAT_SUBSCRIPTIONERR_MSGLEN in length, making the stats
+ * file bloat. It's okay since we assume that the number of
+ * error entries is not high. But if the expectation became
+ * false we should write the string and its length instead.
+ */
+ rc = fwrite(errent, sizeof(PgStat_StatSubErrEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+ }
+ }
+ }
+
/*
* No more output to be done. Close the temp file and replace the old
* pgstat.stat with it. The ferror() check replaces testing for error
@@ -4287,6 +4584,100 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
break;
}
+ /*
+ * 'S' A PgStat_StatSubEntry struct followed by the number of
+ * errors and PgStat_StatSubErrEntry structs, describing
+ * subscription errors.
+ */
+ case 'S':
+ {
+ PgStat_StatSubEntry subbuf;
+ PgStat_StatSubEntry *subent;
+ int32 nerrors;
+
+ /* Read the subscription entry */
+ if (fread(&subbuf, 1, sizeof(PgStat_StatSubEntry), fpin) !=
+ sizeof(PgStat_StatSubEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /* Create hash table if we don't have it already. */
+ if (subscriptionHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ subscriptionHash = hash_create("Subscription stat entries",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ }
+
+ /* Enter the subscription error entry */
+ subent =
+ (PgStat_StatSubEntry *) hash_search(subscriptionHash,
+ (void *) &(subbuf.subid),
+ HASH_ENTER, NULL);
+ subent->suberrors = NULL;
+
+ /* Read the number of errors in the subscription */
+ if (fread(&nerrors, 1, sizeof(int32), fpin) != sizeof(int32))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /* Read error entries */
+ for (int i = 0; i < nerrors; i++)
+ {
+ PgStat_StatSubErrEntry errbuf;
+ PgStat_StatSubErrEntry *errent;
+
+ if (fread(&errbuf, 1, sizeof(PgStat_StatSubErrEntry), fpin) !=
+ sizeof(PgStat_StatSubErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ if (subent->suberrors == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubErrEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ subent->suberrors = hash_create("Subscription error hash",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ }
+
+ /*
+ * Enter the error information to the subscription
+ * hash
+ */
+ errent =
+ (PgStat_StatSubErrEntry *) hash_search(subent->suberrors,
+ (void *) &(errbuf.subrelid),
+ HASH_ENTER, NULL);
+
+ memcpy(errent, &errbuf, sizeof(PgStat_StatSubErrEntry));
+ }
+
+ break;
+ }
+
case 'E':
goto done;
@@ -4629,6 +5020,50 @@ pgstat_read_db_statsfile_timestamp(Oid databaseid, bool permanent,
}
break;
+ /*
+ * 'S' A PgStat_StatSubEntry struct followed by the number of
+ * errors and PgStat_StatSubErrEntry structs describing
+ * subscription errors.
+ */
+ case 'S':
+ {
+ PgStat_StatSubEntry subbuf;
+ PgStat_StatSubErrEntry errbuf;
+ int32 nerrors;
+
+ if (fread(&subbuf, 1, sizeof(PgStat_StatSubEntry), fpin)
+ != sizeof(PgStat_StatSubEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ FreeFile(fpin);
+ return false;
+ }
+
+ if (fread(&nerrors, 1, sizeof(int32), fpin) != sizeof(int32))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ for (int i = 0; i < nerrors; i++)
+ {
+ if (fread(&errbuf, 1, sizeof(PgStat_StatSubErrEntry), fpin) !=
+ sizeof(PgStat_StatSubErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+ }
+ }
+
+ break;
+
case 'E':
goto done;
@@ -4834,6 +5269,7 @@ pgstat_clear_snapshot(void)
pgStatLocalContext = NULL;
pgStatDBHash = NULL;
replSlotStatHash = NULL;
+ subscriptionHash = NULL;
/*
* Historically the backend_status.c facilities lived in this file, and
@@ -5779,6 +6215,116 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
}
}
+/* ----------
+ * pgstat_recv_subscription_error() -
+ *
+ * Process a SUBSCRIPTIONERR message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len)
+{
+ PgStat_StatSubErrEntry *errent;
+ bool create = !msg->m_reset;
+
+ /* Get subscription error */
+ errent = pgstat_get_subscription_error_entry(msg->m_subid,
+ msg->m_subrelid,
+ create);
+
+ if (msg->m_reset)
+ {
+ if (errent == NULL)
+ return;
+
+ /* reset fields and set reset timestamp */
+ errent->relid = InvalidOid;
+ errent->command = 0;
+ errent->xid = InvalidTransactionId;
+ errent->failure_count = 0;
+ errent->last_failure = 0;
+ errent->last_errmsg[0] = '\0';
+ errent->stat_reset_timestamp = GetCurrentTimestamp();
+ }
+ else
+ {
+ Assert(errent);
+
+ /* update the error entry */
+ errent->databaseid = msg->m_databaseid;
+ errent->relid = msg->m_relid;
+ errent->command = msg->m_command;
+ errent->xid = msg->m_xid;
+ errent->failure_count++;
+ errent->last_failure = msg->m_failure_time;
+ strlcpy(errent->last_errmsg, msg->m_errmsg, PGSTAT_SUBSCRIPTIONERR_MSGLEN);
+ }
+}
+
+/* ----------
+ * pgstat_recv_subscription_purge() -
+ *
+ * Process a SUBSCRIPTIONPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len)
+{
+ if (subscriptionHash == NULL)
+ return;
+
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ PgStat_StatSubEntry *subent;
+
+ subent = pgstat_get_subscription_entry(msg->m_subids[i], false);
+
+ /*
+ * Nothing to do if the subscription entry is not found. This could
+ * happen when the subscription is dropped and the message for
+ * dropping subscription entry arrived before the message for
+ * reporting the error.
+ */
+ if (subent == NULL)
+ continue;
+
+ /* Cleanup the hash table for errors */
+ if (subent->suberrors != NULL)
+ hash_destroy(subent->suberrors);
+
+ /* Remote the subscription entry */
+ (void) hash_search(subscriptionHash, (void *) &(msg->m_subids[i]),
+ HASH_REMOVE, NULL);
+ }
+}
+
+/* ----------
+ * pgstat_recv_subscription_error_purge() -
+ *
+ * Process a SUBSCRIPTIONERRPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_error_purge(PgStat_MsgSubscriptionErrPurge *msg, int len)
+{
+ PgStat_StatSubEntry *subent;
+
+ subent = pgstat_get_subscription_entry(msg->m_subid, false);
+
+ /*
+ * Nothing to do if the subscription entry is not found. This could
+ * happen when the subscription with msg->m_subid is removed and the
+ * corresponding entry is also removed before receiving the error purge
+ * message.
+ */
+ if (subent == NULL)
+ return;
+
+ for (int i = 0; i < msg->m_nentries; i++)
+ (void) hash_search(subent->suberrors, (void *) &(msg->m_relids[i]),
+ HASH_REMOVE, NULL);
+}
+
/* ----------
* pgstat_write_statsfile_needed() -
*
@@ -5876,6 +6422,111 @@ pgstat_get_replslot_entry(NameData name, bool create)
return slotent;
}
+/* ----------
+ * pgstat_get_subscription_entry
+ *
+ * Return the subscription statistics with the subscription OID. Return NULL
+ * if not found and the caller didn't request to create it.
+ *
+ * 'create' tells whether to create the new subscription entry if it is not
+ * found.
+ * ----------
+ */
+static PgStat_StatSubEntry *
+pgstat_get_subscription_entry(Oid subid, bool create)
+{
+ PgStat_StatSubEntry *subent;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);
+ bool found;
+
+ if (subscriptionHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ if (!create)
+ return NULL;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubEntry);
+ subscriptionHash = hash_create("Subscription stat entries",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+ }
+
+ subent = (PgStat_StatSubEntry *) hash_search(subscriptionHash,
+ (void *) &subid,
+ action, &found);
+
+ /* initialize field */
+ if (create && !found)
+ subent->suberrors = NULL;
+
+ return subent;
+}
+
+/* ----------
+ * pgstat_get_subscription_error_entry
+ *
+ * Return the entry of subscription error entry with the subscription
+ * OID and relation OID. Return NULL if not found and the caller didn't
+ * request to create it.
+ *
+ * 'create' tells whether to create the new subscription relation entry if it is
+ * not found.
+ * ----------
+ */
+static PgStat_StatSubErrEntry *
+pgstat_get_subscription_error_entry(Oid subid, Oid subrelid, bool create)
+{
+ PgStat_StatSubEntry *subent;
+ PgStat_StatSubErrEntry *errent;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);
+ bool found;
+
+ subent = pgstat_get_subscription_entry(subid, create);
+
+ if (subent == NULL)
+ {
+ Assert(!create);
+ return NULL;
+ }
+
+ if (subent->suberrors == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ if (!create)
+ return NULL;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubErrEntry);
+ subent->suberrors = hash_create("Subscription error hash",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+ }
+
+ errent = (PgStat_StatSubErrEntry *) hash_search(subent->suberrors,
+ (void *) &subrelid,
+ action, &found);
+
+ /* initialize fields */
+ if (create && !found)
+ {
+ errent->databaseid = InvalidOid;
+ errent->relid = InvalidOid;
+ errent->command = 0;
+ errent->xid = InvalidTransactionId;
+ errent->failure_count = 0;
+ errent->last_failure = 0;
+ errent->last_errmsg[0] = '\0';
+ errent->stat_reset_timestamp = 0;
+ }
+
+ return errent;
+}
+
/* ----------
* pgstat_reset_replslot
*
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index a74493b610..7ecabaacb8 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3556,8 +3556,23 @@ ApplyWorkerMain(Datum main_arg)
{
char *syncslotname;
- /* This is table synchronization worker, call initial sync. */
- syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ PG_TRY();
+ {
+ /* This is table synchronization worker, call initial sync. */
+ syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ }
+ PG_CATCH();
+ {
+ /* report the table sync error */
+ pgstat_report_subscription_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ MyLogicalRepWorker->relid,
+ 0,
+ InvalidTransactionId,
+ geterrmessage());
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
/* allocate slot name in long-lived context */
myslotname = MemoryContextStrdup(ApplyContext, syncslotname);
@@ -3675,7 +3690,26 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- LogicalRepApplyLoop(origin_startpos);
+ PG_TRY();
+ {
+ LogicalRepApplyLoop(origin_startpos);
+ }
+ PG_CATCH();
+ {
+ /* report the apply error */
+ if (apply_error_callback_arg.command != 0)
+ pgstat_report_subscription_error(MySubscription->oid,
+ InvalidOid,
+ apply_error_callback_arg.rel != NULL
+ ? apply_error_callback_arg.rel->localreloid
+ : InvalidOid,
+ apply_error_callback_arg.command,
+ apply_error_callback_arg.remote_xid,
+ geterrmessage());
+
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
proc_exit(0);
}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index ff5aedc99c..4048c99a9e 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -24,6 +24,7 @@
#include "pgstat.h"
#include "postmaster/bgworker_internals.h"
#include "postmaster/postmaster.h"
+#include "replication/logicalproto.h"
#include "replication/slot.h"
#include "storage/proc.h"
#include "storage/procarray.h"
@@ -2239,6 +2240,23 @@ pg_stat_reset_replication_slot(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
+/* Reset a subscription error stats */
+Datum
+pg_stat_reset_subscription_error(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid;
+
+ if (PG_ARGISNULL(1))
+ relid = InvalidOid; /* reset apply worker error stats */
+ else
+ relid = PG_GETARG_OID(1); /* reset table sync worker error stats */
+
+ pgstat_reset_subscription_error(subid, relid);
+
+ PG_RETURN_VOID();
+}
+
Datum
pg_stat_get_archiver(PG_FUNCTION_ARGS)
{
@@ -2379,3 +2397,104 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
/* Returns the record as Datum */
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
+
+/*
+ * Get the subscription error for the given subscription (and relation).
+ */
+Datum
+pg_stat_get_subscription_error(PG_FUNCTION_ARGS)
+{
+#define PG_STAT_GET_SUBSCRIPTION_ERROR_COLS 10
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid;
+ TupleDesc tupdesc;
+ Datum values[PG_STAT_GET_SUBSCRIPTION_ERROR_COLS];
+ bool nulls[PG_STAT_GET_SUBSCRIPTION_ERROR_COLS];
+ PgStat_StatSubErrEntry *errent;
+
+ /* Initialise values and NULL flags arrays */
+ MemSet(values, 0, sizeof(values));
+ MemSet(nulls, 0, sizeof(nulls));
+
+ /* Initialise attributes information in the tuple descriptor */
+ tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_SUBSCRIPTION_ERROR_COLS);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "datid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 3, "relid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 4, "command",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 5, "xid",
+ XIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 6, "failure_source",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 7, "failure_count",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 8, "last_failure",
+ TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 9, "last_failure_message",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 10, "stats_reset",
+ TIMESTAMPTZOID, -1, 0);
+ BlessTupleDesc(tupdesc);
+
+ if (PG_ARGISNULL(1))
+ relid = InvalidOid;
+ else
+ relid = PG_GETARG_OID(1);
+
+ /* Get subscription errors */
+ errent = pgstat_fetch_subscription_error(subid, relid);
+
+ /* Return NULL if the subscription doesn't have any errors */
+ if (errent == NULL)
+ PG_RETURN_NULL();
+
+ values[0] = ObjectIdGetDatum(errent->databaseid);
+ values[1] = ObjectIdGetDatum(subid);
+
+ if (OidIsValid(errent->relid))
+ values[2] = ObjectIdGetDatum(errent->relid);
+ else
+ nulls[2] = true;
+
+ if (errent->command == 0)
+ nulls[3] = true;
+ else
+ {
+ /* should be apply worker */
+ Assert(!OidIsValid(errent->subrelid));
+
+ values[3] = CStringGetTextDatum(logicalrep_message_type(errent->command));
+ }
+
+ if (TransactionIdIsValid(errent->xid))
+ values[4] = TransactionIdGetDatum(errent->xid);
+ else
+ nulls[4] = true;
+
+ if (OidIsValid(errent->subrelid))
+ values[5] = CStringGetTextDatum("tablesync");
+ else
+ values[5] = CStringGetTextDatum("apply");
+
+
+ values[6] = Int64GetDatum(errent->failure_count);
+
+ if (errent->last_failure == 0)
+ nulls[7] = true;
+ else
+ values[7] = TimestampTzGetDatum(errent->last_failure);
+
+ values[8] = CStringGetTextDatum(errent->last_errmsg);
+
+ if (errent->stat_reset_timestamp == 0)
+ nulls[9] = true;
+ else
+ values[9] = TimestampTzGetDatum(errent->stat_reset_timestamp);
+
+ /* Returns the record as Datum */
+ PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
+}
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index a3e1c59a82..dd36850016 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -1441,6 +1441,22 @@ getinternalerrposition(void)
return edata->internalpos;
}
+/*
+ * geterrmessage --- return the currently set error message
+ *
+ * This is only intended for use in error callback subroutines, since there
+ * is no other place outside elog.c where the concept is meaningful.
+ */
+const char *
+geterrmessage(void)
+{
+ ErrorData *edata = &errordata[errordata_stack_depth];
+
+ /* we don't bother incrementing recursion_depth */
+ CHECK_STACK_DEPTH();
+
+ return (const char *) edata->message;
+}
/*
* Functions to allow construction of error message strings separately from
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index b603700ed9..7f9c27bdda 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5385,6 +5385,14 @@
proargmodes => '{i,o,o,o,o,o,o,o,o,o,o}',
proargnames => '{slot_name,slot_name,spill_txns,spill_count,spill_bytes,stream_txns,stream_count,stream_bytes,total_txns,total_bytes,stats_reset}',
prosrc => 'pg_stat_get_replication_slot' },
+{ oid => '8523', descr => 'statistics: information about subscription error',
+ proname => 'pg_stat_get_subscription_error', prorows => '1', proisstrict => 'f',
+ proretset => 't', provolatile => 's', proparallel => 'r',
+ prorettype => 'record', proargtypes => 'oid oid',
+ proallargtypes => '{oid,oid,oid,oid,oid,text,xid,text,int8,timestamptz,text,timestamptz}',
+ proargmodes => '{i,i,o,o,o,o,o,o,o,o,o,o}',
+ proargnames => '{subid,relid,datid,subid,relid,command,xid,failure_source,failure_count,last_failure,last_failure_message,stats_reset}',
+ prosrc => 'pg_stat_get_subscription_error' },
{ oid => '6118', descr => 'statistics: information about subscription',
proname => 'pg_stat_get_subscription', prorows => '10', proisstrict => 'f',
proretset => 't', provolatile => 's', proparallel => 'r',
@@ -5772,6 +5780,11 @@
proname => 'pg_stat_reset_replication_slot', proisstrict => 'f',
provolatile => 'v', prorettype => 'void', proargtypes => 'text',
prosrc => 'pg_stat_reset_replication_slot' },
+{ oid => '8524',
+ descr => 'statistics: reset collected statistics for a single subscription error',
+ proname => 'pg_stat_reset_subscription_error', proisstrict => 'f',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid oid',
+ prosrc => 'pg_stat_reset_subscription_error' },
{ oid => '3163', descr => 'current trigger depth',
proname => 'pg_trigger_depth', provolatile => 's', proparallel => 'r',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 2068a68a5f..f66eb290bb 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -14,6 +14,7 @@
#include "datatype/timestamp.h"
#include "portability/instr_time.h"
#include "postmaster/pgarch.h" /* for MAX_XFN_CHARS */
+#include "replication/logicalproto.h"
#include "utils/backend_progress.h" /* for backward compatibility */
#include "utils/backend_status.h" /* for backward compatibility */
#include "utils/hsearch.h"
@@ -66,6 +67,9 @@ typedef enum StatMsgType
PGSTAT_MTYPE_RESETSINGLECOUNTER,
PGSTAT_MTYPE_RESETSLRUCOUNTER,
PGSTAT_MTYPE_RESETREPLSLOTCOUNTER,
+ PGSTAT_MTYPE_SUBSCRIPTIONERR,
+ PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE,
+ PGSTAT_MTYPE_SUBSCRIPTIONPURGE,
PGSTAT_MTYPE_AUTOVAC_START,
PGSTAT_MTYPE_VACUUM,
PGSTAT_MTYPE_ANALYZE,
@@ -550,6 +554,68 @@ typedef struct PgStat_MsgReplSlot
PgStat_Counter m_total_bytes;
} PgStat_MsgReplSlot;
+/* ----------
+ * PgStat_MsgSubscriptionErr Sent by the apply worker or the table sync worker to
+ * update/reset the error happening during logical
+ * replication.
+ * ----------
+ */
+#define PGSTAT_SUBSCRIPTIONERR_MSGLEN 256
+typedef struct PgStat_MsgSubscriptionErr
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * m_subid and m_subrelid are used to determine the subscription and the
+ * reporter of this error. m_subrelid is InvalidOid if reported by the
+ * apply worker, otherwise by the table sync worker. In table sync worker
+ * case, m_subrelid must be the same as m_relid.
+ */
+ Oid m_subid;
+ Oid m_subrelid;
+
+ /* The clear message uses below field */
+ bool m_reset; /* Reset all fields and set reset_stats
+ * timestamp */
+
+ /* The error report message uses below fields */
+ Oid m_databaseid;
+ Oid m_relid;
+ LogicalRepMsgType m_command;
+ TransactionId m_xid;
+ TimestampTz m_failure_time;
+ char m_errmsg[PGSTAT_SUBSCRIPTIONERR_MSGLEN];
+} PgStat_MsgSubscriptionErr;
+
+/* ----------
+ * PgStat_MsgSubscriptionPurge Sent by the autovacuum purge the subscriptions.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionPurge
+{
+ PgStat_MsgHdr m_hdr;
+ int m_nentries;
+ Oid m_subids[PGSTAT_NUM_SUBSCRIPTIONPURGE];
+} PgStat_MsgSubscriptionPurge;
+
+/* ----------
+ * PgStat_MsgSubscriptionErrPurge Sent by the autovacuum purge the subscription
+ * errors.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONERRPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(Oid) - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionErrPurge
+{
+ PgStat_MsgHdr m_hdr;
+ Oid m_subid;
+ int m_nentries;
+ Oid m_relids[PGSTAT_NUM_SUBSCRIPTIONERRPURGE];
+} PgStat_MsgSubscriptionErrPurge;
/* ----------
* PgStat_MsgRecoveryConflict Sent by the backend upon recovery conflict
@@ -722,6 +788,9 @@ typedef union PgStat_Msg
PgStat_MsgChecksumFailure msg_checksumfailure;
PgStat_MsgReplSlot msg_replslot;
PgStat_MsgConn msg_conn;
+ PgStat_MsgSubscriptionErr msg_subscriptionerr;
+ PgStat_MsgSubscriptionErrPurge msg_subscriptionerrpurge;
+ PgStat_MsgSubscriptionPurge msg_subscriptionpurge;
} PgStat_Msg;
@@ -938,6 +1007,38 @@ typedef struct PgStat_StatReplSlotEntry
TimestampTz stat_reset_timestamp;
} PgStat_StatReplSlotEntry;
+/*
+ * Subscription statistics kept in the stats collector.
+ */
+typedef struct PgStat_StatSubEntry
+{
+ Oid subid; /* hash table key */
+ HTAB *suberrors;
+} PgStat_StatSubEntry;
+
+/*
+ * Subscription error statistics kept in the stats collector. One entry represents
+ * an error that happened during logical replication, reported by the apply worker
+ * (subrelid is InvalidOid) or by the table sync worker (subrelid is a valid OID).
+ * The error reported by the table sync worker is removed also when the table
+ * synchronization process completed.
+ */
+
+typedef struct PgStat_StatSubErrEntry
+{
+ Oid subrelid; /* InvalidOid if the apply worker, otherwise
+ * the table sync worker. hash table key. */
+ Oid databaseid;
+ Oid relid; /* OID of relation related to the error. Must
+ * be the same as subrelid in the table sync
+ * case. */
+ LogicalRepMsgType command;
+ TransactionId xid;
+ PgStat_Counter failure_count;
+ TimestampTz last_failure;
+ char last_errmsg[PGSTAT_SUBSCRIPTIONERR_MSGLEN];
+ TimestampTz stat_reset_timestamp;
+} PgStat_StatSubErrEntry;
/*
* Working state needed to accumulate per-function-call timing statistics.
@@ -1031,6 +1132,7 @@ extern void pgstat_reset_shared_counters(const char *);
extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type type);
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
+extern void pgstat_reset_subscription_error(Oid subid, Oid subrelid);
extern void pgstat_report_autovac(Oid dboid);
extern void pgstat_report_vacuum(Oid tableoid, bool shared,
@@ -1047,6 +1149,9 @@ extern void pgstat_report_checksum_failure(void);
extern void pgstat_report_replslot(const PgStat_StatReplSlotEntry *repSlotStat);
extern void pgstat_report_replslot_create(const char *slotname);
extern void pgstat_report_replslot_drop(const char *slotname);
+extern void pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command,
+ TransactionId xid, const char *errmsg);
extern void pgstat_initialize(void);
@@ -1145,6 +1250,7 @@ extern PgStat_GlobalStats *pgstat_fetch_global(void);
extern PgStat_WalStats *pgstat_fetch_stat_wal(void);
extern PgStat_SLRUStats *pgstat_fetch_slru(void);
extern PgStat_StatReplSlotEntry *pgstat_fetch_replslot(NameData slotname);
+extern PgStat_StatSubErrEntry *pgstat_fetch_subscription_error(Oid subid, Oid relid);
extern void pgstat_count_slru_page_zeroed(int slru_idx);
extern void pgstat_count_slru_page_hit(int slru_idx);
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index f53607e12e..155145a77d 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -222,6 +222,7 @@ extern int err_generic_string(int field, const char *str);
extern int geterrcode(void);
extern int geterrposition(void);
extern int getinternalerrposition(void);
+extern const char *geterrmessage(void);
/*----------
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index e5ab11275d..ffad9790ae 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2094,6 +2094,28 @@ pg_stat_subscription| SELECT su.oid AS subid,
st.latest_end_time
FROM (pg_subscription su
LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid, pid, received_lsn, last_msg_send_time, last_msg_receipt_time, latest_end_lsn, latest_end_time) ON ((st.subid = su.oid)));
+pg_stat_subscription_errors| SELECT d.datname,
+ sr.subid,
+ s.subname,
+ e.relid,
+ e.command,
+ e.xid,
+ e.failure_source,
+ e.failure_count,
+ e.last_failure,
+ e.last_failure_message,
+ e.stats_reset
+ FROM ( SELECT pg_subscription.oid AS subid,
+ NULL::oid AS relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT pg_subscription_rel.srsubid AS subid,
+ pg_subscription_rel.srrelid AS relid
+ FROM pg_subscription_rel
+ WHERE (pg_subscription_rel.srsubstate <> 'r'::"char")) sr,
+ ((LATERAL pg_stat_get_subscription_error(sr.subid, sr.relid) e(datid, subid, relid, command, xid, failure_source, failure_count, last_failure, last_failure_message, stats_reset)
+ JOIN pg_database d ON ((e.datid = d.oid)))
+ JOIN pg_subscription s ON ((e.subid = s.oid)));
pg_stat_sys_indexes| SELECT pg_stat_all_indexes.relid,
pg_stat_all_indexes.indexrelid,
pg_stat_all_indexes.schemaname,
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 2dea7b1ac7..05f5ab7a0b 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1938,6 +1938,9 @@ PgStat_MsgResetsharedcounter
PgStat_MsgResetsinglecounter
PgStat_MsgResetslrucounter
PgStat_MsgSLRU
+PgStat_MsgSubscriptionErr
+PgStat_MsgSubscriptionErrPurge
+PgStat_MsgSubscriptionPurge
PgStat_MsgTabpurge
PgStat_MsgTabstat
PgStat_MsgTempFile
@@ -1949,6 +1952,8 @@ PgStat_Single_Reset_Type
PgStat_StatDBEntry
PgStat_StatFuncEntry
PgStat_StatReplSlotEntry
+PgStat_StatSubEntry
+PgStat_StatSubErrEntry
PgStat_StatTabEntry
PgStat_SubXactStatus
PgStat_TableCounts
--
2.24.3 (Apple Git-128)
On Thu, Aug 12, 2021 at 3:54 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached the updated patches. FYI I've included the patch
(v8-0005) that fixes the assertion failure during shared fileset
cleanup to make cfbot tests happy.
A minor comment on the 0001 patch: In the message I think that using
"ID" would look better than lowercase "id" and AFAICS it's more
consistent with existing messages.
+ appendStringInfo(&buf, _(" in transaction id %u with commit timestamp %s"),
Regards,
Greg Nancarrow
Fujitsu Australia
On Thu, Aug 12, 2021 at 1:21 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Thu, Aug 12, 2021 at 3:54 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached the updated patches. FYI I've included the patch
(v8-0005) that fixes the assertion failure during shared fileset
cleanup to make cfbot tests happy.A minor comment on the 0001 patch: In the message I think that using
"ID" would look better than lowercase "id" and AFAICS it's more
consistent with existing messages.+ appendStringInfo(&buf, _(" in transaction id %u with commit timestamp %s"),
You have a point but I think in this case it might look a bit odd as
we have another field 'commit timestamp' after that which is
lowercase.
--
With Regards,
Amit Kapila.
On Thu, Aug 12, 2021 at 9:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
A minor comment on the 0001 patch: In the message I think that using
"ID" would look better than lowercase "id" and AFAICS it's more
consistent with existing messages.+ appendStringInfo(&buf, _(" in transaction id %u with commit timestamp %s"),
You have a point but I think in this case it might look a bit odd as
we have another field 'commit timestamp' after that which is
lowercase.
I did a quick search and I couldn't find any other messages in the
Postgres code that use "transaction id", but I could find some that
use "transaction ID" and "transaction identifier".
Regards,
Greg Nancarrow
Fujitsu Australia
On Thu, Aug 12, 2021 at 5:41 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Thu, Aug 12, 2021 at 9:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
A minor comment on the 0001 patch: In the message I think that using
"ID" would look better than lowercase "id" and AFAICS it's more
consistent with existing messages.+ appendStringInfo(&buf, _(" in transaction id %u with commit timestamp %s"),
You have a point but I think in this case it might look a bit odd as
we have another field 'commit timestamp' after that which is
lowercase.I did a quick search and I couldn't find any other messages in the
Postgres code that use "transaction id", but I could find some that
use "transaction ID" and "transaction identifier".
Okay, but that doesn't mean using it here is bad. I am personally fine
with a message containing something like "... in transaction
id 740 with commit timestamp 2021-08-10 14:44:38.058174+05:30" but I
won't mind if you and or others find some other way convenient. Any
opinion from others?
--
With Regards,
Amit Kapila.
On Fri, Aug 13, 2021 at 2:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Aug 12, 2021 at 5:41 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Thu, Aug 12, 2021 at 9:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
A minor comment on the 0001 patch: In the message I think that using
"ID" would look better than lowercase "id" and AFAICS it's more
consistent with existing messages.+ appendStringInfo(&buf, _(" in transaction id %u with commit timestamp %s"),
You have a point but I think in this case it might look a bit odd as
we have another field 'commit timestamp' after that which is
lowercase.I did a quick search and I couldn't find any other messages in the
Postgres code that use "transaction id", but I could find some that
use "transaction ID" and "transaction identifier".Okay, but that doesn't mean using it here is bad. I am personally fine
with a message containing something like "... in transaction
id 740 with commit timestamp 2021-08-10 14:44:38.058174+05:30" but I
won't mind if you and or others find some other way convenient. Any
opinion from others?
Just to be clear, all I was saying is that I thought using uppercase
"ID" looked better in the message, and was more consistent with
existing logged messages, than using lowercase "id".
i.e. my suggestion was a trivial change:
BEFORE:
+ appendStringInfo(&buf, _(" in transaction id %u with commit timestamp %s"),
AFTER:
+ appendStringInfo(&buf, _(" in transaction ID %u with commit timestamp %s"),
But it was just a suggestion. Maybe others feel differently.
Regards,
Greg Nancarrow
Fujitsu Australia
On Fri, Aug 13, 2021 at 1:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Aug 12, 2021 at 5:41 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Thu, Aug 12, 2021 at 9:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
A minor comment on the 0001 patch: In the message I think that using
"ID" would look better than lowercase "id" and AFAICS it's more
consistent with existing messages.+ appendStringInfo(&buf, _(" in transaction id %u with commit timestamp %s"),
You have a point but I think in this case it might look a bit odd as
we have another field 'commit timestamp' after that which is
lowercase.I did a quick search and I couldn't find any other messages in the
Postgres code that use "transaction id", but I could find some that
use "transaction ID" and "transaction identifier".Okay, but that doesn't mean using it here is bad. I am personally fine
with a message containing something like "... in transaction
id 740 with commit timestamp 2021-08-10 14:44:38.058174+05:30" but I
won't mind if you and or others find some other way convenient. Any
opinion from others?
I don't have a strong opinion on this but in terms of consistency we
often use like "transaction %u" in messages when showing XID value,
rather than "transaction [id|ID|identifier]":
$ git grep -i "errmsg.*transaction %u" src/backend/
src/backend/access/transam/commit_ts.c: errmsg("cannot
retrieve commit timestamp for transaction %u", xid)));
src/backend/access/transam/slru.c: errmsg("could not
access status of transaction %u", xid),
src/backend/access/transam/slru.c: errmsg("could not
access status of transaction %u", xid),
src/backend/access/transam/slru.c: errmsg("could
not access status of transaction %u", xid),
src/backend/access/transam/slru.c: (errmsg("could
not access status of transaction %u", xid),
src/backend/access/transam/slru.c: errmsg("could
not access status of transaction %u", xid),
src/backend/access/transam/slru.c: (errmsg("could
not access status of transaction %u", xid),
src/backend/access/transam/slru.c: errmsg("could not
access status of transaction %u", xid),
src/backend/access/transam/slru.c: errmsg("could not
access status of transaction %u", xid),
src/backend/access/transam/twophase.c:
(errmsg("recovering prepared transaction %u from shared memory",
xid)));
src/backend/access/transam/twophase.c:
(errmsg("removing stale two-phase state file for transaction %u",
src/backend/access/transam/twophase.c:
(errmsg("removing stale two-phase state from memory for transaction
%u",
src/backend/access/transam/twophase.c:
(errmsg("removing future two-phase state file for transaction %u",
src/backend/access/transam/twophase.c:
(errmsg("removing future two-phase state from memory for transaction
%u",
src/backend/access/transam/twophase.c:
errmsg("corrupted two-phase state file for transaction %u",
src/backend/access/transam/twophase.c:
errmsg("corrupted two-phase state in memory for transaction %u",
src/backend/access/transam/xlog.c: (errmsg("recovery
stopping before commit of transaction %u, time %s",
src/backend/access/transam/xlog.c: (errmsg("recovery
stopping before abort of transaction %u, time %s",
src/backend/access/transam/xlog.c:
(errmsg("recovery stopping after commit of transaction %u, time %s",
src/backend/access/transam/xlog.c:
(errmsg("recovery stopping after abort of transaction %u, time %s",
src/backend/replication/logical/worker.c:
errmsg_internal("transaction %u not found in stream XID hash table",
src/backend/replication/logical/worker.c:
errmsg_internal("transaction %u not found in stream XID hash table",
src/backend/replication/logical/worker.c:
errmsg_internal("transaction %u not found in stream XID hash table",
src/backend/replication/logical/worker.c:
errmsg_internal("transaction %u not found in stream XID hash table",
$ git grep -i "errmsg.*transaction identifier" src/backend/
src/backend/access/transam/twophase.c:
errmsg("transaction identifier \"%s\" is too long",
src/backend/access/transam/twophase.c:
errmsg("transaction identifier \"%s\" is already in use",
$ git grep -i "errmsg.*transaction id" src/backend/
src/backend/access/transam/twophase.c:
errmsg("transaction identifier \"%s\" is too long",
src/backend/access/transam/twophase.c:
errmsg("transaction identifier \"%s\" is already in use",
src/backend/access/transam/varsup.c:
(errmsg_internal("transaction ID wrap limit is %u, limited by database
with OID %u",
src/backend/access/transam/xlog.c: (errmsg_internal("next
transaction ID: " UINT64_FORMAT "; next OID: %u",
src/backend/access/transam/xlog.c: (errmsg_internal("oldest
unfrozen transaction ID: %u, in database %u",
src/backend/access/transam/xlog.c: (errmsg("invalid next
transaction ID")));
src/backend/replication/logical/snapbuild.c:
(errmsg_plural("exported logical decoding snapshot: \"%s\" with %u
transaction ID",
src/backend/replication/logical/worker.c:
errmsg_internal("invalid transaction ID in streamed replication
transaction")));
src/backend/replication/logical/worker.c:
errmsg_internal("invalid transaction ID in streamed replication
transaction")));
src/backend/replication/logical/worker.c:
errmsg_internal("invalid two-phase transaction ID")));
src/backend/utils/adt/xid8funcs.c: errmsg("transaction
ID %s is in the future",
Therefore, perhaps a message like "... in transaction 740 with commit
timestamp 2021-08-10 14:44:38.058174+05:30" is better in terms of
consistency with other messages?
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Mon, Aug 16, 2021 at 6:24 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Therefore, perhaps a message like "... in transaction 740 with commit
timestamp 2021-08-10 14:44:38.058174+05:30" is better in terms of
consistency with other messages?
Yes, I think that would be more consistent.
On another note, for the 0001 patch, the elog ERROR at the bottom of
the logicalrep_message_type() function seems to assume that the
unrecognized "action" is a printable character (with its use of %c)
and also that the character is meaningful to the user in some way.
But given that the compiler normally warns of an unhandled enum value
when switching on an enum, such an error would most likely be when
action is some int value that wouldn't be meaningful to the user (as
it wouldn't be one of the LogicalRepMsgType enum values).
I therefore think it would be better to use %d in that ERROR:
i.e.
+ elog(ERROR, "invalid logical replication message type %d", action);
Similar comments apply to the apply_dispatch() function (and I realise
it used %c before your patch).
Regards,
Greg Nancarrow
Fujitsu Australia
On Mon, Aug 16, 2021 at 1:54 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Aug 13, 2021 at 1:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Okay, but that doesn't mean using it here is bad. I am personally fine
with a message containing something like "... in transaction
id 740 with commit timestamp 2021-08-10 14:44:38.058174+05:30" but I
won't mind if you and or others find some other way convenient. Any
opinion from others?I don't have a strong opinion on this but in terms of consistency we
often use like "transaction %u" in messages when showing XID value,
rather than "transaction [id|ID|identifier]":
..
Therefore, perhaps a message like "... in transaction 740 with commit
timestamp 2021-08-10 14:44:38.058174+05:30" is better in terms of
consistency with other messages?
+1.
--
With Regards,
Amit Kapila.
On Thu, Aug 12, 2021 1:53 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached the updated patches. FYI I've included the patch
(v8-0005) that fixes the assertion failure during shared fileset cleanup to make
cfbot tests happy.
Hi,
Thanks for the new patches.
I have a few comments on the v8-0001 patch.
1)
+
+ if (TransactionIdIsNormal(errarg->remote_xid))
+ appendStringInfo(&buf, _(" in transaction id %u with commit timestamp %s"),
+ errarg->remote_xid,
+ errarg->commit_ts == 0
+ ? "(unset)"
+ : timestamptz_to_str(errarg->commit_ts));
+
+ errcontext("%s", buf.data);
I think we can output the timestamp in a separete check which can be more
consistent with the other code style in apply_error_callback()
(ie)
+ if (errarg->commit_ts != 0)
+ appendStringInfo(&buf, _(" with commit timestamp %s"),
+ timestamptz_to_str(errarg->commit_ts));
2)
+/*
+ * Get string representing LogicalRepMsgType.
+ */
+char *
+logicalrep_message_type(LogicalRepMsgType action)
+{
...
+
+ elog(ERROR, "invalid logical replication message type \"%c\"", action);
+}
Some old compilers might complain that the function doesn't have a return value
at the end of the function, maybe we can code like the following:
+char *
+logicalrep_message_type(LogicalRepMsgType action)
+{
+ switch (action)
+ {
+ case LOGICAL_REP_MSG_BEGIN:
+ return "BEGIN";
...
+ default:
+ elog(ERROR, "invalid logical replication message type \"%c\"", action);
+ }
+ return NULL; /* keep compiler quiet */
+}
3)
Do we need to invoke set_apply_error_context_xact() in the function
apply_handle_stream_prepare() to save the xid and timestamp ?
Best regards,
Hou zj
Monday, August 16, 2021 3:00 PM Hou, Zhijie wrote:
On Thu, Aug 12, 2021 1:53 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached the updated patches. FYI I've included the patch
(v8-0005) that fixes the assertion failure during shared fileset
cleanup to make cfbot tests happy.Hi,
Thanks for the new patches.
I have a few comments on the v8-0001 patch.
3)
Do we need to invoke set_apply_error_context_xact() in the function
apply_handle_stream_prepare() to save the xid and timestamp ?
Sorry, this comment wasn't correct, please ignore it.
Here is another comment:
+char *
+logicalrep_message_type(LogicalRepMsgType action)
+{
...
+ case LOGICAL_REP_MSG_STREAM_END:
+ return "STREAM END";
...
I think most the existing code use "STREAM STOP" to describe the
LOGICAL_REP_MSG_STREAM_END message, is it better to return "STREAM STOP" in
function logicalrep_message_type() too ?
Best regards,
Hou zj
On Mon, Aug 16, 2021 at 5:54 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
Here is another comment:
+char * +logicalrep_message_type(LogicalRepMsgType action) +{ ... + case LOGICAL_REP_MSG_STREAM_END: + return "STREAM END"; ...I think most the existing code use "STREAM STOP" to describe the
LOGICAL_REP_MSG_STREAM_END message, is it better to return "STREAM STOP" in
function logicalrep_message_type() too ?
+1
I think you're right, it should be "STREAM STOP" in that case.
Regards,
Greg Nancarrow
Fujitsu Australia
On Thu, Aug 12, 2021 at 3:54 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached the updated patches. FYI I've included the patch
(v8-0005) that fixes the assertion failure during shared fileset
cleanup to make cfbot tests happy.
Another comment on the 0001 patch: as there is now a mix of setting
"apply_error_callback_arg" members directly and also through inline
functions, it might look better to have it done consistently with
functions having prototypes something like the following:
static inline void set_apply_error_context_rel(LogicalRepRelMapEntry *rel);
static inline void reset_apply_error_context_rel(void);
static inline void set_apply_error_context_attnum(int remote_attnum);
Regards,
Greg Nancarrow
Fujitsu Australia
On Mon, Aug 16, 2021 at 3:59 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
On Thu, Aug 12, 2021 1:53 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached the updated patches. FYI I've included the patch
(v8-0005) that fixes the assertion failure during shared fileset cleanup to make
cfbot tests happy.Hi,
Thanks for the new patches.
I have a few comments on the v8-0001 patch.
Thank you for the comments!
2) +/* + * Get string representing LogicalRepMsgType. + */ +char * +logicalrep_message_type(LogicalRepMsgType action) +{ ... + + elog(ERROR, "invalid logical replication message type \"%c\"", action); +}Some old compilers might complain that the function doesn't have a return value
at the end of the function, maybe we can code like the following:+char * +logicalrep_message_type(LogicalRepMsgType action) +{ + switch (action) + { + case LOGICAL_REP_MSG_BEGIN: + return "BEGIN"; ... + default: + elog(ERROR, "invalid logical replication message type \"%c\"", action); + } + return NULL; /* keep compiler quiet */ +}
Fixed.
3)
Do we need to invoke set_apply_error_context_xact() in the function
apply_handle_stream_prepare() to save the xid and timestamp ?
Yes. I think that v8-0001 patch already set xid and timestamp just
after parsing stream_prepare message. You meant it's not necessary?
I'll submit the updated patches soon.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Mon, Aug 16, 2021 at 5:30 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Mon, Aug 16, 2021 at 5:54 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:Here is another comment:
+char * +logicalrep_message_type(LogicalRepMsgType action) +{ ... + case LOGICAL_REP_MSG_STREAM_END: + return "STREAM END"; ...I think most the existing code use "STREAM STOP" to describe the
LOGICAL_REP_MSG_STREAM_END message, is it better to return "STREAM STOP" in
function logicalrep_message_type() too ?+1
I think you're right, it should be "STREAM STOP" in that case.
It's right that we use "STREAM STOP" rather than "STREAM END" in many
places such as elog messages, a callback name, and source code
comments. As far as I have found there are two places where we’re
using "STREAM STOP": LOGICAL_REP_MSG_STREAM_END and a description in
doc/src/sgml/protocol.sgml. Isn't it better to fix these
inconsistencies in the first place? I think “STREAM STOP” would be
more appropriate.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, Aug 17, 2021 at 10:46 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Aug 16, 2021 at 5:30 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Mon, Aug 16, 2021 at 5:54 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:Here is another comment:
+char * +logicalrep_message_type(LogicalRepMsgType action) +{ ... + case LOGICAL_REP_MSG_STREAM_END: + return "STREAM END"; ...I think most the existing code use "STREAM STOP" to describe the
LOGICAL_REP_MSG_STREAM_END message, is it better to return "STREAM STOP" in
function logicalrep_message_type() too ?+1
I think you're right, it should be "STREAM STOP" in that case.It's right that we use "STREAM STOP" rather than "STREAM END" in many
places such as elog messages, a callback name, and source code
comments. As far as I have found there are two places where we’re
using "STREAM STOP": LOGICAL_REP_MSG_STREAM_END and a description in
doc/src/sgml/protocol.sgml. Isn't it better to fix these
inconsistencies in the first place? I think “STREAM STOP” would be
more appropriate.
I think keeping STREAM_END in the enum 'LOGICAL_REP_MSG_STREAM_END'
seems to be a bit better because of the value 'E' we use for it.
--
With Regards,
Amit Kapila.
On Thursday, August 12, 2021 1:53 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached the updated patches. FYI I've included the patch
(v8-0005) that fixes the assertion failure during shared fileset
cleanup to make cfbot tests happy.
Hi
Thanks for your patch. I met a problem when using it. The log is not what I expected in some cases, but in streaming mode, they work well.
For example:
------publisher------
create table test (a int primary key, b varchar);
create publication pub for table test;
------subscriber------
create table test (a int primary key, b varchar);
insert into test values (10000);
create subscription sub connection 'dbname=postgres port=5432' publication pub with(streaming=on);
------publisher------
insert into test values (10000);
Subscriber log:
2021-08-17 14:24:43.415 CST [3630341] ERROR: duplicate key value violates unique constraint "test_pkey"
2021-08-17 14:24:43.415 CST [3630341] DETAIL: Key (a)=(10000) already exists.
It didn't give more context info generated by apply_error_callback function.
In streaming mode(which worked as I expected):
------publisher------
INSERT INTO test SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
Subscriber log:
2021-08-17 14:26:26.521 CST [3630510] ERROR: duplicate key value violates unique constraint "test_pkey"
2021-08-17 14:26:26.521 CST [3630510] DETAIL: Key (a)=(10000) already exists.
2021-08-17 14:26:26.521 CST [3630510] CONTEXT: processing remote data during "INSERT" for replication target relation "public.test" in transaction id 710 with commit timestamp 2021-08-17 14:26:26.403214+08
I looked into it briefly and thought it was related to some code in
apply_dispatch function. It set callback when apply_error_callback_arg.command
is 0, and reset the callback back at the end of the function. But
apply_error_callback_arg.command was not reset to 0, so it won't set callback
when calling apply_dispatch function next time.
I tried to fix it with the following change, thoughts?
@@ -2455,7 +2455,10 @@ apply_dispatch(StringInfo s)
/* Pop the error context stack */
if (set_callback)
+ {
error_context_stack = errcallback.previous;
+ apply_error_callback_arg.command = 0;
+ }
}
Besides, if we make the changes like this, do we still need to reset
apply_error_callback_arg.command in reset_apply_error_context_info function?
Regards
Tang
On Tue, Aug 17, 2021 at 2:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Aug 17, 2021 at 10:46 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Aug 16, 2021 at 5:30 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Mon, Aug 16, 2021 at 5:54 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:Here is another comment:
+char * +logicalrep_message_type(LogicalRepMsgType action) +{ ... + case LOGICAL_REP_MSG_STREAM_END: + return "STREAM END"; ...I think most the existing code use "STREAM STOP" to describe the
LOGICAL_REP_MSG_STREAM_END message, is it better to return "STREAM STOP" in
function logicalrep_message_type() too ?+1
I think you're right, it should be "STREAM STOP" in that case.It's right that we use "STREAM STOP" rather than "STREAM END" in many
places such as elog messages, a callback name, and source code
comments. As far as I have found there are two places where we’re
using "STREAM STOP": LOGICAL_REP_MSG_STREAM_END and a description in
doc/src/sgml/protocol.sgml. Isn't it better to fix these
inconsistencies in the first place? I think “STREAM STOP” would be
more appropriate.I think keeping STREAM_END in the enum 'LOGICAL_REP_MSG_STREAM_END'
seems to be a bit better because of the value 'E' we use for it.
But I think we don't care about the actual value of
LOGICAL_REP_MSG_STREAM_END since we use the enum value rather than
'E'?
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wed, Aug 18, 2021 at 6:53 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Aug 17, 2021 at 2:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
It's right that we use "STREAM STOP" rather than "STREAM END" in many
places such as elog messages, a callback name, and source code
comments. As far as I have found there are two places where we’re
using "STREAM STOP": LOGICAL_REP_MSG_STREAM_END and a description in
doc/src/sgml/protocol.sgml. Isn't it better to fix these
inconsistencies in the first place? I think “STREAM STOP” would be
more appropriate.I think keeping STREAM_END in the enum 'LOGICAL_REP_MSG_STREAM_END'
seems to be a bit better because of the value 'E' we use for it.But I think we don't care about the actual value of
LOGICAL_REP_MSG_STREAM_END since we use the enum value rather than
'E'?
True, but here we are trying to be consistent with other enum values
where we try to use the first letter of the last word (which is E in
this case). I can see there are other cases where we are not
consistent so it won't be a big deal if we won't be consistent here. I
am neutral on this one, so, if you feel using STREAM_STOP would be
better from a code readability perspective then that is fine.
--
With Regards,
Amit Kapila.
On Wed, Aug 18, 2021 at 12:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Aug 18, 2021 at 6:53 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Aug 17, 2021 at 2:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
It's right that we use "STREAM STOP" rather than "STREAM END" in many
places such as elog messages, a callback name, and source code
comments. As far as I have found there are two places where we’re
using "STREAM STOP": LOGICAL_REP_MSG_STREAM_END and a description in
doc/src/sgml/protocol.sgml. Isn't it better to fix these
inconsistencies in the first place? I think “STREAM STOP” would be
more appropriate.I think keeping STREAM_END in the enum 'LOGICAL_REP_MSG_STREAM_END'
seems to be a bit better because of the value 'E' we use for it.But I think we don't care about the actual value of
LOGICAL_REP_MSG_STREAM_END since we use the enum value rather than
'E'?True, but here we are trying to be consistent with other enum values
where we try to use the first letter of the last word (which is E in
this case). I can see there are other cases where we are not
consistent so it won't be a big deal if we won't be consistent here. I
am neutral on this one, so, if you feel using STREAM_STOP would be
better from a code readability perspective then that is fine.
In addition of a code readability, there is a description in the doc
that mentions "Stream End" but we describe "Stream Stop" in the later
description, which seems a bug in the doc to me:
The following messages (Stream Start, Stream End, Stream Commit, and
Stream Abort) are available since protocol version 2.
</para>
(snip)
<varlistentry>
<term>
Stream Stop
</term>
<listitem>
Perhaps it's better to hear other opinions too, but I've attached the
patch. Please review it.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
0001-Rename-LOGICAL_REP_MSG_STREAM_END-to-LOGICAL_REP_MSG.patchapplication/octet-stream; name=0001-Rename-LOGICAL_REP_MSG_STREAM_END-to-LOGICAL_REP_MSG.patchDownload
From 23ef544b82b7b25c2c30cbaacafed400d597fe08 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 18 Aug 2021 13:20:43 +0900
Subject: [PATCH] Rename LOGICAL_REP_MSG_STREAM_END to
LOGICAL_REP_MSG_STREAM_STOP.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Previously, we had LOGICAL_REP_MSG_STREAM_END enum value in
LogicalRepMsgType but we used the term “STREAM Stop” much in the
codes, the logical decoding callback function name, and the
documentation. There were only a few places where we used the term
“STREAM END”. Since we have LOGICAL_REP_MSG_STREAM_START, the term
“STREAM STOP” matches better. This commit improves the cosistency
by renaming LOGICAL_REP_MSG_STREAM_END to LOGICAL_REP_MSG_STREAM_STOP.
---
doc/src/sgml/protocol.sgml | 2 +-
src/backend/replication/logical/proto.c | 2 +-
src/backend/replication/logical/worker.c | 2 +-
src/include/replication/logicalproto.h | 2 +-
4 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 91ec237c21..a232546b1d 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -7212,7 +7212,7 @@ Truncate
<para>
-The following messages (Stream Start, Stream End, Stream Commit, and
+The following messages (Stream Start, Stream Stop, Stream Commit, and
Stream Abort) are available since protocol version 2.
</para>
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 52b65e9572..9732982d93 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -1074,7 +1074,7 @@ logicalrep_read_stream_start(StringInfo in, bool *first_segment)
void
logicalrep_write_stream_stop(StringInfo out)
{
- pq_sendbyte(out, LOGICAL_REP_MSG_STREAM_END);
+ pq_sendbyte(out, LOGICAL_REP_MSG_STREAM_STOP);
}
/*
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index ecaed157f2..38b493e4f5 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -2380,7 +2380,7 @@ apply_dispatch(StringInfo s)
apply_handle_stream_start(s);
return;
- case LOGICAL_REP_MSG_STREAM_END:
+ case LOGICAL_REP_MSG_STREAM_STOP:
apply_handle_stream_stop(s);
return;
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 2e29513151..95c1561ca0 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -65,7 +65,7 @@ typedef enum LogicalRepMsgType
LOGICAL_REP_MSG_COMMIT_PREPARED = 'K',
LOGICAL_REP_MSG_ROLLBACK_PREPARED = 'r',
LOGICAL_REP_MSG_STREAM_START = 'S',
- LOGICAL_REP_MSG_STREAM_END = 'E',
+ LOGICAL_REP_MSG_STREAM_STOP = 'E',
LOGICAL_REP_MSG_STREAM_COMMIT = 'c',
LOGICAL_REP_MSG_STREAM_ABORT = 'A',
LOGICAL_REP_MSG_STREAM_PREPARE = 'p'
--
2.24.3 (Apple Git-128)
On Wed, Aug 18, 2021 at 10:00 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Aug 18, 2021 at 12:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Aug 18, 2021 at 6:53 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Aug 17, 2021 at 2:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
It's right that we use "STREAM STOP" rather than "STREAM END" in many
places such as elog messages, a callback name, and source code
comments. As far as I have found there are two places where we’re
using "STREAM STOP": LOGICAL_REP_MSG_STREAM_END and a description in
doc/src/sgml/protocol.sgml. Isn't it better to fix these
inconsistencies in the first place? I think “STREAM STOP” would be
more appropriate.I think keeping STREAM_END in the enum 'LOGICAL_REP_MSG_STREAM_END'
seems to be a bit better because of the value 'E' we use for it.But I think we don't care about the actual value of
LOGICAL_REP_MSG_STREAM_END since we use the enum value rather than
'E'?True, but here we are trying to be consistent with other enum values
where we try to use the first letter of the last word (which is E in
this case). I can see there are other cases where we are not
consistent so it won't be a big deal if we won't be consistent here. I
am neutral on this one, so, if you feel using STREAM_STOP would be
better from a code readability perspective then that is fine.In addition of a code readability, there is a description in the doc
that mentions "Stream End" but we describe "Stream Stop" in the later
description, which seems a bug in the doc to me:
Doc changes looks good to me. But, I have question for code change:
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -65,7 +65,7 @@ typedef enum LogicalRepMsgType
LOGICAL_REP_MSG_COMMIT_PREPARED = 'K',
LOGICAL_REP_MSG_ROLLBACK_PREPARED = 'r',
LOGICAL_REP_MSG_STREAM_START = 'S',
- LOGICAL_REP_MSG_STREAM_END = 'E',
+ LOGICAL_REP_MSG_STREAM_STOP = 'E',
LOGICAL_REP_MSG_STREAM_COMMIT = 'c',
As this is changing the enum name and if any extension (logical
replication extension) has started using it then they would require a
change. As this is the latest change in PG-14, so it might be okay but
OTOH, as this is just a code readability change, shall we do it only
for PG-15?
--
With Regards,
Amit Kapila.
On Tues, Aug 17, 2021 1:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Aug 16, 2021 at 3:59 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:
3)
Do we need to invoke set_apply_error_context_xact() in the function
apply_handle_stream_prepare() to save the xid and timestamp ?Yes. I think that v8-0001 patch already set xid and timestamp just after parsing
stream_prepare message. You meant it's not necessary?
Sorry, I thought of something wrong, please ignore the above comment.
I'll submit the updated patches soon.
I was thinking about the place to set the errcallback.callback.
apply_dispatch(StringInfo s)
{
LogicalRepMsgType action = pq_getmsgbyte(s);
+ ErrorContextCallback errcallback;
+ bool set_callback = false;
+
+ /*
+ * Push apply error context callback if not yet. Other fields will be
+ * filled during applying the change. Since this function can be called
+ * recursively when applying spooled changes, we set the callback only
+ * once.
+ */
+ if (apply_error_callback_arg.command == 0)
+ {
+ errcallback.callback = apply_error_callback;
+ errcallback.previous = error_context_stack;
+ error_context_stack = &errcallback;
+ set_callback = true;
+ }
...
+ /* Pop the error context stack */
+ if (set_callback)
+ error_context_stack = errcallback.previous;
It seems we can put the above code in the function LogicalRepApplyLoop()
around invoking apply_dispatch(), and in that approach we don't need to worry
about the recursively case. What do you think ?
Best regards,
Hou zj
On Wed, Aug 18, 2021 at 3:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Aug 18, 2021 at 10:00 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Aug 18, 2021 at 12:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Aug 18, 2021 at 6:53 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Aug 17, 2021 at 2:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
It's right that we use "STREAM STOP" rather than "STREAM END" in many
places such as elog messages, a callback name, and source code
comments. As far as I have found there are two places where we’re
using "STREAM STOP": LOGICAL_REP_MSG_STREAM_END and a description in
doc/src/sgml/protocol.sgml. Isn't it better to fix these
inconsistencies in the first place? I think “STREAM STOP” would be
more appropriate.I think keeping STREAM_END in the enum 'LOGICAL_REP_MSG_STREAM_END'
seems to be a bit better because of the value 'E' we use for it.But I think we don't care about the actual value of
LOGICAL_REP_MSG_STREAM_END since we use the enum value rather than
'E'?True, but here we are trying to be consistent with other enum values
where we try to use the first letter of the last word (which is E in
this case). I can see there are other cases where we are not
consistent so it won't be a big deal if we won't be consistent here. I
am neutral on this one, so, if you feel using STREAM_STOP would be
better from a code readability perspective then that is fine.In addition of a code readability, there is a description in the doc
that mentions "Stream End" but we describe "Stream Stop" in the later
description, which seems a bug in the doc to me:Doc changes looks good to me. But, I have question for code change:
--- a/src/include/replication/logicalproto.h +++ b/src/include/replication/logicalproto.h @@ -65,7 +65,7 @@ typedef enum LogicalRepMsgType LOGICAL_REP_MSG_COMMIT_PREPARED = 'K', LOGICAL_REP_MSG_ROLLBACK_PREPARED = 'r', LOGICAL_REP_MSG_STREAM_START = 'S', - LOGICAL_REP_MSG_STREAM_END = 'E', + LOGICAL_REP_MSG_STREAM_STOP = 'E', LOGICAL_REP_MSG_STREAM_COMMIT = 'c',As this is changing the enum name and if any extension (logical
replication extension) has started using it then they would require a
change. As this is the latest change in PG-14, so it might be okay but
OTOH, as this is just a code readability change, shall we do it only
for PG-15?
I think that the doc changes could be backpatched to PG14 but I think
we should do the code change only for PG15.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wed, Aug 18, 2021 2:41 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Aug 18, 2021 at 3:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Aug 18, 2021 at 10:00 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
In addition of a code readability, there is a description in the doc
that mentions "Stream End" but we describe "Stream Stop" in the
later description, which seems a bug in the doc to me:Doc changes looks good to me. But, I have question for code change:
--- a/src/include/replication/logicalproto.h +++ b/src/include/replication/logicalproto.h @@ -65,7 +65,7 @@ typedef enum LogicalRepMsgType LOGICAL_REP_MSG_COMMIT_PREPARED = 'K', LOGICAL_REP_MSG_ROLLBACK_PREPARED = 'r', LOGICAL_REP_MSG_STREAM_START = 'S', - LOGICAL_REP_MSG_STREAM_END = 'E', + LOGICAL_REP_MSG_STREAM_STOP = 'E', LOGICAL_REP_MSG_STREAM_COMMIT = 'c',As this is changing the enum name and if any extension (logical
replication extension) has started using it then they would require a
change. As this is the latest change in PG-14, so it might be okay but
OTOH, as this is just a code readability change, shall we do it only
for PG-15?I think that the doc changes could be backpatched to PG14 but I think we
should do the code change only for PG15.
+1, and the patch looks good to me.
Best regards,
Hou zj
On Wed, Aug 18, 2021 at 3:33 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
On Tues, Aug 17, 2021 1:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Aug 16, 2021 at 3:59 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:
3)
Do we need to invoke set_apply_error_context_xact() in the function
apply_handle_stream_prepare() to save the xid and timestamp ?Yes. I think that v8-0001 patch already set xid and timestamp just after parsing
stream_prepare message. You meant it's not necessary?Sorry, I thought of something wrong, please ignore the above comment.
I'll submit the updated patches soon.
I was thinking about the place to set the errcallback.callback.
apply_dispatch(StringInfo s) { LogicalRepMsgType action = pq_getmsgbyte(s); + ErrorContextCallback errcallback; + bool set_callback = false; + + /* + * Push apply error context callback if not yet. Other fields will be + * filled during applying the change. Since this function can be called + * recursively when applying spooled changes, we set the callback only + * once. + */ + if (apply_error_callback_arg.command == 0) + { + errcallback.callback = apply_error_callback; + errcallback.previous = error_context_stack; + error_context_stack = &errcallback; + set_callback = true; + } ... + /* Pop the error context stack */ + if (set_callback) + error_context_stack = errcallback.previous;It seems we can put the above code in the function LogicalRepApplyLoop()
around invoking apply_dispatch(), and in that approach we don't need to worry
about the recursively case. What do you think ?
Thank you for the comment!
I think you're right. Maybe we can set the callback before entering to
the main loop and pop it after breaking from it. It would also fix the
problem reported by Tang[1]/messages/by-id/OS0PR01MB6113E5BC24922A2D05D16051FBFE9@OS0PR01MB6113.jpnprd01.prod.outlook.com. But one thing we need to note that since
we want to reset apply_error_callback_arg.command at the end of
apply_dispatch() (otherwise we could end up setting the apply error
context to an irrelevant error such as network error), when
apply_dispatch() is called recursively probably we need to save the
apply_error_callback_arg.command before setting the new command and
then revert back to the saved command. Is that right?
Regards,
[1]: /messages/by-id/OS0PR01MB6113E5BC24922A2D05D16051FBFE9@OS0PR01MB6113.jpnprd01.prod.outlook.com
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wed, Aug 18, 2021 at 5:39 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Aug 18, 2021 at 3:33 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:On Tues, Aug 17, 2021 1:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Aug 16, 2021 at 3:59 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:
3)
Do we need to invoke set_apply_error_context_xact() in the function
apply_handle_stream_prepare() to save the xid and timestamp ?Yes. I think that v8-0001 patch already set xid and timestamp just after parsing
stream_prepare message. You meant it's not necessary?Sorry, I thought of something wrong, please ignore the above comment.
I'll submit the updated patches soon.
I was thinking about the place to set the errcallback.callback.
apply_dispatch(StringInfo s) { LogicalRepMsgType action = pq_getmsgbyte(s); + ErrorContextCallback errcallback; + bool set_callback = false; + + /* + * Push apply error context callback if not yet. Other fields will be + * filled during applying the change. Since this function can be called + * recursively when applying spooled changes, we set the callback only + * once. + */ + if (apply_error_callback_arg.command == 0) + { + errcallback.callback = apply_error_callback; + errcallback.previous = error_context_stack; + error_context_stack = &errcallback; + set_callback = true; + } ... + /* Pop the error context stack */ + if (set_callback) + error_context_stack = errcallback.previous;It seems we can put the above code in the function LogicalRepApplyLoop()
around invoking apply_dispatch(), and in that approach we don't need to worry
about the recursively case. What do you think ?Thank you for the comment!
I think you're right. Maybe we can set the callback before entering to
the main loop and pop it after breaking from it. It would also fix the
problem reported by Tang[1]. But one thing we need to note that since
we want to reset apply_error_callback_arg.command at the end of
apply_dispatch() (otherwise we could end up setting the apply error
context to an irrelevant error such as network error), when
apply_dispatch() is called recursively probably we need to save the
apply_error_callback_arg.command before setting the new command and
then revert back to the saved command. Is that right?
I've attached the updated version patches that incorporated all
comments I got so far unless I'm missing something. Please review
them.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v9-0005-Move-shared-fileset-cleanup-to-before_shmem_exit.patchapplication/octet-stream; name=v9-0005-Move-shared-fileset-cleanup-to-before_shmem_exit.patchDownload
From d90648909d7db6abf4938a4cbdfa8c3705235e7f Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Thu, 12 Aug 2021 10:57:41 +0900
Subject: [PATCH v9 5/5] Move shared fileset cleanup to before_shmem_exit().
The reported problem is that shared file set created in
SharedFileSetInit() by logical replication apply worker is cleaned up
in SharedFileSetDeleteOnProcExit() when the process exited on an error
due to a conflict. As shared fileset cleanup causes pgstat reporting
for underlying temporary files, the assertions added in ee3f8d3d3ae
caused failures.
To fix the problem, similar to 675c945394, move shared fileset cleanup
to a before_shmem_exit() hook, ensuring that the fileset is dropped
while we can still report stats for underlying temporary files.
---
src/backend/storage/file/sharedfileset.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/src/backend/storage/file/sharedfileset.c b/src/backend/storage/file/sharedfileset.c
index ed37c940ad..0d9700bf56 100644
--- a/src/backend/storage/file/sharedfileset.c
+++ b/src/backend/storage/file/sharedfileset.c
@@ -36,7 +36,7 @@
static List *filesetlist = NIL;
static void SharedFileSetOnDetach(dsm_segment *segment, Datum datum);
-static void SharedFileSetDeleteOnProcExit(int status, Datum arg);
+static void SharedFileSetDeleteBeforeShmemExit(int status, Datum arg);
static void SharedFileSetPath(char *path, SharedFileSet *fileset, Oid tablespace);
static void SharedFilePath(char *path, SharedFileSet *fileset, const char *name);
static Oid ChooseTablespace(const SharedFileSet *fileset, const char *name);
@@ -112,7 +112,12 @@ SharedFileSetInit(SharedFileSet *fileset, dsm_segment *seg)
* fileset clean up.
*/
Assert(filesetlist == NIL);
- on_proc_exit(SharedFileSetDeleteOnProcExit, 0);
+
+ /*
+ * Register before-shmem-exit hook to ensure fileset is dropped
+ * while we can still report stats for underlying temporary files.
+ */
+ before_shmem_exit(SharedFileSetDeleteBeforeShmemExit, 0);
registered_cleanup = true;
}
@@ -259,12 +264,12 @@ SharedFileSetOnDetach(dsm_segment *segment, Datum datum)
}
/*
- * Callback function that will be invoked on the process exit. This will
+ * Callback function that will be invoked before shmem exit. This will
* process the list of all the registered sharedfilesets and delete the
* underlying files.
*/
static void
-SharedFileSetDeleteOnProcExit(int status, Datum arg)
+SharedFileSetDeleteBeforeShmemExit(int status, Datum arg)
{
/*
* Remove all the pending shared fileset entries. We don't use foreach()
--
2.24.3 (Apple Git-128)
v9-0003-Add-RESET-command-to-ALTER-SUBSCRIPTION-command.patchapplication/octet-stream; name=v9-0003-Add-RESET-command-to-ALTER-SUBSCRIPTION-command.patchDownload
From c492a98d3d597c324a7eeabd18b257238f00c0e7 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 2 Aug 2021 14:23:18 +0900
Subject: [PATCH v9 3/5] Add RESET command to ALTER SUBSCRIPTION command.
ALTER SUBSCRIPTION ... RESET command resets subscription
parameters. The parameters that can be set are streaming, binary,
synchronous_commit.
RESET command is reuiqred by follow-up commit introducing to a new
parameter skip_xid to reset.
---
doc/src/sgml/ref/alter_subscription.sgml | 8 ++-
src/backend/commands/subscriptioncmds.c | 78 +++++++++++++++++-----
src/backend/parser/gram.y | 11 ++-
src/include/nodes/parsenodes.h | 5 +-
src/test/regress/expected/subscription.out | 14 +++-
src/test/regress/sql/subscription.sql | 13 ++++
6 files changed, 109 insertions(+), 20 deletions(-)
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index a6f994450d..8c3c28b7e7 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -29,6 +29,7 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> REFRESH PUB
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> ENABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> DISABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <replaceable>new_name</replaceable>
</synopsis>
@@ -192,16 +193,21 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
<varlistentry>
<term><literal>SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )</literal></term>
+ <term><literal>RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )</literal></term>
<listitem>
<para>
This clause alters parameters originally set by
<xref linkend="sql-createsubscription"/>. See there for more
- information. The parameters that can be altered
+ information. The parameters that can be set
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, and
<literal>streaming</literal>.
</para>
+ <para>
+ The parameters that can be reset are: <literal>streaming</literal>,
+ <literal>binary</literal>, <literal>synchronous_commit</literal>.
+ </para>
</listitem>
</varlistentry>
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 5157f44058..cc390ce95a 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -99,7 +99,8 @@ static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname,
*/
static void
parse_subscription_options(ParseState *pstate, List *stmt_options,
- bits32 supported_opts, SubOpts *opts)
+ bits32 supported_opts, SubOpts *opts,
+ bool is_reset)
{
ListCell *lc;
@@ -134,6 +135,11 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
{
DefElem *defel = (DefElem *) lfirst(lc);
+ if (is_reset && defel->arg != NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("RESET must not include values for parameters")));
+
if (IsSet(supported_opts, SUBOPT_CONNECT) &&
strcmp(defel->defname, "connect") == 0)
{
@@ -192,12 +198,18 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_SYNCHRONOUS_COMMIT;
- opts->synchronous_commit = defGetString(defel);
+ if (!is_reset)
+ {
+ opts->synchronous_commit = defGetString(defel);
- /* Test if the given value is valid for synchronous_commit GUC. */
- (void) set_config_option("synchronous_commit", opts->synchronous_commit,
- PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
- false, 0, false);
+ /*
+ * Test if the given value is valid for synchronous_commit
+ * GUC.
+ */
+ (void) set_config_option("synchronous_commit", opts->synchronous_commit,
+ PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
+ false, 0, false);
+ }
}
else if (IsSet(supported_opts, SUBOPT_REFRESH) &&
strcmp(defel->defname, "refresh") == 0)
@@ -215,7 +227,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_BINARY;
- opts->binary = defGetBoolean(defel);
+ if (!is_reset)
+ opts->binary = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_STREAMING) &&
strcmp(defel->defname, "streaming") == 0)
@@ -224,7 +237,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_STREAMING;
- opts->streaming = defGetBoolean(defel);
+ if (!is_reset)
+ opts->streaming = defGetBoolean(defel);
}
else if (strcmp(defel->defname, "two_phase") == 0)
{
@@ -397,7 +411,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT);
- parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
+ parse_subscription_options(pstate, stmt->options, supported_opts, &opts,
+ false);
/*
* Since creating a replication slot is not transactional, rolling back
@@ -866,14 +881,14 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
switch (stmt->kind)
{
- case ALTER_SUBSCRIPTION_OPTIONS:
+ case ALTER_SUBSCRIPTION_SET_OPTIONS:
{
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING);
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
if (IsSet(opts.specified_opts, SUBOPT_SLOT_NAME))
{
@@ -923,10 +938,43 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
break;
}
+ case ALTER_SUBSCRIPTION_RESET_OPTIONS:
+ {
+ supported_opts = (SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
+ SUBOPT_STREAMING);
+
+ parse_subscription_options(pstate, stmt->options,
+ supported_opts, &opts, true);
+
+ if (IsSet(opts.specified_opts, SUBOPT_SYNCHRONOUS_COMMIT))
+ {
+ values[Anum_pg_subscription_subsynccommit - 1] =
+ CStringGetTextDatum("off");
+ replaces[Anum_pg_subscription_subsynccommit - 1] = true;
+ }
+
+ if (IsSet(opts.specified_opts, SUBOPT_BINARY))
+ {
+ values[Anum_pg_subscription_subbinary - 1] =
+ BoolGetDatum(false);
+ replaces[Anum_pg_subscription_subbinary - 1] = true;
+ }
+
+ if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
+ {
+ values[Anum_pg_subscription_substream - 1] =
+ BoolGetDatum(false);
+ replaces[Anum_pg_subscription_substream - 1] = true;
+ }
+
+ update_tuple = true;
+ break;
+ }
case ALTER_SUBSCRIPTION_ENABLED:
{
parse_subscription_options(pstate, stmt->options,
- SUBOPT_ENABLED, &opts);
+ SUBOPT_ENABLED, &opts, false);
+
Assert(IsSet(opts.specified_opts, SUBOPT_ENABLED));
if (!sub->slotname && opts.enabled)
@@ -961,7 +1009,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
{
supported_opts = SUBOPT_COPY_DATA | SUBOPT_REFRESH;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
values[Anum_pg_subscription_subpublications - 1] =
publicationListToArray(stmt->publication);
@@ -1011,7 +1059,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts |= SUBOPT_COPY_DATA;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
publist = merge_publications(sub->publications, stmt->publication, isadd, stmt->subname);
values[Anum_pg_subscription_subpublications - 1] =
@@ -1059,7 +1107,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
errmsg("ALTER SUBSCRIPTION ... REFRESH is not allowed for disabled subscriptions")));
parse_subscription_options(pstate, stmt->options,
- SUBOPT_COPY_DATA, &opts);
+ SUBOPT_COPY_DATA, &opts, false);
/*
* The subscription option "two_phase" requires that
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 39a2849eba..bcf85e8980 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9707,7 +9707,16 @@ AlterSubscriptionStmt:
{
AlterSubscriptionStmt *n =
makeNode(AlterSubscriptionStmt);
- n->kind = ALTER_SUBSCRIPTION_OPTIONS;
+ n->kind = ALTER_SUBSCRIPTION_SET_OPTIONS;
+ n->subname = $3;
+ n->options = $5;
+ $$ = (Node *)n;
+ }
+ | ALTER SUBSCRIPTION name RESET definition
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_RESET_OPTIONS;
n->subname = $3;
n->options = $5;
$$ = (Node *)n;
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index e28248af32..504d65f7d6 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3659,7 +3659,8 @@ typedef struct CreateSubscriptionStmt
typedef enum AlterSubscriptionType
{
- ALTER_SUBSCRIPTION_OPTIONS,
+ ALTER_SUBSCRIPTION_SET_OPTIONS,
+ ALTER_SUBSCRIPTION_RESET_OPTIONS,
ALTER_SUBSCRIPTION_CONNECTION,
ALTER_SUBSCRIPTION_SET_PUBLICATION,
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
@@ -3671,7 +3672,7 @@ typedef enum AlterSubscriptionType
typedef struct AlterSubscriptionStmt
{
NodeTag type;
- AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_OPTIONS, etc */
+ AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_SET_OPTIONS, etc */
char *subname; /* Name of the subscription */
char *conninfo; /* Connection string to publisher */
List *publication; /* One or more publication to subscribe to */
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 77b4437b69..b87f67fe55 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -284,11 +284,23 @@ ALTER SUBSCRIPTION regress_testsub SET (two_phase = false);
ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+-- ok
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit);
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit, binary, streaming);
+-- fail - unsupported parameters
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ERROR: unrecognized subscription parameter: "connect"
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+ERROR: unrecognized subscription parameter: "enabled"
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+ERROR: RESET must not include values for parameters
\dRs+
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index d42104c191..aa90560691 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -218,6 +218,19 @@ ALTER SUBSCRIPTION regress_testsub SET (two_phase = false);
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+
+-- ok
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit);
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit, binary, streaming);
+
+-- fail - unsupported parameters
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+
\dRs+
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
--
2.24.3 (Apple Git-128)
v9-0001-Add-logical-changes-details-to-errcontext-of-appl.patchapplication/octet-stream; name=v9-0001-Add-logical-changes-details-to-errcontext-of-appl.patchDownload
From 5788e954d3d1370a54b5a1ab296b49e6d8b1b325 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 28 Jun 2021 13:21:58 +0900
Subject: [PATCH v9 1/5] Add logical changes details to errcontext of apply
worker errors.
Previously, the error context was set to only the data conversion
failures. This commit expands the error context to add the details of
logical change being applied by the apply worker, newly showing the
command, transaction, and commit timestamp.
This additional information can be used by the follow-up commit that
enables to skip the particular transaction on the subscriber.
---
src/backend/replication/logical/proto.c | 53 +++++
src/backend/replication/logical/worker.c | 250 ++++++++++++++++-------
src/include/replication/logicalproto.h | 1 +
src/tools/pgindent/typedefs.list | 2 +-
4 files changed, 226 insertions(+), 80 deletions(-)
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 52b65e9572..7aa3452609 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -1156,3 +1156,56 @@ logicalrep_read_stream_abort(StringInfo in, TransactionId *xid,
*xid = pq_getmsgint(in, 4);
*subxid = pq_getmsgint(in, 4);
}
+
+/*
+ * Get string representing LogicalRepMsgType.
+ */
+char *
+logicalrep_message_type(LogicalRepMsgType action)
+{
+ switch (action)
+ {
+ case LOGICAL_REP_MSG_BEGIN:
+ return "BEGIN";
+ case LOGICAL_REP_MSG_COMMIT:
+ return "COMMIT";
+ case LOGICAL_REP_MSG_ORIGIN:
+ return "ORIGIN";
+ case LOGICAL_REP_MSG_INSERT:
+ return "INSERT";
+ case LOGICAL_REP_MSG_UPDATE:
+ return "UPDATE";
+ case LOGICAL_REP_MSG_DELETE:
+ return "DELETE";
+ case LOGICAL_REP_MSG_TRUNCATE:
+ return "TRUNCATE";
+ case LOGICAL_REP_MSG_RELATION:
+ return "RELATION";
+ case LOGICAL_REP_MSG_TYPE:
+ return "TYPE";
+ case LOGICAL_REP_MSG_MESSAGE:
+ return "MESSAGE";
+ case LOGICAL_REP_MSG_BEGIN_PREPARE:
+ return "BEGIN PREPARE";
+ case LOGICAL_REP_MSG_PREPARE:
+ return "PREPARE";
+ case LOGICAL_REP_MSG_COMMIT_PREPARED:
+ return "COMMIT PREPARED";
+ case LOGICAL_REP_MSG_ROLLBACK_PREPARED:
+ return "ROLLBACK PREPARED";
+ case LOGICAL_REP_MSG_STREAM_START:
+ return "STREAM START";
+ case LOGICAL_REP_MSG_STREAM_END:
+ return "STREAM STOP";
+ case LOGICAL_REP_MSG_STREAM_COMMIT:
+ return "STREAM COMMIT";
+ case LOGICAL_REP_MSG_STREAM_ABORT:
+ return "STREAM ABORT";
+ case LOGICAL_REP_MSG_STREAM_PREPARE:
+ return "STREAM PREPARE";
+ }
+
+ elog(ERROR, "invalid logical replication message type \"%c\"", action);
+
+ return NULL; /* keep compiler quiet */
+}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index ecaed157f2..58923094f0 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -203,12 +203,6 @@ typedef struct FlushPosition
static dlist_head lsn_mapping = DLIST_STATIC_INIT(lsn_mapping);
-typedef struct SlotErrCallbackArg
-{
- LogicalRepRelMapEntry *rel;
- int remote_attnum;
-} SlotErrCallbackArg;
-
typedef struct ApplyExecutionData
{
EState *estate; /* executor state, used to track resources */
@@ -221,6 +215,27 @@ typedef struct ApplyExecutionData
PartitionTupleRouting *proute; /* partition routing info */
} ApplyExecutionData;
+/* Struct for saving and restoring apply information */
+typedef struct ApplyErrorCallbackArg
+{
+ LogicalRepMsgType command; /* 0 if invalid */
+ LogicalRepRelMapEntry *rel;
+
+ /* Remote information */
+ int remote_attnum; /* -1 if invalid */
+ TransactionId remote_xid;
+ TimestampTz commit_ts;
+} ApplyErrorCallbackArg;
+
+static ApplyErrorCallbackArg apply_error_callback_arg =
+{
+ .command = 0,
+ .rel = NULL,
+ .remote_attnum = -1,
+ .remote_xid = InvalidTransactionId,
+ .commit_ts = 0,
+};
+
/*
* Stream xid hash entry. Whenever we see a new xid we create this entry in the
* xidhash and along with it create the streaming file and store the fileset handle.
@@ -335,6 +350,11 @@ static void TwoPhaseTransactionGid(Oid subid, TransactionId xid, char *gid, int
/* Common streaming function to apply all the spooled messages */
static void apply_spooled_messages(TransactionId xid, XLogRecPtr lsn);
+/* Functions for apply error callback */
+static void apply_error_callback(void *arg);
+static inline void set_apply_error_context_xact(TransactionId xid, TimestampTz commit_ts);
+static inline void reset_apply_error_context_info(void);
+
/*
* Should this worker apply changes for given relation.
*
@@ -580,26 +600,6 @@ slot_fill_defaults(LogicalRepRelMapEntry *rel, EState *estate,
ExecEvalExpr(defexprs[i], econtext, &slot->tts_isnull[defmap[i]]);
}
-/*
- * Error callback to give more context info about data conversion failures
- * while reading data from the remote server.
- */
-static void
-slot_store_error_callback(void *arg)
-{
- SlotErrCallbackArg *errarg = (SlotErrCallbackArg *) arg;
- LogicalRepRelMapEntry *rel;
-
- /* Nothing to do if remote attribute number is not set */
- if (errarg->remote_attnum < 0)
- return;
-
- rel = errarg->rel;
- errcontext("processing remote data for replication target relation \"%s.%s\" column \"%s\"",
- rel->remoterel.nspname, rel->remoterel.relname,
- rel->remoterel.attnames[errarg->remote_attnum]);
-}
-
/*
* Store tuple data into slot.
*
@@ -611,19 +611,9 @@ slot_store_data(TupleTableSlot *slot, LogicalRepRelMapEntry *rel,
{
int natts = slot->tts_tupleDescriptor->natts;
int i;
- SlotErrCallbackArg errarg;
- ErrorContextCallback errcallback;
ExecClearTuple(slot);
- /* Push callback + info on the error context stack */
- errarg.rel = rel;
- errarg.remote_attnum = -1;
- errcallback.callback = slot_store_error_callback;
- errcallback.arg = (void *) &errarg;
- errcallback.previous = error_context_stack;
- error_context_stack = &errcallback;
-
/* Call the "in" function for each non-dropped, non-null attribute */
Assert(natts == rel->attrmap->maplen);
for (i = 0; i < natts; i++)
@@ -637,7 +627,7 @@ slot_store_data(TupleTableSlot *slot, LogicalRepRelMapEntry *rel,
Assert(remoteattnum < tupleData->ncols);
- errarg.remote_attnum = remoteattnum;
+ apply_error_callback_arg.remote_attnum = remoteattnum;
if (tupleData->colstatus[remoteattnum] == LOGICALREP_COLUMN_TEXT)
{
@@ -685,7 +675,7 @@ slot_store_data(TupleTableSlot *slot, LogicalRepRelMapEntry *rel,
slot->tts_isnull[i] = true;
}
- errarg.remote_attnum = -1;
+ apply_error_callback_arg.remote_attnum = -1;
}
else
{
@@ -699,9 +689,6 @@ slot_store_data(TupleTableSlot *slot, LogicalRepRelMapEntry *rel,
}
}
- /* Pop the error context stack */
- error_context_stack = errcallback.previous;
-
ExecStoreVirtualTuple(slot);
}
@@ -724,8 +711,6 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
{
int natts = slot->tts_tupleDescriptor->natts;
int i;
- SlotErrCallbackArg errarg;
- ErrorContextCallback errcallback;
/* We'll fill "slot" with a virtual tuple, so we must start with ... */
ExecClearTuple(slot);
@@ -739,14 +724,6 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
memcpy(slot->tts_values, srcslot->tts_values, natts * sizeof(Datum));
memcpy(slot->tts_isnull, srcslot->tts_isnull, natts * sizeof(bool));
- /* For error reporting, push callback + info on the error context stack */
- errarg.rel = rel;
- errarg.remote_attnum = -1;
- errcallback.callback = slot_store_error_callback;
- errcallback.arg = (void *) &errarg;
- errcallback.previous = error_context_stack;
- error_context_stack = &errcallback;
-
/* Call the "in" function for each replaced attribute */
Assert(natts == rel->attrmap->maplen);
for (i = 0; i < natts; i++)
@@ -763,7 +740,7 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
{
StringInfo colvalue = &tupleData->colvalues[remoteattnum];
- errarg.remote_attnum = remoteattnum;
+ apply_error_callback_arg.remote_attnum = remoteattnum;
if (tupleData->colstatus[remoteattnum] == LOGICALREP_COLUMN_TEXT)
{
@@ -807,13 +784,10 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
slot->tts_isnull[i] = true;
}
- errarg.remote_attnum = -1;
+ apply_error_callback_arg.remote_attnum = -1;
}
}
- /* Pop the error context stack */
- error_context_stack = errcallback.previous;
-
/* And finally, declare that "slot" contains a valid virtual tuple */
ExecStoreVirtualTuple(slot);
}
@@ -827,6 +801,7 @@ apply_handle_begin(StringInfo s)
LogicalRepBeginData begin_data;
logicalrep_read_begin(s, &begin_data);
+ set_apply_error_context_xact(begin_data.xid, begin_data.committime);
remote_final_lsn = begin_data.final_lsn;
@@ -860,6 +835,7 @@ apply_handle_commit(StringInfo s)
process_syncing_tables(commit_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -877,6 +853,7 @@ apply_handle_begin_prepare(StringInfo s)
errmsg_internal("tablesync worker received a BEGIN PREPARE message")));
logicalrep_read_begin_prepare(s, &begin_data);
+ set_apply_error_context_xact(begin_data.xid, begin_data.prepare_time);
remote_final_lsn = begin_data.prepare_lsn;
@@ -962,6 +939,7 @@ apply_handle_prepare(StringInfo s)
process_syncing_tables(prepare_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -974,6 +952,7 @@ apply_handle_commit_prepared(StringInfo s)
char gid[GIDSIZE];
logicalrep_read_commit_prepared(s, &prepare_data);
+ set_apply_error_context_xact(prepare_data.xid, prepare_data.commit_time);
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
@@ -1001,6 +980,7 @@ apply_handle_commit_prepared(StringInfo s)
process_syncing_tables(prepare_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1013,6 +993,7 @@ apply_handle_rollback_prepared(StringInfo s)
char gid[GIDSIZE];
logicalrep_read_rollback_prepared(s, &rollback_data);
+ set_apply_error_context_xact(rollback_data.xid, rollback_data.rollback_time);
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, rollback_data.xid,
@@ -1050,6 +1031,7 @@ apply_handle_rollback_prepared(StringInfo s)
process_syncing_tables(rollback_data.rollback_end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1076,6 +1058,7 @@ apply_handle_stream_prepare(StringInfo s)
errmsg_internal("tablesync worker received a STREAM PREPARE message")));
logicalrep_read_stream_prepare(s, &prepare_data);
+ set_apply_error_context_xact(prepare_data.xid, prepare_data.prepare_time);
elog(DEBUG1, "received prepare for streamed transaction %u", prepare_data.xid);
@@ -1100,6 +1083,8 @@ apply_handle_stream_prepare(StringInfo s)
process_syncing_tables(prepare_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+
+ reset_apply_error_context_info();
}
/*
@@ -1156,6 +1141,8 @@ apply_handle_stream_start(StringInfo s)
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg_internal("invalid transaction ID in streamed replication transaction")));
+ set_apply_error_context_xact(stream_xid, 0);
+
/*
* Initialize the xidhash table if we haven't yet. This will be used for
* the entire duration of the apply worker so create it in permanent
@@ -1212,6 +1199,7 @@ apply_handle_stream_stop(StringInfo s)
MemoryContextReset(LogicalStreamingContext);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1235,7 +1223,10 @@ apply_handle_stream_abort(StringInfo s)
* just delete the files with serialized info.
*/
if (xid == subxid)
+ {
+ set_apply_error_context_xact(xid, 0);
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
+ }
else
{
/*
@@ -1260,6 +1251,8 @@ apply_handle_stream_abort(StringInfo s)
char path[MAXPGPATH];
StreamXidHash *ent;
+ set_apply_error_context_xact(subxid, 0);
+
subidx = -1;
begin_replication_step();
subxact_info_read(MyLogicalRepWorker->subid, xid);
@@ -1284,6 +1277,7 @@ apply_handle_stream_abort(StringInfo s)
cleanup_subxact_info();
end_replication_step();
CommitTransactionCommand();
+ reset_apply_error_context_info();
return;
}
@@ -1315,6 +1309,8 @@ apply_handle_stream_abort(StringInfo s)
end_replication_step();
CommitTransactionCommand();
}
+
+ reset_apply_error_context_info();
}
/*
@@ -1459,6 +1455,7 @@ apply_handle_stream_commit(StringInfo s)
errmsg_internal("STREAM COMMIT message without STREAM STOP")));
xid = logicalrep_read_stream_commit(s, &commit_data);
+ set_apply_error_context_xact(xid, commit_data.committime);
elog(DEBUG1, "received commit for streamed transaction %u", xid);
@@ -1473,6 +1470,8 @@ apply_handle_stream_commit(StringInfo s)
process_syncing_tables(commit_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+
+ reset_apply_error_context_info();
}
/*
@@ -1592,6 +1591,9 @@ apply_handle_insert(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ apply_error_callback_arg.rel = rel;
+
/* Initialize the executor state. */
edata = create_edata_for_relation(rel);
estate = edata->estate;
@@ -1615,6 +1617,9 @@ apply_handle_insert(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ apply_error_callback_arg.rel = NULL;
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -1713,6 +1718,9 @@ apply_handle_update(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ apply_error_callback_arg.rel = rel;
+
/* Check if we can do the update. */
check_relation_updatable(rel);
@@ -1766,6 +1774,9 @@ apply_handle_update(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ apply_error_callback_arg.rel = NULL;
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -1869,6 +1880,9 @@ apply_handle_delete(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ apply_error_callback_arg.rel = rel;
+
/* Check if we can do the delete. */
check_relation_updatable(rel);
@@ -1894,6 +1908,9 @@ apply_handle_delete(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ apply_error_callback_arg.rel = NULL;
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -2328,44 +2345,52 @@ static void
apply_dispatch(StringInfo s)
{
LogicalRepMsgType action = pq_getmsgbyte(s);
+ LogicalRepMsgType saved_command;
+
+ /*
+ * Set the current command being applied. Since this function can be called
+ * recusively when applying spooled changes, save the current command.
+ */
+ saved_command = apply_error_callback_arg.command;
+ apply_error_callback_arg.command = action;
switch (action)
{
case LOGICAL_REP_MSG_BEGIN:
apply_handle_begin(s);
- return;
+ break;
case LOGICAL_REP_MSG_COMMIT:
apply_handle_commit(s);
- return;
+ break;
case LOGICAL_REP_MSG_INSERT:
apply_handle_insert(s);
- return;
+ break;
case LOGICAL_REP_MSG_UPDATE:
apply_handle_update(s);
- return;
+ break;
case LOGICAL_REP_MSG_DELETE:
apply_handle_delete(s);
- return;
+ break;
case LOGICAL_REP_MSG_TRUNCATE:
apply_handle_truncate(s);
- return;
+ break;
case LOGICAL_REP_MSG_RELATION:
apply_handle_relation(s);
- return;
+ break;
case LOGICAL_REP_MSG_TYPE:
apply_handle_type(s);
- return;
+ break;
case LOGICAL_REP_MSG_ORIGIN:
apply_handle_origin(s);
- return;
+ break;
case LOGICAL_REP_MSG_MESSAGE:
@@ -2374,49 +2399,52 @@ apply_dispatch(StringInfo s)
* Although, it could be used by other applications that use this
* output plugin.
*/
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_START:
apply_handle_stream_start(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_END:
apply_handle_stream_stop(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_ABORT:
apply_handle_stream_abort(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_COMMIT:
apply_handle_stream_commit(s);
- return;
+ break;
case LOGICAL_REP_MSG_BEGIN_PREPARE:
apply_handle_begin_prepare(s);
- return;
+ break;
case LOGICAL_REP_MSG_PREPARE:
apply_handle_prepare(s);
- return;
+ break;
case LOGICAL_REP_MSG_COMMIT_PREPARED:
apply_handle_commit_prepared(s);
- return;
+ break;
case LOGICAL_REP_MSG_ROLLBACK_PREPARED:
apply_handle_rollback_prepared(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_PREPARE:
apply_handle_stream_prepare(s);
- return;
+ break;
+
+ default:
+ ereport(ERROR,
+ (errcode(ERRCODE_PROTOCOL_VIOLATION),
+ errmsg("invalid logical replication message type \"%c\"", action)));
}
- ereport(ERROR,
- (errcode(ERRCODE_PROTOCOL_VIOLATION),
- errmsg_internal("invalid logical replication message type \"%c\"",
- action)));
+ /* Reset the current command */
+ apply_error_callback_arg.command = saved_command;
}
/*
@@ -2517,6 +2545,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
TimestampTz last_recv_timestamp = GetCurrentTimestamp();
bool ping_sent = false;
TimeLineID tli;
+ ErrorContextCallback errcallback;
/*
* Init the ApplyMessageContext which we clean up after each replication
@@ -2537,6 +2566,14 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
/* mark as idle, before starting to loop */
pgstat_report_activity(STATE_IDLE, NULL);
+ /*
+ * Push apply error context callback. Fields will be filled during applying
+ * a change.
+ */
+ errcallback.callback = apply_error_callback;
+ errcallback.previous = error_context_stack;
+ error_context_stack = &errcallback;
+
/* This outer loop iterates once per wait. */
for (;;)
{
@@ -2737,6 +2774,9 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
}
+ /* Pop the error context stack */
+ error_context_stack = errcallback.previous;
+
/* All done */
walrcv_endstreaming(LogRepWorkerWalRcvConn, &tli);
}
@@ -3649,3 +3689,55 @@ IsLogicalWorker(void)
{
return MyLogicalRepWorker != NULL;
}
+
+/* Error callback to give more context info about the change being applied */
+static void
+apply_error_callback(void *arg)
+{
+ StringInfoData buf;
+ ApplyErrorCallbackArg *errarg = &apply_error_callback_arg;
+
+ if (apply_error_callback_arg.command == 0)
+ return;
+
+ initStringInfo(&buf);
+ appendStringInfo(&buf, _("processing remote data during \"%s\""),
+ logicalrep_message_type(errarg->command));
+
+ if (errarg->rel)
+ appendStringInfo(&buf, _(" for replication target relation \"%s.%s\""),
+ errarg->rel->remoterel.nspname,
+ errarg->rel->remoterel.relname);
+
+ if (errarg->remote_attnum >= 0)
+ appendStringInfo(&buf, _(" column \"%s\""),
+ errarg->rel->remoterel.attnames[errarg->remote_attnum]);
+
+ if (TransactionIdIsNormal(errarg->remote_xid))
+ appendStringInfo(&buf, _(" in transaction %u with commit timestamp %s"),
+ errarg->remote_xid,
+ errarg->commit_ts == 0
+ ? "(unset)"
+ : timestamptz_to_str(errarg->commit_ts));
+
+ errcontext("%s", buf.data);
+ pfree(buf.data);
+}
+
+/* Set transaction information of apply error callback */
+static inline void
+set_apply_error_context_xact(TransactionId xid, TimestampTz commit_ts)
+{
+ apply_error_callback_arg.remote_xid = xid;
+ apply_error_callback_arg.commit_ts = commit_ts;
+}
+
+/* Reset all information of apply error callback */
+static inline void
+reset_apply_error_context_info(void)
+{
+ apply_error_callback_arg.command = 0;
+ apply_error_callback_arg.rel = NULL;
+ apply_error_callback_arg.remote_attnum = -1;
+ set_apply_error_context_xact(InvalidTransactionId, 0);
+}
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 2e29513151..af89f58fd3 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -246,5 +246,6 @@ extern void logicalrep_write_stream_abort(StringInfo out, TransactionId xid,
TransactionId subxid);
extern void logicalrep_read_stream_abort(StringInfo in, TransactionId *xid,
TransactionId *subxid);
+extern char *logicalrep_message_type(LogicalRepMsgType action);
#endif /* LOGICAL_PROTO_H */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 37cf4b2f76..621d0cb4da 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -113,6 +113,7 @@ Append
AppendPath
AppendRelInfo
AppendState
+ApplyErrorCallbackArg
ApplyExecutionData
ApplySubXactData
Archive
@@ -2423,7 +2424,6 @@ SlabBlock
SlabChunk
SlabContext
SlabSlot
-SlotErrCallbackArg
SlotNumber
SlruCtl
SlruCtlData
--
2.24.3 (Apple Git-128)
v9-0004-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchapplication/octet-stream; name=v9-0004-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchDownload
From 96536c25bd498c23865304c6ff85be26b54c65af Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 2 Aug 2021 14:27:40 +0900
Subject: [PATCH v9 4/5] Add skip_xid option to ALTER SUBSCRIPTION.
If incoming change violates any constraint, logical replication stops
until it's resolved. This commit introduces another way to skip the
transaction in question.
The user can specify XID by ALTER SUBSCRIPTION ... SET (skip_xid =
XXX), updating pg_subscription.subskipxid field, telling the apply
worker to skip the transaction. The apply worker skips all data
modification changes within the specified transaction.
After skipping the transaciton the apply worker clears
pg_subscription.subskipxid. Also, it clears the error statistics of
the subscription in pg_stat_subscription_errors system view as well in
order the user not to get confused. It's done by sending the message
for clearing a subscription error to the stats collector.
---
doc/src/sgml/logical-replication.sgml | 49 ++++-
doc/src/sgml/ref/alter_subscription.sgml | 32 ++-
src/backend/catalog/pg_subscription.c | 10 +
src/backend/commands/subscriptioncmds.c | 45 +++-
src/backend/postmaster/pgstat.c | 44 +++-
src/backend/replication/logical/worker.c | 201 ++++++++++++++++-
src/include/catalog/pg_subscription.h | 4 +
src/include/nodes/parsenodes.h | 1 +
src/include/pgstat.h | 7 +-
src/test/regress/expected/subscription.out | 13 ++
src/test/regress/sql/subscription.sql | 11 +
src/test/subscription/t/024_skip_xact.pl | 244 +++++++++++++++++++++
12 files changed, 636 insertions(+), 25 deletions(-)
create mode 100644 src/test/subscription/t/024_skip_xact.pl
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 88646bc859..d558dcfe81 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -333,20 +333,63 @@
<para>
A conflict will produce an error and will stop the replication; it must be
resolved manually by the user. Details about the conflict can be found in
- the subscriber's server log.
+ <xref linkend="monitoring-pg-stat-subscription-errors"/> as well as the
+ subscriber's server log.
</para>
<para>
The resolution can be done either by changing data on the subscriber so
that it does not conflict with the incoming change or by skipping the
- transaction that conflicts with the existing data. The transaction can be
- skipped by calling the <link linkend="pg-replication-origin-advance">
+ transaction that conflicts with the existing data. When a conflict produce
+ an error, it is shown in <structname>pg_stat_subscription_errors</structname>
+ view as follows:
+ </para>
+
+ <programlisting>
+postgres=# SELECT * FROM pg_stat_subscription_errors;
+-[ RECORD 1 ]--------+-----------------------------------------------------------
+datname | postgres
+subid | 16395
+subname | test_sub
+relid | 16385
+command | INSERT
+xid | 716
+failure_source | apply
+failure_count | 50
+last_failure | 2021-07-21 21:16:02.781779+00
+last_failure_message | duplicate key value violates unique constraint "test_pkey"
+stats_reset |
+</programlisting>
+
+ <para>
+ and it is also shown in subscriber's server log as follows:
+ </para>
+
+<screen>
+ERROR: duplicate key value violates unique constraint "test_pkey"
+DETAIL: Key (id)=(1) already exists.
+CONTEXT: processing remote data during "INSERT" for replication target relation "public.test" in transaction 716 with commit timestamp 2021-07-15 21:54:58.802874+00
+</screen>
+
+ <para>
+ The transaction ID that contains the change violating the constraint can be
+ found from those outputs (transaction ID 740 in the above case). The transaction
+ can be skipped by setting <replaceable>skip_xid</replaceable> to the subscription
+ by <command>ALTER SUBSCRIPTION ... SET</command>. Alternatively, the transaction
+ can also be skipped by calling the <link linkend="pg-replication-origin-advance">
<function>pg_replication_origin_advance()</function></link> function with
a <parameter>node_name</parameter> corresponding to the subscription name,
and a position. The current position of origins can be seen in the
<link linkend="view-pg-replication-origin-status">
<structname>pg_replication_origin_status</structname></link> system view.
</para>
+
+ <para>
+ In either way, those should be used as a last resort. They skip the whole
+ transaction including changes that may not violate any constraint and easily
+ make subscriber inconsistent if a user specifies the wrong transaction ID or
+ the position of origin.
+ </para>
</sect1>
<sect1 id="logical-replication-restrictions">
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 8c3c28b7e7..cfb318e08c 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -205,8 +205,36 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
<literal>streaming</literal>.
</para>
<para>
- The parameters that can be reset are: <literal>streaming</literal>,
- <literal>binary</literal>, <literal>synchronous_commit</literal>.
+ The parameters that can be reset are: <literal>slot_name</literal>,
+ <literal>synchronous_commit</literal>, <literal>binary</literal>,
+ <literal>streaming</literal>, and following parameter:
+ </para>
+ <para>
+ <variablelist>
+ <varlistentry>
+ <term><literal>skip_xid</literal> (<type>xid</type>)</term>
+ <listitem>
+ <para>
+ If incoming data violates any constraints the logical replication
+ will stop until it is resolved. The resolution can be done either
+ by changing data on the subscriber so that it doesn't conflict with
+ incoming change or by skipping the whole transaction. This option
+ specifies transaction ID that logical replication worker skips to
+ apply. The logical replication worker skips all data modification
+ changes within the specified transaction. Therefore, since it skips
+ the whole transaction including the changes that may not violate the
+ constraint, it should only be used as a last resort. This option has
+ no effect for the transaction that is already prepared with enabling
+ <literal>two_phase</literal> on susbscriber. After the logical
+ replication successfully skips the transaction, the transaction ID
+ (stored in
+ <structname>pg_subscription</structname>.<structfield>subskipxid</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index 25021e25a4..cb22cd7463 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -104,6 +104,16 @@ GetSubscription(Oid subid, bool missing_ok)
Assert(!isnull);
sub->publications = textarray_to_stringlist(DatumGetArrayTypeP(datum));
+ /* Get skip XID */
+ datum = SysCacheGetAttr(SUBSCRIPTIONOID,
+ tup,
+ Anum_pg_subscription_subskipxid,
+ &isnull);
+ if (!isnull)
+ sub->skipxid = DatumGetTransactionId(datum);
+ else
+ sub->skipxid = InvalidTransactionId;
+
ReleaseSysCache(tup);
return sub;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index cc390ce95a..188f3e42fd 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -60,6 +60,7 @@
#define SUBOPT_BINARY 0x00000080
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
+#define SUBOPT_SKIP_XID 0x00000400
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -81,6 +82,7 @@ typedef struct SubOpts
bool binary;
bool streaming;
bool twophase;
+ TransactionId skip_xid;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -129,6 +131,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->streaming = false;
if (IsSet(supported_opts, SUBOPT_TWOPHASE_COMMIT))
opts->twophase = false;
+ if (IsSet(supported_opts, SUBOPT_SKIP_XID))
+ opts->skip_xid = InvalidTransactionId;
/* Parse options */
foreach(lc, stmt_options)
@@ -261,6 +265,29 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_TWOPHASE_COMMIT;
opts->twophase = defGetBoolean(defel);
}
+ else if (strcmp(defel->defname, "skip_xid") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_SKIP_XID))
+ errorConflictingDefElem(defel, pstate);
+
+ if (!is_reset)
+ {
+ char *xid_str = defGetString(defel);
+ TransactionId xid;
+
+ /* Parse the argument as TransactionId */
+ xid = DatumGetTransactionId(DirectFunctionCall1(xidin,
+ CStringGetDatum(xid_str)));
+
+ if (!TransactionIdIsNormal(xid))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid transaction id")));
+ opts->skip_xid = xid;
+ }
+
+ opts->specified_opts |= SUBOPT_SKIP_XID;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -485,6 +512,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
CharGetDatum(opts.twophase ?
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -885,7 +913,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
{
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
- SUBOPT_STREAMING);
+ SUBOPT_STREAMING | SUBOPT_SKIP_XID);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts, false);
@@ -934,6 +962,13 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
replaces[Anum_pg_subscription_substream - 1] = true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_SKIP_XID))
+ {
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(opts.skip_xid);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+ }
+
update_tuple = true;
break;
}
@@ -941,7 +976,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
case ALTER_SUBSCRIPTION_RESET_OPTIONS:
{
supported_opts = (SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
- SUBOPT_STREAMING);
+ SUBOPT_STREAMING | SUBOPT_SKIP_XID);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts, true);
@@ -967,6 +1002,12 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
replaces[Anum_pg_subscription_substream - 1] = true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_SKIP_XID))
+ {
+ nulls[Anum_pg_subscription_subskipxid - 1] =
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+ }
+
update_tuple = true;
break;
}
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 8179e3995d..02bee2e6fb 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -1743,11 +1743,32 @@ pgstat_reset_subscription_error(Oid subid, Oid subrelid)
pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
msg.m_subid = subid;
msg.m_subrelid = subrelid;
+ msg.m_clear = false;
msg.m_reset = true;
pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
}
+/* ----------
+ * pgstat_clear_subscription_error() -
+ *
+ * Tell the collector about clear the error of subscription.
+ * ----------
+ */
+void
+pgstat_clear_subscription_error(Oid subid, Oid subrelid)
+{
+ PgStat_MsgSubscriptionErr msg;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_clear = true;
+ msg.m_reset = false;
+
+ pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
+}
+
/* ----------
* pgstat_report_autovac() -
*
@@ -2035,6 +2056,7 @@ pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
msg.m_subrelid = subrelid;
msg.m_databaseid = MyDatabaseId;
msg.m_relid = relid;
+ msg.m_clear = false;
msg.m_reset = false;
msg.m_command = command;
msg.m_xid = xid;
@@ -6134,27 +6156,37 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
static void
pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len)
{
+
PgStat_StatSubErrEntry *errent;
- bool create = !msg->m_reset;
+ bool create = !(msg->m_reset || msg->m_clear);
/* Get subscription error */
errent = pgstat_get_subscription_error_entry(msg->m_subid,
msg->m_subrelid,
create);
- if (msg->m_reset)
+ if (msg->m_reset || msg->m_clear)
{
+ Assert(!(msg->m_reset && msg->m_clear));
+
if (errent == NULL)
return;
- /* reset fields and set reset timestamp */
errent->relid = InvalidOid;
errent->command = 0;
errent->xid = InvalidTransactionId;
errent->failure_count = 0;
- errent->last_failure = 0;
- errent->last_errmsg[0] = '\0';
- errent->stat_reset_timestamp = GetCurrentTimestamp();
+
+ /*
+ * If the reset is requested, reset more fields and set the reset
+ * timestamp.
+ */
+ if (msg->m_reset)
+ {
+ errent->last_failure = 0;
+ errent->last_errmsg[0] = '\0';
+ errent->stat_reset_timestamp = GetCurrentTimestamp();
+ }
}
else
{
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 38f88cb70b..b11c0c04a0 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -136,6 +136,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -269,6 +270,21 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * skipping_xid is a valid XID if we're skipping all data modification changes
+ * (INSERT/DELETE/UPDATE/TRUNCATE) of the specified transaction in MySubscription->skipxid.
+ * Please note that we don’t skip receiving the changes particularly in streaming
+ * cases, since we decide whether or not to skip applying the changes when starting
+ * to apply. Once starting skipping changes, we copy the XID to skipping_xid and
+ * then don't stop skipping until we skip the whole transaction even if the
+ * subscription is invalidated and* MySubscription->skipxid gets changed or reset.
+ * When stopping the skipping behavior, we reset the skip XID (subskipxid) in the
+ * pg_subscription catalog and associate origin status to the transaction that resets
+ * the skip XID so that we can start streaming from the next transaction.
+ */
+static TransactionId skipping_xid = InvalidTransactionId;
+#define is_skipping_changes() (TransactionIdIsValid(skipping_xid))
+
/*
* Hash table for storing the streaming xid information along with shared file
* set for streaming and subxact files.
@@ -355,6 +371,9 @@ static void apply_error_callback(void *arg);
static inline void set_apply_error_context_xact(TransactionId xid, TimestampTz commit_ts);
static inline void reset_apply_error_context_info(void);
+static void maybe_start_skipping_changes(TransactionId xid);
+static void stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs);
+
/*
* Should this worker apply changes for given relation.
*
@@ -805,6 +824,11 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
+ /*
+ * Enable skipping all changes of this transaction if specified.
+ */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -829,7 +853,18 @@ apply_handle_commit(StringInfo s)
LSN_FORMAT_ARGS(commit_data.commit_lsn),
LSN_FORMAT_ARGS(remote_final_lsn))));
- apply_handle_commit_internal(&commit_data);
+ /*
+ * Stop the skipping transaction if enabled. Otherwise, commit the changes
+ * that are just applied.
+ */
+ if (is_skipping_changes())
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ apply_handle_commit_internal(&commit_data);
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(commit_data.end_lsn);
@@ -857,6 +892,9 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
+ /* Enable skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -915,9 +953,10 @@ apply_handle_prepare(StringInfo s)
/*
* Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction. It is done this way because at
- * commit prepared time, we won't know whether we have skipped preparing a
- * transaction because of no change.
+ * change has happened in this transaction, possibly because we're
+ * skipping data-modification changes of this transaction. It is done this
+ * way because at commit prepared time, we won't know whether we have
+ * skipped preparing a transaction because of no change.
*
* XXX, We can optimize such that at commit prepared time, we first check
* whether we have prepared the transaction or not but that doesn't seem
@@ -931,6 +970,10 @@ apply_handle_prepare(StringInfo s)
CommitTransactionCommand();
pgstat_report_stat(false);
+ /* Stop the skipping changes if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
store_flush_position(prepare_data.end_lsn);
in_remote_transaction = false;
@@ -1062,6 +1105,9 @@ apply_handle_stream_prepare(StringInfo s)
elog(DEBUG1, "received prepare for streamed transaction %u", prepare_data.xid);
+ /* Enable skipping all changes of this transaction if specified. */
+ maybe_start_skipping_changes(prepare_data.xid);
+
/* Replay all the spooled operations. */
apply_spooled_messages(prepare_data.xid, prepare_data.prepare_lsn);
@@ -1072,6 +1118,10 @@ apply_handle_stream_prepare(StringInfo s)
pgstat_report_stat(false);
+ /* Stop the skipping changes if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
store_flush_position(prepare_data.end_lsn);
in_remote_transaction = false;
@@ -1097,9 +1147,10 @@ apply_handle_origin(StringInfo s)
{
/*
* ORIGIN message can only come inside streaming transaction or inside
- * remote transaction and before any actual writes.
+ * remote transaction and before any actual writes unless we're skipping
+ * changes of the transaction.
*/
- if (!in_streamed_transaction &&
+ if (!in_streamed_transaction && !is_skipping_changes() &&
(!in_remote_transaction ||
(IsTransactionState() && !am_tablesync_worker())))
ereport(ERROR,
@@ -1121,6 +1172,9 @@ apply_handle_stream_start(StringInfo s)
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg_internal("duplicate STREAM START message")));
+ /* extract XID of the top-level transaction */
+ stream_xid = logicalrep_read_stream_start(s, &first_segment);
+
/*
* Start a transaction on stream start, this transaction will be committed
* on the stream stop unless it is a tablesync worker in which case it
@@ -1133,9 +1187,6 @@ apply_handle_stream_start(StringInfo s)
/* notify handle methods we're processing a remote transaction */
in_streamed_transaction = true;
- /* extract XID of the top-level transaction */
- stream_xid = logicalrep_read_stream_start(s, &first_segment);
-
if (!TransactionIdIsValid(stream_xid))
ereport(ERROR,
(errcode(ERRCODE_PROTOCOL_VIOLATION),
@@ -1217,6 +1268,7 @@ apply_handle_stream_abort(StringInfo s)
errmsg_internal("STREAM ABORT message without STREAM STOP")));
logicalrep_read_stream_abort(s, &xid, &subxid);
+ maybe_start_skipping_changes(xid);
/*
* If the two XIDs are the same, it's in fact abort of toplevel xact, so
@@ -1310,6 +1362,10 @@ apply_handle_stream_abort(StringInfo s)
CommitTransactionCommand();
}
+ /* Stop the skipping transaction if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
reset_apply_error_context_info();
}
@@ -1459,9 +1515,23 @@ apply_handle_stream_commit(StringInfo s)
elog(DEBUG1, "received commit for streamed transaction %u", xid);
+ /* Enable skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(xid);
+
apply_spooled_messages(xid, commit_data.commit_lsn);
- apply_handle_commit_internal(&commit_data);
+ if (is_skipping_changes())
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ {
+ /* commit the streamed transaction */
+ apply_handle_commit_internal(&commit_data);
+ }
/* unlink the files with serialized changes and subxact info */
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
@@ -2347,6 +2417,17 @@ apply_dispatch(StringInfo s)
LogicalRepMsgType action = pq_getmsgbyte(s);
LogicalRepMsgType saved_command;
+ /*
+ * Skip all data-modification changes if we're skipping changes of this
+ * transaction.
+ */
+ if (is_skipping_changes() &&
+ (action == LOGICAL_REP_MSG_INSERT ||
+ action == LOGICAL_REP_MSG_UPDATE ||
+ action == LOGICAL_REP_MSG_DELETE ||
+ action == LOGICAL_REP_MSG_TRUNCATE))
+ return;
+
/*
* Set the current command being applied. Since this function can be called
* recusively when applying spooled changes, save the current command.
@@ -3775,3 +3856,103 @@ reset_apply_error_context_info(void)
apply_error_callback_arg.remote_attnum = -1;
set_apply_error_context_xact(InvalidTransactionId, 0);
}
+
+/*
+ * Start skipping changes of the transaction if the given XID matches the
+ * transaction ID specified by skip_xid option.
+ */
+static void
+maybe_start_skipping_changes(TransactionId xid)
+{
+ Assert(!is_skipping_changes());
+ Assert(!TransactionIdIsValid(skipping_xid));
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ if (!TransactionIdIsValid(MySubscription->skipxid) ||
+ MySubscription->skipxid != xid)
+ return;
+
+ skipping_xid = xid;
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction with xid %u",
+ skipping_xid));
+}
+
+/*
+ * Stop skipping changes and reset the skip XID. Also, reset the skip XID
+ * (pg_subscription.subskipxid). If origin_lsn and origin_committs are valid, we
+ * set origin state to the transaction commit that resets the skip XID so that we
+ * can start streaming from the transaction next to the one that we just skipped.
+ */
+static void
+stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs)
+{
+ Relation rel;
+ HeapTuple tup;
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ Assert(is_skipping_changes());
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction with xid %u",
+ skipping_xid)));
+
+ /* Stop skipping changes */
+ skipping_xid = InvalidTransactionId;
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ if (!IsTransactionState())
+ StartTransactionCommand();
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ TransactionIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ /* Set subskipxid to null */
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ /* Update the system catalog to reset the skip XID */
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+
+ heap_freetuple(tup);
+ table_close(rel, RowExclusiveLock);
+
+ if (!XLogRecPtrIsInvalid(origin_lsn))
+ {
+ /*
+ * Update origin state so we can restart streaming from correct
+ * position in case of crash.
+ */
+ replorigin_session_origin_lsn = origin_lsn;
+ replorigin_session_origin_timestamp = origin_committs;
+ }
+
+ CommitTransactionCommand();
+ pgstat_report_stat(false);
+
+ /*
+ * Clear the error statistics of this subscription to let users know the
+ * subscription is no longer getting stuck by the conflict.
+ *
+ * The message for clearing the error statistics can be lost but that's
+ * okay. The user can know the logical replication is working fine in
+ * other ways, for example, checking pg_stat_subscription view. And the
+ * user is able to reset the single subscription error statistics by
+ * pg_reset_subscription_error SQL function.
+ */
+ pgstat_clear_subscription_error(MySubscription->oid, InvalidOid);
+}
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 21061493ea..beaa6e646d 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -67,6 +67,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
char subtwophasestate; /* Stream two-phase transactions */
+ TransactionId subskipxid BKI_FORCE_NULL; /* All changes associated with
+ * this XID are skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -103,6 +106,7 @@ typedef struct Subscription
* binary format */
bool stream; /* Allow streaming in-progress transactions. */
char twophasestate; /* Allow streaming two-phase transactions */
+ TransactionId skipxid; /* All changes of the XID are skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 504d65f7d6..aec06b0d23 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3677,6 +3677,7 @@ typedef struct AlterSubscriptionStmt
char *conninfo; /* Connection string to publisher */
List *publication; /* One or more publication to subscribe to */
List *options; /* List of DefElem nodes */
+ TransactionId skip_xid; /* XID to skip */
} AlterSubscriptionStmt;
typedef struct DropSubscriptionStmt
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index a6914a24e5..6775736b2b 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -536,7 +536,7 @@ typedef struct PgStat_MsgReplSlot
/* ----------
* PgStat_MsgSubscriptionErr Sent by the apply worker or the table sync worker to
- * update/reset the error happening during logical
+ * update/reset/clear the error happening during logical
* replication.
* ----------
*/
@@ -554,7 +554,9 @@ typedef struct PgStat_MsgSubscriptionErr
Oid m_subid;
Oid m_subrelid;
- /* The reset message uses below field */
+ /* The clear and reset messages use below fields */
+ bool m_clear; /* clear all fields except for last_failure and
+ * last_errmsg */
bool m_reset; /* Reset all fields and set reset_stats
* timestamp */
@@ -1111,6 +1113,7 @@ extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type t
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
extern void pgstat_reset_subscription_error(Oid subid, Oid subrelid);
+extern void pgstat_clear_subscription_error(Oid subid, Oid subrelid);
extern void pgstat_report_autovac(Oid dboid);
extern void pgstat_report_vacuum(Oid tableoid, bool shared,
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index b87f67fe55..217b5fabd1 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -296,6 +296,19 @@ ERROR: unrecognized subscription parameter: "enabled"
-- fail - RESET must not include values
ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
ERROR: RESET must not include values for parameters
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = '4294967295');
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1.1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+ERROR: invalid transaction id
\dRs+
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index aa90560691..4c9d25f0a4 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -231,6 +231,17 @@ ALTER SUBSCRIPTION regress_testsub RESET (enabled);
-- fail - RESET must not include values
ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = '4294967295');
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1.1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+
\dRs+
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/subscription/t/024_skip_xact.pl b/src/test/subscription/t/024_skip_xact.pl
new file mode 100644
index 0000000000..affb663803
--- /dev/null
+++ b/src/test/subscription/t/024_skip_xact.pl
@@ -0,0 +1,244 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Tests for skipping logical replication transactions
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 10;
+
+# Test if the error reported on pg_subscription_errors view is expected.
+sub test_subscription_error
+{
+ my ($node, $source, $relname, $expected_error, $msg) = @_;
+
+ # Wait for the error statistics to be updated.
+ $node->poll_query_until(
+ 'postgres', qq[
+SELECT count(1) > 0 FROM pg_stat_subscription_errors
+WHERE relid = '$relname'::regclass AND failure_source = '$source';
+]) or die "Timed out while waiting for statistics to be updated";
+
+ my $result = $node->safe_psql(
+ 'postgres',
+ qq[
+SELECT datname, subname, command, relid::regclass, failure_source, failure_count > 0
+FROM pg_stat_subscription_errors
+WHERE relid = '$relname'::regclass AND failure_source = '$source';
+]);
+ is($result, $expected_error, $msg);
+}
+
+# Check the error reported on pg_stat_subscription view and skip the failed
+# transaction.
+sub test_skip_subscription_error
+{
+ my ($node, $source, $subname, $relname, $expected_error, $msg) = @_;
+
+ # Check the reported error.
+ test_subscription_error($node, $source, $relname, $expected_error, $msg);
+
+ # Get XID of the failed transaction.
+ my $skipxid = $node->safe_psql(
+ 'postgres',
+ "SELECT xid FROM pg_stat_subscription_errors WHERE relid = '$relname'::regclass");
+ $node->safe_psql('postgres',
+ "ALTER SUBSCRIPTION $subname SET (skip_xid = '$skipxid')");
+
+ # Restart the subscriber to restart logical replication without interval.
+ $node->restart;
+
+ # Wait for the failed transaction to be skipped.
+ $node->poll_query_until(
+ 'postgres',
+ qq[
+SELECT subskipxid IS NULL FROM pg_subscription
+WHERE subname = '$subname'
+]) or die "Timed out while waiting for the transaction to be skipped";
+
+ # Also wait for the error details to be cleared.
+ $node->poll_query_until(
+ 'postgres',
+ qq[
+SELECT command IS NULL FROM pg_stat_subscription_errors
+WHERE subname = '$subname' AND failure_source = '$source';
+]) or die "Timed out while waiting for the transaction to be skipped";
+}
+
+# Create publisher node.
+my $node_publisher = PostgresNode->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+logical_decoding_work_mem = 64kB
+]);
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgresNode->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+
+# don't overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+wal_retrieve_retry_interval = 5s
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we create
+# the same tables but with primary keys. Also, insert some data that will conflict
+# with the data replicated from publisher later.
+$node_publisher->safe_psql('postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int);
+CREATE TABLE test_tab2 (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+INSERT INTO test_tab1 VALUES (1);
+INSERT INTO test_tab2 VALUES (1);
+COMMIT;
+]);
+$node_subscriber->safe_psql('postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int primary key);
+CREATE TABLE test_tab2 (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab2 VALUES (1);
+INSERT INTO test_tab_streaming SELECT 10000, md5(10000::text);
+COMMIT;
+]);
+
+# Check if there is no subscription errors before starting logical replication.
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_errors");
+is($result, qq(0), 'check no subscription error');
+
+# Setup publications.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ q[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Create subscriptions. The table sync for test_tab2 on tap_sub will enter to
+# infinite error due to violating the unique constraint.
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (two_phase = on);");
+my $appname_streaming = 'tap_sub_streaming';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming' PUBLICATION tap_pub_streaming WITH (streaming = on, two_phase = on);");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Wait for initial table sync for test_tab1 and test_tab_streaming to finish.
+$node_subscriber->poll_query_until('postgres',
+ q[
+SELECT count(1) = 2 FROM pg_subscription_rel
+WHERE srrelid in ('test_tab1'::regclass, 'test_tab_streaming'::regclass) AND srsubstate = 'r'
+]) or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check the initial data.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab1");
+is($result, q(1), 'check initial data was copied to subscriber');
+
+# Insert more data to test_tab1, raising an error on the subscriber due to violating
+# the unique constraint on test_tab1. Then skip the transaction in question.
+$node_publisher->safe_psql('postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab1 VALUES (1);
+COMMIT;
+]);
+test_skip_subscription_error($node_subscriber,
+ 'apply', 'tap_sub', 'test_tab1',
+ qq(postgres|tap_sub|INSERT|test_tab1|apply|t),
+ 'skip the error reported by the apply worker');
+
+# Check the table sync worker's error in the view.
+test_subscription_error($node_subscriber,
+ 'tablesync', 'test_tab2',
+ qq(postgres|tap_sub||test_tab2|tablesync|t),
+ 'skip the error reported by the table sync worker');
+
+# Insert enough rows to test_tab_streaming to exceed the 64kB limit, also raising an
+# error on the subscriber during applying spooled changes for the same reason. Then
+# skip the transactio in question.
+$node_publisher->safe_psql('postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+COMMIT;
+]);
+test_skip_subscription_error($node_subscriber,
+ 'apply', 'tap_sub_streaming', 'test_tab_streaming',
+ qq(postgres|tap_sub_streaming|INSERT|test_tab_streaming|apply|t),
+ 'skip the error reported by the table sync worker during during applying streaming changes');
+
+# Insert data to test_tab1 and test_tab_streaming that don't conflict.
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab1 VALUES (2)");
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab_streaming VALUES (10001, md5(10001::text))");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Check the data is successfully replicated after skipping the transactions.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT * FROM test_tab1");
+is($result, q(1
+2), "subscription gets changes after skipped transaction");
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM test_tab_streaming");
+is($result, q(2), "subscription gets changes after skipped streamed transaction");
+
+# Tests for skipping the transactions that are prepared and stream_prepared. We insert
+# the same data as the previous tests but prepare the transactions. Those insertions
+# raise an error on the subscriptions. Then we skip the transactions in question.
+$node_publisher->safe_psql('postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab1 VALUES (1);
+PREPARE TRANSACTION 'skip_sub1';
+COMMIT PREPARED 'skip_sub1';
+]);
+test_skip_subscription_error($node_subscriber,
+ 'apply', 'tap_sub', 'test_tab1',
+ qq(postgres|tap_sub|INSERT|test_tab1|apply|t),
+ 'skip the error on changes of the prepared transaction');
+
+$node_publisher->safe_psql('postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+PREPARE TRANSACTION 'skip_sub2';
+COMMIT PREPARED 'skip_sub2';
+]);
+test_skip_subscription_error($node_subscriber,
+ 'apply', 'tap_sub_streaming', 'test_tab_streaming',
+ qq(postgres|tap_sub_streaming|INSERT|test_tab_streaming|apply|t),
+ 'skip the error on changes of the prepared-streamed transaction');
+
+# Check if the view doesn't show any entries after dropping the subscriptions.
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+DROP SUBSCRIPTION tap_sub;
+DROP SUBSCRIPTION tap_sub_streaming;
+]);
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_errors");
+is($result, q(0), 'no error after dropping subscription');
--
2.24.3 (Apple Git-128)
v9-0002-Add-pg_stat_subscription_errors-statistics-view.patchapplication/octet-stream; name=v9-0002-Add-pg_stat_subscription_errors-statistics-view.patchDownload
From 44d16851b7781cce05416fe32fbf5ea5d43cbdac Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 16 Jul 2021 23:10:22 +0900
Subject: [PATCH v9 2/5] Add pg_stat_subscription_errors statistics view.
This commits adds new system view pg_stat_logical_replication_error,
showing errors happening during applying logical replication changes
as well as during performing initial table synchronization.
The subscription error entries are removed by autovacuum workers when
the table synchronization competed in table sync worker cases and when
dropping the subscription in apply worker cases.
It also adds SQL function pg_stat_reset_subscription_error() to
reset the single subscription error.
---
doc/src/sgml/monitoring.sgml | 160 ++++++
src/backend/catalog/system_functions.sql | 2 +
src/backend/catalog/system_views.sql | 27 +
src/backend/postmaster/pgstat.c | 651 +++++++++++++++++++++++
src/backend/replication/logical/worker.c | 40 +-
src/backend/utils/adt/pgstatfuncs.c | 119 +++++
src/backend/utils/error/elog.c | 16 +
src/include/catalog/pg_proc.dat | 13 +
src/include/pgstat.h | 106 ++++
src/include/utils/elog.h | 1 +
src/test/regress/expected/rules.out | 22 +
src/tools/pgindent/typedefs.list | 5 +
12 files changed, 1159 insertions(+), 3 deletions(-)
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 74a58a916c..0fd0bbfa1f 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -346,6 +346,15 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
</entry>
</row>
+ <row>
+ <entry><structname>pg_stat_subscription_errors</structname><indexterm><primary>pg_stat_subscription_errors</primary></indexterm></entry>
+ <entry>One row per error that happened on subscription, showing information about
+ the subscription errors.
+ See <link linkend="monitoring-pg-stat-subscription-errors">
+ <structname>pg_stat_subscription_errors</structname></link> for details.
+ </entry>
+ </row>
+
<row>
<entry><structname>pg_stat_ssl</structname><indexterm><primary>pg_stat_ssl</primary></indexterm></entry>
<entry>One row per connection (regular and replication), showing information about
@@ -3050,6 +3059,135 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
</sect2>
+ <sect2 id="monitoring-pg-stat-subscription-errors">
+ <title><structname>pg_stat_subscription_errors</structname></title>
+
+ <indexterm>
+ <primary>pg_stat_subscription_errors</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_stat_subscription_errors</structname> view will contain one
+ row per subscription error reported by workers applying logical replication
+ changes and workers handling the initial data copy of the subscribed tables.
+ </para>
+
+ <table id="pg-stat-subscription-errors" xreflabel="pg_stat_subscription-errors">
+ <title><structname>pg_stat_subscription_errors</structname> View</title>
+ <tgroup cols="1">
+ <thead>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ Column Type
+ </para>
+ <para>
+ Description
+ </para></entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>datname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the database in which the subscription is created.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the subscription.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the subscription.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>relid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker is processing when the
+ error happened.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>command</structfield> <type>text</type>
+ </para>
+ <para>
+ Name of command being applied when the error happened. This
+ field is always NULL if the error is reported by
+ <literal>tablesync</literal> worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>xid</structfield> <type>xid</type>
+ </para>
+ <para>
+ Transaction ID of publisher node being applied when the error
+ happened. This field is always NULL if the error is reported
+ by <literal>tablesync</literal> worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>failure_source</structfield> <type>text</type>
+ </para>
+ <para>
+ Type of worker reported the error: <literal>apply</literal> or
+ <literal>tablesync</literal>.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>failure_count</structfield> <type>uint8</type>
+ </para>
+ <para>
+ Number of times error happened on the worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_failure</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which the last error happened.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_failure_message</structfield> <type>text</type>
+ </para>
+ <para>
+ Error message which is reported last failure time.
+ </para></entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ </sect2>
+
<sect2 id="monitoring-pg-stat-ssl-view">
<title><structname>pg_stat_ssl</structname></title>
@@ -5172,6 +5310,28 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
can be granted EXECUTE to run the function.
</para></entry>
</row>
+
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_stat_reset_subscription_error</primary>
+ </indexterm>
+ <function>pg_stat_reset_subscription_error</function> ( <parameter>subid</parameter> <type>oid</type>, <parameter>relid</parameter> <type>oid</type> )
+ <returnvalue>void</returnvalue>
+ </para>
+ <para>
+ Resets statistics of a single subscription error. If
+ the argument <parameter>relid</parameter> is not <literal>NULL</literal>,
+ resets error statistics of the <literal>tablesync</literal> worker for
+ the relation with <parameter>relid</parameter>. Otherwise, resets the
+ error statistics of the <literal>apply</literal> worker running on the
+ subscription with <parameter>subid</parameter>.
+ </para>
+ <para>
+ This function is restricted to superusers by default, but other users
+ can be granted EXECUTE to run the function.
+ </para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index a416e94d37..c9aa6f04d3 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -639,6 +639,8 @@ REVOKE EXECUTE ON FUNCTION pg_stat_reset_single_function_counters(oid) FROM publ
REVOKE EXECUTE ON FUNCTION pg_stat_reset_replication_slot(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_stat_reset_subscription_error(oid, oid) FROM public;
+
REVOKE EXECUTE ON FUNCTION lo_import(text) FROM public;
REVOKE EXECUTE ON FUNCTION lo_import(text, oid) FROM public;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 55f6e3711d..449692afa9 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1257,3 +1257,30 @@ REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
substream, subtwophasestate, subslotname, subsynccommit, subpublications)
ON pg_subscription TO public;
+
+CREATE VIEW pg_stat_subscription_errors AS
+ SELECT
+ d.datname,
+ sr.subid,
+ s.subname,
+ e.relid,
+ e.command,
+ e.xid,
+ e.failure_source,
+ e.failure_count,
+ e.last_failure,
+ e.last_failure_message,
+ e.stats_reset
+ FROM (SELECT
+ oid as subid,
+ NULL as relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT
+ srsubid as subid,
+ srrelid as relid
+ FROM pg_subscription_rel
+ WHERE srsubstate <> 'r') sr,
+ LATERAL pg_stat_get_subscription_error(sr.subid, sr.relid) e
+ JOIN pg_database d ON (e.datid = d.oid)
+ JOIN pg_subscription s ON (e.subid = s.oid);
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index a3c35bdf60..8179e3995d 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -40,6 +40,8 @@
#include "access/xact.h"
#include "catalog/pg_database.h"
#include "catalog/pg_proc.h"
+#include "catalog/pg_subscription.h"
+#include "catalog/pg_subscription_rel.h"
#include "common/ip.h"
#include "executor/instrument.h"
#include "libpq/libpq.h"
@@ -105,6 +107,7 @@
#define PGSTAT_TAB_HASH_SIZE 512
#define PGSTAT_FUNCTION_HASH_SIZE 512
#define PGSTAT_REPLSLOT_HASH_SIZE 32
+#define PGSTAT_SUBSCRIPTION_HASH_SIZE 32
/* ----------
@@ -279,6 +282,7 @@ static PgStat_GlobalStats globalStats;
static PgStat_WalStats walStats;
static PgStat_SLRUStats slruStats[SLRU_NUM_ELEMENTS];
static HTAB *replSlotStatHash = NULL;
+static HTAB *subscriptionHash = NULL;
/*
* List of OIDs of databases we need to write out. If an entry is InvalidOid,
@@ -329,6 +333,12 @@ static bool pgstat_db_requested(Oid databaseid);
static PgStat_StatReplSlotEntry *pgstat_get_replslot_entry(NameData name, bool create_it);
static void pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotstats, TimestampTz ts);
+static PgStat_StatSubEntry *pgstat_get_subscription_entry(Oid subid,
+ bool create);
+static PgStat_StatSubErrEntry *pgstat_get_subscription_error_entry(Oid subid,
+ Oid subrelid,
+ bool create);
+
static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg);
static void pgstat_send_funcstats(void);
static void pgstat_send_slru(void);
@@ -368,6 +378,10 @@ static void pgstat_recv_checksum_failure(PgStat_MsgChecksumFailure *msg, int len
static void pgstat_recv_connstat(PgStat_MsgConn *msg, int len);
static void pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len);
static void pgstat_recv_tempfile(PgStat_MsgTempFile *msg, int len);
+static void pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len);
+static void pgstat_recv_subscription_error_purge(PgStat_MsgSubscriptionErrPurge *msg,
+ int len);
+static void pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len);
/* ------------------------------------------------------------
* Public functions called from postmaster follow
@@ -1146,6 +1160,166 @@ pgstat_vacuum_stat(void)
}
}
+ /*
+ * Search for all the dead subscriptions and error entries in stats
+ * hashtable and tell the stats collector to drop them.
+ */
+ if (subscriptionHash)
+ {
+ PgStat_MsgSubscriptionPurge submsg;
+ PgStat_StatSubEntry *subent;
+ HTAB *htab;
+
+ htab = pgstat_collect_oids(SubscriptionRelationId, Anum_pg_subscription_oid);
+
+ submsg.m_nentries = 0;
+
+ hash_seq_init(&hstat, subscriptionHash);
+ while ((subent = (PgStat_StatSubEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ PgStat_MsgSubscriptionErrPurge errmsg;
+ PgStat_StatSubErrEntry *errent;
+ HASH_SEQ_STATUS hstat_rel;
+ List *not_ready_rels_list;
+ HTAB *not_ready_rels_htab;
+ ListCell *lc;
+ HASHCTL hash_ctl;
+
+ CHECK_FOR_INTERRUPTS();
+
+ if (hash_search(htab, (void *) &(subent->subid), HASH_FIND, NULL) == NULL)
+ {
+ /* This subscription is dead, add subid to the message */
+ submsg.m_subids[submsg.m_nentries++] = subent->subid;
+
+ /*
+ * If the message is full, send it out and reinitialize to
+ * empty
+ */
+ if (submsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONPURGE)
+ {
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + submsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&submsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(&submsg, len);
+ submsg.m_nentries = 0;
+ }
+
+ continue;
+ }
+
+ /*
+ * Nothing to do here if the subscription exists but has no error
+ * entries.
+ */
+ if (subent->suberrors == NULL)
+ continue;
+
+ /*
+ * The subscription has error entries. We search errors of the
+ * table sync workers who are already in sync state. Those errors
+ * should be removed.
+ *
+ * Note that the lifetime of error entries of the apply worker and
+ * the table sync worker are different. The former lives until
+ * the subscription is dropped whereas the latter lives the table
+ * synchronization is completed.
+ */
+ not_ready_rels_list = GetSubscriptionNotReadyRelations(subent->subid);
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(SubscriptionRelState);
+ not_ready_rels_htab = hash_create("not ready relations in subscription",
+ 64,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+
+ /*
+ * The number of not-ready relations can be high for example right
+ * after creating a subscription, so we load the list of
+ * SubscriptionRelState into the hash table for a faster lookup.
+ */
+ foreach(lc, not_ready_rels_list)
+ {
+ SubscriptionRelState *r_elem = (SubscriptionRelState *) lfirst(lc);
+ SubscriptionRelState *r_entry;
+
+ CHECK_FOR_INTERRUPTS();
+ r_entry = hash_search(not_ready_rels_htab, (void *) &(r_elem->relid),
+ HASH_ENTER, NULL);
+ memcpy(r_entry, r_elem, sizeof(SubscriptionRelState));
+ }
+
+ list_free(not_ready_rels_list);
+
+ errmsg.m_nentries = 0;
+ errmsg.m_subid = subent->subid;
+
+ /*
+ * Search for all error entries of which relation is already ready
+ * state
+ */
+ hash_seq_init(&hstat_rel, subent->suberrors);
+ while ((errent = (PgStat_StatSubErrEntry *) hash_seq_search(&hstat_rel)) != NULL)
+ {
+ CHECK_FOR_INTERRUPTS();
+
+ /* Skip the apply worker's error */
+ if (!OidIsValid(errent->subrelid))
+ continue;
+
+ /*
+ * Add the relid to the message if the table synchronization
+ * for this relation already complete or the table is no
+ * longer subscribed.
+ */
+ if (hash_search(not_ready_rels_htab, (void *) &(errent->subrelid),
+ HASH_FIND, NULL) == NULL)
+ errmsg.m_relids[errmsg.m_nentries++] = errent->subrelid;
+
+ /*
+ * If the message is full, send it out and reinitialize to
+ * empty
+ */
+ if (errmsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONERRPURGE)
+ {
+ len = offsetof(PgStat_MsgSubscriptionErrPurge, m_relids[0])
+ + errmsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&errmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE);
+ pgstat_send(&errmsg, len);
+ errmsg.m_nentries = 0;
+ }
+ }
+
+ /* Send the rest of dead error entries */
+ if (errmsg.m_nentries > 0)
+ {
+ len = offsetof(PgStat_MsgSubscriptionErrPurge, m_relids[0])
+ + errmsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&errmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE);
+ pgstat_send(&errmsg, len);
+ errmsg.m_nentries = 0;
+ }
+
+ hash_destroy(not_ready_rels_htab);
+ }
+
+ /* Send the rest of dead subscriptions */
+ if (submsg.m_nentries > 0)
+ {
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + submsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&submsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(&submsg, len);
+ }
+
+ hash_destroy(htab);
+ }
+
/*
* Lookup our own database entry; if not found, nothing more to do.
*/
@@ -1555,6 +1729,25 @@ pgstat_reset_replslot_counter(const char *name)
pgstat_send(&msg, sizeof(msg));
}
+/* ----------
+ * pgstat_reset_subscription_error() -
+ *
+ * Tell the collector about reset the subscription error.
+ * ----------
+ */
+void
+pgstat_reset_subscription_error(Oid subid, Oid subrelid)
+{
+ PgStat_MsgSubscriptionErr msg;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_reset = true;
+
+ pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
+}
+
/* ----------
* pgstat_report_autovac() -
*
@@ -1820,6 +2013,37 @@ pgstat_report_replslot_drop(const char *slotname)
pgstat_send(&msg, sizeof(PgStat_MsgReplSlot));
}
+/* ----------
+ * pgstat_report_subscription_error() -
+ *
+ * Tell the collector about the subscription error.
+ * ----------
+ */
+void
+pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command, TransactionId xid,
+ const char *errmsg)
+{
+ PgStat_MsgSubscriptionErr msg;
+ int len;
+
+ len = offsetof(PgStat_MsgSubscriptionErr, m_errmsg[0]) + strlen(errmsg);
+ Assert(len < PGSTAT_MAX_MSG_SIZE);
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_databaseid = MyDatabaseId;
+ msg.m_relid = relid;
+ msg.m_reset = false;
+ msg.m_command = command;
+ msg.m_xid = xid;
+ msg.m_failure_time = GetCurrentTimestamp();
+ strlcpy(msg.m_errmsg, errmsg, PGSTAT_SUBSCRIPTIONERR_MSGLEN);
+
+ pgstat_send(&msg, len);
+}
+
/* ----------
* pgstat_ping() -
*
@@ -2886,6 +3110,22 @@ pgstat_fetch_replslot(NameData slotname)
return pgstat_get_replslot_entry(slotname, false);
}
+/*
+ * ---------
+ * pgstat_fetch_subscription_error() -
+ *
+ * Support function for the SQL-callable pgstat* functions. Returns
+ * a pointer to the subscription error struct.
+ * ---------
+ */
+PgStat_StatSubErrEntry *
+pgstat_fetch_subscription_error(Oid subid, Oid relid)
+{
+ backend_read_statsfile();
+
+ return pgstat_get_subscription_error_entry(subid, relid, false);
+}
+
/*
* Shut down a single backend's statistics reporting at process exit.
*
@@ -3463,6 +3703,19 @@ PgstatCollectorMain(int argc, char *argv[])
pgstat_recv_connstat(&msg.msg_conn, len);
break;
+ case PGSTAT_MTYPE_SUBSCRIPTIONERR:
+ pgstat_recv_subscription_error(&msg.msg_subscriptionerr, len);
+ break;
+
+ case PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE:
+ pgstat_recv_subscription_error_purge(&msg.msg_subscriptionerrpurge,
+ len);
+ break;
+
+ case PGSTAT_MTYPE_SUBSCRIPTIONPURGE:
+ pgstat_recv_subscription_purge(&msg.msg_subscriptionpurge, len);
+ break;
+
default:
break;
}
@@ -3763,6 +4016,50 @@ pgstat_write_statsfiles(bool permanent, bool allDbs)
}
}
+ /*
+ * Write subscription error structs
+ */
+ if (subscriptionHash)
+ {
+ PgStat_StatSubEntry *subent;
+
+ hash_seq_init(&hstat, subscriptionHash);
+ while ((subent = (PgStat_StatSubEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ PgStat_StatSubErrEntry *errent;
+ HASH_SEQ_STATUS relhstat;
+ int32 nerrors = hash_get_num_entries(subent->suberrors);
+
+ /* Skip this subscription if not have any errors */
+ if (subent->suberrors == NULL)
+ continue;
+
+ fputc('S', fpout);
+ rc = fwrite(subent, sizeof(PgStat_StatSubEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+
+ /* The number of errors follows */
+ rc = fwrite(&nerrors, sizeof(int32), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+
+ /* Then, the error entries follow */
+ hash_seq_init(&relhstat, subent->suberrors);
+ while ((errent = (PgStat_StatSubErrEntry *) hash_seq_search(&relhstat)) != NULL)
+ {
+ /*
+ * XXX we write the whole PgStat_StatSubErrEntry entry that
+ * contains the fixed-legnth error message string which is
+ * PGSTAT_SUBSCRIPTIONERR_MSGLEN in length, making the stats
+ * file bloat. It's okay since we assume that the number of
+ * error entries is not high. But if the expectation became
+ * false we should write the string and its length instead.
+ */
+ rc = fwrite(errent, sizeof(PgStat_StatSubErrEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+ }
+ }
+ }
+
/*
* No more output to be done. Close the temp file and replace the old
* pgstat.stat with it. The ferror() check replaces testing for error
@@ -4224,6 +4521,100 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
break;
}
+ /*
+ * 'S' A PgStat_StatSubEntry struct followed by the number of
+ * errors and PgStat_StatSubErrEntry structs, describing
+ * subscription errors.
+ */
+ case 'S':
+ {
+ PgStat_StatSubEntry subbuf;
+ PgStat_StatSubEntry *subent;
+ int32 nerrors;
+
+ /* Read the subscription entry */
+ if (fread(&subbuf, 1, sizeof(PgStat_StatSubEntry), fpin) !=
+ sizeof(PgStat_StatSubEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /* Create hash table if we don't have it already. */
+ if (subscriptionHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ subscriptionHash = hash_create("Subscription stat entries",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ }
+
+ /* Enter the subscription error entry */
+ subent =
+ (PgStat_StatSubEntry *) hash_search(subscriptionHash,
+ (void *) &(subbuf.subid),
+ HASH_ENTER, NULL);
+ subent->suberrors = NULL;
+
+ /* Read the number of errors in the subscription */
+ if (fread(&nerrors, 1, sizeof(int32), fpin) != sizeof(int32))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /* Read error entries */
+ for (int i = 0; i < nerrors; i++)
+ {
+ PgStat_StatSubErrEntry errbuf;
+ PgStat_StatSubErrEntry *errent;
+
+ if (fread(&errbuf, 1, sizeof(PgStat_StatSubErrEntry), fpin) !=
+ sizeof(PgStat_StatSubErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ if (subent->suberrors == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubErrEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ subent->suberrors = hash_create("Subscription error hash",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ }
+
+ /*
+ * Enter the error information to the subscription
+ * hash
+ */
+ errent =
+ (PgStat_StatSubErrEntry *) hash_search(subent->suberrors,
+ (void *) &(errbuf.subrelid),
+ HASH_ENTER, NULL);
+
+ memcpy(errent, &errbuf, sizeof(PgStat_StatSubErrEntry));
+ }
+
+ break;
+ }
+
case 'E':
goto done;
@@ -4566,6 +4957,50 @@ pgstat_read_db_statsfile_timestamp(Oid databaseid, bool permanent,
}
break;
+ /*
+ * 'S' A PgStat_StatSubEntry struct followed by the number of
+ * errors and PgStat_StatSubErrEntry structs describing
+ * subscription errors.
+ */
+ case 'S':
+ {
+ PgStat_StatSubEntry subbuf;
+ PgStat_StatSubErrEntry errbuf;
+ int32 nerrors;
+
+ if (fread(&subbuf, 1, sizeof(PgStat_StatSubEntry), fpin)
+ != sizeof(PgStat_StatSubEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ FreeFile(fpin);
+ return false;
+ }
+
+ if (fread(&nerrors, 1, sizeof(int32), fpin) != sizeof(int32))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ for (int i = 0; i < nerrors; i++)
+ {
+ if (fread(&errbuf, 1, sizeof(PgStat_StatSubErrEntry), fpin) !=
+ sizeof(PgStat_StatSubErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+ }
+ }
+
+ break;
+
case 'E':
goto done;
@@ -4771,6 +5206,7 @@ pgstat_clear_snapshot(void)
pgStatLocalContext = NULL;
pgStatDBHash = NULL;
replSlotStatHash = NULL;
+ subscriptionHash = NULL;
/*
* Historically the backend_status.c facilities lived in this file, and
@@ -5689,6 +6125,116 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
}
}
+/* ----------
+ * pgstat_recv_subscription_error() -
+ *
+ * Process a SUBSCRIPTIONERR message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len)
+{
+ PgStat_StatSubErrEntry *errent;
+ bool create = !msg->m_reset;
+
+ /* Get subscription error */
+ errent = pgstat_get_subscription_error_entry(msg->m_subid,
+ msg->m_subrelid,
+ create);
+
+ if (msg->m_reset)
+ {
+ if (errent == NULL)
+ return;
+
+ /* reset fields and set reset timestamp */
+ errent->relid = InvalidOid;
+ errent->command = 0;
+ errent->xid = InvalidTransactionId;
+ errent->failure_count = 0;
+ errent->last_failure = 0;
+ errent->last_errmsg[0] = '\0';
+ errent->stat_reset_timestamp = GetCurrentTimestamp();
+ }
+ else
+ {
+ Assert(errent);
+
+ /* update the error entry */
+ errent->databaseid = msg->m_databaseid;
+ errent->relid = msg->m_relid;
+ errent->command = msg->m_command;
+ errent->xid = msg->m_xid;
+ errent->failure_count++;
+ errent->last_failure = msg->m_failure_time;
+ strlcpy(errent->last_errmsg, msg->m_errmsg, PGSTAT_SUBSCRIPTIONERR_MSGLEN);
+ }
+}
+
+/* ----------
+ * pgstat_recv_subscription_purge() -
+ *
+ * Process a SUBSCRIPTIONPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len)
+{
+ if (subscriptionHash == NULL)
+ return;
+
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ PgStat_StatSubEntry *subent;
+
+ subent = pgstat_get_subscription_entry(msg->m_subids[i], false);
+
+ /*
+ * Nothing to do if the subscription entry is not found. This could
+ * happen when the subscription is dropped and the message for
+ * dropping subscription entry arrived before the message for
+ * reporting the error.
+ */
+ if (subent == NULL)
+ continue;
+
+ /* Cleanup the hash table for errors */
+ if (subent->suberrors != NULL)
+ hash_destroy(subent->suberrors);
+
+ /* Remote the subscription entry */
+ (void) hash_search(subscriptionHash, (void *) &(msg->m_subids[i]),
+ HASH_REMOVE, NULL);
+ }
+}
+
+/* ----------
+ * pgstat_recv_subscription_error_purge() -
+ *
+ * Process a SUBSCRIPTIONERRPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_error_purge(PgStat_MsgSubscriptionErrPurge *msg, int len)
+{
+ PgStat_StatSubEntry *subent;
+
+ subent = pgstat_get_subscription_entry(msg->m_subid, false);
+
+ /*
+ * Nothing to do if the subscription entry is not found. This could
+ * happen when the subscription with msg->m_subid is removed and the
+ * corresponding entry is also removed before receiving the error purge
+ * message.
+ */
+ if (subent == NULL)
+ return;
+
+ for (int i = 0; i < msg->m_nentries; i++)
+ (void) hash_search(subent->suberrors, (void *) &(msg->m_relids[i]),
+ HASH_REMOVE, NULL);
+}
+
/* ----------
* pgstat_write_statsfile_needed() -
*
@@ -5786,6 +6332,111 @@ pgstat_get_replslot_entry(NameData name, bool create)
return slotent;
}
+/* ----------
+ * pgstat_get_subscription_entry
+ *
+ * Return the subscription statistics with the subscription OID. Return NULL
+ * if not found and the caller didn't request to create it.
+ *
+ * 'create' tells whether to create the new subscription entry if it is not
+ * found.
+ * ----------
+ */
+static PgStat_StatSubEntry *
+pgstat_get_subscription_entry(Oid subid, bool create)
+{
+ PgStat_StatSubEntry *subent;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);
+ bool found;
+
+ if (subscriptionHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ if (!create)
+ return NULL;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubEntry);
+ subscriptionHash = hash_create("Subscription stat entries",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+ }
+
+ subent = (PgStat_StatSubEntry *) hash_search(subscriptionHash,
+ (void *) &subid,
+ action, &found);
+
+ /* initialize field */
+ if (create && !found)
+ subent->suberrors = NULL;
+
+ return subent;
+}
+
+/* ----------
+ * pgstat_get_subscription_error_entry
+ *
+ * Return the entry of subscription error entry with the subscription
+ * OID and relation OID. Return NULL if not found and the caller didn't
+ * request to create it.
+ *
+ * 'create' tells whether to create the new subscription relation entry if it is
+ * not found.
+ * ----------
+ */
+static PgStat_StatSubErrEntry *
+pgstat_get_subscription_error_entry(Oid subid, Oid subrelid, bool create)
+{
+ PgStat_StatSubEntry *subent;
+ PgStat_StatSubErrEntry *errent;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);
+ bool found;
+
+ subent = pgstat_get_subscription_entry(subid, create);
+
+ if (subent == NULL)
+ {
+ Assert(!create);
+ return NULL;
+ }
+
+ if (subent->suberrors == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ if (!create)
+ return NULL;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubErrEntry);
+ subent->suberrors = hash_create("Subscription error hash",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+ }
+
+ errent = (PgStat_StatSubErrEntry *) hash_search(subent->suberrors,
+ (void *) &subrelid,
+ action, &found);
+
+ /* initialize fields */
+ if (create && !found)
+ {
+ errent->databaseid = InvalidOid;
+ errent->relid = InvalidOid;
+ errent->command = 0;
+ errent->xid = InvalidTransactionId;
+ errent->failure_count = 0;
+ errent->last_failure = 0;
+ errent->last_errmsg[0] = '\0';
+ errent->stat_reset_timestamp = 0;
+ }
+
+ return errent;
+}
+
/* ----------
* pgstat_reset_replslot
*
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 58923094f0..38f88cb70b 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3557,8 +3557,23 @@ ApplyWorkerMain(Datum main_arg)
{
char *syncslotname;
- /* This is table synchronization worker, call initial sync. */
- syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ PG_TRY();
+ {
+ /* This is table synchronization worker, call initial sync. */
+ syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ }
+ PG_CATCH();
+ {
+ /* report the table sync error */
+ pgstat_report_subscription_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ MyLogicalRepWorker->relid,
+ 0,
+ InvalidTransactionId,
+ geterrmessage());
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
/* allocate slot name in long-lived context */
myslotname = MemoryContextStrdup(ApplyContext, syncslotname);
@@ -3676,7 +3691,26 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- LogicalRepApplyLoop(origin_startpos);
+ PG_TRY();
+ {
+ LogicalRepApplyLoop(origin_startpos);
+ }
+ PG_CATCH();
+ {
+ /* report the apply error */
+ if (apply_error_callback_arg.command != 0)
+ pgstat_report_subscription_error(MySubscription->oid,
+ InvalidOid,
+ apply_error_callback_arg.rel != NULL
+ ? apply_error_callback_arg.rel->localreloid
+ : InvalidOid,
+ apply_error_callback_arg.command,
+ apply_error_callback_arg.remote_xid,
+ geterrmessage());
+
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
proc_exit(0);
}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index ff5aedc99c..4048c99a9e 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -24,6 +24,7 @@
#include "pgstat.h"
#include "postmaster/bgworker_internals.h"
#include "postmaster/postmaster.h"
+#include "replication/logicalproto.h"
#include "replication/slot.h"
#include "storage/proc.h"
#include "storage/procarray.h"
@@ -2239,6 +2240,23 @@ pg_stat_reset_replication_slot(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
+/* Reset a subscription error stats */
+Datum
+pg_stat_reset_subscription_error(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid;
+
+ if (PG_ARGISNULL(1))
+ relid = InvalidOid; /* reset apply worker error stats */
+ else
+ relid = PG_GETARG_OID(1); /* reset table sync worker error stats */
+
+ pgstat_reset_subscription_error(subid, relid);
+
+ PG_RETURN_VOID();
+}
+
Datum
pg_stat_get_archiver(PG_FUNCTION_ARGS)
{
@@ -2379,3 +2397,104 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
/* Returns the record as Datum */
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
+
+/*
+ * Get the subscription error for the given subscription (and relation).
+ */
+Datum
+pg_stat_get_subscription_error(PG_FUNCTION_ARGS)
+{
+#define PG_STAT_GET_SUBSCRIPTION_ERROR_COLS 10
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid;
+ TupleDesc tupdesc;
+ Datum values[PG_STAT_GET_SUBSCRIPTION_ERROR_COLS];
+ bool nulls[PG_STAT_GET_SUBSCRIPTION_ERROR_COLS];
+ PgStat_StatSubErrEntry *errent;
+
+ /* Initialise values and NULL flags arrays */
+ MemSet(values, 0, sizeof(values));
+ MemSet(nulls, 0, sizeof(nulls));
+
+ /* Initialise attributes information in the tuple descriptor */
+ tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_SUBSCRIPTION_ERROR_COLS);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "datid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 3, "relid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 4, "command",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 5, "xid",
+ XIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 6, "failure_source",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 7, "failure_count",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 8, "last_failure",
+ TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 9, "last_failure_message",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 10, "stats_reset",
+ TIMESTAMPTZOID, -1, 0);
+ BlessTupleDesc(tupdesc);
+
+ if (PG_ARGISNULL(1))
+ relid = InvalidOid;
+ else
+ relid = PG_GETARG_OID(1);
+
+ /* Get subscription errors */
+ errent = pgstat_fetch_subscription_error(subid, relid);
+
+ /* Return NULL if the subscription doesn't have any errors */
+ if (errent == NULL)
+ PG_RETURN_NULL();
+
+ values[0] = ObjectIdGetDatum(errent->databaseid);
+ values[1] = ObjectIdGetDatum(subid);
+
+ if (OidIsValid(errent->relid))
+ values[2] = ObjectIdGetDatum(errent->relid);
+ else
+ nulls[2] = true;
+
+ if (errent->command == 0)
+ nulls[3] = true;
+ else
+ {
+ /* should be apply worker */
+ Assert(!OidIsValid(errent->subrelid));
+
+ values[3] = CStringGetTextDatum(logicalrep_message_type(errent->command));
+ }
+
+ if (TransactionIdIsValid(errent->xid))
+ values[4] = TransactionIdGetDatum(errent->xid);
+ else
+ nulls[4] = true;
+
+ if (OidIsValid(errent->subrelid))
+ values[5] = CStringGetTextDatum("tablesync");
+ else
+ values[5] = CStringGetTextDatum("apply");
+
+
+ values[6] = Int64GetDatum(errent->failure_count);
+
+ if (errent->last_failure == 0)
+ nulls[7] = true;
+ else
+ values[7] = TimestampTzGetDatum(errent->last_failure);
+
+ values[8] = CStringGetTextDatum(errent->last_errmsg);
+
+ if (errent->stat_reset_timestamp == 0)
+ nulls[9] = true;
+ else
+ values[9] = TimestampTzGetDatum(errent->stat_reset_timestamp);
+
+ /* Returns the record as Datum */
+ PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
+}
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index a3e1c59a82..dd36850016 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -1441,6 +1441,22 @@ getinternalerrposition(void)
return edata->internalpos;
}
+/*
+ * geterrmessage --- return the currently set error message
+ *
+ * This is only intended for use in error callback subroutines, since there
+ * is no other place outside elog.c where the concept is meaningful.
+ */
+const char *
+geterrmessage(void)
+{
+ ErrorData *edata = &errordata[errordata_stack_depth];
+
+ /* we don't bother incrementing recursion_depth */
+ CHECK_STACK_DEPTH();
+
+ return (const char *) edata->message;
+}
/*
* Functions to allow construction of error message strings separately from
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index b603700ed9..7f9c27bdda 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5385,6 +5385,14 @@
proargmodes => '{i,o,o,o,o,o,o,o,o,o,o}',
proargnames => '{slot_name,slot_name,spill_txns,spill_count,spill_bytes,stream_txns,stream_count,stream_bytes,total_txns,total_bytes,stats_reset}',
prosrc => 'pg_stat_get_replication_slot' },
+{ oid => '8523', descr => 'statistics: information about subscription error',
+ proname => 'pg_stat_get_subscription_error', prorows => '1', proisstrict => 'f',
+ proretset => 't', provolatile => 's', proparallel => 'r',
+ prorettype => 'record', proargtypes => 'oid oid',
+ proallargtypes => '{oid,oid,oid,oid,oid,text,xid,text,int8,timestamptz,text,timestamptz}',
+ proargmodes => '{i,i,o,o,o,o,o,o,o,o,o,o}',
+ proargnames => '{subid,relid,datid,subid,relid,command,xid,failure_source,failure_count,last_failure,last_failure_message,stats_reset}',
+ prosrc => 'pg_stat_get_subscription_error' },
{ oid => '6118', descr => 'statistics: information about subscription',
proname => 'pg_stat_get_subscription', prorows => '10', proisstrict => 'f',
proretset => 't', provolatile => 's', proparallel => 'r',
@@ -5772,6 +5780,11 @@
proname => 'pg_stat_reset_replication_slot', proisstrict => 'f',
provolatile => 'v', prorettype => 'void', proargtypes => 'text',
prosrc => 'pg_stat_reset_replication_slot' },
+{ oid => '8524',
+ descr => 'statistics: reset collected statistics for a single subscription error',
+ proname => 'pg_stat_reset_subscription_error', proisstrict => 'f',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid oid',
+ prosrc => 'pg_stat_reset_subscription_error' },
{ oid => '3163', descr => 'current trigger depth',
proname => 'pg_trigger_depth', provolatile => 's', proparallel => 'r',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 509849c7ff..a6914a24e5 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -14,6 +14,7 @@
#include "datatype/timestamp.h"
#include "portability/instr_time.h"
#include "postmaster/pgarch.h" /* for MAX_XFN_CHARS */
+#include "replication/logicalproto.h"
#include "utils/backend_progress.h" /* for backward compatibility */
#include "utils/backend_status.h" /* for backward compatibility */
#include "utils/hsearch.h"
@@ -66,6 +67,9 @@ typedef enum StatMsgType
PGSTAT_MTYPE_RESETSINGLECOUNTER,
PGSTAT_MTYPE_RESETSLRUCOUNTER,
PGSTAT_MTYPE_RESETREPLSLOTCOUNTER,
+ PGSTAT_MTYPE_SUBSCRIPTIONERR,
+ PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE,
+ PGSTAT_MTYPE_SUBSCRIPTIONPURGE,
PGSTAT_MTYPE_AUTOVAC_START,
PGSTAT_MTYPE_VACUUM,
PGSTAT_MTYPE_ANALYZE,
@@ -530,6 +534,68 @@ typedef struct PgStat_MsgReplSlot
PgStat_Counter m_total_bytes;
} PgStat_MsgReplSlot;
+/* ----------
+ * PgStat_MsgSubscriptionErr Sent by the apply worker or the table sync worker to
+ * update/reset the error happening during logical
+ * replication.
+ * ----------
+ */
+#define PGSTAT_SUBSCRIPTIONERR_MSGLEN 256
+typedef struct PgStat_MsgSubscriptionErr
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * m_subid and m_subrelid are used to determine the subscription and the
+ * reporter of this error. m_subrelid is InvalidOid if reported by the
+ * apply worker, otherwise by the table sync worker. In table sync worker
+ * case, m_subrelid must be the same as m_relid.
+ */
+ Oid m_subid;
+ Oid m_subrelid;
+
+ /* The reset message uses below field */
+ bool m_reset; /* Reset all fields and set reset_stats
+ * timestamp */
+
+ /* The error report message uses below fields */
+ Oid m_databaseid;
+ Oid m_relid;
+ LogicalRepMsgType m_command;
+ TransactionId m_xid;
+ TimestampTz m_failure_time;
+ char m_errmsg[PGSTAT_SUBSCRIPTIONERR_MSGLEN];
+} PgStat_MsgSubscriptionErr;
+
+/* ----------
+ * PgStat_MsgSubscriptionPurge Sent by the autovacuum purge the subscriptions.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionPurge
+{
+ PgStat_MsgHdr m_hdr;
+ int m_nentries;
+ Oid m_subids[PGSTAT_NUM_SUBSCRIPTIONPURGE];
+} PgStat_MsgSubscriptionPurge;
+
+/* ----------
+ * PgStat_MsgSubscriptionErrPurge Sent by the autovacuum purge the subscription
+ * errors.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONERRPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(Oid) - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionErrPurge
+{
+ PgStat_MsgHdr m_hdr;
+ Oid m_subid;
+ int m_nentries;
+ Oid m_relids[PGSTAT_NUM_SUBSCRIPTIONERRPURGE];
+} PgStat_MsgSubscriptionErrPurge;
/* ----------
* PgStat_MsgRecoveryConflict Sent by the backend upon recovery conflict
@@ -701,6 +767,9 @@ typedef union PgStat_Msg
PgStat_MsgChecksumFailure msg_checksumfailure;
PgStat_MsgReplSlot msg_replslot;
PgStat_MsgConn msg_conn;
+ PgStat_MsgSubscriptionErr msg_subscriptionerr;
+ PgStat_MsgSubscriptionErrPurge msg_subscriptionerrpurge;
+ PgStat_MsgSubscriptionPurge msg_subscriptionpurge;
} PgStat_Msg;
@@ -916,6 +985,38 @@ typedef struct PgStat_StatReplSlotEntry
TimestampTz stat_reset_timestamp;
} PgStat_StatReplSlotEntry;
+/*
+ * Subscription statistics kept in the stats collector.
+ */
+typedef struct PgStat_StatSubEntry
+{
+ Oid subid; /* hash table key */
+ HTAB *suberrors;
+} PgStat_StatSubEntry;
+
+/*
+ * Subscription error statistics kept in the stats collector. One entry represents
+ * an error that happened during logical replication, reported by the apply worker
+ * (subrelid is InvalidOid) or by the table sync worker (subrelid is a valid OID).
+ * The error reported by the table sync worker is removed also when the table
+ * synchronization process completed.
+ */
+
+typedef struct PgStat_StatSubErrEntry
+{
+ Oid subrelid; /* InvalidOid if the apply worker, otherwise
+ * the table sync worker. hash table key. */
+ Oid databaseid;
+ Oid relid; /* OID of relation related to the error. Must
+ * be the same as subrelid in the table sync
+ * case. */
+ LogicalRepMsgType command;
+ TransactionId xid;
+ PgStat_Counter failure_count;
+ TimestampTz last_failure;
+ char last_errmsg[PGSTAT_SUBSCRIPTIONERR_MSGLEN];
+ TimestampTz stat_reset_timestamp;
+} PgStat_StatSubErrEntry;
/*
* Working state needed to accumulate per-function-call timing statistics.
@@ -1009,6 +1110,7 @@ extern void pgstat_reset_shared_counters(const char *);
extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type type);
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
+extern void pgstat_reset_subscription_error(Oid subid, Oid subrelid);
extern void pgstat_report_autovac(Oid dboid);
extern void pgstat_report_vacuum(Oid tableoid, bool shared,
@@ -1024,6 +1126,9 @@ extern void pgstat_report_checksum_failure(void);
extern void pgstat_report_replslot(const PgStat_StatReplSlotEntry *repSlotStat);
extern void pgstat_report_replslot_create(const char *slotname);
extern void pgstat_report_replslot_drop(const char *slotname);
+extern void pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command,
+ TransactionId xid, const char *errmsg);
extern void pgstat_initialize(void);
@@ -1122,6 +1227,7 @@ extern PgStat_GlobalStats *pgstat_fetch_global(void);
extern PgStat_WalStats *pgstat_fetch_stat_wal(void);
extern PgStat_SLRUStats *pgstat_fetch_slru(void);
extern PgStat_StatReplSlotEntry *pgstat_fetch_replslot(NameData slotname);
+extern PgStat_StatSubErrEntry *pgstat_fetch_subscription_error(Oid subid, Oid relid);
extern void pgstat_count_slru_page_zeroed(int slru_idx);
extern void pgstat_count_slru_page_hit(int slru_idx);
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index f53607e12e..155145a77d 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -222,6 +222,7 @@ extern int err_generic_string(int field, const char *str);
extern int geterrcode(void);
extern int geterrposition(void);
extern int getinternalerrposition(void);
+extern const char *geterrmessage(void);
/*----------
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index e5ab11275d..ffad9790ae 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2094,6 +2094,28 @@ pg_stat_subscription| SELECT su.oid AS subid,
st.latest_end_time
FROM (pg_subscription su
LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid, pid, received_lsn, last_msg_send_time, last_msg_receipt_time, latest_end_lsn, latest_end_time) ON ((st.subid = su.oid)));
+pg_stat_subscription_errors| SELECT d.datname,
+ sr.subid,
+ s.subname,
+ e.relid,
+ e.command,
+ e.xid,
+ e.failure_source,
+ e.failure_count,
+ e.last_failure,
+ e.last_failure_message,
+ e.stats_reset
+ FROM ( SELECT pg_subscription.oid AS subid,
+ NULL::oid AS relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT pg_subscription_rel.srsubid AS subid,
+ pg_subscription_rel.srrelid AS relid
+ FROM pg_subscription_rel
+ WHERE (pg_subscription_rel.srsubstate <> 'r'::"char")) sr,
+ ((LATERAL pg_stat_get_subscription_error(sr.subid, sr.relid) e(datid, subid, relid, command, xid, failure_source, failure_count, last_failure, last_failure_message, stats_reset)
+ JOIN pg_database d ON ((e.datid = d.oid)))
+ JOIN pg_subscription s ON ((e.subid = s.oid)));
pg_stat_sys_indexes| SELECT pg_stat_all_indexes.relid,
pg_stat_all_indexes.indexrelid,
pg_stat_all_indexes.schemaname,
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 621d0cb4da..0859a791fa 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1938,6 +1938,9 @@ PgStat_MsgResetsharedcounter
PgStat_MsgResetsinglecounter
PgStat_MsgResetslrucounter
PgStat_MsgSLRU
+PgStat_MsgSubscriptionErr
+PgStat_MsgSubscriptionErrPurge
+PgStat_MsgSubscriptionPurge
PgStat_MsgTabpurge
PgStat_MsgTabstat
PgStat_MsgTempFile
@@ -1949,6 +1952,8 @@ PgStat_Single_Reset_Type
PgStat_StatDBEntry
PgStat_StatFuncEntry
PgStat_StatReplSlotEntry
+PgStat_StatSubEntry
+PgStat_StatSubErrEntry
PgStat_StatTabEntry
PgStat_SubXactStatus
PgStat_TableCounts
--
2.24.3 (Apple Git-128)
On Tue, Aug 17, 2021 at 5:21 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
On Thursday, August 12, 2021 1:53 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached the updated patches. FYI I've included the patch
(v8-0005) that fixes the assertion failure during shared fileset
cleanup to make cfbot tests happy.Hi
Thanks for your patch. I met a problem when using it. The log is not what I expected in some cases, but in streaming mode, they work well.
For example:
------publisher------
create table test (a int primary key, b varchar);
create publication pub for table test;------subscriber------
create table test (a int primary key, b varchar);
insert into test values (10000);
create subscription sub connection 'dbname=postgres port=5432' publication pub with(streaming=on);------publisher------
insert into test values (10000);Subscriber log:
2021-08-17 14:24:43.415 CST [3630341] ERROR: duplicate key value violates unique constraint "test_pkey"
2021-08-17 14:24:43.415 CST [3630341] DETAIL: Key (a)=(10000) already exists.It didn't give more context info generated by apply_error_callback function.
Thank you for reporting the issue! This issue must be fixed in the
latest (v9) patches I've just submitted[1]/messages/by-id/CAD21AoCH4Jwn_NkJhvS6W5bZJKSaAYnC9inXqMJc6dLLvhvTQg@mail.gmail.com.
Regards,
[1]: /messages/by-id/CAD21AoCH4Jwn_NkJhvS6W5bZJKSaAYnC9inXqMJc6dLLvhvTQg@mail.gmail.com
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wed, Aug 18, 2021 at 12:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Aug 18, 2021 at 3:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Aug 18, 2021 at 10:00 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Aug 18, 2021 at 12:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Aug 18, 2021 at 6:53 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Aug 17, 2021 at 2:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
It's right that we use "STREAM STOP" rather than "STREAM END" in many
places such as elog messages, a callback name, and source code
comments. As far as I have found there are two places where we’re
using "STREAM STOP": LOGICAL_REP_MSG_STREAM_END and a description in
doc/src/sgml/protocol.sgml. Isn't it better to fix these
inconsistencies in the first place? I think “STREAM STOP” would be
more appropriate.I think keeping STREAM_END in the enum 'LOGICAL_REP_MSG_STREAM_END'
seems to be a bit better because of the value 'E' we use for it.But I think we don't care about the actual value of
LOGICAL_REP_MSG_STREAM_END since we use the enum value rather than
'E'?True, but here we are trying to be consistent with other enum values
where we try to use the first letter of the last word (which is E in
this case). I can see there are other cases where we are not
consistent so it won't be a big deal if we won't be consistent here. I
am neutral on this one, so, if you feel using STREAM_STOP would be
better from a code readability perspective then that is fine.In addition of a code readability, there is a description in the doc
that mentions "Stream End" but we describe "Stream Stop" in the later
description, which seems a bug in the doc to me:Doc changes looks good to me. But, I have question for code change:
--- a/src/include/replication/logicalproto.h +++ b/src/include/replication/logicalproto.h @@ -65,7 +65,7 @@ typedef enum LogicalRepMsgType LOGICAL_REP_MSG_COMMIT_PREPARED = 'K', LOGICAL_REP_MSG_ROLLBACK_PREPARED = 'r', LOGICAL_REP_MSG_STREAM_START = 'S', - LOGICAL_REP_MSG_STREAM_END = 'E', + LOGICAL_REP_MSG_STREAM_STOP = 'E', LOGICAL_REP_MSG_STREAM_COMMIT = 'c',As this is changing the enum name and if any extension (logical
replication extension) has started using it then they would require a
change. As this is the latest change in PG-14, so it might be okay but
OTOH, as this is just a code readability change, shall we do it only
for PG-15?I think that the doc changes could be backpatched to PG14 but I think
we should do the code change only for PG15.
Okay, done that way!
--
With Regards,
Amit Kapila.
On Thu, Aug 19, 2021 at 11:48 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached the updated version patches that incorporated all
comments I got so far unless I'm missing something. Please review
them.
The comments I made on Aug 16 and Aug 17 for the v8-0001 patch don't
seem to be addressed in the v9-0001 patch (if you disagree with them
that's fine, but best to say so and why).
Regards,
Greg Nancarrow
Fujitsu Australia
On Mon, Aug 16, 2021 at 8:33 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Mon, Aug 16, 2021 at 6:24 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Therefore, perhaps a message like "... in transaction 740 with commit
timestamp 2021-08-10 14:44:38.058174+05:30" is better in terms of
consistency with other messages?Yes, I think that would be more consistent.
On another note, for the 0001 patch, the elog ERROR at the bottom of
the logicalrep_message_type() function seems to assume that the
unrecognized "action" is a printable character (with its use of %c)
and also that the character is meaningful to the user in some way.
But given that the compiler normally warns of an unhandled enum value
when switching on an enum, such an error would most likely be when
action is some int value that wouldn't be meaningful to the user (as
it wouldn't be one of the LogicalRepMsgType enum values).
I therefore think it would be better to use %d in that ERROR:i.e.
+ elog(ERROR, "invalid logical replication message type %d", action);
Similar comments apply to the apply_dispatch() function (and I realise
it used %c before your patch).
The action in apply_dispatch is always a single byte so not sure why
we need %d here. Also, if it is used as %c before the patch then I
think it is better not to change it in this patch.
--
With Regards,
Amit Kapila.
On Thu, Aug 19, 2021 at 2:18 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Thu, Aug 19, 2021 at 11:48 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached the updated version patches that incorporated all
comments I got so far unless I'm missing something. Please review
them.The comments I made on Aug 16 and Aug 17 for the v8-0001 patch don't
seem to be addressed in the v9-0001 patch (if you disagree with them
that's fine, but best to say so and why).
Oops, sorry about that. I had just missed those comments. Let's
discuss them and I'll incorporate those comments in the v10 patch if
we agree with the changes.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thu, Aug 19, 2021 at 3:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Aug 16, 2021 at 8:33 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Mon, Aug 16, 2021 at 6:24 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Therefore, perhaps a message like "... in transaction 740 with commit
timestamp 2021-08-10 14:44:38.058174+05:30" is better in terms of
consistency with other messages?Yes, I think that would be more consistent.
On another note, for the 0001 patch, the elog ERROR at the bottom of
the logicalrep_message_type() function seems to assume that the
unrecognized "action" is a printable character (with its use of %c)
and also that the character is meaningful to the user in some way.
But given that the compiler normally warns of an unhandled enum value
when switching on an enum, such an error would most likely be when
action is some int value that wouldn't be meaningful to the user (as
it wouldn't be one of the LogicalRepMsgType enum values).
I therefore think it would be better to use %d in that ERROR:i.e.
+ elog(ERROR, "invalid logical replication message type %d", action);
Similar comments apply to the apply_dispatch() function (and I realise
it used %c before your patch).The action in apply_dispatch is always a single byte so not sure why
we need %d here. Also, if it is used as %c before the patch then I
think it is better not to change it in this patch.
Yes, I agree that it's better no to change it in this patch since %c
is used before the patch. Also I can see some error messages in
walsender.c also use %c. If we conclude that it should use %d instead
of %c, we can change all of them as another patch.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, Aug 17, 2021 at 12:00 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Thu, Aug 12, 2021 at 3:54 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached the updated patches. FYI I've included the patch
(v8-0005) that fixes the assertion failure during shared fileset
cleanup to make cfbot tests happy.
Thank you for the comment!
Another comment on the 0001 patch: as there is now a mix of setting
"apply_error_callback_arg" members directly and also through inline
functions, it might look better to have it done consistently with
functions having prototypes something like the following:static inline void set_apply_error_context_rel(LogicalRepRelMapEntry *rel);
static inline void reset_apply_error_context_rel(void);
static inline void set_apply_error_context_attnum(int remote_attnum);
It might look consistent, but if we do that, we will end up needing
functions every field to update when we add new fields to the struct
in the future?
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thu, Aug 19, 2021 at 4:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
The action in apply_dispatch is always a single byte so not sure why
we need %d here. Also, if it is used as %c before the patch then I
think it is better not to change it in this patch.
As I explained before, the point is that all the known message types
are handled in the switch statement cases (and you will get a compiler
warning if you miss one of the enum values in the switch cases).
So anything NOT handled in the switch, will be some OTHER value (and
note that any "int" value can be assigned to an enum).
Who says its value will be a printable character (%c) in this case?
And even if it is printable, will it help?
I think in this case it would be better to know the exact value of the
byte ("%d" or "0x%x" etc.), not the character equivalent.
I'm OK if it's done as a separate patch.
Regards,
Greg Nancarrow
Fujitsu Australia
On Thu, Aug 19, 2021 at 2:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Aug 18, 2021 at 12:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Aug 18, 2021 at 3:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Aug 18, 2021 at 10:00 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Aug 18, 2021 at 12:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Aug 18, 2021 at 6:53 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Aug 17, 2021 at 2:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
It's right that we use "STREAM STOP" rather than "STREAM END" in many
places such as elog messages, a callback name, and source code
comments. As far as I have found there are two places where we’re
using "STREAM STOP": LOGICAL_REP_MSG_STREAM_END and a description in
doc/src/sgml/protocol.sgml. Isn't it better to fix these
inconsistencies in the first place? I think “STREAM STOP” would be
more appropriate.I think keeping STREAM_END in the enum 'LOGICAL_REP_MSG_STREAM_END'
seems to be a bit better because of the value 'E' we use for it.But I think we don't care about the actual value of
LOGICAL_REP_MSG_STREAM_END since we use the enum value rather than
'E'?True, but here we are trying to be consistent with other enum values
where we try to use the first letter of the last word (which is E in
this case). I can see there are other cases where we are not
consistent so it won't be a big deal if we won't be consistent here. I
am neutral on this one, so, if you feel using STREAM_STOP would be
better from a code readability perspective then that is fine.In addition of a code readability, there is a description in the doc
that mentions "Stream End" but we describe "Stream Stop" in the later
description, which seems a bug in the doc to me:Doc changes looks good to me. But, I have question for code change:
--- a/src/include/replication/logicalproto.h +++ b/src/include/replication/logicalproto.h @@ -65,7 +65,7 @@ typedef enum LogicalRepMsgType LOGICAL_REP_MSG_COMMIT_PREPARED = 'K', LOGICAL_REP_MSG_ROLLBACK_PREPARED = 'r', LOGICAL_REP_MSG_STREAM_START = 'S', - LOGICAL_REP_MSG_STREAM_END = 'E', + LOGICAL_REP_MSG_STREAM_STOP = 'E', LOGICAL_REP_MSG_STREAM_COMMIT = 'c',As this is changing the enum name and if any extension (logical
replication extension) has started using it then they would require a
change. As this is the latest change in PG-14, so it might be okay but
OTOH, as this is just a code readability change, shall we do it only
for PG-15?I think that the doc changes could be backpatched to PG14 but I think
we should do the code change only for PG15.Okay, done that way!
Thanks!
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thu, Aug 19, 2021 9:48 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached the updated version patches that incorporated all comments I
got so far unless I'm missing something. Please review them.
Thanks for the new version patches.
The v9-0001 patch looks good to me and I will start to review other patches.
Best regards,
Hou zj
On Thu, Aug 19, 2021 at 12:47 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Aug 17, 2021 at 12:00 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
Another comment on the 0001 patch: as there is now a mix of setting
"apply_error_callback_arg" members directly and also through inline
functions, it might look better to have it done consistently with
functions having prototypes something like the following:static inline void set_apply_error_context_rel(LogicalRepRelMapEntry *rel);
static inline void reset_apply_error_context_rel(void);
static inline void set_apply_error_context_attnum(int remote_attnum);It might look consistent, but if we do that, we will end up needing
functions every field to update when we add new fields to the struct
in the future?
Yeah, I also think it is too much, but we can add comments where ever
we set the information for error callback. I see it is missing when
the patch is setting remote_attnum, see similar other places and add
comments if already not there.
--
With Regards,
Amit Kapila.
On Thu, Aug 19, 2021 at 9:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Aug 19, 2021 at 12:47 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Aug 17, 2021 at 12:00 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
Another comment on the 0001 patch: as there is now a mix of setting
"apply_error_callback_arg" members directly and also through inline
functions, it might look better to have it done consistently with
functions having prototypes something like the following:static inline void set_apply_error_context_rel(LogicalRepRelMapEntry *rel);
static inline void reset_apply_error_context_rel(void);
static inline void set_apply_error_context_attnum(int remote_attnum);It might look consistent, but if we do that, we will end up needing
functions every field to update when we add new fields to the struct
in the future?Yeah, I also think it is too much, but we can add comments where ever
we set the information for error callback. I see it is missing when
the patch is setting remote_attnum, see similar other places and add
comments if already not there.
Agred. Will add comments in the next version patch.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thursday, August 19, 2021 9:53 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Thank you for reporting the issue! This issue must be fixed in the
latest (v9) patches I've just submitted[1].
Thanks for your patch.
I've confirmed the issue is fixed as you said.
Regards
Tang
On Fri, Aug 20, 2021 at 6:14 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
On Thursday, August 19, 2021 9:53 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Thank you for reporting the issue! This issue must be fixed in the
latest (v9) patches I've just submitted[1].Thanks for your patch.
I've confirmed the issue is fixed as you said.
Thanks for your confirmation!
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thu, Aug 19, 2021 at 10:09 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Aug 19, 2021 at 9:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Aug 19, 2021 at 12:47 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Aug 17, 2021 at 12:00 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
Another comment on the 0001 patch: as there is now a mix of setting
"apply_error_callback_arg" members directly and also through inline
functions, it might look better to have it done consistently with
functions having prototypes something like the following:static inline void set_apply_error_context_rel(LogicalRepRelMapEntry *rel);
static inline void reset_apply_error_context_rel(void);
static inline void set_apply_error_context_attnum(int remote_attnum);It might look consistent, but if we do that, we will end up needing
functions every field to update when we add new fields to the struct
in the future?Yeah, I also think it is too much, but we can add comments where ever
we set the information for error callback. I see it is missing when
the patch is setting remote_attnum, see similar other places and add
comments if already not there.Agred. Will add comments in the next version patch.
I've attached updated patches. Please review them.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v10-0005-Move-shared-fileset-cleanup-to-before_shmem_exit.patchapplication/octet-stream; name=v10-0005-Move-shared-fileset-cleanup-to-before_shmem_exit.patchDownload
From 539cd2119058deb9c223f55d8e66313b0deed303 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Thu, 12 Aug 2021 10:57:41 +0900
Subject: [PATCH v10 5/5] Move shared fileset cleanup to before_shmem_exit().
The reported problem is that shared file set created in
SharedFileSetInit() by logical replication apply worker is cleaned up
in SharedFileSetDeleteOnProcExit() when the process exited on an error
due to a conflict. As shared fileset cleanup causes pgstat reporting
for underlying temporary files, the assertions added in ee3f8d3d3ae
caused failures.
To fix the problem, similar to 675c945394, move shared fileset cleanup
to a before_shmem_exit() hook, ensuring that the fileset is dropped
while we can still report stats for underlying temporary files.
---
src/backend/storage/file/sharedfileset.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/src/backend/storage/file/sharedfileset.c b/src/backend/storage/file/sharedfileset.c
index ed37c940ad..0d9700bf56 100644
--- a/src/backend/storage/file/sharedfileset.c
+++ b/src/backend/storage/file/sharedfileset.c
@@ -36,7 +36,7 @@
static List *filesetlist = NIL;
static void SharedFileSetOnDetach(dsm_segment *segment, Datum datum);
-static void SharedFileSetDeleteOnProcExit(int status, Datum arg);
+static void SharedFileSetDeleteBeforeShmemExit(int status, Datum arg);
static void SharedFileSetPath(char *path, SharedFileSet *fileset, Oid tablespace);
static void SharedFilePath(char *path, SharedFileSet *fileset, const char *name);
static Oid ChooseTablespace(const SharedFileSet *fileset, const char *name);
@@ -112,7 +112,12 @@ SharedFileSetInit(SharedFileSet *fileset, dsm_segment *seg)
* fileset clean up.
*/
Assert(filesetlist == NIL);
- on_proc_exit(SharedFileSetDeleteOnProcExit, 0);
+
+ /*
+ * Register before-shmem-exit hook to ensure fileset is dropped
+ * while we can still report stats for underlying temporary files.
+ */
+ before_shmem_exit(SharedFileSetDeleteBeforeShmemExit, 0);
registered_cleanup = true;
}
@@ -259,12 +264,12 @@ SharedFileSetOnDetach(dsm_segment *segment, Datum datum)
}
/*
- * Callback function that will be invoked on the process exit. This will
+ * Callback function that will be invoked before shmem exit. This will
* process the list of all the registered sharedfilesets and delete the
* underlying files.
*/
static void
-SharedFileSetDeleteOnProcExit(int status, Datum arg)
+SharedFileSetDeleteBeforeShmemExit(int status, Datum arg)
{
/*
* Remove all the pending shared fileset entries. We don't use foreach()
--
2.24.3 (Apple Git-128)
v10-0001-Add-logical-changes-details-to-errcontext-of-app.patchapplication/octet-stream; name=v10-0001-Add-logical-changes-details-to-errcontext-of-app.patchDownload
From 2e836931179c610b9acf4c8f9557ec1da8b4bfbb Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 28 Jun 2021 13:21:58 +0900
Subject: [PATCH v10 1/5] Add logical changes details to errcontext of apply
worker errors.
Previously, the error context was set to only the data conversion
failures. This commit expands the error context to add the details of
logical change being applied by the apply worker, newly showing the
command, transaction, and commit timestamp.
This additional information can be used by the follow-up commit that
enables to skip the particular transaction on the subscriber.
---
src/backend/replication/logical/proto.c | 53 +++++
src/backend/replication/logical/worker.c | 255 ++++++++++++++++-------
src/include/replication/logicalproto.h | 1 +
src/tools/pgindent/typedefs.list | 2 +-
4 files changed, 231 insertions(+), 80 deletions(-)
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 9732982d93..cdbc6838cc 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -1156,3 +1156,56 @@ logicalrep_read_stream_abort(StringInfo in, TransactionId *xid,
*xid = pq_getmsgint(in, 4);
*subxid = pq_getmsgint(in, 4);
}
+
+/*
+ * Get string representing LogicalRepMsgType.
+ */
+char *
+logicalrep_message_type(LogicalRepMsgType action)
+{
+ switch (action)
+ {
+ case LOGICAL_REP_MSG_BEGIN:
+ return "BEGIN";
+ case LOGICAL_REP_MSG_COMMIT:
+ return "COMMIT";
+ case LOGICAL_REP_MSG_ORIGIN:
+ return "ORIGIN";
+ case LOGICAL_REP_MSG_INSERT:
+ return "INSERT";
+ case LOGICAL_REP_MSG_UPDATE:
+ return "UPDATE";
+ case LOGICAL_REP_MSG_DELETE:
+ return "DELETE";
+ case LOGICAL_REP_MSG_TRUNCATE:
+ return "TRUNCATE";
+ case LOGICAL_REP_MSG_RELATION:
+ return "RELATION";
+ case LOGICAL_REP_MSG_TYPE:
+ return "TYPE";
+ case LOGICAL_REP_MSG_MESSAGE:
+ return "MESSAGE";
+ case LOGICAL_REP_MSG_BEGIN_PREPARE:
+ return "BEGIN PREPARE";
+ case LOGICAL_REP_MSG_PREPARE:
+ return "PREPARE";
+ case LOGICAL_REP_MSG_COMMIT_PREPARED:
+ return "COMMIT PREPARED";
+ case LOGICAL_REP_MSG_ROLLBACK_PREPARED:
+ return "ROLLBACK PREPARED";
+ case LOGICAL_REP_MSG_STREAM_START:
+ return "STREAM START";
+ case LOGICAL_REP_MSG_STREAM_STOP:
+ return "STREAM STOP";
+ case LOGICAL_REP_MSG_STREAM_COMMIT:
+ return "STREAM COMMIT";
+ case LOGICAL_REP_MSG_STREAM_ABORT:
+ return "STREAM ABORT";
+ case LOGICAL_REP_MSG_STREAM_PREPARE:
+ return "STREAM PREPARE";
+ }
+
+ elog(ERROR, "invalid logical replication message type \"%c\"", action);
+
+ return NULL; /* keep compiler quiet */
+}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 38b493e4f5..3c707e3a1e 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -203,12 +203,6 @@ typedef struct FlushPosition
static dlist_head lsn_mapping = DLIST_STATIC_INIT(lsn_mapping);
-typedef struct SlotErrCallbackArg
-{
- LogicalRepRelMapEntry *rel;
- int remote_attnum;
-} SlotErrCallbackArg;
-
typedef struct ApplyExecutionData
{
EState *estate; /* executor state, used to track resources */
@@ -221,6 +215,27 @@ typedef struct ApplyExecutionData
PartitionTupleRouting *proute; /* partition routing info */
} ApplyExecutionData;
+/* Struct for saving and restoring apply information */
+typedef struct ApplyErrorCallbackArg
+{
+ LogicalRepMsgType command; /* 0 if invalid */
+ LogicalRepRelMapEntry *rel;
+
+ /* Remote information */
+ int remote_attnum; /* -1 if invalid */
+ TransactionId remote_xid;
+ TimestampTz commit_ts;
+} ApplyErrorCallbackArg;
+
+static ApplyErrorCallbackArg apply_error_callback_arg =
+{
+ .command = 0,
+ .rel = NULL,
+ .remote_attnum = -1,
+ .remote_xid = InvalidTransactionId,
+ .commit_ts = 0,
+};
+
/*
* Stream xid hash entry. Whenever we see a new xid we create this entry in the
* xidhash and along with it create the streaming file and store the fileset handle.
@@ -335,6 +350,11 @@ static void TwoPhaseTransactionGid(Oid subid, TransactionId xid, char *gid, int
/* Common streaming function to apply all the spooled messages */
static void apply_spooled_messages(TransactionId xid, XLogRecPtr lsn);
+/* Functions for apply error callback */
+static void apply_error_callback(void *arg);
+static inline void set_apply_error_context_xact(TransactionId xid, TimestampTz commit_ts);
+static inline void reset_apply_error_context_info(void);
+
/*
* Should this worker apply changes for given relation.
*
@@ -580,26 +600,6 @@ slot_fill_defaults(LogicalRepRelMapEntry *rel, EState *estate,
ExecEvalExpr(defexprs[i], econtext, &slot->tts_isnull[defmap[i]]);
}
-/*
- * Error callback to give more context info about data conversion failures
- * while reading data from the remote server.
- */
-static void
-slot_store_error_callback(void *arg)
-{
- SlotErrCallbackArg *errarg = (SlotErrCallbackArg *) arg;
- LogicalRepRelMapEntry *rel;
-
- /* Nothing to do if remote attribute number is not set */
- if (errarg->remote_attnum < 0)
- return;
-
- rel = errarg->rel;
- errcontext("processing remote data for replication target relation \"%s.%s\" column \"%s\"",
- rel->remoterel.nspname, rel->remoterel.relname,
- rel->remoterel.attnames[errarg->remote_attnum]);
-}
-
/*
* Store tuple data into slot.
*
@@ -611,19 +611,9 @@ slot_store_data(TupleTableSlot *slot, LogicalRepRelMapEntry *rel,
{
int natts = slot->tts_tupleDescriptor->natts;
int i;
- SlotErrCallbackArg errarg;
- ErrorContextCallback errcallback;
ExecClearTuple(slot);
- /* Push callback + info on the error context stack */
- errarg.rel = rel;
- errarg.remote_attnum = -1;
- errcallback.callback = slot_store_error_callback;
- errcallback.arg = (void *) &errarg;
- errcallback.previous = error_context_stack;
- error_context_stack = &errcallback;
-
/* Call the "in" function for each non-dropped, non-null attribute */
Assert(natts == rel->attrmap->maplen);
for (i = 0; i < natts; i++)
@@ -637,7 +627,8 @@ slot_store_data(TupleTableSlot *slot, LogicalRepRelMapEntry *rel,
Assert(remoteattnum < tupleData->ncols);
- errarg.remote_attnum = remoteattnum;
+ /* Set attnum for error callback */
+ apply_error_callback_arg.remote_attnum = remoteattnum;
if (tupleData->colstatus[remoteattnum] == LOGICALREP_COLUMN_TEXT)
{
@@ -685,7 +676,8 @@ slot_store_data(TupleTableSlot *slot, LogicalRepRelMapEntry *rel,
slot->tts_isnull[i] = true;
}
- errarg.remote_attnum = -1;
+ /* Reset attnum for error callback */
+ apply_error_callback_arg.remote_attnum = -1;
}
else
{
@@ -699,9 +691,6 @@ slot_store_data(TupleTableSlot *slot, LogicalRepRelMapEntry *rel,
}
}
- /* Pop the error context stack */
- error_context_stack = errcallback.previous;
-
ExecStoreVirtualTuple(slot);
}
@@ -724,8 +713,6 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
{
int natts = slot->tts_tupleDescriptor->natts;
int i;
- SlotErrCallbackArg errarg;
- ErrorContextCallback errcallback;
/* We'll fill "slot" with a virtual tuple, so we must start with ... */
ExecClearTuple(slot);
@@ -739,14 +726,6 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
memcpy(slot->tts_values, srcslot->tts_values, natts * sizeof(Datum));
memcpy(slot->tts_isnull, srcslot->tts_isnull, natts * sizeof(bool));
- /* For error reporting, push callback + info on the error context stack */
- errarg.rel = rel;
- errarg.remote_attnum = -1;
- errcallback.callback = slot_store_error_callback;
- errcallback.arg = (void *) &errarg;
- errcallback.previous = error_context_stack;
- error_context_stack = &errcallback;
-
/* Call the "in" function for each replaced attribute */
Assert(natts == rel->attrmap->maplen);
for (i = 0; i < natts; i++)
@@ -763,7 +742,8 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
{
StringInfo colvalue = &tupleData->colvalues[remoteattnum];
- errarg.remote_attnum = remoteattnum;
+ /* Set attnum for error callback */
+ apply_error_callback_arg.remote_attnum = remoteattnum;
if (tupleData->colstatus[remoteattnum] == LOGICALREP_COLUMN_TEXT)
{
@@ -807,13 +787,11 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
slot->tts_isnull[i] = true;
}
- errarg.remote_attnum = -1;
+ /* Reset attnum for error callback */
+ apply_error_callback_arg.remote_attnum = -1;
}
}
- /* Pop the error context stack */
- error_context_stack = errcallback.previous;
-
/* And finally, declare that "slot" contains a valid virtual tuple */
ExecStoreVirtualTuple(slot);
}
@@ -827,6 +805,7 @@ apply_handle_begin(StringInfo s)
LogicalRepBeginData begin_data;
logicalrep_read_begin(s, &begin_data);
+ set_apply_error_context_xact(begin_data.xid, begin_data.committime);
remote_final_lsn = begin_data.final_lsn;
@@ -860,6 +839,7 @@ apply_handle_commit(StringInfo s)
process_syncing_tables(commit_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -877,6 +857,7 @@ apply_handle_begin_prepare(StringInfo s)
errmsg_internal("tablesync worker received a BEGIN PREPARE message")));
logicalrep_read_begin_prepare(s, &begin_data);
+ set_apply_error_context_xact(begin_data.xid, begin_data.prepare_time);
remote_final_lsn = begin_data.prepare_lsn;
@@ -962,6 +943,7 @@ apply_handle_prepare(StringInfo s)
process_syncing_tables(prepare_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -974,6 +956,7 @@ apply_handle_commit_prepared(StringInfo s)
char gid[GIDSIZE];
logicalrep_read_commit_prepared(s, &prepare_data);
+ set_apply_error_context_xact(prepare_data.xid, prepare_data.commit_time);
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
@@ -1001,6 +984,7 @@ apply_handle_commit_prepared(StringInfo s)
process_syncing_tables(prepare_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1013,6 +997,7 @@ apply_handle_rollback_prepared(StringInfo s)
char gid[GIDSIZE];
logicalrep_read_rollback_prepared(s, &rollback_data);
+ set_apply_error_context_xact(rollback_data.xid, rollback_data.rollback_time);
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, rollback_data.xid,
@@ -1050,6 +1035,7 @@ apply_handle_rollback_prepared(StringInfo s)
process_syncing_tables(rollback_data.rollback_end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1076,6 +1062,7 @@ apply_handle_stream_prepare(StringInfo s)
errmsg_internal("tablesync worker received a STREAM PREPARE message")));
logicalrep_read_stream_prepare(s, &prepare_data);
+ set_apply_error_context_xact(prepare_data.xid, prepare_data.prepare_time);
elog(DEBUG1, "received prepare for streamed transaction %u", prepare_data.xid);
@@ -1100,6 +1087,8 @@ apply_handle_stream_prepare(StringInfo s)
process_syncing_tables(prepare_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+
+ reset_apply_error_context_info();
}
/*
@@ -1156,6 +1145,8 @@ apply_handle_stream_start(StringInfo s)
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg_internal("invalid transaction ID in streamed replication transaction")));
+ set_apply_error_context_xact(stream_xid, 0);
+
/*
* Initialize the xidhash table if we haven't yet. This will be used for
* the entire duration of the apply worker so create it in permanent
@@ -1212,6 +1203,7 @@ apply_handle_stream_stop(StringInfo s)
MemoryContextReset(LogicalStreamingContext);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1235,7 +1227,10 @@ apply_handle_stream_abort(StringInfo s)
* just delete the files with serialized info.
*/
if (xid == subxid)
+ {
+ set_apply_error_context_xact(xid, 0);
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
+ }
else
{
/*
@@ -1260,6 +1255,8 @@ apply_handle_stream_abort(StringInfo s)
char path[MAXPGPATH];
StreamXidHash *ent;
+ set_apply_error_context_xact(subxid, 0);
+
subidx = -1;
begin_replication_step();
subxact_info_read(MyLogicalRepWorker->subid, xid);
@@ -1284,6 +1281,7 @@ apply_handle_stream_abort(StringInfo s)
cleanup_subxact_info();
end_replication_step();
CommitTransactionCommand();
+ reset_apply_error_context_info();
return;
}
@@ -1315,6 +1313,8 @@ apply_handle_stream_abort(StringInfo s)
end_replication_step();
CommitTransactionCommand();
}
+
+ reset_apply_error_context_info();
}
/*
@@ -1459,6 +1459,7 @@ apply_handle_stream_commit(StringInfo s)
errmsg_internal("STREAM COMMIT message without STREAM STOP")));
xid = logicalrep_read_stream_commit(s, &commit_data);
+ set_apply_error_context_xact(xid, commit_data.committime);
elog(DEBUG1, "received commit for streamed transaction %u", xid);
@@ -1473,6 +1474,8 @@ apply_handle_stream_commit(StringInfo s)
process_syncing_tables(commit_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+
+ reset_apply_error_context_info();
}
/*
@@ -1592,6 +1595,9 @@ apply_handle_insert(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ apply_error_callback_arg.rel = rel;
+
/* Initialize the executor state. */
edata = create_edata_for_relation(rel);
estate = edata->estate;
@@ -1615,6 +1621,9 @@ apply_handle_insert(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ apply_error_callback_arg.rel = NULL;
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -1713,6 +1722,9 @@ apply_handle_update(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ apply_error_callback_arg.rel = rel;
+
/* Check if we can do the update. */
check_relation_updatable(rel);
@@ -1766,6 +1778,9 @@ apply_handle_update(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ apply_error_callback_arg.rel = NULL;
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -1869,6 +1884,9 @@ apply_handle_delete(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ apply_error_callback_arg.rel = rel;
+
/* Check if we can do the delete. */
check_relation_updatable(rel);
@@ -1894,6 +1912,9 @@ apply_handle_delete(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ apply_error_callback_arg.rel = NULL;
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -2328,44 +2349,53 @@ static void
apply_dispatch(StringInfo s)
{
LogicalRepMsgType action = pq_getmsgbyte(s);
+ LogicalRepMsgType saved_command;
+
+ /*
+ * Set the current command being applied. Since this function can be
+ * called recusively when applying spooled changes, save the current
+ * command.
+ */
+ saved_command = apply_error_callback_arg.command;
+ apply_error_callback_arg.command = action;
switch (action)
{
case LOGICAL_REP_MSG_BEGIN:
apply_handle_begin(s);
- return;
+ break;
case LOGICAL_REP_MSG_COMMIT:
apply_handle_commit(s);
- return;
+ break;
case LOGICAL_REP_MSG_INSERT:
apply_handle_insert(s);
- return;
+ break;
case LOGICAL_REP_MSG_UPDATE:
apply_handle_update(s);
- return;
+ break;
case LOGICAL_REP_MSG_DELETE:
apply_handle_delete(s);
- return;
+ break;
case LOGICAL_REP_MSG_TRUNCATE:
apply_handle_truncate(s);
- return;
+ break;
case LOGICAL_REP_MSG_RELATION:
apply_handle_relation(s);
- return;
+ break;
case LOGICAL_REP_MSG_TYPE:
apply_handle_type(s);
- return;
+ break;
case LOGICAL_REP_MSG_ORIGIN:
apply_handle_origin(s);
- return;
+ break;
case LOGICAL_REP_MSG_MESSAGE:
@@ -2374,49 +2404,52 @@ apply_dispatch(StringInfo s)
* Although, it could be used by other applications that use this
* output plugin.
*/
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_START:
apply_handle_stream_start(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_STOP:
apply_handle_stream_stop(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_ABORT:
apply_handle_stream_abort(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_COMMIT:
apply_handle_stream_commit(s);
- return;
+ break;
case LOGICAL_REP_MSG_BEGIN_PREPARE:
apply_handle_begin_prepare(s);
- return;
+ break;
case LOGICAL_REP_MSG_PREPARE:
apply_handle_prepare(s);
- return;
+ break;
case LOGICAL_REP_MSG_COMMIT_PREPARED:
apply_handle_commit_prepared(s);
- return;
+ break;
case LOGICAL_REP_MSG_ROLLBACK_PREPARED:
apply_handle_rollback_prepared(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_PREPARE:
apply_handle_stream_prepare(s);
- return;
+ break;
+
+ default:
+ ereport(ERROR,
+ (errcode(ERRCODE_PROTOCOL_VIOLATION),
+ errmsg("invalid logical replication message type \"%c\"", action)));
}
- ereport(ERROR,
- (errcode(ERRCODE_PROTOCOL_VIOLATION),
- errmsg_internal("invalid logical replication message type \"%c\"",
- action)));
+ /* Reset the current command */
+ apply_error_callback_arg.command = saved_command;
}
/*
@@ -2517,6 +2550,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
TimestampTz last_recv_timestamp = GetCurrentTimestamp();
bool ping_sent = false;
TimeLineID tli;
+ ErrorContextCallback errcallback;
/*
* Init the ApplyMessageContext which we clean up after each replication
@@ -2537,6 +2571,14 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
/* mark as idle, before starting to loop */
pgstat_report_activity(STATE_IDLE, NULL);
+ /*
+ * Push apply error context callback. Fields will be filled during
+ * applying a change.
+ */
+ errcallback.callback = apply_error_callback;
+ errcallback.previous = error_context_stack;
+ error_context_stack = &errcallback;
+
/* This outer loop iterates once per wait. */
for (;;)
{
@@ -2737,6 +2779,9 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
}
+ /* Pop the error context stack */
+ error_context_stack = errcallback.previous;
+
/* All done */
walrcv_endstreaming(LogRepWorkerWalRcvConn, &tli);
}
@@ -3649,3 +3694,55 @@ IsLogicalWorker(void)
{
return MyLogicalRepWorker != NULL;
}
+
+/* Error callback to give more context info about the change being applied */
+static void
+apply_error_callback(void *arg)
+{
+ StringInfoData buf;
+ ApplyErrorCallbackArg *errarg = &apply_error_callback_arg;
+
+ if (apply_error_callback_arg.command == 0)
+ return;
+
+ initStringInfo(&buf);
+ appendStringInfo(&buf, _("processing remote data during \"%s\""),
+ logicalrep_message_type(errarg->command));
+
+ if (errarg->rel)
+ appendStringInfo(&buf, _(" for replication target relation \"%s.%s\""),
+ errarg->rel->remoterel.nspname,
+ errarg->rel->remoterel.relname);
+
+ if (errarg->remote_attnum >= 0)
+ appendStringInfo(&buf, _(" column \"%s\""),
+ errarg->rel->remoterel.attnames[errarg->remote_attnum]);
+
+ if (TransactionIdIsNormal(errarg->remote_xid))
+ appendStringInfo(&buf, _(" in transaction %u with commit timestamp %s"),
+ errarg->remote_xid,
+ errarg->commit_ts == 0
+ ? "(unset)"
+ : timestamptz_to_str(errarg->commit_ts));
+
+ errcontext("%s", buf.data);
+ pfree(buf.data);
+}
+
+/* Set transaction information of apply error callback */
+static inline void
+set_apply_error_context_xact(TransactionId xid, TimestampTz commit_ts)
+{
+ apply_error_callback_arg.remote_xid = xid;
+ apply_error_callback_arg.commit_ts = commit_ts;
+}
+
+/* Reset all information of apply error callback */
+static inline void
+reset_apply_error_context_info(void)
+{
+ apply_error_callback_arg.command = 0;
+ apply_error_callback_arg.rel = NULL;
+ apply_error_callback_arg.remote_attnum = -1;
+ set_apply_error_context_xact(InvalidTransactionId, 0);
+}
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 95c1561ca0..83741dcf42 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -246,5 +246,6 @@ extern void logicalrep_write_stream_abort(StringInfo out, TransactionId xid,
TransactionId subxid);
extern void logicalrep_read_stream_abort(StringInfo in, TransactionId *xid,
TransactionId *subxid);
+extern char *logicalrep_message_type(LogicalRepMsgType action);
#endif /* LOGICAL_PROTO_H */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 37cf4b2f76..621d0cb4da 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -113,6 +113,7 @@ Append
AppendPath
AppendRelInfo
AppendState
+ApplyErrorCallbackArg
ApplyExecutionData
ApplySubXactData
Archive
@@ -2423,7 +2424,6 @@ SlabBlock
SlabChunk
SlabContext
SlabSlot
-SlotErrCallbackArg
SlotNumber
SlruCtl
SlruCtlData
--
2.24.3 (Apple Git-128)
v10-0003-Add-RESET-command-to-ALTER-SUBSCRIPTION-command.patchapplication/octet-stream; name=v10-0003-Add-RESET-command-to-ALTER-SUBSCRIPTION-command.patchDownload
From 6f6cd58c5600bbf84d434f633df2fe585ea0a286 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 2 Aug 2021 14:23:18 +0900
Subject: [PATCH v10 3/5] Add RESET command to ALTER SUBSCRIPTION command.
ALTER SUBSCRIPTION ... RESET command resets subscription
parameters. The parameters that can be set are streaming, binary,
synchronous_commit.
RESET command is reuiqred by follow-up commit introducing to a new
parameter skip_xid to reset.
---
doc/src/sgml/ref/alter_subscription.sgml | 8 ++-
src/backend/commands/subscriptioncmds.c | 78 +++++++++++++++++-----
src/backend/parser/gram.y | 11 ++-
src/include/nodes/parsenodes.h | 5 +-
src/test/regress/expected/subscription.out | 14 +++-
src/test/regress/sql/subscription.sql | 13 ++++
6 files changed, 109 insertions(+), 20 deletions(-)
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index a6f994450d..8c3c28b7e7 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -29,6 +29,7 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> REFRESH PUB
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> ENABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> DISABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <replaceable>new_name</replaceable>
</synopsis>
@@ -192,16 +193,21 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
<varlistentry>
<term><literal>SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )</literal></term>
+ <term><literal>RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )</literal></term>
<listitem>
<para>
This clause alters parameters originally set by
<xref linkend="sql-createsubscription"/>. See there for more
- information. The parameters that can be altered
+ information. The parameters that can be set
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, and
<literal>streaming</literal>.
</para>
+ <para>
+ The parameters that can be reset are: <literal>streaming</literal>,
+ <literal>binary</literal>, <literal>synchronous_commit</literal>.
+ </para>
</listitem>
</varlistentry>
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 5157f44058..cc390ce95a 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -99,7 +99,8 @@ static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname,
*/
static void
parse_subscription_options(ParseState *pstate, List *stmt_options,
- bits32 supported_opts, SubOpts *opts)
+ bits32 supported_opts, SubOpts *opts,
+ bool is_reset)
{
ListCell *lc;
@@ -134,6 +135,11 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
{
DefElem *defel = (DefElem *) lfirst(lc);
+ if (is_reset && defel->arg != NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("RESET must not include values for parameters")));
+
if (IsSet(supported_opts, SUBOPT_CONNECT) &&
strcmp(defel->defname, "connect") == 0)
{
@@ -192,12 +198,18 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_SYNCHRONOUS_COMMIT;
- opts->synchronous_commit = defGetString(defel);
+ if (!is_reset)
+ {
+ opts->synchronous_commit = defGetString(defel);
- /* Test if the given value is valid for synchronous_commit GUC. */
- (void) set_config_option("synchronous_commit", opts->synchronous_commit,
- PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
- false, 0, false);
+ /*
+ * Test if the given value is valid for synchronous_commit
+ * GUC.
+ */
+ (void) set_config_option("synchronous_commit", opts->synchronous_commit,
+ PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
+ false, 0, false);
+ }
}
else if (IsSet(supported_opts, SUBOPT_REFRESH) &&
strcmp(defel->defname, "refresh") == 0)
@@ -215,7 +227,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_BINARY;
- opts->binary = defGetBoolean(defel);
+ if (!is_reset)
+ opts->binary = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_STREAMING) &&
strcmp(defel->defname, "streaming") == 0)
@@ -224,7 +237,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_STREAMING;
- opts->streaming = defGetBoolean(defel);
+ if (!is_reset)
+ opts->streaming = defGetBoolean(defel);
}
else if (strcmp(defel->defname, "two_phase") == 0)
{
@@ -397,7 +411,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT);
- parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
+ parse_subscription_options(pstate, stmt->options, supported_opts, &opts,
+ false);
/*
* Since creating a replication slot is not transactional, rolling back
@@ -866,14 +881,14 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
switch (stmt->kind)
{
- case ALTER_SUBSCRIPTION_OPTIONS:
+ case ALTER_SUBSCRIPTION_SET_OPTIONS:
{
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING);
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
if (IsSet(opts.specified_opts, SUBOPT_SLOT_NAME))
{
@@ -923,10 +938,43 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
break;
}
+ case ALTER_SUBSCRIPTION_RESET_OPTIONS:
+ {
+ supported_opts = (SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
+ SUBOPT_STREAMING);
+
+ parse_subscription_options(pstate, stmt->options,
+ supported_opts, &opts, true);
+
+ if (IsSet(opts.specified_opts, SUBOPT_SYNCHRONOUS_COMMIT))
+ {
+ values[Anum_pg_subscription_subsynccommit - 1] =
+ CStringGetTextDatum("off");
+ replaces[Anum_pg_subscription_subsynccommit - 1] = true;
+ }
+
+ if (IsSet(opts.specified_opts, SUBOPT_BINARY))
+ {
+ values[Anum_pg_subscription_subbinary - 1] =
+ BoolGetDatum(false);
+ replaces[Anum_pg_subscription_subbinary - 1] = true;
+ }
+
+ if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
+ {
+ values[Anum_pg_subscription_substream - 1] =
+ BoolGetDatum(false);
+ replaces[Anum_pg_subscription_substream - 1] = true;
+ }
+
+ update_tuple = true;
+ break;
+ }
case ALTER_SUBSCRIPTION_ENABLED:
{
parse_subscription_options(pstate, stmt->options,
- SUBOPT_ENABLED, &opts);
+ SUBOPT_ENABLED, &opts, false);
+
Assert(IsSet(opts.specified_opts, SUBOPT_ENABLED));
if (!sub->slotname && opts.enabled)
@@ -961,7 +1009,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
{
supported_opts = SUBOPT_COPY_DATA | SUBOPT_REFRESH;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
values[Anum_pg_subscription_subpublications - 1] =
publicationListToArray(stmt->publication);
@@ -1011,7 +1059,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts |= SUBOPT_COPY_DATA;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
publist = merge_publications(sub->publications, stmt->publication, isadd, stmt->subname);
values[Anum_pg_subscription_subpublications - 1] =
@@ -1059,7 +1107,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
errmsg("ALTER SUBSCRIPTION ... REFRESH is not allowed for disabled subscriptions")));
parse_subscription_options(pstate, stmt->options,
- SUBOPT_COPY_DATA, &opts);
+ SUBOPT_COPY_DATA, &opts, false);
/*
* The subscription option "two_phase" requires that
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 39a2849eba..bcf85e8980 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9707,7 +9707,16 @@ AlterSubscriptionStmt:
{
AlterSubscriptionStmt *n =
makeNode(AlterSubscriptionStmt);
- n->kind = ALTER_SUBSCRIPTION_OPTIONS;
+ n->kind = ALTER_SUBSCRIPTION_SET_OPTIONS;
+ n->subname = $3;
+ n->options = $5;
+ $$ = (Node *)n;
+ }
+ | ALTER SUBSCRIPTION name RESET definition
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_RESET_OPTIONS;
n->subname = $3;
n->options = $5;
$$ = (Node *)n;
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 7af13dee43..3f55d63425 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3659,7 +3659,8 @@ typedef struct CreateSubscriptionStmt
typedef enum AlterSubscriptionType
{
- ALTER_SUBSCRIPTION_OPTIONS,
+ ALTER_SUBSCRIPTION_SET_OPTIONS,
+ ALTER_SUBSCRIPTION_RESET_OPTIONS,
ALTER_SUBSCRIPTION_CONNECTION,
ALTER_SUBSCRIPTION_SET_PUBLICATION,
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
@@ -3671,7 +3672,7 @@ typedef enum AlterSubscriptionType
typedef struct AlterSubscriptionStmt
{
NodeTag type;
- AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_OPTIONS, etc */
+ AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_SET_OPTIONS, etc */
char *subname; /* Name of the subscription */
char *conninfo; /* Connection string to publisher */
List *publication; /* One or more publication to subscribe to */
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 77b4437b69..b87f67fe55 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -284,11 +284,23 @@ ALTER SUBSCRIPTION regress_testsub SET (two_phase = false);
ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+-- ok
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit);
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit, binary, streaming);
+-- fail - unsupported parameters
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ERROR: unrecognized subscription parameter: "connect"
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+ERROR: unrecognized subscription parameter: "enabled"
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+ERROR: RESET must not include values for parameters
\dRs+
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index d42104c191..aa90560691 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -218,6 +218,19 @@ ALTER SUBSCRIPTION regress_testsub SET (two_phase = false);
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+
+-- ok
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit);
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit, binary, streaming);
+
+-- fail - unsupported parameters
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+
\dRs+
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
--
2.24.3 (Apple Git-128)
v10-0004-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchapplication/octet-stream; name=v10-0004-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchDownload
From 7d70db1da75a5b266c029c54feea1dda9d4ab98d Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 2 Aug 2021 14:27:40 +0900
Subject: [PATCH v10 4/5] Add skip_xid option to ALTER SUBSCRIPTION.
If incoming change violates any constraint, logical replication stops
until it's resolved. This commit introduces another way to skip the
transaction in question.
The user can specify XID by ALTER SUBSCRIPTION ... SET (skip_xid =
XXX), updating pg_subscription.subskipxid field, telling the apply
worker to skip the transaction. The apply worker skips all data
modification changes within the specified transaction.
After skipping the transaciton the apply worker clears
pg_subscription.subskipxid. Also, it clears the error statistics of
the subscription in pg_stat_subscription_errors system view as well in
order the user not to get confused. It's done by sending the message
for clearing a subscription error to the stats collector.
---
doc/src/sgml/logical-replication.sgml | 49 ++++-
doc/src/sgml/ref/alter_subscription.sgml | 32 ++-
src/backend/catalog/pg_subscription.c | 10 +
src/backend/commands/subscriptioncmds.c | 45 +++-
src/backend/postmaster/pgstat.c | 44 +++-
src/backend/replication/logical/worker.c | 201 ++++++++++++++++-
src/include/catalog/pg_subscription.h | 4 +
src/include/nodes/parsenodes.h | 1 +
src/include/pgstat.h | 7 +-
src/test/regress/expected/subscription.out | 13 ++
src/test/regress/sql/subscription.sql | 11 +
src/test/subscription/t/024_skip_xact.pl | 244 +++++++++++++++++++++
12 files changed, 636 insertions(+), 25 deletions(-)
create mode 100644 src/test/subscription/t/024_skip_xact.pl
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 88646bc859..d558dcfe81 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -333,20 +333,63 @@
<para>
A conflict will produce an error and will stop the replication; it must be
resolved manually by the user. Details about the conflict can be found in
- the subscriber's server log.
+ <xref linkend="monitoring-pg-stat-subscription-errors"/> as well as the
+ subscriber's server log.
</para>
<para>
The resolution can be done either by changing data on the subscriber so
that it does not conflict with the incoming change or by skipping the
- transaction that conflicts with the existing data. The transaction can be
- skipped by calling the <link linkend="pg-replication-origin-advance">
+ transaction that conflicts with the existing data. When a conflict produce
+ an error, it is shown in <structname>pg_stat_subscription_errors</structname>
+ view as follows:
+ </para>
+
+ <programlisting>
+postgres=# SELECT * FROM pg_stat_subscription_errors;
+-[ RECORD 1 ]--------+-----------------------------------------------------------
+datname | postgres
+subid | 16395
+subname | test_sub
+relid | 16385
+command | INSERT
+xid | 716
+failure_source | apply
+failure_count | 50
+last_failure | 2021-07-21 21:16:02.781779+00
+last_failure_message | duplicate key value violates unique constraint "test_pkey"
+stats_reset |
+</programlisting>
+
+ <para>
+ and it is also shown in subscriber's server log as follows:
+ </para>
+
+<screen>
+ERROR: duplicate key value violates unique constraint "test_pkey"
+DETAIL: Key (id)=(1) already exists.
+CONTEXT: processing remote data during "INSERT" for replication target relation "public.test" in transaction 716 with commit timestamp 2021-07-15 21:54:58.802874+00
+</screen>
+
+ <para>
+ The transaction ID that contains the change violating the constraint can be
+ found from those outputs (transaction ID 740 in the above case). The transaction
+ can be skipped by setting <replaceable>skip_xid</replaceable> to the subscription
+ by <command>ALTER SUBSCRIPTION ... SET</command>. Alternatively, the transaction
+ can also be skipped by calling the <link linkend="pg-replication-origin-advance">
<function>pg_replication_origin_advance()</function></link> function with
a <parameter>node_name</parameter> corresponding to the subscription name,
and a position. The current position of origins can be seen in the
<link linkend="view-pg-replication-origin-status">
<structname>pg_replication_origin_status</structname></link> system view.
</para>
+
+ <para>
+ In either way, those should be used as a last resort. They skip the whole
+ transaction including changes that may not violate any constraint and easily
+ make subscriber inconsistent if a user specifies the wrong transaction ID or
+ the position of origin.
+ </para>
</sect1>
<sect1 id="logical-replication-restrictions">
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 8c3c28b7e7..cfb318e08c 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -205,8 +205,36 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
<literal>streaming</literal>.
</para>
<para>
- The parameters that can be reset are: <literal>streaming</literal>,
- <literal>binary</literal>, <literal>synchronous_commit</literal>.
+ The parameters that can be reset are: <literal>slot_name</literal>,
+ <literal>synchronous_commit</literal>, <literal>binary</literal>,
+ <literal>streaming</literal>, and following parameter:
+ </para>
+ <para>
+ <variablelist>
+ <varlistentry>
+ <term><literal>skip_xid</literal> (<type>xid</type>)</term>
+ <listitem>
+ <para>
+ If incoming data violates any constraints the logical replication
+ will stop until it is resolved. The resolution can be done either
+ by changing data on the subscriber so that it doesn't conflict with
+ incoming change or by skipping the whole transaction. This option
+ specifies transaction ID that logical replication worker skips to
+ apply. The logical replication worker skips all data modification
+ changes within the specified transaction. Therefore, since it skips
+ the whole transaction including the changes that may not violate the
+ constraint, it should only be used as a last resort. This option has
+ no effect for the transaction that is already prepared with enabling
+ <literal>two_phase</literal> on susbscriber. After the logical
+ replication successfully skips the transaction, the transaction ID
+ (stored in
+ <structname>pg_subscription</structname>.<structfield>subskipxid</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index 25021e25a4..cb22cd7463 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -104,6 +104,16 @@ GetSubscription(Oid subid, bool missing_ok)
Assert(!isnull);
sub->publications = textarray_to_stringlist(DatumGetArrayTypeP(datum));
+ /* Get skip XID */
+ datum = SysCacheGetAttr(SUBSCRIPTIONOID,
+ tup,
+ Anum_pg_subscription_subskipxid,
+ &isnull);
+ if (!isnull)
+ sub->skipxid = DatumGetTransactionId(datum);
+ else
+ sub->skipxid = InvalidTransactionId;
+
ReleaseSysCache(tup);
return sub;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index cc390ce95a..188f3e42fd 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -60,6 +60,7 @@
#define SUBOPT_BINARY 0x00000080
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
+#define SUBOPT_SKIP_XID 0x00000400
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -81,6 +82,7 @@ typedef struct SubOpts
bool binary;
bool streaming;
bool twophase;
+ TransactionId skip_xid;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -129,6 +131,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->streaming = false;
if (IsSet(supported_opts, SUBOPT_TWOPHASE_COMMIT))
opts->twophase = false;
+ if (IsSet(supported_opts, SUBOPT_SKIP_XID))
+ opts->skip_xid = InvalidTransactionId;
/* Parse options */
foreach(lc, stmt_options)
@@ -261,6 +265,29 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_TWOPHASE_COMMIT;
opts->twophase = defGetBoolean(defel);
}
+ else if (strcmp(defel->defname, "skip_xid") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_SKIP_XID))
+ errorConflictingDefElem(defel, pstate);
+
+ if (!is_reset)
+ {
+ char *xid_str = defGetString(defel);
+ TransactionId xid;
+
+ /* Parse the argument as TransactionId */
+ xid = DatumGetTransactionId(DirectFunctionCall1(xidin,
+ CStringGetDatum(xid_str)));
+
+ if (!TransactionIdIsNormal(xid))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid transaction id")));
+ opts->skip_xid = xid;
+ }
+
+ opts->specified_opts |= SUBOPT_SKIP_XID;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -485,6 +512,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
CharGetDatum(opts.twophase ?
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -885,7 +913,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
{
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
- SUBOPT_STREAMING);
+ SUBOPT_STREAMING | SUBOPT_SKIP_XID);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts, false);
@@ -934,6 +962,13 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
replaces[Anum_pg_subscription_substream - 1] = true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_SKIP_XID))
+ {
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(opts.skip_xid);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+ }
+
update_tuple = true;
break;
}
@@ -941,7 +976,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
case ALTER_SUBSCRIPTION_RESET_OPTIONS:
{
supported_opts = (SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
- SUBOPT_STREAMING);
+ SUBOPT_STREAMING | SUBOPT_SKIP_XID);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts, true);
@@ -967,6 +1002,12 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
replaces[Anum_pg_subscription_substream - 1] = true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_SKIP_XID))
+ {
+ nulls[Anum_pg_subscription_subskipxid - 1] =
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+ }
+
update_tuple = true;
break;
}
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 7e3938d0d3..911a031b60 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -1743,11 +1743,32 @@ pgstat_reset_subscription_error(Oid subid, Oid subrelid)
pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
msg.m_subid = subid;
msg.m_subrelid = subrelid;
+ msg.m_clear = false;
msg.m_reset = true;
pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
}
+/* ----------
+ * pgstat_clear_subscription_error() -
+ *
+ * Tell the collector about clear the error of subscription.
+ * ----------
+ */
+void
+pgstat_clear_subscription_error(Oid subid, Oid subrelid)
+{
+ PgStat_MsgSubscriptionErr msg;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_clear = true;
+ msg.m_reset = false;
+
+ pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
+}
+
/* ----------
* pgstat_report_autovac() -
*
@@ -2034,6 +2055,7 @@ pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
msg.m_subid = subid;
msg.m_subrelid = subrelid;
msg.m_reset = false;
+ msg.m_clear = false;
msg.m_databaseid = MyDatabaseId;
msg.m_relid = relid;
msg.m_command = command;
@@ -6134,27 +6156,37 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
static void
pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len)
{
+
PgStat_StatSubErrEntry *errent;
- bool create = !msg->m_reset;
+ bool create = !(msg->m_reset || msg->m_clear);
/* Get subscription error */
errent = pgstat_get_subscription_error_entry(msg->m_subid,
msg->m_subrelid,
create);
- if (msg->m_reset)
+ if (msg->m_reset || msg->m_clear)
{
+ Assert(!(msg->m_reset && msg->m_clear));
+
if (errent == NULL)
return;
- /* reset fields and set reset timestamp */
errent->relid = InvalidOid;
errent->command = 0;
errent->xid = InvalidTransactionId;
errent->failure_count = 0;
- errent->last_failure = 0;
- errent->last_errmsg[0] = '\0';
- errent->stat_reset_timestamp = GetCurrentTimestamp();
+
+ /*
+ * If the reset is requested, reset more fields and set the reset
+ * timestamp.
+ */
+ if (msg->m_reset)
+ {
+ errent->last_failure = 0;
+ errent->last_errmsg[0] = '\0';
+ errent->stat_reset_timestamp = GetCurrentTimestamp();
+ }
}
else
{
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 3264b36c81..2832aa3219 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -136,6 +136,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -269,6 +270,21 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * skipping_xid is a valid XID if we're skipping all data modification changes
+ * (INSERT/DELETE/UPDATE/TRUNCATE) of the specified transaction in MySubscription->skipxid.
+ * Please note that we don’t skip receiving the changes particularly in streaming
+ * cases, since we decide whether or not to skip applying the changes when starting
+ * to apply. Once starting skipping changes, we copy the XID to skipping_xid and
+ * then don't stop skipping until we skip the whole transaction even if the
+ * subscription is invalidated and* MySubscription->skipxid gets changed or reset.
+ * When stopping the skipping behavior, we reset the skip XID (subskipxid) in the
+ * pg_subscription catalog and associate origin status to the transaction that resets
+ * the skip XID so that we can start streaming from the next transaction.
+ */
+static TransactionId skipping_xid = InvalidTransactionId;
+#define is_skipping_changes() (TransactionIdIsValid(skipping_xid))
+
/*
* Hash table for storing the streaming xid information along with shared file
* set for streaming and subxact files.
@@ -355,6 +371,9 @@ static void apply_error_callback(void *arg);
static inline void set_apply_error_context_xact(TransactionId xid, TimestampTz commit_ts);
static inline void reset_apply_error_context_info(void);
+static void maybe_start_skipping_changes(TransactionId xid);
+static void stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs);
+
/*
* Should this worker apply changes for given relation.
*
@@ -809,6 +828,11 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
+ /*
+ * Enable skipping all changes of this transaction if specified.
+ */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -833,7 +857,18 @@ apply_handle_commit(StringInfo s)
LSN_FORMAT_ARGS(commit_data.commit_lsn),
LSN_FORMAT_ARGS(remote_final_lsn))));
- apply_handle_commit_internal(&commit_data);
+ /*
+ * Stop the skipping transaction if enabled. Otherwise, commit the changes
+ * that are just applied.
+ */
+ if (is_skipping_changes())
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ apply_handle_commit_internal(&commit_data);
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(commit_data.end_lsn);
@@ -861,6 +896,9 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
+ /* Enable skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -919,9 +957,10 @@ apply_handle_prepare(StringInfo s)
/*
* Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction. It is done this way because at
- * commit prepared time, we won't know whether we have skipped preparing a
- * transaction because of no change.
+ * change has happened in this transaction, possibly because we're
+ * skipping data-modification changes of this transaction. It is done this
+ * way because at commit prepared time, we won't know whether we have
+ * skipped preparing a transaction because of no change.
*
* XXX, We can optimize such that at commit prepared time, we first check
* whether we have prepared the transaction or not but that doesn't seem
@@ -935,6 +974,10 @@ apply_handle_prepare(StringInfo s)
CommitTransactionCommand();
pgstat_report_stat(false);
+ /* Stop the skipping changes if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
store_flush_position(prepare_data.end_lsn);
in_remote_transaction = false;
@@ -1066,6 +1109,9 @@ apply_handle_stream_prepare(StringInfo s)
elog(DEBUG1, "received prepare for streamed transaction %u", prepare_data.xid);
+ /* Enable skipping all changes of this transaction if specified. */
+ maybe_start_skipping_changes(prepare_data.xid);
+
/* Replay all the spooled operations. */
apply_spooled_messages(prepare_data.xid, prepare_data.prepare_lsn);
@@ -1076,6 +1122,10 @@ apply_handle_stream_prepare(StringInfo s)
pgstat_report_stat(false);
+ /* Stop the skipping changes if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
store_flush_position(prepare_data.end_lsn);
in_remote_transaction = false;
@@ -1101,9 +1151,10 @@ apply_handle_origin(StringInfo s)
{
/*
* ORIGIN message can only come inside streaming transaction or inside
- * remote transaction and before any actual writes.
+ * remote transaction and before any actual writes unless we're skipping
+ * changes of the transaction.
*/
- if (!in_streamed_transaction &&
+ if (!in_streamed_transaction && !is_skipping_changes() &&
(!in_remote_transaction ||
(IsTransactionState() && !am_tablesync_worker())))
ereport(ERROR,
@@ -1125,6 +1176,9 @@ apply_handle_stream_start(StringInfo s)
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg_internal("duplicate STREAM START message")));
+ /* extract XID of the top-level transaction */
+ stream_xid = logicalrep_read_stream_start(s, &first_segment);
+
/*
* Start a transaction on stream start, this transaction will be committed
* on the stream stop unless it is a tablesync worker in which case it
@@ -1137,9 +1191,6 @@ apply_handle_stream_start(StringInfo s)
/* notify handle methods we're processing a remote transaction */
in_streamed_transaction = true;
- /* extract XID of the top-level transaction */
- stream_xid = logicalrep_read_stream_start(s, &first_segment);
-
if (!TransactionIdIsValid(stream_xid))
ereport(ERROR,
(errcode(ERRCODE_PROTOCOL_VIOLATION),
@@ -1221,6 +1272,7 @@ apply_handle_stream_abort(StringInfo s)
errmsg_internal("STREAM ABORT message without STREAM STOP")));
logicalrep_read_stream_abort(s, &xid, &subxid);
+ maybe_start_skipping_changes(xid);
/*
* If the two XIDs are the same, it's in fact abort of toplevel xact, so
@@ -1314,6 +1366,10 @@ apply_handle_stream_abort(StringInfo s)
CommitTransactionCommand();
}
+ /* Stop the skipping transaction if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
reset_apply_error_context_info();
}
@@ -1463,9 +1519,23 @@ apply_handle_stream_commit(StringInfo s)
elog(DEBUG1, "received commit for streamed transaction %u", xid);
+ /* Enable skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(xid);
+
apply_spooled_messages(xid, commit_data.commit_lsn);
- apply_handle_commit_internal(&commit_data);
+ if (is_skipping_changes())
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ {
+ /* commit the streamed transaction */
+ apply_handle_commit_internal(&commit_data);
+ }
/* unlink the files with serialized changes and subxact info */
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
@@ -2351,6 +2421,17 @@ apply_dispatch(StringInfo s)
LogicalRepMsgType action = pq_getmsgbyte(s);
LogicalRepMsgType saved_command;
+ /*
+ * Skip all data-modification changes if we're skipping changes of this
+ * transaction.
+ */
+ if (is_skipping_changes() &&
+ (action == LOGICAL_REP_MSG_INSERT ||
+ action == LOGICAL_REP_MSG_UPDATE ||
+ action == LOGICAL_REP_MSG_DELETE ||
+ action == LOGICAL_REP_MSG_TRUNCATE))
+ return;
+
/*
* Set the current command being applied. Since this function can be
* called recusively when applying spooled changes, save the current
@@ -3791,3 +3872,103 @@ reset_apply_error_context_info(void)
apply_error_callback_arg.remote_attnum = -1;
set_apply_error_context_xact(InvalidTransactionId, 0);
}
+
+/*
+ * Start skipping changes of the transaction if the given XID matches the
+ * transaction ID specified by skip_xid option.
+ */
+static void
+maybe_start_skipping_changes(TransactionId xid)
+{
+ Assert(!is_skipping_changes());
+ Assert(!TransactionIdIsValid(skipping_xid));
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ if (!TransactionIdIsValid(MySubscription->skipxid) ||
+ MySubscription->skipxid != xid)
+ return;
+
+ skipping_xid = xid;
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction with xid %u",
+ skipping_xid));
+}
+
+/*
+ * Stop skipping changes and reset the skip XID. Also, reset the skip XID
+ * (pg_subscription.subskipxid). If origin_lsn and origin_committs are valid, we
+ * set origin state to the transaction commit that resets the skip XID so that we
+ * can start streaming from the transaction next to the one that we just skipped.
+ */
+static void
+stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs)
+{
+ Relation rel;
+ HeapTuple tup;
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ Assert(is_skipping_changes());
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction with xid %u",
+ skipping_xid)));
+
+ /* Stop skipping changes */
+ skipping_xid = InvalidTransactionId;
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ if (!IsTransactionState())
+ StartTransactionCommand();
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ TransactionIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ /* Set subskipxid to null */
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ /* Update the system catalog to reset the skip XID */
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+
+ heap_freetuple(tup);
+ table_close(rel, RowExclusiveLock);
+
+ if (!XLogRecPtrIsInvalid(origin_lsn))
+ {
+ /*
+ * Update origin state so we can restart streaming from correct
+ * position in case of crash.
+ */
+ replorigin_session_origin_lsn = origin_lsn;
+ replorigin_session_origin_timestamp = origin_committs;
+ }
+
+ CommitTransactionCommand();
+ pgstat_report_stat(false);
+
+ /*
+ * Clear the error statistics of this subscription to let users know the
+ * subscription is no longer getting stuck by the conflict.
+ *
+ * The message for clearing the error statistics can be lost but that's
+ * okay. The user can know the logical replication is working fine in
+ * other ways, for example, checking pg_stat_subscription view. And the
+ * user is able to reset the single subscription error statistics by
+ * pg_reset_subscription_error SQL function.
+ */
+ pgstat_clear_subscription_error(MySubscription->oid, InvalidOid);
+}
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 21061493ea..beaa6e646d 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -67,6 +67,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
char subtwophasestate; /* Stream two-phase transactions */
+ TransactionId subskipxid BKI_FORCE_NULL; /* All changes associated with
+ * this XID are skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -103,6 +106,7 @@ typedef struct Subscription
* binary format */
bool stream; /* Allow streaming in-progress transactions. */
char twophasestate; /* Allow streaming two-phase transactions */
+ TransactionId skipxid; /* All changes of the XID are skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 3f55d63425..93bfef0e9c 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3677,6 +3677,7 @@ typedef struct AlterSubscriptionStmt
char *conninfo; /* Connection string to publisher */
List *publication; /* One or more publication to subscribe to */
List *options; /* List of DefElem nodes */
+ TransactionId skip_xid; /* XID to skip */
} AlterSubscriptionStmt;
typedef struct DropSubscriptionStmt
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index a6914a24e5..6775736b2b 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -536,7 +536,7 @@ typedef struct PgStat_MsgReplSlot
/* ----------
* PgStat_MsgSubscriptionErr Sent by the apply worker or the table sync worker to
- * update/reset the error happening during logical
+ * update/reset/clear the error happening during logical
* replication.
* ----------
*/
@@ -554,7 +554,9 @@ typedef struct PgStat_MsgSubscriptionErr
Oid m_subid;
Oid m_subrelid;
- /* The reset message uses below field */
+ /* The clear and reset messages use below fields */
+ bool m_clear; /* clear all fields except for last_failure and
+ * last_errmsg */
bool m_reset; /* Reset all fields and set reset_stats
* timestamp */
@@ -1111,6 +1113,7 @@ extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type t
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
extern void pgstat_reset_subscription_error(Oid subid, Oid subrelid);
+extern void pgstat_clear_subscription_error(Oid subid, Oid subrelid);
extern void pgstat_report_autovac(Oid dboid);
extern void pgstat_report_vacuum(Oid tableoid, bool shared,
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index b87f67fe55..217b5fabd1 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -296,6 +296,19 @@ ERROR: unrecognized subscription parameter: "enabled"
-- fail - RESET must not include values
ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
ERROR: RESET must not include values for parameters
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = '4294967295');
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1.1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+ERROR: invalid transaction id
\dRs+
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index aa90560691..4c9d25f0a4 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -231,6 +231,17 @@ ALTER SUBSCRIPTION regress_testsub RESET (enabled);
-- fail - RESET must not include values
ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = '4294967295');
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1.1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+
\dRs+
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/subscription/t/024_skip_xact.pl b/src/test/subscription/t/024_skip_xact.pl
new file mode 100644
index 0000000000..affb663803
--- /dev/null
+++ b/src/test/subscription/t/024_skip_xact.pl
@@ -0,0 +1,244 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Tests for skipping logical replication transactions
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 10;
+
+# Test if the error reported on pg_subscription_errors view is expected.
+sub test_subscription_error
+{
+ my ($node, $source, $relname, $expected_error, $msg) = @_;
+
+ # Wait for the error statistics to be updated.
+ $node->poll_query_until(
+ 'postgres', qq[
+SELECT count(1) > 0 FROM pg_stat_subscription_errors
+WHERE relid = '$relname'::regclass AND failure_source = '$source';
+]) or die "Timed out while waiting for statistics to be updated";
+
+ my $result = $node->safe_psql(
+ 'postgres',
+ qq[
+SELECT datname, subname, command, relid::regclass, failure_source, failure_count > 0
+FROM pg_stat_subscription_errors
+WHERE relid = '$relname'::regclass AND failure_source = '$source';
+]);
+ is($result, $expected_error, $msg);
+}
+
+# Check the error reported on pg_stat_subscription view and skip the failed
+# transaction.
+sub test_skip_subscription_error
+{
+ my ($node, $source, $subname, $relname, $expected_error, $msg) = @_;
+
+ # Check the reported error.
+ test_subscription_error($node, $source, $relname, $expected_error, $msg);
+
+ # Get XID of the failed transaction.
+ my $skipxid = $node->safe_psql(
+ 'postgres',
+ "SELECT xid FROM pg_stat_subscription_errors WHERE relid = '$relname'::regclass");
+ $node->safe_psql('postgres',
+ "ALTER SUBSCRIPTION $subname SET (skip_xid = '$skipxid')");
+
+ # Restart the subscriber to restart logical replication without interval.
+ $node->restart;
+
+ # Wait for the failed transaction to be skipped.
+ $node->poll_query_until(
+ 'postgres',
+ qq[
+SELECT subskipxid IS NULL FROM pg_subscription
+WHERE subname = '$subname'
+]) or die "Timed out while waiting for the transaction to be skipped";
+
+ # Also wait for the error details to be cleared.
+ $node->poll_query_until(
+ 'postgres',
+ qq[
+SELECT command IS NULL FROM pg_stat_subscription_errors
+WHERE subname = '$subname' AND failure_source = '$source';
+]) or die "Timed out while waiting for the transaction to be skipped";
+}
+
+# Create publisher node.
+my $node_publisher = PostgresNode->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+logical_decoding_work_mem = 64kB
+]);
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgresNode->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+
+# don't overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+wal_retrieve_retry_interval = 5s
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we create
+# the same tables but with primary keys. Also, insert some data that will conflict
+# with the data replicated from publisher later.
+$node_publisher->safe_psql('postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int);
+CREATE TABLE test_tab2 (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+INSERT INTO test_tab1 VALUES (1);
+INSERT INTO test_tab2 VALUES (1);
+COMMIT;
+]);
+$node_subscriber->safe_psql('postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int primary key);
+CREATE TABLE test_tab2 (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab2 VALUES (1);
+INSERT INTO test_tab_streaming SELECT 10000, md5(10000::text);
+COMMIT;
+]);
+
+# Check if there is no subscription errors before starting logical replication.
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_errors");
+is($result, qq(0), 'check no subscription error');
+
+# Setup publications.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ q[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Create subscriptions. The table sync for test_tab2 on tap_sub will enter to
+# infinite error due to violating the unique constraint.
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (two_phase = on);");
+my $appname_streaming = 'tap_sub_streaming';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming' PUBLICATION tap_pub_streaming WITH (streaming = on, two_phase = on);");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Wait for initial table sync for test_tab1 and test_tab_streaming to finish.
+$node_subscriber->poll_query_until('postgres',
+ q[
+SELECT count(1) = 2 FROM pg_subscription_rel
+WHERE srrelid in ('test_tab1'::regclass, 'test_tab_streaming'::regclass) AND srsubstate = 'r'
+]) or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check the initial data.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab1");
+is($result, q(1), 'check initial data was copied to subscriber');
+
+# Insert more data to test_tab1, raising an error on the subscriber due to violating
+# the unique constraint on test_tab1. Then skip the transaction in question.
+$node_publisher->safe_psql('postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab1 VALUES (1);
+COMMIT;
+]);
+test_skip_subscription_error($node_subscriber,
+ 'apply', 'tap_sub', 'test_tab1',
+ qq(postgres|tap_sub|INSERT|test_tab1|apply|t),
+ 'skip the error reported by the apply worker');
+
+# Check the table sync worker's error in the view.
+test_subscription_error($node_subscriber,
+ 'tablesync', 'test_tab2',
+ qq(postgres|tap_sub||test_tab2|tablesync|t),
+ 'skip the error reported by the table sync worker');
+
+# Insert enough rows to test_tab_streaming to exceed the 64kB limit, also raising an
+# error on the subscriber during applying spooled changes for the same reason. Then
+# skip the transactio in question.
+$node_publisher->safe_psql('postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+COMMIT;
+]);
+test_skip_subscription_error($node_subscriber,
+ 'apply', 'tap_sub_streaming', 'test_tab_streaming',
+ qq(postgres|tap_sub_streaming|INSERT|test_tab_streaming|apply|t),
+ 'skip the error reported by the table sync worker during during applying streaming changes');
+
+# Insert data to test_tab1 and test_tab_streaming that don't conflict.
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab1 VALUES (2)");
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab_streaming VALUES (10001, md5(10001::text))");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Check the data is successfully replicated after skipping the transactions.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT * FROM test_tab1");
+is($result, q(1
+2), "subscription gets changes after skipped transaction");
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM test_tab_streaming");
+is($result, q(2), "subscription gets changes after skipped streamed transaction");
+
+# Tests for skipping the transactions that are prepared and stream_prepared. We insert
+# the same data as the previous tests but prepare the transactions. Those insertions
+# raise an error on the subscriptions. Then we skip the transactions in question.
+$node_publisher->safe_psql('postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab1 VALUES (1);
+PREPARE TRANSACTION 'skip_sub1';
+COMMIT PREPARED 'skip_sub1';
+]);
+test_skip_subscription_error($node_subscriber,
+ 'apply', 'tap_sub', 'test_tab1',
+ qq(postgres|tap_sub|INSERT|test_tab1|apply|t),
+ 'skip the error on changes of the prepared transaction');
+
+$node_publisher->safe_psql('postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+PREPARE TRANSACTION 'skip_sub2';
+COMMIT PREPARED 'skip_sub2';
+]);
+test_skip_subscription_error($node_subscriber,
+ 'apply', 'tap_sub_streaming', 'test_tab_streaming',
+ qq(postgres|tap_sub_streaming|INSERT|test_tab_streaming|apply|t),
+ 'skip the error on changes of the prepared-streamed transaction');
+
+# Check if the view doesn't show any entries after dropping the subscriptions.
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+DROP SUBSCRIPTION tap_sub;
+DROP SUBSCRIPTION tap_sub_streaming;
+]);
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_errors");
+is($result, q(0), 'no error after dropping subscription');
--
2.24.3 (Apple Git-128)
v10-0002-Add-pg_stat_subscription_errors-statistics-view.patchapplication/octet-stream; name=v10-0002-Add-pg_stat_subscription_errors-statistics-view.patchDownload
From 7ee81c432ed8cc9b713ee0f645ac58d60e426190 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 16 Jul 2021 23:10:22 +0900
Subject: [PATCH v10 2/5] Add pg_stat_subscription_errors statistics view.
This commits adds new system view pg_stat_logical_replication_error,
showing errors happening during applying logical replication changes
as well as during performing initial table synchronization.
The subscription error entries are removed by autovacuum workers when
the table synchronization competed in table sync worker cases and when
dropping the subscription in apply worker cases.
It also adds SQL function pg_stat_reset_subscription_error() to
reset the single subscription error.
---
doc/src/sgml/monitoring.sgml | 160 ++++++
src/backend/catalog/system_functions.sql | 2 +
src/backend/catalog/system_views.sql | 27 +
src/backend/postmaster/pgstat.c | 651 +++++++++++++++++++++++
src/backend/replication/logical/worker.c | 51 +-
src/backend/utils/adt/pgstatfuncs.c | 113 ++++
src/backend/utils/error/elog.c | 1 -
src/include/catalog/pg_proc.dat | 13 +
src/include/pgstat.h | 106 ++++
src/test/regress/expected/rules.out | 22 +
src/tools/pgindent/typedefs.list | 5 +
11 files changed, 1147 insertions(+), 4 deletions(-)
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 74a58a916c..0c02e46947 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -346,6 +346,15 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
</entry>
</row>
+ <row>
+ <entry><structname>pg_stat_subscription_errors</structname><indexterm><primary>pg_stat_subscription_errors</primary></indexterm></entry>
+ <entry>One row per error that happened on subscription, showing information about
+ the subscription errors.
+ See <link linkend="monitoring-pg-stat-subscription-errors">
+ <structname>pg_stat_subscription_errors</structname></link> for details.
+ </entry>
+ </row>
+
<row>
<entry><structname>pg_stat_ssl</structname><indexterm><primary>pg_stat_ssl</primary></indexterm></entry>
<entry>One row per connection (regular and replication), showing information about
@@ -3050,6 +3059,135 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
</sect2>
+ <sect2 id="monitoring-pg-stat-subscription-errors">
+ <title><structname>pg_stat_subscription_errors</structname></title>
+
+ <indexterm>
+ <primary>pg_stat_subscription_errors</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_stat_subscription_errors</structname> view will contain one
+ row per subscription error reported by workers applying logical replication
+ changes and workers handling the initial data copy of the subscribed tables.
+ </para>
+
+ <table id="pg-stat-subscription-errors" xreflabel="pg_stat_subscription-errors">
+ <title><structname>pg_stat_subscription_errors</structname> View</title>
+ <tgroup cols="1">
+ <thead>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ Column Type
+ </para>
+ <para>
+ Description
+ </para></entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>datname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the database in which the subscription is created.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the subscription.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the subscription.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>relid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker is processing when the
+ error happened.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>command</structfield> <type>text</type>
+ </para>
+ <para>
+ Name of command being applied when the error happened. This
+ field is always NULL if the error is reported by
+ <literal>tablesync</literal> worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>xid</structfield> <type>xid</type>
+ </para>
+ <para>
+ Transaction ID of publisher node being applied when the error
+ happened. This field is always NULL if the error is reported
+ by <literal>tablesync</literal> worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>failure_source</structfield> <type>text</type>
+ </para>
+ <para>
+ Type of worker reported the error: <literal>apply</literal> or
+ <literal>tablesync</literal>.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>failure_count</structfield> <type>uint8</type>
+ </para>
+ <para>
+ Number of times the error happened on the worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_failure</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which the last error happened.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_failure_message</structfield> <type>text</type>
+ </para>
+ <para>
+ Error message which is reported last failure time.
+ </para></entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ </sect2>
+
<sect2 id="monitoring-pg-stat-ssl-view">
<title><structname>pg_stat_ssl</structname></title>
@@ -5172,6 +5310,28 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
can be granted EXECUTE to run the function.
</para></entry>
</row>
+
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_stat_reset_subscription_error</primary>
+ </indexterm>
+ <function>pg_stat_reset_subscription_error</function> ( <parameter>subid</parameter> <type>oid</type>, <parameter>relid</parameter> <type>oid</type> )
+ <returnvalue>void</returnvalue>
+ </para>
+ <para>
+ Resets statistics of a single subscription error. If
+ the argument <parameter>relid</parameter> is not <literal>NULL</literal>,
+ resets error statistics of the <literal>tablesync</literal> worker for
+ the relation with <parameter>relid</parameter>. Otherwise, resets the
+ error statistics of the <literal>apply</literal> worker running on the
+ subscription with <parameter>subid</parameter>.
+ </para>
+ <para>
+ This function is restricted to superusers by default, but other users
+ can be granted EXECUTE to run the function.
+ </para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index a416e94d37..c9aa6f04d3 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -639,6 +639,8 @@ REVOKE EXECUTE ON FUNCTION pg_stat_reset_single_function_counters(oid) FROM publ
REVOKE EXECUTE ON FUNCTION pg_stat_reset_replication_slot(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_stat_reset_subscription_error(oid, oid) FROM public;
+
REVOKE EXECUTE ON FUNCTION lo_import(text) FROM public;
REVOKE EXECUTE ON FUNCTION lo_import(text, oid) FROM public;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 55f6e3711d..449692afa9 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1257,3 +1257,30 @@ REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
substream, subtwophasestate, subslotname, subsynccommit, subpublications)
ON pg_subscription TO public;
+
+CREATE VIEW pg_stat_subscription_errors AS
+ SELECT
+ d.datname,
+ sr.subid,
+ s.subname,
+ e.relid,
+ e.command,
+ e.xid,
+ e.failure_source,
+ e.failure_count,
+ e.last_failure,
+ e.last_failure_message,
+ e.stats_reset
+ FROM (SELECT
+ oid as subid,
+ NULL as relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT
+ srsubid as subid,
+ srrelid as relid
+ FROM pg_subscription_rel
+ WHERE srsubstate <> 'r') sr,
+ LATERAL pg_stat_get_subscription_error(sr.subid, sr.relid) e
+ JOIN pg_database d ON (e.datid = d.oid)
+ JOIN pg_subscription s ON (e.subid = s.oid);
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index a3c35bdf60..7e3938d0d3 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -40,6 +40,8 @@
#include "access/xact.h"
#include "catalog/pg_database.h"
#include "catalog/pg_proc.h"
+#include "catalog/pg_subscription.h"
+#include "catalog/pg_subscription_rel.h"
#include "common/ip.h"
#include "executor/instrument.h"
#include "libpq/libpq.h"
@@ -105,6 +107,7 @@
#define PGSTAT_TAB_HASH_SIZE 512
#define PGSTAT_FUNCTION_HASH_SIZE 512
#define PGSTAT_REPLSLOT_HASH_SIZE 32
+#define PGSTAT_SUBSCRIPTION_HASH_SIZE 32
/* ----------
@@ -279,6 +282,7 @@ static PgStat_GlobalStats globalStats;
static PgStat_WalStats walStats;
static PgStat_SLRUStats slruStats[SLRU_NUM_ELEMENTS];
static HTAB *replSlotStatHash = NULL;
+static HTAB *subscriptionHash = NULL;
/*
* List of OIDs of databases we need to write out. If an entry is InvalidOid,
@@ -329,6 +333,12 @@ static bool pgstat_db_requested(Oid databaseid);
static PgStat_StatReplSlotEntry *pgstat_get_replslot_entry(NameData name, bool create_it);
static void pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotstats, TimestampTz ts);
+static PgStat_StatSubEntry *pgstat_get_subscription_entry(Oid subid,
+ bool create);
+static PgStat_StatSubErrEntry *pgstat_get_subscription_error_entry(Oid subid,
+ Oid subrelid,
+ bool create);
+
static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg);
static void pgstat_send_funcstats(void);
static void pgstat_send_slru(void);
@@ -368,6 +378,10 @@ static void pgstat_recv_checksum_failure(PgStat_MsgChecksumFailure *msg, int len
static void pgstat_recv_connstat(PgStat_MsgConn *msg, int len);
static void pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len);
static void pgstat_recv_tempfile(PgStat_MsgTempFile *msg, int len);
+static void pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len);
+static void pgstat_recv_subscription_error_purge(PgStat_MsgSubscriptionErrPurge *msg,
+ int len);
+static void pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len);
/* ------------------------------------------------------------
* Public functions called from postmaster follow
@@ -1146,6 +1160,166 @@ pgstat_vacuum_stat(void)
}
}
+ /*
+ * Search for all the dead subscriptions and error entries in stats
+ * hashtable and tell the stats collector to drop them.
+ */
+ if (subscriptionHash)
+ {
+ PgStat_MsgSubscriptionPurge submsg;
+ PgStat_StatSubEntry *subent;
+ HTAB *htab;
+
+ htab = pgstat_collect_oids(SubscriptionRelationId, Anum_pg_subscription_oid);
+
+ submsg.m_nentries = 0;
+
+ hash_seq_init(&hstat, subscriptionHash);
+ while ((subent = (PgStat_StatSubEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ PgStat_MsgSubscriptionErrPurge errmsg;
+ PgStat_StatSubErrEntry *errent;
+ HASH_SEQ_STATUS hstat_rel;
+ List *not_ready_rels_list;
+ HTAB *not_ready_rels_htab;
+ ListCell *lc;
+ HASHCTL hash_ctl;
+
+ CHECK_FOR_INTERRUPTS();
+
+ if (hash_search(htab, (void *) &(subent->subid), HASH_FIND, NULL) == NULL)
+ {
+ /* This subscription is dead, add subid to the message */
+ submsg.m_subids[submsg.m_nentries++] = subent->subid;
+
+ /*
+ * If the message is full, send it out and reinitialize to
+ * empty
+ */
+ if (submsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONPURGE)
+ {
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + submsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&submsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(&submsg, len);
+ submsg.m_nentries = 0;
+ }
+
+ continue;
+ }
+
+ /*
+ * Nothing to do here if the subscription exists but has no error
+ * entries.
+ */
+ if (subent->suberrors == NULL)
+ continue;
+
+ /*
+ * The subscription has error entries. We search errors of the
+ * table sync workers who are already in sync state. Those errors
+ * should be removed.
+ *
+ * Note that the lifetime of error entries of the apply worker and
+ * the table sync worker are different. The former lives until
+ * the subscription is dropped whereas the latter lives the table
+ * synchronization is completed.
+ */
+ not_ready_rels_list = GetSubscriptionNotReadyRelations(subent->subid);
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(SubscriptionRelState);
+ not_ready_rels_htab = hash_create("not ready relations in subscription",
+ 64,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+
+ /*
+ * The number of not-ready relations can be high for example right
+ * after creating a subscription, so we load the list of
+ * SubscriptionRelState into the hash table for faster lookups.
+ */
+ foreach(lc, not_ready_rels_list)
+ {
+ SubscriptionRelState *r_elem = (SubscriptionRelState *) lfirst(lc);
+ SubscriptionRelState *r_entry;
+
+ CHECK_FOR_INTERRUPTS();
+ r_entry = hash_search(not_ready_rels_htab, (void *) &(r_elem->relid),
+ HASH_ENTER, NULL);
+ memcpy(r_entry, r_elem, sizeof(SubscriptionRelState));
+ }
+
+ list_free(not_ready_rels_list);
+
+ errmsg.m_nentries = 0;
+ errmsg.m_subid = subent->subid;
+
+ /*
+ * Search for all error entries of which relation is already ready
+ * state
+ */
+ hash_seq_init(&hstat_rel, subent->suberrors);
+ while ((errent = (PgStat_StatSubErrEntry *) hash_seq_search(&hstat_rel)) != NULL)
+ {
+ CHECK_FOR_INTERRUPTS();
+
+ /* Skip the apply worker's error */
+ if (!OidIsValid(errent->subrelid))
+ continue;
+
+ /*
+ * Add the relid to the message if the table synchronization
+ * for this relation already completes or the table is no
+ * longer subscribed.
+ */
+ if (hash_search(not_ready_rels_htab, (void *) &(errent->subrelid),
+ HASH_FIND, NULL) == NULL)
+ errmsg.m_relids[errmsg.m_nentries++] = errent->subrelid;
+
+ /*
+ * If the message is full, send it out and reinitialize to
+ * empty
+ */
+ if (errmsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONERRPURGE)
+ {
+ len = offsetof(PgStat_MsgSubscriptionErrPurge, m_relids[0])
+ + errmsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&errmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE);
+ pgstat_send(&errmsg, len);
+ errmsg.m_nentries = 0;
+ }
+ }
+
+ /* Send the rest of dead error entries */
+ if (errmsg.m_nentries > 0)
+ {
+ len = offsetof(PgStat_MsgSubscriptionErrPurge, m_relids[0])
+ + errmsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&errmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE);
+ pgstat_send(&errmsg, len);
+ errmsg.m_nentries = 0;
+ }
+
+ hash_destroy(not_ready_rels_htab);
+ }
+
+ /* Send the rest of dead subscriptions */
+ if (submsg.m_nentries > 0)
+ {
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + submsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&submsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(&submsg, len);
+ }
+
+ hash_destroy(htab);
+ }
+
/*
* Lookup our own database entry; if not found, nothing more to do.
*/
@@ -1555,6 +1729,25 @@ pgstat_reset_replslot_counter(const char *name)
pgstat_send(&msg, sizeof(msg));
}
+/* ----------
+ * pgstat_reset_subscription_error() -
+ *
+ * Tell the collector about reset the subscription error.
+ * ----------
+ */
+void
+pgstat_reset_subscription_error(Oid subid, Oid subrelid)
+{
+ PgStat_MsgSubscriptionErr msg;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_reset = true;
+
+ pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
+}
+
/* ----------
* pgstat_report_autovac() -
*
@@ -1820,6 +2013,37 @@ pgstat_report_replslot_drop(const char *slotname)
pgstat_send(&msg, sizeof(PgStat_MsgReplSlot));
}
+/* ----------
+ * pgstat_report_subscription_error() -
+ *
+ * Tell the collector about the subscription error.
+ * ----------
+ */
+void
+pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command, TransactionId xid,
+ const char *errmsg)
+{
+ PgStat_MsgSubscriptionErr msg;
+ int len;
+
+ len = offsetof(PgStat_MsgSubscriptionErr, m_errmsg[0]) + strlen(errmsg);
+ Assert(len < PGSTAT_MAX_MSG_SIZE);
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_reset = false;
+ msg.m_databaseid = MyDatabaseId;
+ msg.m_relid = relid;
+ msg.m_command = command;
+ msg.m_xid = xid;
+ msg.m_failure_time = GetCurrentTimestamp();
+ strlcpy(msg.m_errmsg, errmsg, PGSTAT_SUBSCRIPTIONERR_MSGLEN);
+
+ pgstat_send(&msg, len);
+}
+
/* ----------
* pgstat_ping() -
*
@@ -2886,6 +3110,22 @@ pgstat_fetch_replslot(NameData slotname)
return pgstat_get_replslot_entry(slotname, false);
}
+/*
+ * ---------
+ * pgstat_fetch_subscription_error() -
+ *
+ * Support function for the SQL-callable pgstat* functions. Returns
+ * a pointer to the subscription error struct.
+ * ---------
+ */
+PgStat_StatSubErrEntry *
+pgstat_fetch_subscription_error(Oid subid, Oid relid)
+{
+ backend_read_statsfile();
+
+ return pgstat_get_subscription_error_entry(subid, relid, false);
+}
+
/*
* Shut down a single backend's statistics reporting at process exit.
*
@@ -3463,6 +3703,19 @@ PgstatCollectorMain(int argc, char *argv[])
pgstat_recv_connstat(&msg.msg_conn, len);
break;
+ case PGSTAT_MTYPE_SUBSCRIPTIONERR:
+ pgstat_recv_subscription_error(&msg.msg_subscriptionerr, len);
+ break;
+
+ case PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE:
+ pgstat_recv_subscription_error_purge(&msg.msg_subscriptionerrpurge,
+ len);
+ break;
+
+ case PGSTAT_MTYPE_SUBSCRIPTIONPURGE:
+ pgstat_recv_subscription_purge(&msg.msg_subscriptionpurge, len);
+ break;
+
default:
break;
}
@@ -3763,6 +4016,50 @@ pgstat_write_statsfiles(bool permanent, bool allDbs)
}
}
+ /*
+ * Write subscription error structs
+ */
+ if (subscriptionHash)
+ {
+ PgStat_StatSubEntry *subent;
+
+ hash_seq_init(&hstat, subscriptionHash);
+ while ((subent = (PgStat_StatSubEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ PgStat_StatSubErrEntry *errent;
+ HASH_SEQ_STATUS relhstat;
+ int32 nerrors = hash_get_num_entries(subent->suberrors);
+
+ /* Skip this subscription if not have any errors */
+ if (subent->suberrors == NULL)
+ continue;
+
+ fputc('S', fpout);
+ rc = fwrite(subent, sizeof(PgStat_StatSubEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+
+ /* The number of errors follows */
+ rc = fwrite(&nerrors, sizeof(int32), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+
+ /* Then, the error entries follow */
+ hash_seq_init(&relhstat, subent->suberrors);
+ while ((errent = (PgStat_StatSubErrEntry *) hash_seq_search(&relhstat)) != NULL)
+ {
+ /*
+ * XXX we write the whole PgStat_StatSubErrEntry entry that
+ * contains the fixed-legnth error message string which is
+ * PGSTAT_SUBSCRIPTIONERR_MSGLEN in length, making the stats
+ * file bloat. It's okay since we assume that the number of
+ * error entries is not high. But if the expectation became
+ * false we should write the string and its length instead.
+ */
+ rc = fwrite(errent, sizeof(PgStat_StatSubErrEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+ }
+ }
+ }
+
/*
* No more output to be done. Close the temp file and replace the old
* pgstat.stat with it. The ferror() check replaces testing for error
@@ -4224,6 +4521,100 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
break;
}
+ /*
+ * 'S' A PgStat_StatSubEntry struct followed by the number of
+ * errors and PgStat_StatSubErrEntry structs, describing
+ * subscription errors.
+ */
+ case 'S':
+ {
+ PgStat_StatSubEntry subbuf;
+ PgStat_StatSubEntry *subent;
+ int32 nerrors;
+
+ /* Read the subscription entry */
+ if (fread(&subbuf, 1, sizeof(PgStat_StatSubEntry), fpin) !=
+ sizeof(PgStat_StatSubEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /* Create hash table if we don't have it already. */
+ if (subscriptionHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ subscriptionHash = hash_create("Subscription stat entries",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ }
+
+ /* Enter the subscription error entry */
+ subent =
+ (PgStat_StatSubEntry *) hash_search(subscriptionHash,
+ (void *) &(subbuf.subid),
+ HASH_ENTER, NULL);
+ subent->suberrors = NULL;
+
+ /* Read the number of errors in the subscription */
+ if (fread(&nerrors, 1, sizeof(int32), fpin) != sizeof(int32))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /* Read error entries */
+ for (int i = 0; i < nerrors; i++)
+ {
+ PgStat_StatSubErrEntry errbuf;
+ PgStat_StatSubErrEntry *errent;
+
+ if (fread(&errbuf, 1, sizeof(PgStat_StatSubErrEntry), fpin) !=
+ sizeof(PgStat_StatSubErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ if (subent->suberrors == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubErrEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ subent->suberrors = hash_create("Subscription error hash",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ }
+
+ /*
+ * Enter the error information to the subscription
+ * hash
+ */
+ errent =
+ (PgStat_StatSubErrEntry *) hash_search(subent->suberrors,
+ (void *) &(errbuf.subrelid),
+ HASH_ENTER, NULL);
+
+ memcpy(errent, &errbuf, sizeof(PgStat_StatSubErrEntry));
+ }
+
+ break;
+ }
+
case 'E':
goto done;
@@ -4566,6 +4957,50 @@ pgstat_read_db_statsfile_timestamp(Oid databaseid, bool permanent,
}
break;
+ /*
+ * 'S' A PgStat_StatSubEntry struct followed by the number of
+ * errors and PgStat_StatSubErrEntry structs describing
+ * subscription errors.
+ */
+ case 'S':
+ {
+ PgStat_StatSubEntry subbuf;
+ PgStat_StatSubErrEntry errbuf;
+ int32 nerrors;
+
+ if (fread(&subbuf, 1, sizeof(PgStat_StatSubEntry), fpin)
+ != sizeof(PgStat_StatSubEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ FreeFile(fpin);
+ return false;
+ }
+
+ if (fread(&nerrors, 1, sizeof(int32), fpin) != sizeof(int32))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ for (int i = 0; i < nerrors; i++)
+ {
+ if (fread(&errbuf, 1, sizeof(PgStat_StatSubErrEntry), fpin) !=
+ sizeof(PgStat_StatSubErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+ }
+ }
+
+ break;
+
case 'E':
goto done;
@@ -4771,6 +5206,7 @@ pgstat_clear_snapshot(void)
pgStatLocalContext = NULL;
pgStatDBHash = NULL;
replSlotStatHash = NULL;
+ subscriptionHash = NULL;
/*
* Historically the backend_status.c facilities lived in this file, and
@@ -5689,6 +6125,116 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
}
}
+/* ----------
+ * pgstat_recv_subscription_error() -
+ *
+ * Process a SUBSCRIPTIONERR message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len)
+{
+ PgStat_StatSubErrEntry *errent;
+ bool create = !msg->m_reset;
+
+ /* Get subscription error */
+ errent = pgstat_get_subscription_error_entry(msg->m_subid,
+ msg->m_subrelid,
+ create);
+
+ if (msg->m_reset)
+ {
+ if (errent == NULL)
+ return;
+
+ /* reset fields and set reset timestamp */
+ errent->relid = InvalidOid;
+ errent->command = 0;
+ errent->xid = InvalidTransactionId;
+ errent->failure_count = 0;
+ errent->last_failure = 0;
+ errent->last_errmsg[0] = '\0';
+ errent->stat_reset_timestamp = GetCurrentTimestamp();
+ }
+ else
+ {
+ Assert(errent);
+
+ /* update the error entry */
+ errent->databaseid = msg->m_databaseid;
+ errent->relid = msg->m_relid;
+ errent->command = msg->m_command;
+ errent->xid = msg->m_xid;
+ errent->failure_count++;
+ errent->last_failure = msg->m_failure_time;
+ strlcpy(errent->last_errmsg, msg->m_errmsg, PGSTAT_SUBSCRIPTIONERR_MSGLEN);
+ }
+}
+
+/* ----------
+ * pgstat_recv_subscription_purge() -
+ *
+ * Process a SUBSCRIPTIONPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len)
+{
+ if (subscriptionHash == NULL)
+ return;
+
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ PgStat_StatSubEntry *subent;
+
+ subent = pgstat_get_subscription_entry(msg->m_subids[i], false);
+
+ /*
+ * Nothing to do if the subscription entry is not found. This could
+ * happen when the subscription is dropped and the message for
+ * dropping subscription entry arrived before the message for
+ * reporting the error.
+ */
+ if (subent == NULL)
+ continue;
+
+ /* Cleanup the hash table for errors */
+ if (subent->suberrors != NULL)
+ hash_destroy(subent->suberrors);
+
+ /* Remove the subscription entry */
+ (void) hash_search(subscriptionHash, (void *) &(msg->m_subids[i]),
+ HASH_REMOVE, NULL);
+ }
+}
+
+/* ----------
+ * pgstat_recv_subscription_error_purge() -
+ *
+ * Process a SUBSCRIPTIONERRPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_error_purge(PgStat_MsgSubscriptionErrPurge *msg, int len)
+{
+ PgStat_StatSubEntry *subent;
+
+ subent = pgstat_get_subscription_entry(msg->m_subid, false);
+
+ /*
+ * Nothing to do if the subscription entry is not found. This could
+ * happen when the subscription with msg->m_subid is removed and the
+ * corresponding entry is also removed before receiving the error purge
+ * message.
+ */
+ if (subent == NULL)
+ return;
+
+ for (int i = 0; i < msg->m_nentries; i++)
+ (void) hash_search(subent->suberrors, (void *) &(msg->m_relids[i]),
+ HASH_REMOVE, NULL);
+}
+
/* ----------
* pgstat_write_statsfile_needed() -
*
@@ -5786,6 +6332,111 @@ pgstat_get_replslot_entry(NameData name, bool create)
return slotent;
}
+/* ----------
+ * pgstat_get_subscription_entry
+ *
+ * Return the subscription statistics with the subscription OID. Return NULL
+ * if not found and the caller didn't request to create it.
+ *
+ * 'create' tells whether to create the new subscription entry if it is not
+ * found.
+ * ----------
+ */
+static PgStat_StatSubEntry *
+pgstat_get_subscription_entry(Oid subid, bool create)
+{
+ PgStat_StatSubEntry *subent;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);
+ bool found;
+
+ if (subscriptionHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ if (!create)
+ return NULL;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubEntry);
+ subscriptionHash = hash_create("Subscription stat entries",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+ }
+
+ subent = (PgStat_StatSubEntry *) hash_search(subscriptionHash,
+ (void *) &subid,
+ action, &found);
+
+ /* initialize field */
+ if (create && !found)
+ subent->suberrors = NULL;
+
+ return subent;
+}
+
+/* ----------
+ * pgstat_get_subscription_error_entry
+ *
+ * Return the entry of subscription error entry with the subscription
+ * OID and relation OID. Return NULL if not found and the caller didn't
+ * request to create it.
+ *
+ * 'create' tells whether to create the new subscription relation entry if it is
+ * not found.
+ * ----------
+ */
+static PgStat_StatSubErrEntry *
+pgstat_get_subscription_error_entry(Oid subid, Oid subrelid, bool create)
+{
+ PgStat_StatSubEntry *subent;
+ PgStat_StatSubErrEntry *errent;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);
+ bool found;
+
+ subent = pgstat_get_subscription_entry(subid, create);
+
+ if (subent == NULL)
+ {
+ Assert(!create);
+ return NULL;
+ }
+
+ if (subent->suberrors == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ if (!create)
+ return NULL;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubErrEntry);
+ subent->suberrors = hash_create("Subscription error hash",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+ }
+
+ errent = (PgStat_StatSubErrEntry *) hash_search(subent->suberrors,
+ (void *) &subrelid,
+ action, &found);
+
+ /* initialize fields */
+ if (create && !found)
+ {
+ errent->databaseid = InvalidOid;
+ errent->relid = InvalidOid;
+ errent->command = 0;
+ errent->xid = InvalidTransactionId;
+ errent->failure_count = 0;
+ errent->last_failure = 0;
+ errent->last_errmsg[0] = '\0';
+ errent->stat_reset_timestamp = 0;
+ }
+
+ return errent;
+}
+
/* ----------
* pgstat_reset_replslot
*
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 3c707e3a1e..3264b36c81 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3462,6 +3462,7 @@ void
ApplyWorkerMain(Datum main_arg)
{
int worker_slot = DatumGetInt32(main_arg);
+ MemoryContext cctx = CurrentMemoryContext;
MemoryContext oldctx;
char originname[NAMEDATALEN];
XLogRecPtr origin_startpos;
@@ -3562,8 +3563,27 @@ ApplyWorkerMain(Datum main_arg)
{
char *syncslotname;
- /* This is table synchronization worker, call initial sync. */
- syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ PG_TRY();
+ {
+ /* This is table synchronization worker, call initial sync. */
+ syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ }
+ PG_CATCH();
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ /* report the table sync error */
+ pgstat_report_subscription_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ MyLogicalRepWorker->relid,
+ 0,
+ InvalidTransactionId,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
/* allocate slot name in long-lived context */
myslotname = MemoryContextStrdup(ApplyContext, syncslotname);
@@ -3681,7 +3701,32 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- LogicalRepApplyLoop(origin_startpos);
+ PG_TRY();
+ {
+ LogicalRepApplyLoop(origin_startpos);
+ }
+ PG_CATCH();
+ {
+ /* report the apply error */
+ if (apply_error_callback_arg.command != 0)
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ pgstat_report_subscription_error(MySubscription->oid,
+ InvalidOid,
+ apply_error_callback_arg.rel != NULL
+ ? apply_error_callback_arg.rel->localreloid
+ : InvalidOid,
+ apply_error_callback_arg.command,
+ apply_error_callback_arg.remote_xid,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ }
+
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
proc_exit(0);
}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index ff5aedc99c..c454e2f8bc 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -24,6 +24,7 @@
#include "pgstat.h"
#include "postmaster/bgworker_internals.h"
#include "postmaster/postmaster.h"
+#include "replication/logicalproto.h"
#include "replication/slot.h"
#include "storage/proc.h"
#include "storage/procarray.h"
@@ -2239,6 +2240,23 @@ pg_stat_reset_replication_slot(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
+/* Reset a subscription error stats */
+Datum
+pg_stat_reset_subscription_error(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid;
+
+ if (PG_ARGISNULL(1))
+ relid = InvalidOid; /* reset apply worker error stats */
+ else
+ relid = PG_GETARG_OID(1); /* reset table sync worker error stats */
+
+ pgstat_reset_subscription_error(subid, relid);
+
+ PG_RETURN_VOID();
+}
+
Datum
pg_stat_get_archiver(PG_FUNCTION_ARGS)
{
@@ -2379,3 +2397,98 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
/* Returns the record as Datum */
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
+
+/*
+ * Get the subscription error for the given subscription (and relation).
+ */
+Datum
+pg_stat_get_subscription_error(PG_FUNCTION_ARGS)
+{
+#define PG_STAT_GET_SUBSCRIPTION_ERROR_COLS 10
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid;
+ TupleDesc tupdesc;
+ Datum values[PG_STAT_GET_SUBSCRIPTION_ERROR_COLS];
+ bool nulls[PG_STAT_GET_SUBSCRIPTION_ERROR_COLS];
+ PgStat_StatSubErrEntry *errent;
+
+ /* Initialise values and NULL flags arrays */
+ MemSet(values, 0, sizeof(values));
+ MemSet(nulls, 0, sizeof(nulls));
+
+ /* Initialise attributes information in the tuple descriptor */
+ tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_SUBSCRIPTION_ERROR_COLS);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "datid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 3, "relid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 4, "command",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 5, "xid",
+ XIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 6, "failure_source",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 7, "failure_count",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 8, "last_failure",
+ TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 9, "last_failure_message",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 10, "stats_reset",
+ TIMESTAMPTZOID, -1, 0);
+ BlessTupleDesc(tupdesc);
+
+ if (PG_ARGISNULL(1))
+ relid = InvalidOid;
+ else
+ relid = PG_GETARG_OID(1);
+
+ /* Get subscription errors */
+ errent = pgstat_fetch_subscription_error(subid, relid);
+
+ /* Return NULL if the subscription doesn't have any errors */
+ if (errent == NULL)
+ PG_RETURN_NULL();
+
+ values[0] = ObjectIdGetDatum(errent->databaseid);
+ values[1] = ObjectIdGetDatum(subid);
+
+ if (OidIsValid(errent->relid))
+ values[2] = ObjectIdGetDatum(errent->relid);
+ else
+ nulls[2] = true;
+
+ if (errent->command == 0)
+ nulls[3] = true;
+ else
+ values[3] = CStringGetTextDatum(logicalrep_message_type(errent->command));
+
+ if (TransactionIdIsValid(errent->xid))
+ values[4] = TransactionIdGetDatum(errent->xid);
+ else
+ nulls[4] = true;
+
+ if (OidIsValid(errent->subrelid))
+ values[5] = CStringGetTextDatum("tablesync");
+ else
+ values[5] = CStringGetTextDatum("apply");
+
+ values[6] = Int64GetDatum(errent->failure_count);
+
+ if (errent->last_failure == 0)
+ nulls[7] = true;
+ else
+ values[7] = TimestampTzGetDatum(errent->last_failure);
+
+ values[8] = CStringGetTextDatum(errent->last_errmsg);
+
+ if (errent->stat_reset_timestamp == 0)
+ nulls[9] = true;
+ else
+ values[9] = TimestampTzGetDatum(errent->stat_reset_timestamp);
+
+ /* Returns the record as Datum */
+ PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
+}
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index a3e1c59a82..871f7b1b15 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -1441,7 +1441,6 @@ getinternalerrposition(void)
return edata->internalpos;
}
-
/*
* Functions to allow construction of error message strings separately from
* the ereport() call itself.
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index b603700ed9..7f9c27bdda 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5385,6 +5385,14 @@
proargmodes => '{i,o,o,o,o,o,o,o,o,o,o}',
proargnames => '{slot_name,slot_name,spill_txns,spill_count,spill_bytes,stream_txns,stream_count,stream_bytes,total_txns,total_bytes,stats_reset}',
prosrc => 'pg_stat_get_replication_slot' },
+{ oid => '8523', descr => 'statistics: information about subscription error',
+ proname => 'pg_stat_get_subscription_error', prorows => '1', proisstrict => 'f',
+ proretset => 't', provolatile => 's', proparallel => 'r',
+ prorettype => 'record', proargtypes => 'oid oid',
+ proallargtypes => '{oid,oid,oid,oid,oid,text,xid,text,int8,timestamptz,text,timestamptz}',
+ proargmodes => '{i,i,o,o,o,o,o,o,o,o,o,o}',
+ proargnames => '{subid,relid,datid,subid,relid,command,xid,failure_source,failure_count,last_failure,last_failure_message,stats_reset}',
+ prosrc => 'pg_stat_get_subscription_error' },
{ oid => '6118', descr => 'statistics: information about subscription',
proname => 'pg_stat_get_subscription', prorows => '10', proisstrict => 'f',
proretset => 't', provolatile => 's', proparallel => 'r',
@@ -5772,6 +5780,11 @@
proname => 'pg_stat_reset_replication_slot', proisstrict => 'f',
provolatile => 'v', prorettype => 'void', proargtypes => 'text',
prosrc => 'pg_stat_reset_replication_slot' },
+{ oid => '8524',
+ descr => 'statistics: reset collected statistics for a single subscription error',
+ proname => 'pg_stat_reset_subscription_error', proisstrict => 'f',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid oid',
+ prosrc => 'pg_stat_reset_subscription_error' },
{ oid => '3163', descr => 'current trigger depth',
proname => 'pg_trigger_depth', provolatile => 's', proparallel => 'r',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 509849c7ff..a6914a24e5 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -14,6 +14,7 @@
#include "datatype/timestamp.h"
#include "portability/instr_time.h"
#include "postmaster/pgarch.h" /* for MAX_XFN_CHARS */
+#include "replication/logicalproto.h"
#include "utils/backend_progress.h" /* for backward compatibility */
#include "utils/backend_status.h" /* for backward compatibility */
#include "utils/hsearch.h"
@@ -66,6 +67,9 @@ typedef enum StatMsgType
PGSTAT_MTYPE_RESETSINGLECOUNTER,
PGSTAT_MTYPE_RESETSLRUCOUNTER,
PGSTAT_MTYPE_RESETREPLSLOTCOUNTER,
+ PGSTAT_MTYPE_SUBSCRIPTIONERR,
+ PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE,
+ PGSTAT_MTYPE_SUBSCRIPTIONPURGE,
PGSTAT_MTYPE_AUTOVAC_START,
PGSTAT_MTYPE_VACUUM,
PGSTAT_MTYPE_ANALYZE,
@@ -530,6 +534,68 @@ typedef struct PgStat_MsgReplSlot
PgStat_Counter m_total_bytes;
} PgStat_MsgReplSlot;
+/* ----------
+ * PgStat_MsgSubscriptionErr Sent by the apply worker or the table sync worker to
+ * update/reset the error happening during logical
+ * replication.
+ * ----------
+ */
+#define PGSTAT_SUBSCRIPTIONERR_MSGLEN 256
+typedef struct PgStat_MsgSubscriptionErr
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * m_subid and m_subrelid are used to determine the subscription and the
+ * reporter of this error. m_subrelid is InvalidOid if reported by the
+ * apply worker, otherwise by the table sync worker. In table sync worker
+ * case, m_subrelid must be the same as m_relid.
+ */
+ Oid m_subid;
+ Oid m_subrelid;
+
+ /* The reset message uses below field */
+ bool m_reset; /* Reset all fields and set reset_stats
+ * timestamp */
+
+ /* The error report message uses below fields */
+ Oid m_databaseid;
+ Oid m_relid;
+ LogicalRepMsgType m_command;
+ TransactionId m_xid;
+ TimestampTz m_failure_time;
+ char m_errmsg[PGSTAT_SUBSCRIPTIONERR_MSGLEN];
+} PgStat_MsgSubscriptionErr;
+
+/* ----------
+ * PgStat_MsgSubscriptionPurge Sent by the autovacuum purge the subscriptions.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionPurge
+{
+ PgStat_MsgHdr m_hdr;
+ int m_nentries;
+ Oid m_subids[PGSTAT_NUM_SUBSCRIPTIONPURGE];
+} PgStat_MsgSubscriptionPurge;
+
+/* ----------
+ * PgStat_MsgSubscriptionErrPurge Sent by the autovacuum purge the subscription
+ * errors.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONERRPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(Oid) - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionErrPurge
+{
+ PgStat_MsgHdr m_hdr;
+ Oid m_subid;
+ int m_nentries;
+ Oid m_relids[PGSTAT_NUM_SUBSCRIPTIONERRPURGE];
+} PgStat_MsgSubscriptionErrPurge;
/* ----------
* PgStat_MsgRecoveryConflict Sent by the backend upon recovery conflict
@@ -701,6 +767,9 @@ typedef union PgStat_Msg
PgStat_MsgChecksumFailure msg_checksumfailure;
PgStat_MsgReplSlot msg_replslot;
PgStat_MsgConn msg_conn;
+ PgStat_MsgSubscriptionErr msg_subscriptionerr;
+ PgStat_MsgSubscriptionErrPurge msg_subscriptionerrpurge;
+ PgStat_MsgSubscriptionPurge msg_subscriptionpurge;
} PgStat_Msg;
@@ -916,6 +985,38 @@ typedef struct PgStat_StatReplSlotEntry
TimestampTz stat_reset_timestamp;
} PgStat_StatReplSlotEntry;
+/*
+ * Subscription statistics kept in the stats collector.
+ */
+typedef struct PgStat_StatSubEntry
+{
+ Oid subid; /* hash table key */
+ HTAB *suberrors;
+} PgStat_StatSubEntry;
+
+/*
+ * Subscription error statistics kept in the stats collector. One entry represents
+ * an error that happened during logical replication, reported by the apply worker
+ * (subrelid is InvalidOid) or by the table sync worker (subrelid is a valid OID).
+ * The error reported by the table sync worker is removed also when the table
+ * synchronization process completed.
+ */
+
+typedef struct PgStat_StatSubErrEntry
+{
+ Oid subrelid; /* InvalidOid if the apply worker, otherwise
+ * the table sync worker. hash table key. */
+ Oid databaseid;
+ Oid relid; /* OID of relation related to the error. Must
+ * be the same as subrelid in the table sync
+ * case. */
+ LogicalRepMsgType command;
+ TransactionId xid;
+ PgStat_Counter failure_count;
+ TimestampTz last_failure;
+ char last_errmsg[PGSTAT_SUBSCRIPTIONERR_MSGLEN];
+ TimestampTz stat_reset_timestamp;
+} PgStat_StatSubErrEntry;
/*
* Working state needed to accumulate per-function-call timing statistics.
@@ -1009,6 +1110,7 @@ extern void pgstat_reset_shared_counters(const char *);
extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type type);
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
+extern void pgstat_reset_subscription_error(Oid subid, Oid subrelid);
extern void pgstat_report_autovac(Oid dboid);
extern void pgstat_report_vacuum(Oid tableoid, bool shared,
@@ -1024,6 +1126,9 @@ extern void pgstat_report_checksum_failure(void);
extern void pgstat_report_replslot(const PgStat_StatReplSlotEntry *repSlotStat);
extern void pgstat_report_replslot_create(const char *slotname);
extern void pgstat_report_replslot_drop(const char *slotname);
+extern void pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command,
+ TransactionId xid, const char *errmsg);
extern void pgstat_initialize(void);
@@ -1122,6 +1227,7 @@ extern PgStat_GlobalStats *pgstat_fetch_global(void);
extern PgStat_WalStats *pgstat_fetch_stat_wal(void);
extern PgStat_SLRUStats *pgstat_fetch_slru(void);
extern PgStat_StatReplSlotEntry *pgstat_fetch_replslot(NameData slotname);
+extern PgStat_StatSubErrEntry *pgstat_fetch_subscription_error(Oid subid, Oid relid);
extern void pgstat_count_slru_page_zeroed(int slru_idx);
extern void pgstat_count_slru_page_hit(int slru_idx);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2fa00a3c29..66b185fc9c 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2094,6 +2094,28 @@ pg_stat_subscription| SELECT su.oid AS subid,
st.latest_end_time
FROM (pg_subscription su
LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid, pid, received_lsn, last_msg_send_time, last_msg_receipt_time, latest_end_lsn, latest_end_time) ON ((st.subid = su.oid)));
+pg_stat_subscription_errors| SELECT d.datname,
+ sr.subid,
+ s.subname,
+ e.relid,
+ e.command,
+ e.xid,
+ e.failure_source,
+ e.failure_count,
+ e.last_failure,
+ e.last_failure_message,
+ e.stats_reset
+ FROM ( SELECT pg_subscription.oid AS subid,
+ NULL::oid AS relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT pg_subscription_rel.srsubid AS subid,
+ pg_subscription_rel.srrelid AS relid
+ FROM pg_subscription_rel
+ WHERE (pg_subscription_rel.srsubstate <> 'r'::"char")) sr,
+ ((LATERAL pg_stat_get_subscription_error(sr.subid, sr.relid) e(datid, subid, relid, command, xid, failure_source, failure_count, last_failure, last_failure_message, stats_reset)
+ JOIN pg_database d ON ((e.datid = d.oid)))
+ JOIN pg_subscription s ON ((e.subid = s.oid)));
pg_stat_sys_indexes| SELECT pg_stat_all_indexes.relid,
pg_stat_all_indexes.indexrelid,
pg_stat_all_indexes.schemaname,
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 621d0cb4da..0859a791fa 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1938,6 +1938,9 @@ PgStat_MsgResetsharedcounter
PgStat_MsgResetsinglecounter
PgStat_MsgResetslrucounter
PgStat_MsgSLRU
+PgStat_MsgSubscriptionErr
+PgStat_MsgSubscriptionErrPurge
+PgStat_MsgSubscriptionPurge
PgStat_MsgTabpurge
PgStat_MsgTabstat
PgStat_MsgTempFile
@@ -1949,6 +1952,8 @@ PgStat_Single_Reset_Type
PgStat_StatDBEntry
PgStat_StatFuncEntry
PgStat_StatReplSlotEntry
+PgStat_StatSubEntry
+PgStat_StatSubErrEntry
PgStat_StatTabEntry
PgStat_SubXactStatus
PgStat_TableCounts
--
2.24.3 (Apple Git-128)
On Monday, August 23, 2021 11:09 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached updated patches. Please review them.
I tested v10-0001 patch in both streaming and no-streaming more. All tests works well.
I also tried two-phase commit feature, the error context was set as expected,
but please allow me to propose a fix suggestion on the error description:
CONTEXT: processing remote data during "INSERT" for replication target relation
"public.test" in transaction 714 with commit timestamp 2021-08-24
13:20:22.480532+08
It said "commit timestamp", but for 2pc feature, the timestamp could be "prepare timestamp" or "rollback timestamp", too.
Could we make some change to make the error log more comprehensive?
Regards
Tang
On Tue, Aug 24, 2021 at 11:44 AM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
On Monday, August 23, 2021 11:09 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached updated patches. Please review them.
I tested v10-0001 patch in both streaming and no-streaming more. All tests works well.
I also tried two-phase commit feature, the error context was set as expected,
but please allow me to propose a fix suggestion on the error description:CONTEXT: processing remote data during "INSERT" for replication target relation
"public.test" in transaction 714 with commit timestamp 2021-08-24
13:20:22.480532+08It said "commit timestamp", but for 2pc feature, the timestamp could be "prepare timestamp" or "rollback timestamp", too.
Could we make some change to make the error log more comprehensive?
I think we can write something like: (processing remote data during
"INSERT" for replication target relation "public.test" in transaction
714 at 2021-08-24 13:20:22.480532+08). Basically replacing "with
commit timestamp" with "at". This is similar to what we do
test_decoding module for transaction timestamp. The other idea could
be we print the exact operation like commit/prepare/rollback which is
also possible because we have that information while setting context
info but that might add a bit more complexity which I don't think is
worth it.
One more point about the v10-0001* patch: From the commit message
"Add logical changes details to errcontext of apply worker errors.",
it appears that the context will be added only for the apply worker
but won't it get added for tablesync worker as well during its sync
phase (when it tries to catch up with apply worker)?
--
With Regards,
Amit Kapila.
On Tue, Aug 24, 2021 at 10:05 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Aug 24, 2021 at 11:44 AM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:On Monday, August 23, 2021 11:09 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached updated patches. Please review them.
I tested v10-0001 patch in both streaming and no-streaming more. All tests works well.
I also tried two-phase commit feature, the error context was set as expected,
but please allow me to propose a fix suggestion on the error description:
Thank you for the suggestion!
CONTEXT: processing remote data during "INSERT" for replication target relation
"public.test" in transaction 714 with commit timestamp 2021-08-24
13:20:22.480532+08It said "commit timestamp", but for 2pc feature, the timestamp could be "prepare timestamp" or "rollback timestamp", too.
Could we make some change to make the error log more comprehensive?I think we can write something like: (processing remote data during
"INSERT" for replication target relation "public.test" in transaction
714 at 2021-08-24 13:20:22.480532+08). Basically replacing "with
commit timestamp" with "at". This is similar to what we do
test_decoding module for transaction timestamp.
+1
The other idea could
be we print the exact operation like commit/prepare/rollback which is
also possible because we have that information while setting context
info but that might add a bit more complexity which I don't think is
worth it.
Agreed.
I replaced "with commit timestamp" with "at" and rename 'commit_ts'
field name to 'ts'.
One more point about the v10-0001* patch: From the commit message
"Add logical changes details to errcontext of apply worker errors.",
it appears that the context will be added only for the apply worker
but won't it get added for tablesync worker as well during its sync
phase (when it tries to catch up with apply worker)?
Right. I've updated the message.
Attached updated version patches. Please review them.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v11-0001-Add-logical-change-details-to-logical-replicatio.patchapplication/octet-stream; name=v11-0001-Add-logical-change-details-to-logical-replicatio.patchDownload
From cb0fd795bb4c5ebf5623c66aca5db02c417e5fd6 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 28 Jun 2021 13:21:58 +0900
Subject: [PATCH v11 1/5] Add logical change details to logical replication
worker errcontext.
Previously, on the subscriber, we set the error context callback to
add the error context to the error of the tuple data conversion
failures. This commit replaces the existing error context callback
with a comprehensive error context callback so that it shows not only
the details of data conversion failures but also the details of
logical change being applied by apply worker or table sync worker.
The additional information displayed will be the command, transaction
id, and timestamp.
The error context callback is set when entering the main apply loop.
We incrementaly update the fields during applying changes. The error
context is added to an error only when applying a change but not when
other work such as receiving data etc.
This will help users in diagnosing the problems that occur during
logical replication. It also can be used by the follow-up commit that
enables to skip the particular transaction on the subscriber.
---
src/backend/replication/logical/proto.c | 53 +++++
src/backend/replication/logical/worker.c | 255 ++++++++++++++++-------
src/include/replication/logicalproto.h | 1 +
src/tools/pgindent/typedefs.list | 2 +-
4 files changed, 231 insertions(+), 80 deletions(-)
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 9732982d93..cdbc6838cc 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -1156,3 +1156,56 @@ logicalrep_read_stream_abort(StringInfo in, TransactionId *xid,
*xid = pq_getmsgint(in, 4);
*subxid = pq_getmsgint(in, 4);
}
+
+/*
+ * Get string representing LogicalRepMsgType.
+ */
+char *
+logicalrep_message_type(LogicalRepMsgType action)
+{
+ switch (action)
+ {
+ case LOGICAL_REP_MSG_BEGIN:
+ return "BEGIN";
+ case LOGICAL_REP_MSG_COMMIT:
+ return "COMMIT";
+ case LOGICAL_REP_MSG_ORIGIN:
+ return "ORIGIN";
+ case LOGICAL_REP_MSG_INSERT:
+ return "INSERT";
+ case LOGICAL_REP_MSG_UPDATE:
+ return "UPDATE";
+ case LOGICAL_REP_MSG_DELETE:
+ return "DELETE";
+ case LOGICAL_REP_MSG_TRUNCATE:
+ return "TRUNCATE";
+ case LOGICAL_REP_MSG_RELATION:
+ return "RELATION";
+ case LOGICAL_REP_MSG_TYPE:
+ return "TYPE";
+ case LOGICAL_REP_MSG_MESSAGE:
+ return "MESSAGE";
+ case LOGICAL_REP_MSG_BEGIN_PREPARE:
+ return "BEGIN PREPARE";
+ case LOGICAL_REP_MSG_PREPARE:
+ return "PREPARE";
+ case LOGICAL_REP_MSG_COMMIT_PREPARED:
+ return "COMMIT PREPARED";
+ case LOGICAL_REP_MSG_ROLLBACK_PREPARED:
+ return "ROLLBACK PREPARED";
+ case LOGICAL_REP_MSG_STREAM_START:
+ return "STREAM START";
+ case LOGICAL_REP_MSG_STREAM_STOP:
+ return "STREAM STOP";
+ case LOGICAL_REP_MSG_STREAM_COMMIT:
+ return "STREAM COMMIT";
+ case LOGICAL_REP_MSG_STREAM_ABORT:
+ return "STREAM ABORT";
+ case LOGICAL_REP_MSG_STREAM_PREPARE:
+ return "STREAM PREPARE";
+ }
+
+ elog(ERROR, "invalid logical replication message type \"%c\"", action);
+
+ return NULL; /* keep compiler quiet */
+}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 38b493e4f5..c24cd7db1b 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -203,12 +203,6 @@ typedef struct FlushPosition
static dlist_head lsn_mapping = DLIST_STATIC_INIT(lsn_mapping);
-typedef struct SlotErrCallbackArg
-{
- LogicalRepRelMapEntry *rel;
- int remote_attnum;
-} SlotErrCallbackArg;
-
typedef struct ApplyExecutionData
{
EState *estate; /* executor state, used to track resources */
@@ -221,6 +215,27 @@ typedef struct ApplyExecutionData
PartitionTupleRouting *proute; /* partition routing info */
} ApplyExecutionData;
+/* Struct for saving and restoring apply information */
+typedef struct ApplyErrorCallbackArg
+{
+ LogicalRepMsgType command; /* 0 if invalid */
+ LogicalRepRelMapEntry *rel;
+
+ /* Remote information */
+ int remote_attnum; /* -1 if invalid */
+ TransactionId remote_xid;
+ TimestampTz ts; /* commit, rollback, or prepare timestamp */
+} ApplyErrorCallbackArg;
+
+static ApplyErrorCallbackArg apply_error_callback_arg =
+{
+ .command = 0,
+ .rel = NULL,
+ .remote_attnum = -1,
+ .remote_xid = InvalidTransactionId,
+ .ts = 0,
+};
+
/*
* Stream xid hash entry. Whenever we see a new xid we create this entry in the
* xidhash and along with it create the streaming file and store the fileset handle.
@@ -335,6 +350,11 @@ static void TwoPhaseTransactionGid(Oid subid, TransactionId xid, char *gid, int
/* Common streaming function to apply all the spooled messages */
static void apply_spooled_messages(TransactionId xid, XLogRecPtr lsn);
+/* Functions for apply error callback */
+static void apply_error_callback(void *arg);
+static inline void set_apply_error_context_xact(TransactionId xid, TimestampTz ts);
+static inline void reset_apply_error_context_info(void);
+
/*
* Should this worker apply changes for given relation.
*
@@ -580,26 +600,6 @@ slot_fill_defaults(LogicalRepRelMapEntry *rel, EState *estate,
ExecEvalExpr(defexprs[i], econtext, &slot->tts_isnull[defmap[i]]);
}
-/*
- * Error callback to give more context info about data conversion failures
- * while reading data from the remote server.
- */
-static void
-slot_store_error_callback(void *arg)
-{
- SlotErrCallbackArg *errarg = (SlotErrCallbackArg *) arg;
- LogicalRepRelMapEntry *rel;
-
- /* Nothing to do if remote attribute number is not set */
- if (errarg->remote_attnum < 0)
- return;
-
- rel = errarg->rel;
- errcontext("processing remote data for replication target relation \"%s.%s\" column \"%s\"",
- rel->remoterel.nspname, rel->remoterel.relname,
- rel->remoterel.attnames[errarg->remote_attnum]);
-}
-
/*
* Store tuple data into slot.
*
@@ -611,19 +611,9 @@ slot_store_data(TupleTableSlot *slot, LogicalRepRelMapEntry *rel,
{
int natts = slot->tts_tupleDescriptor->natts;
int i;
- SlotErrCallbackArg errarg;
- ErrorContextCallback errcallback;
ExecClearTuple(slot);
- /* Push callback + info on the error context stack */
- errarg.rel = rel;
- errarg.remote_attnum = -1;
- errcallback.callback = slot_store_error_callback;
- errcallback.arg = (void *) &errarg;
- errcallback.previous = error_context_stack;
- error_context_stack = &errcallback;
-
/* Call the "in" function for each non-dropped, non-null attribute */
Assert(natts == rel->attrmap->maplen);
for (i = 0; i < natts; i++)
@@ -637,7 +627,8 @@ slot_store_data(TupleTableSlot *slot, LogicalRepRelMapEntry *rel,
Assert(remoteattnum < tupleData->ncols);
- errarg.remote_attnum = remoteattnum;
+ /* Set attnum for error callback */
+ apply_error_callback_arg.remote_attnum = remoteattnum;
if (tupleData->colstatus[remoteattnum] == LOGICALREP_COLUMN_TEXT)
{
@@ -685,7 +676,8 @@ slot_store_data(TupleTableSlot *slot, LogicalRepRelMapEntry *rel,
slot->tts_isnull[i] = true;
}
- errarg.remote_attnum = -1;
+ /* Reset attnum for error callback */
+ apply_error_callback_arg.remote_attnum = -1;
}
else
{
@@ -699,9 +691,6 @@ slot_store_data(TupleTableSlot *slot, LogicalRepRelMapEntry *rel,
}
}
- /* Pop the error context stack */
- error_context_stack = errcallback.previous;
-
ExecStoreVirtualTuple(slot);
}
@@ -724,8 +713,6 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
{
int natts = slot->tts_tupleDescriptor->natts;
int i;
- SlotErrCallbackArg errarg;
- ErrorContextCallback errcallback;
/* We'll fill "slot" with a virtual tuple, so we must start with ... */
ExecClearTuple(slot);
@@ -739,14 +726,6 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
memcpy(slot->tts_values, srcslot->tts_values, natts * sizeof(Datum));
memcpy(slot->tts_isnull, srcslot->tts_isnull, natts * sizeof(bool));
- /* For error reporting, push callback + info on the error context stack */
- errarg.rel = rel;
- errarg.remote_attnum = -1;
- errcallback.callback = slot_store_error_callback;
- errcallback.arg = (void *) &errarg;
- errcallback.previous = error_context_stack;
- error_context_stack = &errcallback;
-
/* Call the "in" function for each replaced attribute */
Assert(natts == rel->attrmap->maplen);
for (i = 0; i < natts; i++)
@@ -763,7 +742,8 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
{
StringInfo colvalue = &tupleData->colvalues[remoteattnum];
- errarg.remote_attnum = remoteattnum;
+ /* Set attnum for error callback */
+ apply_error_callback_arg.remote_attnum = remoteattnum;
if (tupleData->colstatus[remoteattnum] == LOGICALREP_COLUMN_TEXT)
{
@@ -807,13 +787,11 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
slot->tts_isnull[i] = true;
}
- errarg.remote_attnum = -1;
+ /* Reset attnum for error callback */
+ apply_error_callback_arg.remote_attnum = -1;
}
}
- /* Pop the error context stack */
- error_context_stack = errcallback.previous;
-
/* And finally, declare that "slot" contains a valid virtual tuple */
ExecStoreVirtualTuple(slot);
}
@@ -827,6 +805,7 @@ apply_handle_begin(StringInfo s)
LogicalRepBeginData begin_data;
logicalrep_read_begin(s, &begin_data);
+ set_apply_error_context_xact(begin_data.xid, begin_data.committime);
remote_final_lsn = begin_data.final_lsn;
@@ -860,6 +839,7 @@ apply_handle_commit(StringInfo s)
process_syncing_tables(commit_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -877,6 +857,7 @@ apply_handle_begin_prepare(StringInfo s)
errmsg_internal("tablesync worker received a BEGIN PREPARE message")));
logicalrep_read_begin_prepare(s, &begin_data);
+ set_apply_error_context_xact(begin_data.xid, begin_data.prepare_time);
remote_final_lsn = begin_data.prepare_lsn;
@@ -962,6 +943,7 @@ apply_handle_prepare(StringInfo s)
process_syncing_tables(prepare_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -974,6 +956,7 @@ apply_handle_commit_prepared(StringInfo s)
char gid[GIDSIZE];
logicalrep_read_commit_prepared(s, &prepare_data);
+ set_apply_error_context_xact(prepare_data.xid, prepare_data.commit_time);
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
@@ -1001,6 +984,7 @@ apply_handle_commit_prepared(StringInfo s)
process_syncing_tables(prepare_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1013,6 +997,7 @@ apply_handle_rollback_prepared(StringInfo s)
char gid[GIDSIZE];
logicalrep_read_rollback_prepared(s, &rollback_data);
+ set_apply_error_context_xact(rollback_data.xid, rollback_data.rollback_time);
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, rollback_data.xid,
@@ -1050,6 +1035,7 @@ apply_handle_rollback_prepared(StringInfo s)
process_syncing_tables(rollback_data.rollback_end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1076,6 +1062,7 @@ apply_handle_stream_prepare(StringInfo s)
errmsg_internal("tablesync worker received a STREAM PREPARE message")));
logicalrep_read_stream_prepare(s, &prepare_data);
+ set_apply_error_context_xact(prepare_data.xid, prepare_data.prepare_time);
elog(DEBUG1, "received prepare for streamed transaction %u", prepare_data.xid);
@@ -1100,6 +1087,8 @@ apply_handle_stream_prepare(StringInfo s)
process_syncing_tables(prepare_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+
+ reset_apply_error_context_info();
}
/*
@@ -1156,6 +1145,8 @@ apply_handle_stream_start(StringInfo s)
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg_internal("invalid transaction ID in streamed replication transaction")));
+ set_apply_error_context_xact(stream_xid, 0);
+
/*
* Initialize the xidhash table if we haven't yet. This will be used for
* the entire duration of the apply worker so create it in permanent
@@ -1212,6 +1203,7 @@ apply_handle_stream_stop(StringInfo s)
MemoryContextReset(LogicalStreamingContext);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1235,7 +1227,10 @@ apply_handle_stream_abort(StringInfo s)
* just delete the files with serialized info.
*/
if (xid == subxid)
+ {
+ set_apply_error_context_xact(xid, 0);
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
+ }
else
{
/*
@@ -1260,6 +1255,8 @@ apply_handle_stream_abort(StringInfo s)
char path[MAXPGPATH];
StreamXidHash *ent;
+ set_apply_error_context_xact(subxid, 0);
+
subidx = -1;
begin_replication_step();
subxact_info_read(MyLogicalRepWorker->subid, xid);
@@ -1284,6 +1281,7 @@ apply_handle_stream_abort(StringInfo s)
cleanup_subxact_info();
end_replication_step();
CommitTransactionCommand();
+ reset_apply_error_context_info();
return;
}
@@ -1315,6 +1313,8 @@ apply_handle_stream_abort(StringInfo s)
end_replication_step();
CommitTransactionCommand();
}
+
+ reset_apply_error_context_info();
}
/*
@@ -1459,6 +1459,7 @@ apply_handle_stream_commit(StringInfo s)
errmsg_internal("STREAM COMMIT message without STREAM STOP")));
xid = logicalrep_read_stream_commit(s, &commit_data);
+ set_apply_error_context_xact(xid, commit_data.committime);
elog(DEBUG1, "received commit for streamed transaction %u", xid);
@@ -1473,6 +1474,8 @@ apply_handle_stream_commit(StringInfo s)
process_syncing_tables(commit_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+
+ reset_apply_error_context_info();
}
/*
@@ -1592,6 +1595,9 @@ apply_handle_insert(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ apply_error_callback_arg.rel = rel;
+
/* Initialize the executor state. */
edata = create_edata_for_relation(rel);
estate = edata->estate;
@@ -1615,6 +1621,9 @@ apply_handle_insert(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ apply_error_callback_arg.rel = NULL;
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -1713,6 +1722,9 @@ apply_handle_update(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ apply_error_callback_arg.rel = rel;
+
/* Check if we can do the update. */
check_relation_updatable(rel);
@@ -1766,6 +1778,9 @@ apply_handle_update(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ apply_error_callback_arg.rel = NULL;
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -1869,6 +1884,9 @@ apply_handle_delete(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ apply_error_callback_arg.rel = rel;
+
/* Check if we can do the delete. */
check_relation_updatable(rel);
@@ -1894,6 +1912,9 @@ apply_handle_delete(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ apply_error_callback_arg.rel = NULL;
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -2328,44 +2349,53 @@ static void
apply_dispatch(StringInfo s)
{
LogicalRepMsgType action = pq_getmsgbyte(s);
+ LogicalRepMsgType saved_command;
+
+ /*
+ * Set the current command being applied. Since this function can be
+ * called recusively when applying spooled changes, save the current
+ * command.
+ */
+ saved_command = apply_error_callback_arg.command;
+ apply_error_callback_arg.command = action;
switch (action)
{
case LOGICAL_REP_MSG_BEGIN:
apply_handle_begin(s);
- return;
+ break;
case LOGICAL_REP_MSG_COMMIT:
apply_handle_commit(s);
- return;
+ break;
case LOGICAL_REP_MSG_INSERT:
apply_handle_insert(s);
- return;
+ break;
case LOGICAL_REP_MSG_UPDATE:
apply_handle_update(s);
- return;
+ break;
case LOGICAL_REP_MSG_DELETE:
apply_handle_delete(s);
- return;
+ break;
case LOGICAL_REP_MSG_TRUNCATE:
apply_handle_truncate(s);
- return;
+ break;
case LOGICAL_REP_MSG_RELATION:
apply_handle_relation(s);
- return;
+ break;
case LOGICAL_REP_MSG_TYPE:
apply_handle_type(s);
- return;
+ break;
case LOGICAL_REP_MSG_ORIGIN:
apply_handle_origin(s);
- return;
+ break;
case LOGICAL_REP_MSG_MESSAGE:
@@ -2374,49 +2404,52 @@ apply_dispatch(StringInfo s)
* Although, it could be used by other applications that use this
* output plugin.
*/
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_START:
apply_handle_stream_start(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_STOP:
apply_handle_stream_stop(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_ABORT:
apply_handle_stream_abort(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_COMMIT:
apply_handle_stream_commit(s);
- return;
+ break;
case LOGICAL_REP_MSG_BEGIN_PREPARE:
apply_handle_begin_prepare(s);
- return;
+ break;
case LOGICAL_REP_MSG_PREPARE:
apply_handle_prepare(s);
- return;
+ break;
case LOGICAL_REP_MSG_COMMIT_PREPARED:
apply_handle_commit_prepared(s);
- return;
+ break;
case LOGICAL_REP_MSG_ROLLBACK_PREPARED:
apply_handle_rollback_prepared(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_PREPARE:
apply_handle_stream_prepare(s);
- return;
+ break;
+
+ default:
+ ereport(ERROR,
+ (errcode(ERRCODE_PROTOCOL_VIOLATION),
+ errmsg("invalid logical replication message type \"%c\"", action)));
}
- ereport(ERROR,
- (errcode(ERRCODE_PROTOCOL_VIOLATION),
- errmsg_internal("invalid logical replication message type \"%c\"",
- action)));
+ /* Reset the current command */
+ apply_error_callback_arg.command = saved_command;
}
/*
@@ -2517,6 +2550,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
TimestampTz last_recv_timestamp = GetCurrentTimestamp();
bool ping_sent = false;
TimeLineID tli;
+ ErrorContextCallback errcallback;
/*
* Init the ApplyMessageContext which we clean up after each replication
@@ -2537,6 +2571,14 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
/* mark as idle, before starting to loop */
pgstat_report_activity(STATE_IDLE, NULL);
+ /*
+ * Push apply error context callback. Fields will be filled during
+ * applying a change.
+ */
+ errcallback.callback = apply_error_callback;
+ errcallback.previous = error_context_stack;
+ error_context_stack = &errcallback;
+
/* This outer loop iterates once per wait. */
for (;;)
{
@@ -2737,6 +2779,9 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
}
+ /* Pop the error context stack */
+ error_context_stack = errcallback.previous;
+
/* All done */
walrcv_endstreaming(LogRepWorkerWalRcvConn, &tli);
}
@@ -3649,3 +3694,55 @@ IsLogicalWorker(void)
{
return MyLogicalRepWorker != NULL;
}
+
+/* Error callback to give more context info about the change being applied */
+static void
+apply_error_callback(void *arg)
+{
+ StringInfoData buf;
+ ApplyErrorCallbackArg *errarg = &apply_error_callback_arg;
+
+ if (apply_error_callback_arg.command == 0)
+ return;
+
+ initStringInfo(&buf);
+ appendStringInfo(&buf, _("processing remote data during \"%s\""),
+ logicalrep_message_type(errarg->command));
+
+ if (errarg->rel)
+ appendStringInfo(&buf, _(" for replication target relation \"%s.%s\""),
+ errarg->rel->remoterel.nspname,
+ errarg->rel->remoterel.relname);
+
+ if (errarg->remote_attnum >= 0)
+ appendStringInfo(&buf, _(" column \"%s\""),
+ errarg->rel->remoterel.attnames[errarg->remote_attnum]);
+
+ if (TransactionIdIsNormal(errarg->remote_xid))
+ appendStringInfo(&buf, _(" in transaction %u at %s"),
+ errarg->remote_xid,
+ errarg->ts == 0
+ ? "(unset)"
+ : timestamptz_to_str(errarg->ts));
+
+ errcontext("%s", buf.data);
+ pfree(buf.data);
+}
+
+/* Set transaction information of apply error callback */
+static inline void
+set_apply_error_context_xact(TransactionId xid, TimestampTz ts)
+{
+ apply_error_callback_arg.remote_xid = xid;
+ apply_error_callback_arg.ts = ts;
+}
+
+/* Reset all information of apply error callback */
+static inline void
+reset_apply_error_context_info(void)
+{
+ apply_error_callback_arg.command = 0;
+ apply_error_callback_arg.rel = NULL;
+ apply_error_callback_arg.remote_attnum = -1;
+ set_apply_error_context_xact(InvalidTransactionId, 0);
+}
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 95c1561ca0..83741dcf42 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -246,5 +246,6 @@ extern void logicalrep_write_stream_abort(StringInfo out, TransactionId xid,
TransactionId subxid);
extern void logicalrep_read_stream_abort(StringInfo in, TransactionId *xid,
TransactionId *subxid);
+extern char *logicalrep_message_type(LogicalRepMsgType action);
#endif /* LOGICAL_PROTO_H */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 37cf4b2f76..621d0cb4da 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -113,6 +113,7 @@ Append
AppendPath
AppendRelInfo
AppendState
+ApplyErrorCallbackArg
ApplyExecutionData
ApplySubXactData
Archive
@@ -2423,7 +2424,6 @@ SlabBlock
SlabChunk
SlabContext
SlabSlot
-SlotErrCallbackArg
SlotNumber
SlruCtl
SlruCtlData
--
2.24.3 (Apple Git-128)
v11-0003-Add-RESET-command-to-ALTER-SUBSCRIPTION-command.patchapplication/octet-stream; name=v11-0003-Add-RESET-command-to-ALTER-SUBSCRIPTION-command.patchDownload
From 661554fc2b24ded4ed0db07cadfafb98ad9c6ce2 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 2 Aug 2021 14:23:18 +0900
Subject: [PATCH v11 3/5] Add RESET command to ALTER SUBSCRIPTION command.
ALTER SUBSCRIPTION ... RESET command resets subscription
parameters. The parameters that can be set are streaming, binary,
synchronous_commit.
RESET command is reuiqred by follow-up commit introducing to a new
parameter skip_xid to reset.
---
doc/src/sgml/ref/alter_subscription.sgml | 8 ++-
src/backend/commands/subscriptioncmds.c | 78 +++++++++++++++++-----
src/backend/parser/gram.y | 11 ++-
src/include/nodes/parsenodes.h | 5 +-
src/test/regress/expected/subscription.out | 14 +++-
src/test/regress/sql/subscription.sql | 13 ++++
6 files changed, 109 insertions(+), 20 deletions(-)
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index a6f994450d..8c3c28b7e7 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -29,6 +29,7 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> REFRESH PUB
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> ENABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> DISABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <replaceable>new_name</replaceable>
</synopsis>
@@ -192,16 +193,21 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
<varlistentry>
<term><literal>SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )</literal></term>
+ <term><literal>RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )</literal></term>
<listitem>
<para>
This clause alters parameters originally set by
<xref linkend="sql-createsubscription"/>. See there for more
- information. The parameters that can be altered
+ information. The parameters that can be set
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, and
<literal>streaming</literal>.
</para>
+ <para>
+ The parameters that can be reset are: <literal>streaming</literal>,
+ <literal>binary</literal>, <literal>synchronous_commit</literal>.
+ </para>
</listitem>
</varlistentry>
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 5157f44058..cc390ce95a 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -99,7 +99,8 @@ static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname,
*/
static void
parse_subscription_options(ParseState *pstate, List *stmt_options,
- bits32 supported_opts, SubOpts *opts)
+ bits32 supported_opts, SubOpts *opts,
+ bool is_reset)
{
ListCell *lc;
@@ -134,6 +135,11 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
{
DefElem *defel = (DefElem *) lfirst(lc);
+ if (is_reset && defel->arg != NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("RESET must not include values for parameters")));
+
if (IsSet(supported_opts, SUBOPT_CONNECT) &&
strcmp(defel->defname, "connect") == 0)
{
@@ -192,12 +198,18 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_SYNCHRONOUS_COMMIT;
- opts->synchronous_commit = defGetString(defel);
+ if (!is_reset)
+ {
+ opts->synchronous_commit = defGetString(defel);
- /* Test if the given value is valid for synchronous_commit GUC. */
- (void) set_config_option("synchronous_commit", opts->synchronous_commit,
- PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
- false, 0, false);
+ /*
+ * Test if the given value is valid for synchronous_commit
+ * GUC.
+ */
+ (void) set_config_option("synchronous_commit", opts->synchronous_commit,
+ PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
+ false, 0, false);
+ }
}
else if (IsSet(supported_opts, SUBOPT_REFRESH) &&
strcmp(defel->defname, "refresh") == 0)
@@ -215,7 +227,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_BINARY;
- opts->binary = defGetBoolean(defel);
+ if (!is_reset)
+ opts->binary = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_STREAMING) &&
strcmp(defel->defname, "streaming") == 0)
@@ -224,7 +237,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_STREAMING;
- opts->streaming = defGetBoolean(defel);
+ if (!is_reset)
+ opts->streaming = defGetBoolean(defel);
}
else if (strcmp(defel->defname, "two_phase") == 0)
{
@@ -397,7 +411,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT);
- parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
+ parse_subscription_options(pstate, stmt->options, supported_opts, &opts,
+ false);
/*
* Since creating a replication slot is not transactional, rolling back
@@ -866,14 +881,14 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
switch (stmt->kind)
{
- case ALTER_SUBSCRIPTION_OPTIONS:
+ case ALTER_SUBSCRIPTION_SET_OPTIONS:
{
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING);
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
if (IsSet(opts.specified_opts, SUBOPT_SLOT_NAME))
{
@@ -923,10 +938,43 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
break;
}
+ case ALTER_SUBSCRIPTION_RESET_OPTIONS:
+ {
+ supported_opts = (SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
+ SUBOPT_STREAMING);
+
+ parse_subscription_options(pstate, stmt->options,
+ supported_opts, &opts, true);
+
+ if (IsSet(opts.specified_opts, SUBOPT_SYNCHRONOUS_COMMIT))
+ {
+ values[Anum_pg_subscription_subsynccommit - 1] =
+ CStringGetTextDatum("off");
+ replaces[Anum_pg_subscription_subsynccommit - 1] = true;
+ }
+
+ if (IsSet(opts.specified_opts, SUBOPT_BINARY))
+ {
+ values[Anum_pg_subscription_subbinary - 1] =
+ BoolGetDatum(false);
+ replaces[Anum_pg_subscription_subbinary - 1] = true;
+ }
+
+ if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
+ {
+ values[Anum_pg_subscription_substream - 1] =
+ BoolGetDatum(false);
+ replaces[Anum_pg_subscription_substream - 1] = true;
+ }
+
+ update_tuple = true;
+ break;
+ }
case ALTER_SUBSCRIPTION_ENABLED:
{
parse_subscription_options(pstate, stmt->options,
- SUBOPT_ENABLED, &opts);
+ SUBOPT_ENABLED, &opts, false);
+
Assert(IsSet(opts.specified_opts, SUBOPT_ENABLED));
if (!sub->slotname && opts.enabled)
@@ -961,7 +1009,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
{
supported_opts = SUBOPT_COPY_DATA | SUBOPT_REFRESH;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
values[Anum_pg_subscription_subpublications - 1] =
publicationListToArray(stmt->publication);
@@ -1011,7 +1059,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts |= SUBOPT_COPY_DATA;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
publist = merge_publications(sub->publications, stmt->publication, isadd, stmt->subname);
values[Anum_pg_subscription_subpublications - 1] =
@@ -1059,7 +1107,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
errmsg("ALTER SUBSCRIPTION ... REFRESH is not allowed for disabled subscriptions")));
parse_subscription_options(pstate, stmt->options,
- SUBOPT_COPY_DATA, &opts);
+ SUBOPT_COPY_DATA, &opts, false);
/*
* The subscription option "two_phase" requires that
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 39a2849eba..bcf85e8980 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9707,7 +9707,16 @@ AlterSubscriptionStmt:
{
AlterSubscriptionStmt *n =
makeNode(AlterSubscriptionStmt);
- n->kind = ALTER_SUBSCRIPTION_OPTIONS;
+ n->kind = ALTER_SUBSCRIPTION_SET_OPTIONS;
+ n->subname = $3;
+ n->options = $5;
+ $$ = (Node *)n;
+ }
+ | ALTER SUBSCRIPTION name RESET definition
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_RESET_OPTIONS;
n->subname = $3;
n->options = $5;
$$ = (Node *)n;
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 7af13dee43..3f55d63425 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3659,7 +3659,8 @@ typedef struct CreateSubscriptionStmt
typedef enum AlterSubscriptionType
{
- ALTER_SUBSCRIPTION_OPTIONS,
+ ALTER_SUBSCRIPTION_SET_OPTIONS,
+ ALTER_SUBSCRIPTION_RESET_OPTIONS,
ALTER_SUBSCRIPTION_CONNECTION,
ALTER_SUBSCRIPTION_SET_PUBLICATION,
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
@@ -3671,7 +3672,7 @@ typedef enum AlterSubscriptionType
typedef struct AlterSubscriptionStmt
{
NodeTag type;
- AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_OPTIONS, etc */
+ AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_SET_OPTIONS, etc */
char *subname; /* Name of the subscription */
char *conninfo; /* Connection string to publisher */
List *publication; /* One or more publication to subscribe to */
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 77b4437b69..b87f67fe55 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -284,11 +284,23 @@ ALTER SUBSCRIPTION regress_testsub SET (two_phase = false);
ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+-- ok
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit);
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit, binary, streaming);
+-- fail - unsupported parameters
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ERROR: unrecognized subscription parameter: "connect"
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+ERROR: unrecognized subscription parameter: "enabled"
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+ERROR: RESET must not include values for parameters
\dRs+
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index d42104c191..aa90560691 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -218,6 +218,19 @@ ALTER SUBSCRIPTION regress_testsub SET (two_phase = false);
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+
+-- ok
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit);
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit, binary, streaming);
+
+-- fail - unsupported parameters
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+
\dRs+
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
--
2.24.3 (Apple Git-128)
v11-0004-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchapplication/octet-stream; name=v11-0004-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchDownload
From 24126e3765fec4f9706185c6b7e35b1da14ae653 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 2 Aug 2021 14:27:40 +0900
Subject: [PATCH v11 4/5] Add skip_xid option to ALTER SUBSCRIPTION.
If incoming change violates any constraint, logical replication stops
until it's resolved. This commit introduces another way to skip the
transaction in question.
The user can specify XID by ALTER SUBSCRIPTION ... SET (skip_xid =
XXX), updating pg_subscription.subskipxid field, telling the apply
worker to skip the transaction. The apply worker skips all data
modification changes within the specified transaction.
After skipping the transaciton the apply worker clears
pg_subscription.subskipxid. Also, it clears the error statistics of
the subscription in pg_stat_subscription_errors system view as well in
order the user not to get confused. It's done by sending the message
for clearing a subscription error to the stats collector.
---
doc/src/sgml/logical-replication.sgml | 49 ++++-
doc/src/sgml/ref/alter_subscription.sgml | 32 ++-
src/backend/catalog/pg_subscription.c | 10 +
src/backend/commands/subscriptioncmds.c | 45 +++-
src/backend/postmaster/pgstat.c | 44 +++-
src/backend/replication/logical/worker.c | 201 ++++++++++++++++-
src/include/catalog/pg_subscription.h | 4 +
src/include/nodes/parsenodes.h | 1 +
src/include/pgstat.h | 7 +-
src/test/regress/expected/subscription.out | 13 ++
src/test/regress/sql/subscription.sql | 11 +
src/test/subscription/t/024_skip_xact.pl | 244 +++++++++++++++++++++
12 files changed, 636 insertions(+), 25 deletions(-)
create mode 100644 src/test/subscription/t/024_skip_xact.pl
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 88646bc859..d558dcfe81 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -333,20 +333,63 @@
<para>
A conflict will produce an error and will stop the replication; it must be
resolved manually by the user. Details about the conflict can be found in
- the subscriber's server log.
+ <xref linkend="monitoring-pg-stat-subscription-errors"/> as well as the
+ subscriber's server log.
</para>
<para>
The resolution can be done either by changing data on the subscriber so
that it does not conflict with the incoming change or by skipping the
- transaction that conflicts with the existing data. The transaction can be
- skipped by calling the <link linkend="pg-replication-origin-advance">
+ transaction that conflicts with the existing data. When a conflict produce
+ an error, it is shown in <structname>pg_stat_subscription_errors</structname>
+ view as follows:
+ </para>
+
+ <programlisting>
+postgres=# SELECT * FROM pg_stat_subscription_errors;
+-[ RECORD 1 ]--------+-----------------------------------------------------------
+datname | postgres
+subid | 16395
+subname | test_sub
+relid | 16385
+command | INSERT
+xid | 716
+failure_source | apply
+failure_count | 50
+last_failure | 2021-07-21 21:16:02.781779+00
+last_failure_message | duplicate key value violates unique constraint "test_pkey"
+stats_reset |
+</programlisting>
+
+ <para>
+ and it is also shown in subscriber's server log as follows:
+ </para>
+
+<screen>
+ERROR: duplicate key value violates unique constraint "test_pkey"
+DETAIL: Key (id)=(1) already exists.
+CONTEXT: processing remote data during "INSERT" for replication target relation "public.test" in transaction 716 with commit timestamp 2021-07-15 21:54:58.802874+00
+</screen>
+
+ <para>
+ The transaction ID that contains the change violating the constraint can be
+ found from those outputs (transaction ID 740 in the above case). The transaction
+ can be skipped by setting <replaceable>skip_xid</replaceable> to the subscription
+ by <command>ALTER SUBSCRIPTION ... SET</command>. Alternatively, the transaction
+ can also be skipped by calling the <link linkend="pg-replication-origin-advance">
<function>pg_replication_origin_advance()</function></link> function with
a <parameter>node_name</parameter> corresponding to the subscription name,
and a position. The current position of origins can be seen in the
<link linkend="view-pg-replication-origin-status">
<structname>pg_replication_origin_status</structname></link> system view.
</para>
+
+ <para>
+ In either way, those should be used as a last resort. They skip the whole
+ transaction including changes that may not violate any constraint and easily
+ make subscriber inconsistent if a user specifies the wrong transaction ID or
+ the position of origin.
+ </para>
</sect1>
<sect1 id="logical-replication-restrictions">
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 8c3c28b7e7..cfb318e08c 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -205,8 +205,36 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
<literal>streaming</literal>.
</para>
<para>
- The parameters that can be reset are: <literal>streaming</literal>,
- <literal>binary</literal>, <literal>synchronous_commit</literal>.
+ The parameters that can be reset are: <literal>slot_name</literal>,
+ <literal>synchronous_commit</literal>, <literal>binary</literal>,
+ <literal>streaming</literal>, and following parameter:
+ </para>
+ <para>
+ <variablelist>
+ <varlistentry>
+ <term><literal>skip_xid</literal> (<type>xid</type>)</term>
+ <listitem>
+ <para>
+ If incoming data violates any constraints the logical replication
+ will stop until it is resolved. The resolution can be done either
+ by changing data on the subscriber so that it doesn't conflict with
+ incoming change or by skipping the whole transaction. This option
+ specifies transaction ID that logical replication worker skips to
+ apply. The logical replication worker skips all data modification
+ changes within the specified transaction. Therefore, since it skips
+ the whole transaction including the changes that may not violate the
+ constraint, it should only be used as a last resort. This option has
+ no effect for the transaction that is already prepared with enabling
+ <literal>two_phase</literal> on susbscriber. After the logical
+ replication successfully skips the transaction, the transaction ID
+ (stored in
+ <structname>pg_subscription</structname>.<structfield>subskipxid</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index 25021e25a4..cb22cd7463 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -104,6 +104,16 @@ GetSubscription(Oid subid, bool missing_ok)
Assert(!isnull);
sub->publications = textarray_to_stringlist(DatumGetArrayTypeP(datum));
+ /* Get skip XID */
+ datum = SysCacheGetAttr(SUBSCRIPTIONOID,
+ tup,
+ Anum_pg_subscription_subskipxid,
+ &isnull);
+ if (!isnull)
+ sub->skipxid = DatumGetTransactionId(datum);
+ else
+ sub->skipxid = InvalidTransactionId;
+
ReleaseSysCache(tup);
return sub;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index cc390ce95a..188f3e42fd 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -60,6 +60,7 @@
#define SUBOPT_BINARY 0x00000080
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
+#define SUBOPT_SKIP_XID 0x00000400
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -81,6 +82,7 @@ typedef struct SubOpts
bool binary;
bool streaming;
bool twophase;
+ TransactionId skip_xid;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -129,6 +131,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->streaming = false;
if (IsSet(supported_opts, SUBOPT_TWOPHASE_COMMIT))
opts->twophase = false;
+ if (IsSet(supported_opts, SUBOPT_SKIP_XID))
+ opts->skip_xid = InvalidTransactionId;
/* Parse options */
foreach(lc, stmt_options)
@@ -261,6 +265,29 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_TWOPHASE_COMMIT;
opts->twophase = defGetBoolean(defel);
}
+ else if (strcmp(defel->defname, "skip_xid") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_SKIP_XID))
+ errorConflictingDefElem(defel, pstate);
+
+ if (!is_reset)
+ {
+ char *xid_str = defGetString(defel);
+ TransactionId xid;
+
+ /* Parse the argument as TransactionId */
+ xid = DatumGetTransactionId(DirectFunctionCall1(xidin,
+ CStringGetDatum(xid_str)));
+
+ if (!TransactionIdIsNormal(xid))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid transaction id")));
+ opts->skip_xid = xid;
+ }
+
+ opts->specified_opts |= SUBOPT_SKIP_XID;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -485,6 +512,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
CharGetDatum(opts.twophase ?
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -885,7 +913,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
{
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
- SUBOPT_STREAMING);
+ SUBOPT_STREAMING | SUBOPT_SKIP_XID);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts, false);
@@ -934,6 +962,13 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
replaces[Anum_pg_subscription_substream - 1] = true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_SKIP_XID))
+ {
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(opts.skip_xid);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+ }
+
update_tuple = true;
break;
}
@@ -941,7 +976,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
case ALTER_SUBSCRIPTION_RESET_OPTIONS:
{
supported_opts = (SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
- SUBOPT_STREAMING);
+ SUBOPT_STREAMING | SUBOPT_SKIP_XID);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts, true);
@@ -967,6 +1002,12 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
replaces[Anum_pg_subscription_substream - 1] = true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_SKIP_XID))
+ {
+ nulls[Anum_pg_subscription_subskipxid - 1] =
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+ }
+
update_tuple = true;
break;
}
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 7e3938d0d3..911a031b60 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -1743,11 +1743,32 @@ pgstat_reset_subscription_error(Oid subid, Oid subrelid)
pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
msg.m_subid = subid;
msg.m_subrelid = subrelid;
+ msg.m_clear = false;
msg.m_reset = true;
pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
}
+/* ----------
+ * pgstat_clear_subscription_error() -
+ *
+ * Tell the collector about clear the error of subscription.
+ * ----------
+ */
+void
+pgstat_clear_subscription_error(Oid subid, Oid subrelid)
+{
+ PgStat_MsgSubscriptionErr msg;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_clear = true;
+ msg.m_reset = false;
+
+ pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
+}
+
/* ----------
* pgstat_report_autovac() -
*
@@ -2034,6 +2055,7 @@ pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
msg.m_subid = subid;
msg.m_subrelid = subrelid;
msg.m_reset = false;
+ msg.m_clear = false;
msg.m_databaseid = MyDatabaseId;
msg.m_relid = relid;
msg.m_command = command;
@@ -6134,27 +6156,37 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
static void
pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len)
{
+
PgStat_StatSubErrEntry *errent;
- bool create = !msg->m_reset;
+ bool create = !(msg->m_reset || msg->m_clear);
/* Get subscription error */
errent = pgstat_get_subscription_error_entry(msg->m_subid,
msg->m_subrelid,
create);
- if (msg->m_reset)
+ if (msg->m_reset || msg->m_clear)
{
+ Assert(!(msg->m_reset && msg->m_clear));
+
if (errent == NULL)
return;
- /* reset fields and set reset timestamp */
errent->relid = InvalidOid;
errent->command = 0;
errent->xid = InvalidTransactionId;
errent->failure_count = 0;
- errent->last_failure = 0;
- errent->last_errmsg[0] = '\0';
- errent->stat_reset_timestamp = GetCurrentTimestamp();
+
+ /*
+ * If the reset is requested, reset more fields and set the reset
+ * timestamp.
+ */
+ if (msg->m_reset)
+ {
+ errent->last_failure = 0;
+ errent->last_errmsg[0] = '\0';
+ errent->stat_reset_timestamp = GetCurrentTimestamp();
+ }
}
else
{
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index ebe6f53b5d..ae6c9f147c 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -136,6 +136,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -269,6 +270,21 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * skipping_xid is a valid XID if we're skipping all data modification changes
+ * (INSERT/DELETE/UPDATE/TRUNCATE) of the specified transaction in MySubscription->skipxid.
+ * Please note that we don’t skip receiving the changes particularly in streaming
+ * cases, since we decide whether or not to skip applying the changes when starting
+ * to apply. Once starting skipping changes, we copy the XID to skipping_xid and
+ * then don't stop skipping until we skip the whole transaction even if the
+ * subscription is invalidated and* MySubscription->skipxid gets changed or reset.
+ * When stopping the skipping behavior, we reset the skip XID (subskipxid) in the
+ * pg_subscription catalog and associate origin status to the transaction that resets
+ * the skip XID so that we can start streaming from the next transaction.
+ */
+static TransactionId skipping_xid = InvalidTransactionId;
+#define is_skipping_changes() (TransactionIdIsValid(skipping_xid))
+
/*
* Hash table for storing the streaming xid information along with shared file
* set for streaming and subxact files.
@@ -355,6 +371,9 @@ static void apply_error_callback(void *arg);
static inline void set_apply_error_context_xact(TransactionId xid, TimestampTz ts);
static inline void reset_apply_error_context_info(void);
+static void maybe_start_skipping_changes(TransactionId xid);
+static void stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs);
+
/*
* Should this worker apply changes for given relation.
*
@@ -809,6 +828,11 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
+ /*
+ * Enable skipping all changes of this transaction if specified.
+ */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -833,7 +857,18 @@ apply_handle_commit(StringInfo s)
LSN_FORMAT_ARGS(commit_data.commit_lsn),
LSN_FORMAT_ARGS(remote_final_lsn))));
- apply_handle_commit_internal(&commit_data);
+ /*
+ * Stop the skipping transaction if enabled. Otherwise, commit the changes
+ * that are just applied.
+ */
+ if (is_skipping_changes())
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ apply_handle_commit_internal(&commit_data);
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(commit_data.end_lsn);
@@ -861,6 +896,9 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
+ /* Enable skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -919,9 +957,10 @@ apply_handle_prepare(StringInfo s)
/*
* Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction. It is done this way because at
- * commit prepared time, we won't know whether we have skipped preparing a
- * transaction because of no change.
+ * change has happened in this transaction, possibly because we're
+ * skipping data-modification changes of this transaction. It is done this
+ * way because at commit prepared time, we won't know whether we have
+ * skipped preparing a transaction because of no change.
*
* XXX, We can optimize such that at commit prepared time, we first check
* whether we have prepared the transaction or not but that doesn't seem
@@ -935,6 +974,10 @@ apply_handle_prepare(StringInfo s)
CommitTransactionCommand();
pgstat_report_stat(false);
+ /* Stop the skipping changes if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
store_flush_position(prepare_data.end_lsn);
in_remote_transaction = false;
@@ -1066,6 +1109,9 @@ apply_handle_stream_prepare(StringInfo s)
elog(DEBUG1, "received prepare for streamed transaction %u", prepare_data.xid);
+ /* Enable skipping all changes of this transaction if specified. */
+ maybe_start_skipping_changes(prepare_data.xid);
+
/* Replay all the spooled operations. */
apply_spooled_messages(prepare_data.xid, prepare_data.prepare_lsn);
@@ -1076,6 +1122,10 @@ apply_handle_stream_prepare(StringInfo s)
pgstat_report_stat(false);
+ /* Stop the skipping changes if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
store_flush_position(prepare_data.end_lsn);
in_remote_transaction = false;
@@ -1101,9 +1151,10 @@ apply_handle_origin(StringInfo s)
{
/*
* ORIGIN message can only come inside streaming transaction or inside
- * remote transaction and before any actual writes.
+ * remote transaction and before any actual writes unless we're skipping
+ * changes of the transaction.
*/
- if (!in_streamed_transaction &&
+ if (!in_streamed_transaction && !is_skipping_changes() &&
(!in_remote_transaction ||
(IsTransactionState() && !am_tablesync_worker())))
ereport(ERROR,
@@ -1125,6 +1176,9 @@ apply_handle_stream_start(StringInfo s)
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg_internal("duplicate STREAM START message")));
+ /* extract XID of the top-level transaction */
+ stream_xid = logicalrep_read_stream_start(s, &first_segment);
+
/*
* Start a transaction on stream start, this transaction will be committed
* on the stream stop unless it is a tablesync worker in which case it
@@ -1137,9 +1191,6 @@ apply_handle_stream_start(StringInfo s)
/* notify handle methods we're processing a remote transaction */
in_streamed_transaction = true;
- /* extract XID of the top-level transaction */
- stream_xid = logicalrep_read_stream_start(s, &first_segment);
-
if (!TransactionIdIsValid(stream_xid))
ereport(ERROR,
(errcode(ERRCODE_PROTOCOL_VIOLATION),
@@ -1221,6 +1272,7 @@ apply_handle_stream_abort(StringInfo s)
errmsg_internal("STREAM ABORT message without STREAM STOP")));
logicalrep_read_stream_abort(s, &xid, &subxid);
+ maybe_start_skipping_changes(xid);
/*
* If the two XIDs are the same, it's in fact abort of toplevel xact, so
@@ -1314,6 +1366,10 @@ apply_handle_stream_abort(StringInfo s)
CommitTransactionCommand();
}
+ /* Stop the skipping transaction if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
reset_apply_error_context_info();
}
@@ -1463,9 +1519,23 @@ apply_handle_stream_commit(StringInfo s)
elog(DEBUG1, "received commit for streamed transaction %u", xid);
+ /* Enable skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(xid);
+
apply_spooled_messages(xid, commit_data.commit_lsn);
- apply_handle_commit_internal(&commit_data);
+ if (is_skipping_changes())
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ {
+ /* commit the streamed transaction */
+ apply_handle_commit_internal(&commit_data);
+ }
/* unlink the files with serialized changes and subxact info */
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
@@ -2351,6 +2421,17 @@ apply_dispatch(StringInfo s)
LogicalRepMsgType action = pq_getmsgbyte(s);
LogicalRepMsgType saved_command;
+ /*
+ * Skip all data-modification changes if we're skipping changes of this
+ * transaction.
+ */
+ if (is_skipping_changes() &&
+ (action == LOGICAL_REP_MSG_INSERT ||
+ action == LOGICAL_REP_MSG_UPDATE ||
+ action == LOGICAL_REP_MSG_DELETE ||
+ action == LOGICAL_REP_MSG_TRUNCATE))
+ return;
+
/*
* Set the current command being applied. Since this function can be
* called recusively when applying spooled changes, save the current
@@ -3791,3 +3872,103 @@ reset_apply_error_context_info(void)
apply_error_callback_arg.remote_attnum = -1;
set_apply_error_context_xact(InvalidTransactionId, 0);
}
+
+/*
+ * Start skipping changes of the transaction if the given XID matches the
+ * transaction ID specified by skip_xid option.
+ */
+static void
+maybe_start_skipping_changes(TransactionId xid)
+{
+ Assert(!is_skipping_changes());
+ Assert(!TransactionIdIsValid(skipping_xid));
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ if (!TransactionIdIsValid(MySubscription->skipxid) ||
+ MySubscription->skipxid != xid)
+ return;
+
+ skipping_xid = xid;
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction with xid %u",
+ skipping_xid));
+}
+
+/*
+ * Stop skipping changes and reset the skip XID. Also, reset the skip XID
+ * (pg_subscription.subskipxid). If origin_lsn and origin_committs are valid, we
+ * set origin state to the transaction commit that resets the skip XID so that we
+ * can start streaming from the transaction next to the one that we just skipped.
+ */
+static void
+stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs)
+{
+ Relation rel;
+ HeapTuple tup;
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ Assert(is_skipping_changes());
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction with xid %u",
+ skipping_xid)));
+
+ /* Stop skipping changes */
+ skipping_xid = InvalidTransactionId;
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ if (!IsTransactionState())
+ StartTransactionCommand();
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ TransactionIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ /* Set subskipxid to null */
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ /* Update the system catalog to reset the skip XID */
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+
+ heap_freetuple(tup);
+ table_close(rel, RowExclusiveLock);
+
+ if (!XLogRecPtrIsInvalid(origin_lsn))
+ {
+ /*
+ * Update origin state so we can restart streaming from correct
+ * position in case of crash.
+ */
+ replorigin_session_origin_lsn = origin_lsn;
+ replorigin_session_origin_timestamp = origin_committs;
+ }
+
+ CommitTransactionCommand();
+ pgstat_report_stat(false);
+
+ /*
+ * Clear the error statistics of this subscription to let users know the
+ * subscription is no longer getting stuck by the conflict.
+ *
+ * The message for clearing the error statistics can be lost but that's
+ * okay. The user can know the logical replication is working fine in
+ * other ways, for example, checking pg_stat_subscription view. And the
+ * user is able to reset the single subscription error statistics by
+ * pg_reset_subscription_error SQL function.
+ */
+ pgstat_clear_subscription_error(MySubscription->oid, InvalidOid);
+}
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 21061493ea..beaa6e646d 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -67,6 +67,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
char subtwophasestate; /* Stream two-phase transactions */
+ TransactionId subskipxid BKI_FORCE_NULL; /* All changes associated with
+ * this XID are skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -103,6 +106,7 @@ typedef struct Subscription
* binary format */
bool stream; /* Allow streaming in-progress transactions. */
char twophasestate; /* Allow streaming two-phase transactions */
+ TransactionId skipxid; /* All changes of the XID are skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 3f55d63425..93bfef0e9c 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3677,6 +3677,7 @@ typedef struct AlterSubscriptionStmt
char *conninfo; /* Connection string to publisher */
List *publication; /* One or more publication to subscribe to */
List *options; /* List of DefElem nodes */
+ TransactionId skip_xid; /* XID to skip */
} AlterSubscriptionStmt;
typedef struct DropSubscriptionStmt
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index a6914a24e5..6775736b2b 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -536,7 +536,7 @@ typedef struct PgStat_MsgReplSlot
/* ----------
* PgStat_MsgSubscriptionErr Sent by the apply worker or the table sync worker to
- * update/reset the error happening during logical
+ * update/reset/clear the error happening during logical
* replication.
* ----------
*/
@@ -554,7 +554,9 @@ typedef struct PgStat_MsgSubscriptionErr
Oid m_subid;
Oid m_subrelid;
- /* The reset message uses below field */
+ /* The clear and reset messages use below fields */
+ bool m_clear; /* clear all fields except for last_failure and
+ * last_errmsg */
bool m_reset; /* Reset all fields and set reset_stats
* timestamp */
@@ -1111,6 +1113,7 @@ extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type t
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
extern void pgstat_reset_subscription_error(Oid subid, Oid subrelid);
+extern void pgstat_clear_subscription_error(Oid subid, Oid subrelid);
extern void pgstat_report_autovac(Oid dboid);
extern void pgstat_report_vacuum(Oid tableoid, bool shared,
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index b87f67fe55..217b5fabd1 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -296,6 +296,19 @@ ERROR: unrecognized subscription parameter: "enabled"
-- fail - RESET must not include values
ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
ERROR: RESET must not include values for parameters
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = '4294967295');
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1.1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+ERROR: invalid transaction id
\dRs+
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index aa90560691..4c9d25f0a4 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -231,6 +231,17 @@ ALTER SUBSCRIPTION regress_testsub RESET (enabled);
-- fail - RESET must not include values
ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = '4294967295');
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1.1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+
\dRs+
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/subscription/t/024_skip_xact.pl b/src/test/subscription/t/024_skip_xact.pl
new file mode 100644
index 0000000000..affb663803
--- /dev/null
+++ b/src/test/subscription/t/024_skip_xact.pl
@@ -0,0 +1,244 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Tests for skipping logical replication transactions
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 10;
+
+# Test if the error reported on pg_subscription_errors view is expected.
+sub test_subscription_error
+{
+ my ($node, $source, $relname, $expected_error, $msg) = @_;
+
+ # Wait for the error statistics to be updated.
+ $node->poll_query_until(
+ 'postgres', qq[
+SELECT count(1) > 0 FROM pg_stat_subscription_errors
+WHERE relid = '$relname'::regclass AND failure_source = '$source';
+]) or die "Timed out while waiting for statistics to be updated";
+
+ my $result = $node->safe_psql(
+ 'postgres',
+ qq[
+SELECT datname, subname, command, relid::regclass, failure_source, failure_count > 0
+FROM pg_stat_subscription_errors
+WHERE relid = '$relname'::regclass AND failure_source = '$source';
+]);
+ is($result, $expected_error, $msg);
+}
+
+# Check the error reported on pg_stat_subscription view and skip the failed
+# transaction.
+sub test_skip_subscription_error
+{
+ my ($node, $source, $subname, $relname, $expected_error, $msg) = @_;
+
+ # Check the reported error.
+ test_subscription_error($node, $source, $relname, $expected_error, $msg);
+
+ # Get XID of the failed transaction.
+ my $skipxid = $node->safe_psql(
+ 'postgres',
+ "SELECT xid FROM pg_stat_subscription_errors WHERE relid = '$relname'::regclass");
+ $node->safe_psql('postgres',
+ "ALTER SUBSCRIPTION $subname SET (skip_xid = '$skipxid')");
+
+ # Restart the subscriber to restart logical replication without interval.
+ $node->restart;
+
+ # Wait for the failed transaction to be skipped.
+ $node->poll_query_until(
+ 'postgres',
+ qq[
+SELECT subskipxid IS NULL FROM pg_subscription
+WHERE subname = '$subname'
+]) or die "Timed out while waiting for the transaction to be skipped";
+
+ # Also wait for the error details to be cleared.
+ $node->poll_query_until(
+ 'postgres',
+ qq[
+SELECT command IS NULL FROM pg_stat_subscription_errors
+WHERE subname = '$subname' AND failure_source = '$source';
+]) or die "Timed out while waiting for the transaction to be skipped";
+}
+
+# Create publisher node.
+my $node_publisher = PostgresNode->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+logical_decoding_work_mem = 64kB
+]);
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgresNode->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+
+# don't overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+wal_retrieve_retry_interval = 5s
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we create
+# the same tables but with primary keys. Also, insert some data that will conflict
+# with the data replicated from publisher later.
+$node_publisher->safe_psql('postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int);
+CREATE TABLE test_tab2 (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+INSERT INTO test_tab1 VALUES (1);
+INSERT INTO test_tab2 VALUES (1);
+COMMIT;
+]);
+$node_subscriber->safe_psql('postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int primary key);
+CREATE TABLE test_tab2 (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab2 VALUES (1);
+INSERT INTO test_tab_streaming SELECT 10000, md5(10000::text);
+COMMIT;
+]);
+
+# Check if there is no subscription errors before starting logical replication.
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_errors");
+is($result, qq(0), 'check no subscription error');
+
+# Setup publications.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ q[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Create subscriptions. The table sync for test_tab2 on tap_sub will enter to
+# infinite error due to violating the unique constraint.
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (two_phase = on);");
+my $appname_streaming = 'tap_sub_streaming';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming' PUBLICATION tap_pub_streaming WITH (streaming = on, two_phase = on);");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Wait for initial table sync for test_tab1 and test_tab_streaming to finish.
+$node_subscriber->poll_query_until('postgres',
+ q[
+SELECT count(1) = 2 FROM pg_subscription_rel
+WHERE srrelid in ('test_tab1'::regclass, 'test_tab_streaming'::regclass) AND srsubstate = 'r'
+]) or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check the initial data.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab1");
+is($result, q(1), 'check initial data was copied to subscriber');
+
+# Insert more data to test_tab1, raising an error on the subscriber due to violating
+# the unique constraint on test_tab1. Then skip the transaction in question.
+$node_publisher->safe_psql('postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab1 VALUES (1);
+COMMIT;
+]);
+test_skip_subscription_error($node_subscriber,
+ 'apply', 'tap_sub', 'test_tab1',
+ qq(postgres|tap_sub|INSERT|test_tab1|apply|t),
+ 'skip the error reported by the apply worker');
+
+# Check the table sync worker's error in the view.
+test_subscription_error($node_subscriber,
+ 'tablesync', 'test_tab2',
+ qq(postgres|tap_sub||test_tab2|tablesync|t),
+ 'skip the error reported by the table sync worker');
+
+# Insert enough rows to test_tab_streaming to exceed the 64kB limit, also raising an
+# error on the subscriber during applying spooled changes for the same reason. Then
+# skip the transactio in question.
+$node_publisher->safe_psql('postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+COMMIT;
+]);
+test_skip_subscription_error($node_subscriber,
+ 'apply', 'tap_sub_streaming', 'test_tab_streaming',
+ qq(postgres|tap_sub_streaming|INSERT|test_tab_streaming|apply|t),
+ 'skip the error reported by the table sync worker during during applying streaming changes');
+
+# Insert data to test_tab1 and test_tab_streaming that don't conflict.
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab1 VALUES (2)");
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab_streaming VALUES (10001, md5(10001::text))");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Check the data is successfully replicated after skipping the transactions.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT * FROM test_tab1");
+is($result, q(1
+2), "subscription gets changes after skipped transaction");
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM test_tab_streaming");
+is($result, q(2), "subscription gets changes after skipped streamed transaction");
+
+# Tests for skipping the transactions that are prepared and stream_prepared. We insert
+# the same data as the previous tests but prepare the transactions. Those insertions
+# raise an error on the subscriptions. Then we skip the transactions in question.
+$node_publisher->safe_psql('postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab1 VALUES (1);
+PREPARE TRANSACTION 'skip_sub1';
+COMMIT PREPARED 'skip_sub1';
+]);
+test_skip_subscription_error($node_subscriber,
+ 'apply', 'tap_sub', 'test_tab1',
+ qq(postgres|tap_sub|INSERT|test_tab1|apply|t),
+ 'skip the error on changes of the prepared transaction');
+
+$node_publisher->safe_psql('postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+PREPARE TRANSACTION 'skip_sub2';
+COMMIT PREPARED 'skip_sub2';
+]);
+test_skip_subscription_error($node_subscriber,
+ 'apply', 'tap_sub_streaming', 'test_tab_streaming',
+ qq(postgres|tap_sub_streaming|INSERT|test_tab_streaming|apply|t),
+ 'skip the error on changes of the prepared-streamed transaction');
+
+# Check if the view doesn't show any entries after dropping the subscriptions.
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+DROP SUBSCRIPTION tap_sub;
+DROP SUBSCRIPTION tap_sub_streaming;
+]);
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_errors");
+is($result, q(0), 'no error after dropping subscription');
--
2.24.3 (Apple Git-128)
v11-0005-Move-shared-fileset-cleanup-to-before_shmem_exit.patchapplication/octet-stream; name=v11-0005-Move-shared-fileset-cleanup-to-before_shmem_exit.patchDownload
From 339398d022b2239a7cbaea01aa3d41a313366413 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Thu, 12 Aug 2021 10:57:41 +0900
Subject: [PATCH v11 5/5] Move shared fileset cleanup to before_shmem_exit().
The reported problem is that shared file set created in
SharedFileSetInit() by logical replication apply worker is cleaned up
in SharedFileSetDeleteOnProcExit() when the process exited on an error
due to a conflict. As shared fileset cleanup causes pgstat reporting
for underlying temporary files, the assertions added in ee3f8d3d3ae
caused failures.
To fix the problem, similar to 675c945394, move shared fileset cleanup
to a before_shmem_exit() hook, ensuring that the fileset is dropped
while we can still report stats for underlying temporary files.
---
src/backend/storage/file/sharedfileset.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/src/backend/storage/file/sharedfileset.c b/src/backend/storage/file/sharedfileset.c
index ed37c940ad..0d9700bf56 100644
--- a/src/backend/storage/file/sharedfileset.c
+++ b/src/backend/storage/file/sharedfileset.c
@@ -36,7 +36,7 @@
static List *filesetlist = NIL;
static void SharedFileSetOnDetach(dsm_segment *segment, Datum datum);
-static void SharedFileSetDeleteOnProcExit(int status, Datum arg);
+static void SharedFileSetDeleteBeforeShmemExit(int status, Datum arg);
static void SharedFileSetPath(char *path, SharedFileSet *fileset, Oid tablespace);
static void SharedFilePath(char *path, SharedFileSet *fileset, const char *name);
static Oid ChooseTablespace(const SharedFileSet *fileset, const char *name);
@@ -112,7 +112,12 @@ SharedFileSetInit(SharedFileSet *fileset, dsm_segment *seg)
* fileset clean up.
*/
Assert(filesetlist == NIL);
- on_proc_exit(SharedFileSetDeleteOnProcExit, 0);
+
+ /*
+ * Register before-shmem-exit hook to ensure fileset is dropped
+ * while we can still report stats for underlying temporary files.
+ */
+ before_shmem_exit(SharedFileSetDeleteBeforeShmemExit, 0);
registered_cleanup = true;
}
@@ -259,12 +264,12 @@ SharedFileSetOnDetach(dsm_segment *segment, Datum datum)
}
/*
- * Callback function that will be invoked on the process exit. This will
+ * Callback function that will be invoked before shmem exit. This will
* process the list of all the registered sharedfilesets and delete the
* underlying files.
*/
static void
-SharedFileSetDeleteOnProcExit(int status, Datum arg)
+SharedFileSetDeleteBeforeShmemExit(int status, Datum arg)
{
/*
* Remove all the pending shared fileset entries. We don't use foreach()
--
2.24.3 (Apple Git-128)
v11-0002-Add-pg_stat_subscription_errors-statistics-view.patchapplication/octet-stream; name=v11-0002-Add-pg_stat_subscription_errors-statistics-view.patchDownload
From 4d50f2c8cc54e01cb521e97533b9dca6192bc122 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 16 Jul 2021 23:10:22 +0900
Subject: [PATCH v11 2/5] Add pg_stat_subscription_errors statistics view.
This commits adds new system view pg_stat_logical_replication_error,
showing errors happening during applying logical replication changes
as well as during performing initial table synchronization.
The subscription error entries are removed by autovacuum workers when
the table synchronization competed in table sync worker cases and when
dropping the subscription in apply worker cases.
It also adds SQL function pg_stat_reset_subscription_error() to
reset the single subscription error.
---
doc/src/sgml/monitoring.sgml | 160 ++++++
src/backend/catalog/system_functions.sql | 2 +
src/backend/catalog/system_views.sql | 27 +
src/backend/postmaster/pgstat.c | 651 +++++++++++++++++++++++
src/backend/replication/logical/worker.c | 51 +-
src/backend/utils/adt/pgstatfuncs.c | 113 ++++
src/backend/utils/error/elog.c | 1 -
src/include/catalog/pg_proc.dat | 13 +
src/include/pgstat.h | 106 ++++
src/test/regress/expected/rules.out | 22 +
src/tools/pgindent/typedefs.list | 5 +
11 files changed, 1147 insertions(+), 4 deletions(-)
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 74a58a916c..0c02e46947 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -346,6 +346,15 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
</entry>
</row>
+ <row>
+ <entry><structname>pg_stat_subscription_errors</structname><indexterm><primary>pg_stat_subscription_errors</primary></indexterm></entry>
+ <entry>One row per error that happened on subscription, showing information about
+ the subscription errors.
+ See <link linkend="monitoring-pg-stat-subscription-errors">
+ <structname>pg_stat_subscription_errors</structname></link> for details.
+ </entry>
+ </row>
+
<row>
<entry><structname>pg_stat_ssl</structname><indexterm><primary>pg_stat_ssl</primary></indexterm></entry>
<entry>One row per connection (regular and replication), showing information about
@@ -3050,6 +3059,135 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
</sect2>
+ <sect2 id="monitoring-pg-stat-subscription-errors">
+ <title><structname>pg_stat_subscription_errors</structname></title>
+
+ <indexterm>
+ <primary>pg_stat_subscription_errors</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_stat_subscription_errors</structname> view will contain one
+ row per subscription error reported by workers applying logical replication
+ changes and workers handling the initial data copy of the subscribed tables.
+ </para>
+
+ <table id="pg-stat-subscription-errors" xreflabel="pg_stat_subscription-errors">
+ <title><structname>pg_stat_subscription_errors</structname> View</title>
+ <tgroup cols="1">
+ <thead>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ Column Type
+ </para>
+ <para>
+ Description
+ </para></entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>datname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the database in which the subscription is created.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the subscription.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the subscription.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>relid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker is processing when the
+ error happened.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>command</structfield> <type>text</type>
+ </para>
+ <para>
+ Name of command being applied when the error happened. This
+ field is always NULL if the error is reported by
+ <literal>tablesync</literal> worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>xid</structfield> <type>xid</type>
+ </para>
+ <para>
+ Transaction ID of publisher node being applied when the error
+ happened. This field is always NULL if the error is reported
+ by <literal>tablesync</literal> worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>failure_source</structfield> <type>text</type>
+ </para>
+ <para>
+ Type of worker reported the error: <literal>apply</literal> or
+ <literal>tablesync</literal>.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>failure_count</structfield> <type>uint8</type>
+ </para>
+ <para>
+ Number of times the error happened on the worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_failure</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which the last error happened.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_failure_message</structfield> <type>text</type>
+ </para>
+ <para>
+ Error message which is reported last failure time.
+ </para></entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ </sect2>
+
<sect2 id="monitoring-pg-stat-ssl-view">
<title><structname>pg_stat_ssl</structname></title>
@@ -5172,6 +5310,28 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
can be granted EXECUTE to run the function.
</para></entry>
</row>
+
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_stat_reset_subscription_error</primary>
+ </indexterm>
+ <function>pg_stat_reset_subscription_error</function> ( <parameter>subid</parameter> <type>oid</type>, <parameter>relid</parameter> <type>oid</type> )
+ <returnvalue>void</returnvalue>
+ </para>
+ <para>
+ Resets statistics of a single subscription error. If
+ the argument <parameter>relid</parameter> is not <literal>NULL</literal>,
+ resets error statistics of the <literal>tablesync</literal> worker for
+ the relation with <parameter>relid</parameter>. Otherwise, resets the
+ error statistics of the <literal>apply</literal> worker running on the
+ subscription with <parameter>subid</parameter>.
+ </para>
+ <para>
+ This function is restricted to superusers by default, but other users
+ can be granted EXECUTE to run the function.
+ </para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index a416e94d37..c9aa6f04d3 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -639,6 +639,8 @@ REVOKE EXECUTE ON FUNCTION pg_stat_reset_single_function_counters(oid) FROM publ
REVOKE EXECUTE ON FUNCTION pg_stat_reset_replication_slot(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_stat_reset_subscription_error(oid, oid) FROM public;
+
REVOKE EXECUTE ON FUNCTION lo_import(text) FROM public;
REVOKE EXECUTE ON FUNCTION lo_import(text, oid) FROM public;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 55f6e3711d..449692afa9 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1257,3 +1257,30 @@ REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
substream, subtwophasestate, subslotname, subsynccommit, subpublications)
ON pg_subscription TO public;
+
+CREATE VIEW pg_stat_subscription_errors AS
+ SELECT
+ d.datname,
+ sr.subid,
+ s.subname,
+ e.relid,
+ e.command,
+ e.xid,
+ e.failure_source,
+ e.failure_count,
+ e.last_failure,
+ e.last_failure_message,
+ e.stats_reset
+ FROM (SELECT
+ oid as subid,
+ NULL as relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT
+ srsubid as subid,
+ srrelid as relid
+ FROM pg_subscription_rel
+ WHERE srsubstate <> 'r') sr,
+ LATERAL pg_stat_get_subscription_error(sr.subid, sr.relid) e
+ JOIN pg_database d ON (e.datid = d.oid)
+ JOIN pg_subscription s ON (e.subid = s.oid);
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index a3c35bdf60..7e3938d0d3 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -40,6 +40,8 @@
#include "access/xact.h"
#include "catalog/pg_database.h"
#include "catalog/pg_proc.h"
+#include "catalog/pg_subscription.h"
+#include "catalog/pg_subscription_rel.h"
#include "common/ip.h"
#include "executor/instrument.h"
#include "libpq/libpq.h"
@@ -105,6 +107,7 @@
#define PGSTAT_TAB_HASH_SIZE 512
#define PGSTAT_FUNCTION_HASH_SIZE 512
#define PGSTAT_REPLSLOT_HASH_SIZE 32
+#define PGSTAT_SUBSCRIPTION_HASH_SIZE 32
/* ----------
@@ -279,6 +282,7 @@ static PgStat_GlobalStats globalStats;
static PgStat_WalStats walStats;
static PgStat_SLRUStats slruStats[SLRU_NUM_ELEMENTS];
static HTAB *replSlotStatHash = NULL;
+static HTAB *subscriptionHash = NULL;
/*
* List of OIDs of databases we need to write out. If an entry is InvalidOid,
@@ -329,6 +333,12 @@ static bool pgstat_db_requested(Oid databaseid);
static PgStat_StatReplSlotEntry *pgstat_get_replslot_entry(NameData name, bool create_it);
static void pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotstats, TimestampTz ts);
+static PgStat_StatSubEntry *pgstat_get_subscription_entry(Oid subid,
+ bool create);
+static PgStat_StatSubErrEntry *pgstat_get_subscription_error_entry(Oid subid,
+ Oid subrelid,
+ bool create);
+
static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg);
static void pgstat_send_funcstats(void);
static void pgstat_send_slru(void);
@@ -368,6 +378,10 @@ static void pgstat_recv_checksum_failure(PgStat_MsgChecksumFailure *msg, int len
static void pgstat_recv_connstat(PgStat_MsgConn *msg, int len);
static void pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len);
static void pgstat_recv_tempfile(PgStat_MsgTempFile *msg, int len);
+static void pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len);
+static void pgstat_recv_subscription_error_purge(PgStat_MsgSubscriptionErrPurge *msg,
+ int len);
+static void pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len);
/* ------------------------------------------------------------
* Public functions called from postmaster follow
@@ -1146,6 +1160,166 @@ pgstat_vacuum_stat(void)
}
}
+ /*
+ * Search for all the dead subscriptions and error entries in stats
+ * hashtable and tell the stats collector to drop them.
+ */
+ if (subscriptionHash)
+ {
+ PgStat_MsgSubscriptionPurge submsg;
+ PgStat_StatSubEntry *subent;
+ HTAB *htab;
+
+ htab = pgstat_collect_oids(SubscriptionRelationId, Anum_pg_subscription_oid);
+
+ submsg.m_nentries = 0;
+
+ hash_seq_init(&hstat, subscriptionHash);
+ while ((subent = (PgStat_StatSubEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ PgStat_MsgSubscriptionErrPurge errmsg;
+ PgStat_StatSubErrEntry *errent;
+ HASH_SEQ_STATUS hstat_rel;
+ List *not_ready_rels_list;
+ HTAB *not_ready_rels_htab;
+ ListCell *lc;
+ HASHCTL hash_ctl;
+
+ CHECK_FOR_INTERRUPTS();
+
+ if (hash_search(htab, (void *) &(subent->subid), HASH_FIND, NULL) == NULL)
+ {
+ /* This subscription is dead, add subid to the message */
+ submsg.m_subids[submsg.m_nentries++] = subent->subid;
+
+ /*
+ * If the message is full, send it out and reinitialize to
+ * empty
+ */
+ if (submsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONPURGE)
+ {
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + submsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&submsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(&submsg, len);
+ submsg.m_nentries = 0;
+ }
+
+ continue;
+ }
+
+ /*
+ * Nothing to do here if the subscription exists but has no error
+ * entries.
+ */
+ if (subent->suberrors == NULL)
+ continue;
+
+ /*
+ * The subscription has error entries. We search errors of the
+ * table sync workers who are already in sync state. Those errors
+ * should be removed.
+ *
+ * Note that the lifetime of error entries of the apply worker and
+ * the table sync worker are different. The former lives until
+ * the subscription is dropped whereas the latter lives the table
+ * synchronization is completed.
+ */
+ not_ready_rels_list = GetSubscriptionNotReadyRelations(subent->subid);
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(SubscriptionRelState);
+ not_ready_rels_htab = hash_create("not ready relations in subscription",
+ 64,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+
+ /*
+ * The number of not-ready relations can be high for example right
+ * after creating a subscription, so we load the list of
+ * SubscriptionRelState into the hash table for faster lookups.
+ */
+ foreach(lc, not_ready_rels_list)
+ {
+ SubscriptionRelState *r_elem = (SubscriptionRelState *) lfirst(lc);
+ SubscriptionRelState *r_entry;
+
+ CHECK_FOR_INTERRUPTS();
+ r_entry = hash_search(not_ready_rels_htab, (void *) &(r_elem->relid),
+ HASH_ENTER, NULL);
+ memcpy(r_entry, r_elem, sizeof(SubscriptionRelState));
+ }
+
+ list_free(not_ready_rels_list);
+
+ errmsg.m_nentries = 0;
+ errmsg.m_subid = subent->subid;
+
+ /*
+ * Search for all error entries of which relation is already ready
+ * state
+ */
+ hash_seq_init(&hstat_rel, subent->suberrors);
+ while ((errent = (PgStat_StatSubErrEntry *) hash_seq_search(&hstat_rel)) != NULL)
+ {
+ CHECK_FOR_INTERRUPTS();
+
+ /* Skip the apply worker's error */
+ if (!OidIsValid(errent->subrelid))
+ continue;
+
+ /*
+ * Add the relid to the message if the table synchronization
+ * for this relation already completes or the table is no
+ * longer subscribed.
+ */
+ if (hash_search(not_ready_rels_htab, (void *) &(errent->subrelid),
+ HASH_FIND, NULL) == NULL)
+ errmsg.m_relids[errmsg.m_nentries++] = errent->subrelid;
+
+ /*
+ * If the message is full, send it out and reinitialize to
+ * empty
+ */
+ if (errmsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONERRPURGE)
+ {
+ len = offsetof(PgStat_MsgSubscriptionErrPurge, m_relids[0])
+ + errmsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&errmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE);
+ pgstat_send(&errmsg, len);
+ errmsg.m_nentries = 0;
+ }
+ }
+
+ /* Send the rest of dead error entries */
+ if (errmsg.m_nentries > 0)
+ {
+ len = offsetof(PgStat_MsgSubscriptionErrPurge, m_relids[0])
+ + errmsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&errmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE);
+ pgstat_send(&errmsg, len);
+ errmsg.m_nentries = 0;
+ }
+
+ hash_destroy(not_ready_rels_htab);
+ }
+
+ /* Send the rest of dead subscriptions */
+ if (submsg.m_nentries > 0)
+ {
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + submsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&submsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(&submsg, len);
+ }
+
+ hash_destroy(htab);
+ }
+
/*
* Lookup our own database entry; if not found, nothing more to do.
*/
@@ -1555,6 +1729,25 @@ pgstat_reset_replslot_counter(const char *name)
pgstat_send(&msg, sizeof(msg));
}
+/* ----------
+ * pgstat_reset_subscription_error() -
+ *
+ * Tell the collector about reset the subscription error.
+ * ----------
+ */
+void
+pgstat_reset_subscription_error(Oid subid, Oid subrelid)
+{
+ PgStat_MsgSubscriptionErr msg;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_reset = true;
+
+ pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
+}
+
/* ----------
* pgstat_report_autovac() -
*
@@ -1820,6 +2013,37 @@ pgstat_report_replslot_drop(const char *slotname)
pgstat_send(&msg, sizeof(PgStat_MsgReplSlot));
}
+/* ----------
+ * pgstat_report_subscription_error() -
+ *
+ * Tell the collector about the subscription error.
+ * ----------
+ */
+void
+pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command, TransactionId xid,
+ const char *errmsg)
+{
+ PgStat_MsgSubscriptionErr msg;
+ int len;
+
+ len = offsetof(PgStat_MsgSubscriptionErr, m_errmsg[0]) + strlen(errmsg);
+ Assert(len < PGSTAT_MAX_MSG_SIZE);
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_reset = false;
+ msg.m_databaseid = MyDatabaseId;
+ msg.m_relid = relid;
+ msg.m_command = command;
+ msg.m_xid = xid;
+ msg.m_failure_time = GetCurrentTimestamp();
+ strlcpy(msg.m_errmsg, errmsg, PGSTAT_SUBSCRIPTIONERR_MSGLEN);
+
+ pgstat_send(&msg, len);
+}
+
/* ----------
* pgstat_ping() -
*
@@ -2886,6 +3110,22 @@ pgstat_fetch_replslot(NameData slotname)
return pgstat_get_replslot_entry(slotname, false);
}
+/*
+ * ---------
+ * pgstat_fetch_subscription_error() -
+ *
+ * Support function for the SQL-callable pgstat* functions. Returns
+ * a pointer to the subscription error struct.
+ * ---------
+ */
+PgStat_StatSubErrEntry *
+pgstat_fetch_subscription_error(Oid subid, Oid relid)
+{
+ backend_read_statsfile();
+
+ return pgstat_get_subscription_error_entry(subid, relid, false);
+}
+
/*
* Shut down a single backend's statistics reporting at process exit.
*
@@ -3463,6 +3703,19 @@ PgstatCollectorMain(int argc, char *argv[])
pgstat_recv_connstat(&msg.msg_conn, len);
break;
+ case PGSTAT_MTYPE_SUBSCRIPTIONERR:
+ pgstat_recv_subscription_error(&msg.msg_subscriptionerr, len);
+ break;
+
+ case PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE:
+ pgstat_recv_subscription_error_purge(&msg.msg_subscriptionerrpurge,
+ len);
+ break;
+
+ case PGSTAT_MTYPE_SUBSCRIPTIONPURGE:
+ pgstat_recv_subscription_purge(&msg.msg_subscriptionpurge, len);
+ break;
+
default:
break;
}
@@ -3763,6 +4016,50 @@ pgstat_write_statsfiles(bool permanent, bool allDbs)
}
}
+ /*
+ * Write subscription error structs
+ */
+ if (subscriptionHash)
+ {
+ PgStat_StatSubEntry *subent;
+
+ hash_seq_init(&hstat, subscriptionHash);
+ while ((subent = (PgStat_StatSubEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ PgStat_StatSubErrEntry *errent;
+ HASH_SEQ_STATUS relhstat;
+ int32 nerrors = hash_get_num_entries(subent->suberrors);
+
+ /* Skip this subscription if not have any errors */
+ if (subent->suberrors == NULL)
+ continue;
+
+ fputc('S', fpout);
+ rc = fwrite(subent, sizeof(PgStat_StatSubEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+
+ /* The number of errors follows */
+ rc = fwrite(&nerrors, sizeof(int32), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+
+ /* Then, the error entries follow */
+ hash_seq_init(&relhstat, subent->suberrors);
+ while ((errent = (PgStat_StatSubErrEntry *) hash_seq_search(&relhstat)) != NULL)
+ {
+ /*
+ * XXX we write the whole PgStat_StatSubErrEntry entry that
+ * contains the fixed-legnth error message string which is
+ * PGSTAT_SUBSCRIPTIONERR_MSGLEN in length, making the stats
+ * file bloat. It's okay since we assume that the number of
+ * error entries is not high. But if the expectation became
+ * false we should write the string and its length instead.
+ */
+ rc = fwrite(errent, sizeof(PgStat_StatSubErrEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+ }
+ }
+ }
+
/*
* No more output to be done. Close the temp file and replace the old
* pgstat.stat with it. The ferror() check replaces testing for error
@@ -4224,6 +4521,100 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
break;
}
+ /*
+ * 'S' A PgStat_StatSubEntry struct followed by the number of
+ * errors and PgStat_StatSubErrEntry structs, describing
+ * subscription errors.
+ */
+ case 'S':
+ {
+ PgStat_StatSubEntry subbuf;
+ PgStat_StatSubEntry *subent;
+ int32 nerrors;
+
+ /* Read the subscription entry */
+ if (fread(&subbuf, 1, sizeof(PgStat_StatSubEntry), fpin) !=
+ sizeof(PgStat_StatSubEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /* Create hash table if we don't have it already. */
+ if (subscriptionHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ subscriptionHash = hash_create("Subscription stat entries",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ }
+
+ /* Enter the subscription error entry */
+ subent =
+ (PgStat_StatSubEntry *) hash_search(subscriptionHash,
+ (void *) &(subbuf.subid),
+ HASH_ENTER, NULL);
+ subent->suberrors = NULL;
+
+ /* Read the number of errors in the subscription */
+ if (fread(&nerrors, 1, sizeof(int32), fpin) != sizeof(int32))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /* Read error entries */
+ for (int i = 0; i < nerrors; i++)
+ {
+ PgStat_StatSubErrEntry errbuf;
+ PgStat_StatSubErrEntry *errent;
+
+ if (fread(&errbuf, 1, sizeof(PgStat_StatSubErrEntry), fpin) !=
+ sizeof(PgStat_StatSubErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ if (subent->suberrors == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubErrEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ subent->suberrors = hash_create("Subscription error hash",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ }
+
+ /*
+ * Enter the error information to the subscription
+ * hash
+ */
+ errent =
+ (PgStat_StatSubErrEntry *) hash_search(subent->suberrors,
+ (void *) &(errbuf.subrelid),
+ HASH_ENTER, NULL);
+
+ memcpy(errent, &errbuf, sizeof(PgStat_StatSubErrEntry));
+ }
+
+ break;
+ }
+
case 'E':
goto done;
@@ -4566,6 +4957,50 @@ pgstat_read_db_statsfile_timestamp(Oid databaseid, bool permanent,
}
break;
+ /*
+ * 'S' A PgStat_StatSubEntry struct followed by the number of
+ * errors and PgStat_StatSubErrEntry structs describing
+ * subscription errors.
+ */
+ case 'S':
+ {
+ PgStat_StatSubEntry subbuf;
+ PgStat_StatSubErrEntry errbuf;
+ int32 nerrors;
+
+ if (fread(&subbuf, 1, sizeof(PgStat_StatSubEntry), fpin)
+ != sizeof(PgStat_StatSubEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ FreeFile(fpin);
+ return false;
+ }
+
+ if (fread(&nerrors, 1, sizeof(int32), fpin) != sizeof(int32))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ for (int i = 0; i < nerrors; i++)
+ {
+ if (fread(&errbuf, 1, sizeof(PgStat_StatSubErrEntry), fpin) !=
+ sizeof(PgStat_StatSubErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+ }
+ }
+
+ break;
+
case 'E':
goto done;
@@ -4771,6 +5206,7 @@ pgstat_clear_snapshot(void)
pgStatLocalContext = NULL;
pgStatDBHash = NULL;
replSlotStatHash = NULL;
+ subscriptionHash = NULL;
/*
* Historically the backend_status.c facilities lived in this file, and
@@ -5689,6 +6125,116 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
}
}
+/* ----------
+ * pgstat_recv_subscription_error() -
+ *
+ * Process a SUBSCRIPTIONERR message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len)
+{
+ PgStat_StatSubErrEntry *errent;
+ bool create = !msg->m_reset;
+
+ /* Get subscription error */
+ errent = pgstat_get_subscription_error_entry(msg->m_subid,
+ msg->m_subrelid,
+ create);
+
+ if (msg->m_reset)
+ {
+ if (errent == NULL)
+ return;
+
+ /* reset fields and set reset timestamp */
+ errent->relid = InvalidOid;
+ errent->command = 0;
+ errent->xid = InvalidTransactionId;
+ errent->failure_count = 0;
+ errent->last_failure = 0;
+ errent->last_errmsg[0] = '\0';
+ errent->stat_reset_timestamp = GetCurrentTimestamp();
+ }
+ else
+ {
+ Assert(errent);
+
+ /* update the error entry */
+ errent->databaseid = msg->m_databaseid;
+ errent->relid = msg->m_relid;
+ errent->command = msg->m_command;
+ errent->xid = msg->m_xid;
+ errent->failure_count++;
+ errent->last_failure = msg->m_failure_time;
+ strlcpy(errent->last_errmsg, msg->m_errmsg, PGSTAT_SUBSCRIPTIONERR_MSGLEN);
+ }
+}
+
+/* ----------
+ * pgstat_recv_subscription_purge() -
+ *
+ * Process a SUBSCRIPTIONPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len)
+{
+ if (subscriptionHash == NULL)
+ return;
+
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ PgStat_StatSubEntry *subent;
+
+ subent = pgstat_get_subscription_entry(msg->m_subids[i], false);
+
+ /*
+ * Nothing to do if the subscription entry is not found. This could
+ * happen when the subscription is dropped and the message for
+ * dropping subscription entry arrived before the message for
+ * reporting the error.
+ */
+ if (subent == NULL)
+ continue;
+
+ /* Cleanup the hash table for errors */
+ if (subent->suberrors != NULL)
+ hash_destroy(subent->suberrors);
+
+ /* Remove the subscription entry */
+ (void) hash_search(subscriptionHash, (void *) &(msg->m_subids[i]),
+ HASH_REMOVE, NULL);
+ }
+}
+
+/* ----------
+ * pgstat_recv_subscription_error_purge() -
+ *
+ * Process a SUBSCRIPTIONERRPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_error_purge(PgStat_MsgSubscriptionErrPurge *msg, int len)
+{
+ PgStat_StatSubEntry *subent;
+
+ subent = pgstat_get_subscription_entry(msg->m_subid, false);
+
+ /*
+ * Nothing to do if the subscription entry is not found. This could
+ * happen when the subscription with msg->m_subid is removed and the
+ * corresponding entry is also removed before receiving the error purge
+ * message.
+ */
+ if (subent == NULL)
+ return;
+
+ for (int i = 0; i < msg->m_nentries; i++)
+ (void) hash_search(subent->suberrors, (void *) &(msg->m_relids[i]),
+ HASH_REMOVE, NULL);
+}
+
/* ----------
* pgstat_write_statsfile_needed() -
*
@@ -5786,6 +6332,111 @@ pgstat_get_replslot_entry(NameData name, bool create)
return slotent;
}
+/* ----------
+ * pgstat_get_subscription_entry
+ *
+ * Return the subscription statistics with the subscription OID. Return NULL
+ * if not found and the caller didn't request to create it.
+ *
+ * 'create' tells whether to create the new subscription entry if it is not
+ * found.
+ * ----------
+ */
+static PgStat_StatSubEntry *
+pgstat_get_subscription_entry(Oid subid, bool create)
+{
+ PgStat_StatSubEntry *subent;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);
+ bool found;
+
+ if (subscriptionHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ if (!create)
+ return NULL;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubEntry);
+ subscriptionHash = hash_create("Subscription stat entries",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+ }
+
+ subent = (PgStat_StatSubEntry *) hash_search(subscriptionHash,
+ (void *) &subid,
+ action, &found);
+
+ /* initialize field */
+ if (create && !found)
+ subent->suberrors = NULL;
+
+ return subent;
+}
+
+/* ----------
+ * pgstat_get_subscription_error_entry
+ *
+ * Return the entry of subscription error entry with the subscription
+ * OID and relation OID. Return NULL if not found and the caller didn't
+ * request to create it.
+ *
+ * 'create' tells whether to create the new subscription relation entry if it is
+ * not found.
+ * ----------
+ */
+static PgStat_StatSubErrEntry *
+pgstat_get_subscription_error_entry(Oid subid, Oid subrelid, bool create)
+{
+ PgStat_StatSubEntry *subent;
+ PgStat_StatSubErrEntry *errent;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);
+ bool found;
+
+ subent = pgstat_get_subscription_entry(subid, create);
+
+ if (subent == NULL)
+ {
+ Assert(!create);
+ return NULL;
+ }
+
+ if (subent->suberrors == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ if (!create)
+ return NULL;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubErrEntry);
+ subent->suberrors = hash_create("Subscription error hash",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+ }
+
+ errent = (PgStat_StatSubErrEntry *) hash_search(subent->suberrors,
+ (void *) &subrelid,
+ action, &found);
+
+ /* initialize fields */
+ if (create && !found)
+ {
+ errent->databaseid = InvalidOid;
+ errent->relid = InvalidOid;
+ errent->command = 0;
+ errent->xid = InvalidTransactionId;
+ errent->failure_count = 0;
+ errent->last_failure = 0;
+ errent->last_errmsg[0] = '\0';
+ errent->stat_reset_timestamp = 0;
+ }
+
+ return errent;
+}
+
/* ----------
* pgstat_reset_replslot
*
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index c24cd7db1b..ebe6f53b5d 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3462,6 +3462,7 @@ void
ApplyWorkerMain(Datum main_arg)
{
int worker_slot = DatumGetInt32(main_arg);
+ MemoryContext cctx = CurrentMemoryContext;
MemoryContext oldctx;
char originname[NAMEDATALEN];
XLogRecPtr origin_startpos;
@@ -3562,8 +3563,27 @@ ApplyWorkerMain(Datum main_arg)
{
char *syncslotname;
- /* This is table synchronization worker, call initial sync. */
- syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ PG_TRY();
+ {
+ /* This is table synchronization worker, call initial sync. */
+ syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ }
+ PG_CATCH();
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ /* report the table sync error */
+ pgstat_report_subscription_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ MyLogicalRepWorker->relid,
+ 0,
+ InvalidTransactionId,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
/* allocate slot name in long-lived context */
myslotname = MemoryContextStrdup(ApplyContext, syncslotname);
@@ -3681,7 +3701,32 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- LogicalRepApplyLoop(origin_startpos);
+ PG_TRY();
+ {
+ LogicalRepApplyLoop(origin_startpos);
+ }
+ PG_CATCH();
+ {
+ /* report the apply error */
+ if (apply_error_callback_arg.command != 0)
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ pgstat_report_subscription_error(MySubscription->oid,
+ InvalidOid,
+ apply_error_callback_arg.rel != NULL
+ ? apply_error_callback_arg.rel->localreloid
+ : InvalidOid,
+ apply_error_callback_arg.command,
+ apply_error_callback_arg.remote_xid,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ }
+
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
proc_exit(0);
}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index ff5aedc99c..c454e2f8bc 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -24,6 +24,7 @@
#include "pgstat.h"
#include "postmaster/bgworker_internals.h"
#include "postmaster/postmaster.h"
+#include "replication/logicalproto.h"
#include "replication/slot.h"
#include "storage/proc.h"
#include "storage/procarray.h"
@@ -2239,6 +2240,23 @@ pg_stat_reset_replication_slot(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
+/* Reset a subscription error stats */
+Datum
+pg_stat_reset_subscription_error(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid;
+
+ if (PG_ARGISNULL(1))
+ relid = InvalidOid; /* reset apply worker error stats */
+ else
+ relid = PG_GETARG_OID(1); /* reset table sync worker error stats */
+
+ pgstat_reset_subscription_error(subid, relid);
+
+ PG_RETURN_VOID();
+}
+
Datum
pg_stat_get_archiver(PG_FUNCTION_ARGS)
{
@@ -2379,3 +2397,98 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
/* Returns the record as Datum */
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
+
+/*
+ * Get the subscription error for the given subscription (and relation).
+ */
+Datum
+pg_stat_get_subscription_error(PG_FUNCTION_ARGS)
+{
+#define PG_STAT_GET_SUBSCRIPTION_ERROR_COLS 10
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid;
+ TupleDesc tupdesc;
+ Datum values[PG_STAT_GET_SUBSCRIPTION_ERROR_COLS];
+ bool nulls[PG_STAT_GET_SUBSCRIPTION_ERROR_COLS];
+ PgStat_StatSubErrEntry *errent;
+
+ /* Initialise values and NULL flags arrays */
+ MemSet(values, 0, sizeof(values));
+ MemSet(nulls, 0, sizeof(nulls));
+
+ /* Initialise attributes information in the tuple descriptor */
+ tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_SUBSCRIPTION_ERROR_COLS);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "datid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 3, "relid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 4, "command",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 5, "xid",
+ XIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 6, "failure_source",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 7, "failure_count",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 8, "last_failure",
+ TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 9, "last_failure_message",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 10, "stats_reset",
+ TIMESTAMPTZOID, -1, 0);
+ BlessTupleDesc(tupdesc);
+
+ if (PG_ARGISNULL(1))
+ relid = InvalidOid;
+ else
+ relid = PG_GETARG_OID(1);
+
+ /* Get subscription errors */
+ errent = pgstat_fetch_subscription_error(subid, relid);
+
+ /* Return NULL if the subscription doesn't have any errors */
+ if (errent == NULL)
+ PG_RETURN_NULL();
+
+ values[0] = ObjectIdGetDatum(errent->databaseid);
+ values[1] = ObjectIdGetDatum(subid);
+
+ if (OidIsValid(errent->relid))
+ values[2] = ObjectIdGetDatum(errent->relid);
+ else
+ nulls[2] = true;
+
+ if (errent->command == 0)
+ nulls[3] = true;
+ else
+ values[3] = CStringGetTextDatum(logicalrep_message_type(errent->command));
+
+ if (TransactionIdIsValid(errent->xid))
+ values[4] = TransactionIdGetDatum(errent->xid);
+ else
+ nulls[4] = true;
+
+ if (OidIsValid(errent->subrelid))
+ values[5] = CStringGetTextDatum("tablesync");
+ else
+ values[5] = CStringGetTextDatum("apply");
+
+ values[6] = Int64GetDatum(errent->failure_count);
+
+ if (errent->last_failure == 0)
+ nulls[7] = true;
+ else
+ values[7] = TimestampTzGetDatum(errent->last_failure);
+
+ values[8] = CStringGetTextDatum(errent->last_errmsg);
+
+ if (errent->stat_reset_timestamp == 0)
+ nulls[9] = true;
+ else
+ values[9] = TimestampTzGetDatum(errent->stat_reset_timestamp);
+
+ /* Returns the record as Datum */
+ PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
+}
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index a3e1c59a82..871f7b1b15 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -1441,7 +1441,6 @@ getinternalerrposition(void)
return edata->internalpos;
}
-
/*
* Functions to allow construction of error message strings separately from
* the ereport() call itself.
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index b603700ed9..7f9c27bdda 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5385,6 +5385,14 @@
proargmodes => '{i,o,o,o,o,o,o,o,o,o,o}',
proargnames => '{slot_name,slot_name,spill_txns,spill_count,spill_bytes,stream_txns,stream_count,stream_bytes,total_txns,total_bytes,stats_reset}',
prosrc => 'pg_stat_get_replication_slot' },
+{ oid => '8523', descr => 'statistics: information about subscription error',
+ proname => 'pg_stat_get_subscription_error', prorows => '1', proisstrict => 'f',
+ proretset => 't', provolatile => 's', proparallel => 'r',
+ prorettype => 'record', proargtypes => 'oid oid',
+ proallargtypes => '{oid,oid,oid,oid,oid,text,xid,text,int8,timestamptz,text,timestamptz}',
+ proargmodes => '{i,i,o,o,o,o,o,o,o,o,o,o}',
+ proargnames => '{subid,relid,datid,subid,relid,command,xid,failure_source,failure_count,last_failure,last_failure_message,stats_reset}',
+ prosrc => 'pg_stat_get_subscription_error' },
{ oid => '6118', descr => 'statistics: information about subscription',
proname => 'pg_stat_get_subscription', prorows => '10', proisstrict => 'f',
proretset => 't', provolatile => 's', proparallel => 'r',
@@ -5772,6 +5780,11 @@
proname => 'pg_stat_reset_replication_slot', proisstrict => 'f',
provolatile => 'v', prorettype => 'void', proargtypes => 'text',
prosrc => 'pg_stat_reset_replication_slot' },
+{ oid => '8524',
+ descr => 'statistics: reset collected statistics for a single subscription error',
+ proname => 'pg_stat_reset_subscription_error', proisstrict => 'f',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid oid',
+ prosrc => 'pg_stat_reset_subscription_error' },
{ oid => '3163', descr => 'current trigger depth',
proname => 'pg_trigger_depth', provolatile => 's', proparallel => 'r',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 509849c7ff..a6914a24e5 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -14,6 +14,7 @@
#include "datatype/timestamp.h"
#include "portability/instr_time.h"
#include "postmaster/pgarch.h" /* for MAX_XFN_CHARS */
+#include "replication/logicalproto.h"
#include "utils/backend_progress.h" /* for backward compatibility */
#include "utils/backend_status.h" /* for backward compatibility */
#include "utils/hsearch.h"
@@ -66,6 +67,9 @@ typedef enum StatMsgType
PGSTAT_MTYPE_RESETSINGLECOUNTER,
PGSTAT_MTYPE_RESETSLRUCOUNTER,
PGSTAT_MTYPE_RESETREPLSLOTCOUNTER,
+ PGSTAT_MTYPE_SUBSCRIPTIONERR,
+ PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE,
+ PGSTAT_MTYPE_SUBSCRIPTIONPURGE,
PGSTAT_MTYPE_AUTOVAC_START,
PGSTAT_MTYPE_VACUUM,
PGSTAT_MTYPE_ANALYZE,
@@ -530,6 +534,68 @@ typedef struct PgStat_MsgReplSlot
PgStat_Counter m_total_bytes;
} PgStat_MsgReplSlot;
+/* ----------
+ * PgStat_MsgSubscriptionErr Sent by the apply worker or the table sync worker to
+ * update/reset the error happening during logical
+ * replication.
+ * ----------
+ */
+#define PGSTAT_SUBSCRIPTIONERR_MSGLEN 256
+typedef struct PgStat_MsgSubscriptionErr
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * m_subid and m_subrelid are used to determine the subscription and the
+ * reporter of this error. m_subrelid is InvalidOid if reported by the
+ * apply worker, otherwise by the table sync worker. In table sync worker
+ * case, m_subrelid must be the same as m_relid.
+ */
+ Oid m_subid;
+ Oid m_subrelid;
+
+ /* The reset message uses below field */
+ bool m_reset; /* Reset all fields and set reset_stats
+ * timestamp */
+
+ /* The error report message uses below fields */
+ Oid m_databaseid;
+ Oid m_relid;
+ LogicalRepMsgType m_command;
+ TransactionId m_xid;
+ TimestampTz m_failure_time;
+ char m_errmsg[PGSTAT_SUBSCRIPTIONERR_MSGLEN];
+} PgStat_MsgSubscriptionErr;
+
+/* ----------
+ * PgStat_MsgSubscriptionPurge Sent by the autovacuum purge the subscriptions.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionPurge
+{
+ PgStat_MsgHdr m_hdr;
+ int m_nentries;
+ Oid m_subids[PGSTAT_NUM_SUBSCRIPTIONPURGE];
+} PgStat_MsgSubscriptionPurge;
+
+/* ----------
+ * PgStat_MsgSubscriptionErrPurge Sent by the autovacuum purge the subscription
+ * errors.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONERRPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(Oid) - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionErrPurge
+{
+ PgStat_MsgHdr m_hdr;
+ Oid m_subid;
+ int m_nentries;
+ Oid m_relids[PGSTAT_NUM_SUBSCRIPTIONERRPURGE];
+} PgStat_MsgSubscriptionErrPurge;
/* ----------
* PgStat_MsgRecoveryConflict Sent by the backend upon recovery conflict
@@ -701,6 +767,9 @@ typedef union PgStat_Msg
PgStat_MsgChecksumFailure msg_checksumfailure;
PgStat_MsgReplSlot msg_replslot;
PgStat_MsgConn msg_conn;
+ PgStat_MsgSubscriptionErr msg_subscriptionerr;
+ PgStat_MsgSubscriptionErrPurge msg_subscriptionerrpurge;
+ PgStat_MsgSubscriptionPurge msg_subscriptionpurge;
} PgStat_Msg;
@@ -916,6 +985,38 @@ typedef struct PgStat_StatReplSlotEntry
TimestampTz stat_reset_timestamp;
} PgStat_StatReplSlotEntry;
+/*
+ * Subscription statistics kept in the stats collector.
+ */
+typedef struct PgStat_StatSubEntry
+{
+ Oid subid; /* hash table key */
+ HTAB *suberrors;
+} PgStat_StatSubEntry;
+
+/*
+ * Subscription error statistics kept in the stats collector. One entry represents
+ * an error that happened during logical replication, reported by the apply worker
+ * (subrelid is InvalidOid) or by the table sync worker (subrelid is a valid OID).
+ * The error reported by the table sync worker is removed also when the table
+ * synchronization process completed.
+ */
+
+typedef struct PgStat_StatSubErrEntry
+{
+ Oid subrelid; /* InvalidOid if the apply worker, otherwise
+ * the table sync worker. hash table key. */
+ Oid databaseid;
+ Oid relid; /* OID of relation related to the error. Must
+ * be the same as subrelid in the table sync
+ * case. */
+ LogicalRepMsgType command;
+ TransactionId xid;
+ PgStat_Counter failure_count;
+ TimestampTz last_failure;
+ char last_errmsg[PGSTAT_SUBSCRIPTIONERR_MSGLEN];
+ TimestampTz stat_reset_timestamp;
+} PgStat_StatSubErrEntry;
/*
* Working state needed to accumulate per-function-call timing statistics.
@@ -1009,6 +1110,7 @@ extern void pgstat_reset_shared_counters(const char *);
extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type type);
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
+extern void pgstat_reset_subscription_error(Oid subid, Oid subrelid);
extern void pgstat_report_autovac(Oid dboid);
extern void pgstat_report_vacuum(Oid tableoid, bool shared,
@@ -1024,6 +1126,9 @@ extern void pgstat_report_checksum_failure(void);
extern void pgstat_report_replslot(const PgStat_StatReplSlotEntry *repSlotStat);
extern void pgstat_report_replslot_create(const char *slotname);
extern void pgstat_report_replslot_drop(const char *slotname);
+extern void pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command,
+ TransactionId xid, const char *errmsg);
extern void pgstat_initialize(void);
@@ -1122,6 +1227,7 @@ extern PgStat_GlobalStats *pgstat_fetch_global(void);
extern PgStat_WalStats *pgstat_fetch_stat_wal(void);
extern PgStat_SLRUStats *pgstat_fetch_slru(void);
extern PgStat_StatReplSlotEntry *pgstat_fetch_replslot(NameData slotname);
+extern PgStat_StatSubErrEntry *pgstat_fetch_subscription_error(Oid subid, Oid relid);
extern void pgstat_count_slru_page_zeroed(int slru_idx);
extern void pgstat_count_slru_page_hit(int slru_idx);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2fa00a3c29..66b185fc9c 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2094,6 +2094,28 @@ pg_stat_subscription| SELECT su.oid AS subid,
st.latest_end_time
FROM (pg_subscription su
LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid, pid, received_lsn, last_msg_send_time, last_msg_receipt_time, latest_end_lsn, latest_end_time) ON ((st.subid = su.oid)));
+pg_stat_subscription_errors| SELECT d.datname,
+ sr.subid,
+ s.subname,
+ e.relid,
+ e.command,
+ e.xid,
+ e.failure_source,
+ e.failure_count,
+ e.last_failure,
+ e.last_failure_message,
+ e.stats_reset
+ FROM ( SELECT pg_subscription.oid AS subid,
+ NULL::oid AS relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT pg_subscription_rel.srsubid AS subid,
+ pg_subscription_rel.srrelid AS relid
+ FROM pg_subscription_rel
+ WHERE (pg_subscription_rel.srsubstate <> 'r'::"char")) sr,
+ ((LATERAL pg_stat_get_subscription_error(sr.subid, sr.relid) e(datid, subid, relid, command, xid, failure_source, failure_count, last_failure, last_failure_message, stats_reset)
+ JOIN pg_database d ON ((e.datid = d.oid)))
+ JOIN pg_subscription s ON ((e.subid = s.oid)));
pg_stat_sys_indexes| SELECT pg_stat_all_indexes.relid,
pg_stat_all_indexes.indexrelid,
pg_stat_all_indexes.schemaname,
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 621d0cb4da..0859a791fa 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1938,6 +1938,9 @@ PgStat_MsgResetsharedcounter
PgStat_MsgResetsinglecounter
PgStat_MsgResetslrucounter
PgStat_MsgSLRU
+PgStat_MsgSubscriptionErr
+PgStat_MsgSubscriptionErrPurge
+PgStat_MsgSubscriptionPurge
PgStat_MsgTabpurge
PgStat_MsgTabstat
PgStat_MsgTempFile
@@ -1949,6 +1952,8 @@ PgStat_Single_Reset_Type
PgStat_StatDBEntry
PgStat_StatFuncEntry
PgStat_StatReplSlotEntry
+PgStat_StatSubEntry
+PgStat_StatSubErrEntry
PgStat_StatTabEntry
PgStat_SubXactStatus
PgStat_TableCounts
--
2.24.3 (Apple Git-128)
On Wednesday, August 25, 2021 12:22 PM Masahiko Sawada <sawada.mshk@gmail.com>wrote:
Attached updated version patches. Please review them.
Thanks for your new patch. The v11-0001 patch LGTM.
Regards
Tang
On Wed, Aug 25, 2021 at 2:22 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Attached updated version patches. Please review them.
Regarding the v11-0001 patch, it looks OK to me, but I do have one point:
In apply_dispatch(), wouldn't it be better to NOT move the error
reporting for an invalid message type into the switch as the default
case - because then, if you add a new message type, you won't get a
compiler warning (when warnings are enabled) for a missing switch
case, which is a handy way to alert you that the new message type
needs to be added as a case to the switch.
Regards,
Greg Nancarrow
Fujitsu Australia
On Thu, Aug 26, 2021 at 7:15 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Wed, Aug 25, 2021 at 2:22 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Attached updated version patches. Please review them.
Regarding the v11-0001 patch, it looks OK to me, but I do have one point:
In apply_dispatch(), wouldn't it be better to NOT move the error
reporting for an invalid message type into the switch as the default
case - because then, if you add a new message type, you won't get a
compiler warning (when warnings are enabled) for a missing switch
case, which is a handy way to alert you that the new message type
needs to be added as a case to the switch.
Do you have any suggestions on how to achieve that without adding some
additional variable? I think it is not a very hard requirement as we
don't follow the same at other places in code.
--
With Regards,
Amit Kapila.
On Thu, Aug 26, 2021 at 12:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Aug 26, 2021 at 7:15 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Wed, Aug 25, 2021 at 2:22 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Attached updated version patches. Please review them.
Regarding the v11-0001 patch, it looks OK to me, but I do have one point:
In apply_dispatch(), wouldn't it be better to NOT move the error
reporting for an invalid message type into the switch as the default
case - because then, if you add a new message type, you won't get a
compiler warning (when warnings are enabled) for a missing switch
case, which is a handy way to alert you that the new message type
needs to be added as a case to the switch.Do you have any suggestions on how to achieve that without adding some
additional variable? I think it is not a very hard requirement as we
don't follow the same at other places in code.
Yeah, I agree that it's a handy way to detect missing a switch case
but I think that we don't necessarily need it in this case. Because
there are many places in the code where doing similar things and when
it comes to apply_dispatch() it's the entry function to handle the
incoming message so it will be unlikely that we miss adding a switch
case until the patch gets committed. If we don't move it, we would end
up either adding the code resetting the
apply_error_callback_arg.command to every message type, adding a flag
indicating the message is handled and checking later, or having a big
if statement checking if the incoming message type is valid etc.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thu, Aug 26, 2021 at 1:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Do you have any suggestions on how to achieve that without adding some
additional variable? I think it is not a very hard requirement as we
don't follow the same at other places in code.
Sorry, forget my suggestion, I see it's not easy to achieve it and
still execute the non-error-case code after the switch.
(you'd have to use a variable set in the default case, defeating the
purpose, or have the switch in a separate function with return for
each case)
So the 0001 patch LGTM.
Regards,
Greg Nancarrow
Fujitsu Australia
On Wed, Aug 25, 2021 12:22 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Attached updated version patches. Please review them.
The v11-0001 patch LGTM.
Best regards,
Hou zj
On Thu, Aug 26, 2021 at 9:50 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Aug 26, 2021 at 12:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Yeah, I agree that it's a handy way to detect missing a switch case
but I think that we don't necessarily need it in this case. Because
there are many places in the code where doing similar things and when
it comes to apply_dispatch() it's the entry function to handle the
incoming message so it will be unlikely that we miss adding a switch
case until the patch gets committed. If we don't move it, we would end
up either adding the code resetting the
apply_error_callback_arg.command to every message type, adding a flag
indicating the message is handled and checking later, or having a big
if statement checking if the incoming message type is valid etc.
I was reviewing and making minor edits to your v11-0001* patch and
noticed that the below parts of the code could be improved:
1.
+ if (errarg->rel)
+ appendStringInfo(&buf, _(" for replication target relation \"%s.%s\""),
+ errarg->rel->remoterel.nspname,
+ errarg->rel->remoterel.relname);
+
+ if (errarg->remote_attnum >= 0)
+ appendStringInfo(&buf, _(" column \"%s\""),
+ errarg->rel->remoterel.attnames[errarg->remote_attnum]);
Isn't it better if 'remote_attnum' check is inside if (errargrel)
check? It will be weird to print column information without rel
information and in the current code, we don't set remote_attnum
without rel. The other possibility could be to have an Assert for rel
in 'remote_attnum' check.
2.
+ /* Reset relation for error callback */
+ apply_error_callback_arg.rel = NULL;
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
Isn't it better to reset relation info as the last thing in
apply_handle_insert/update/delete as you do for a few other
parameters? There is very little chance of error from those two
functions but still, it will be good if they ever throw an error and
it might be clear for future edits in this function that this needs to
be set as the last thing in these functions.
Note - I can take care of the above points based on whatever we agree
with, you don't need to send a new version for this.
--
With Regards,
Amit Kapila.
On Thu, Aug 26, 2021 at 11:39 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Aug 26, 2021 at 9:50 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Aug 26, 2021 at 12:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Yeah, I agree that it's a handy way to detect missing a switch case
but I think that we don't necessarily need it in this case. Because
there are many places in the code where doing similar things and when
it comes to apply_dispatch() it's the entry function to handle the
incoming message so it will be unlikely that we miss adding a switch
case until the patch gets committed. If we don't move it, we would end
up either adding the code resetting the
apply_error_callback_arg.command to every message type, adding a flag
indicating the message is handled and checking later, or having a big
if statement checking if the incoming message type is valid etc.I was reviewing and making minor edits to your v11-0001* patch and noticed that the below parts of the code could be improved: 1. + if (errarg->rel) + appendStringInfo(&buf, _(" for replication target relation \"%s.%s\""), + errarg->rel->remoterel.nspname, + errarg->rel->remoterel.relname); + + if (errarg->remote_attnum >= 0) + appendStringInfo(&buf, _(" column \"%s\""), + errarg->rel->remoterel.attnames[errarg->remote_attnum]);Isn't it better if 'remote_attnum' check is inside if (errargrel)
check? It will be weird to print column information without rel
information and in the current code, we don't set remote_attnum
without rel. The other possibility could be to have an Assert for rel
in 'remote_attnum' check.2. + /* Reset relation for error callback */ + apply_error_callback_arg.rel = NULL; + logicalrep_rel_close(rel, NoLock);end_replication_step();
Isn't it better to reset relation info as the last thing in
apply_handle_insert/update/delete as you do for a few other
parameters? There is very little chance of error from those two
functions but still, it will be good if they ever throw an error and
it might be clear for future edits in this function that this needs to
be set as the last thing in these functions.
I see that resetting it before logicalrep_rel_close has an advantage
that we might not accidentally access some information after close
which is not there in rel. I am not sure if that is the reason you
have in mind for resetting it before close.
--
With Regards,
Amit Kapila.
On Thu, Aug 26, 2021 at 3:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Aug 26, 2021 at 9:50 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Aug 26, 2021 at 12:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Yeah, I agree that it's a handy way to detect missing a switch case
but I think that we don't necessarily need it in this case. Because
there are many places in the code where doing similar things and when
it comes to apply_dispatch() it's the entry function to handle the
incoming message so it will be unlikely that we miss adding a switch
case until the patch gets committed. If we don't move it, we would end
up either adding the code resetting the
apply_error_callback_arg.command to every message type, adding a flag
indicating the message is handled and checking later, or having a big
if statement checking if the incoming message type is valid etc.I was reviewing and making minor edits to your v11-0001* patch and
noticed that the below parts of the code could be improved:
Thank you for the comments!
1. + if (errarg->rel) + appendStringInfo(&buf, _(" for replication target relation \"%s.%s\""), + errarg->rel->remoterel.nspname, + errarg->rel->remoterel.relname); + + if (errarg->remote_attnum >= 0) + appendStringInfo(&buf, _(" column \"%s\""), + errarg->rel->remoterel.attnames[errarg->remote_attnum]);Isn't it better if 'remote_attnum' check is inside if (errargrel)
check? It will be weird to print column information without rel
information and in the current code, we don't set remote_attnum
without rel. The other possibility could be to have an Assert for rel
in 'remote_attnum' check.
Agreed to check 'remote_attnum' inside "if(errargrel)".
2. + /* Reset relation for error callback */ + apply_error_callback_arg.rel = NULL; + logicalrep_rel_close(rel, NoLock);end_replication_step();
Isn't it better to reset relation info as the last thing in
apply_handle_insert/update/delete as you do for a few other
parameters? There is very little chance of error from those two
functions but still, it will be good if they ever throw an error and
it might be clear for future edits in this function that this needs to
be set as the last thing in these functions.
On Thu, Aug 26, 2021 at 3:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I see that resetting it before logicalrep_rel_close has an advantage
that we might not accidentally access some information after close
which is not there in rel. I am not sure if that is the reason you
have in mind for resetting it before close.
Yes, that's why I reset the apply_error_callback_arg.rel before
logicalrep_rel_close(), not at the end of the function.
Since the callback function refers to apply_error_callback_arg.rel it
still needs to be valid when an error occurs. Moving it to the end of
the function is no problem for now, but if we always reset relation
info as the last thing, I think that we cannot allow adding changes
between setting relation info and the end of the function (i.g.,
resetting relation info) that could lead to invalidate fields of
apply_error_callback_arg.rel (e.g, freeing a string value etc).
Note - I can take care of the above points based on whatever we agree
with, you don't need to send a new version for this.
Thanks!
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thu, Aug 26, 2021 at 4:42 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Aug 26, 2021 at 3:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
1. + if (errarg->rel) + appendStringInfo(&buf, _(" for replication target relation \"%s.%s\""), + errarg->rel->remoterel.nspname, + errarg->rel->remoterel.relname); + + if (errarg->remote_attnum >= 0) + appendStringInfo(&buf, _(" column \"%s\""), + errarg->rel->remoterel.attnames[errarg->remote_attnum]);Isn't it better if 'remote_attnum' check is inside if (errargrel)
check? It will be weird to print column information without rel
information and in the current code, we don't set remote_attnum
without rel. The other possibility could be to have an Assert for rel
in 'remote_attnum' check.Agreed to check 'remote_attnum' inside "if(errargrel)".
Okay, changed accordingly. Additionally, I have changed the code which
sets timestamp to (unset) when it is 0 so that it won't display the
timestamp in that case. I have made few other cosmetic changes in the
attached patch. See and let me know what you think of it?
Note - I have just attached the first patch here, once this is
committed we can focus on others.
--
With Regards,
Amit Kapila.
Attachments:
v12-0001-Add-logical-change-details-to-logical-replicatio.patchapplication/octet-stream; name=v12-0001-Add-logical-change-details-to-logical-replicatio.patchDownload
From 0d8b6b33ac1f8be8f341cbe64598b8838010899a Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Thu, 26 Aug 2021 17:33:36 +0530
Subject: [PATCH v12] Add logical change details to logical replication worker
errcontext.
Previously, on the subscriber, we set the error context callback to add
the error context to the error of the tuple data conversion failures. This
commit replaces the existing error context callback with a comprehensive
error context callback so that it shows not only the details of data
conversion failures but also the details of logical change being applied
by the apply worker or table sync worker. The additional information
displayed will be the command, transaction id, and timestamp.
The error context callback is set when entering the main apply loop. We
incrementally update the fields during applying changes. The error context
is added to an error only when applying a change but not when other work
such as receiving data etc.
This will help users in diagnosing the problems that occur during logical
replication. It also can be used by the follow-up commit that enables to
skip the particular transaction on the subscriber.
Author: Masahiko Sawada
Reviewed-by: Hou Zhijie, Greg Nancarrow, Haiying Tang, Amit Kapila
Tested-by: Haiying Tang
Discussion: https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xJfUVihNZDA@mail.gmail.com
---
src/backend/replication/logical/proto.c | 53 +++++++
src/backend/replication/logical/worker.c | 259 +++++++++++++++++++++----------
src/include/replication/logicalproto.h | 1 +
src/tools/pgindent/typedefs.list | 2 +-
4 files changed, 235 insertions(+), 80 deletions(-)
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 9732982..9f5bf4b 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -1156,3 +1156,56 @@ logicalrep_read_stream_abort(StringInfo in, TransactionId *xid,
*xid = pq_getmsgint(in, 4);
*subxid = pq_getmsgint(in, 4);
}
+
+/*
+ * Get string representing LogicalRepMsgType.
+ */
+char *
+logicalrep_message_type(LogicalRepMsgType action)
+{
+ switch (action)
+ {
+ case LOGICAL_REP_MSG_BEGIN:
+ return "BEGIN";
+ case LOGICAL_REP_MSG_COMMIT:
+ return "COMMIT";
+ case LOGICAL_REP_MSG_ORIGIN:
+ return "ORIGIN";
+ case LOGICAL_REP_MSG_INSERT:
+ return "INSERT";
+ case LOGICAL_REP_MSG_UPDATE:
+ return "UPDATE";
+ case LOGICAL_REP_MSG_DELETE:
+ return "DELETE";
+ case LOGICAL_REP_MSG_TRUNCATE:
+ return "TRUNCATE";
+ case LOGICAL_REP_MSG_RELATION:
+ return "RELATION";
+ case LOGICAL_REP_MSG_TYPE:
+ return "TYPE";
+ case LOGICAL_REP_MSG_MESSAGE:
+ return "MESSAGE";
+ case LOGICAL_REP_MSG_BEGIN_PREPARE:
+ return "BEGIN PREPARE";
+ case LOGICAL_REP_MSG_PREPARE:
+ return "PREPARE";
+ case LOGICAL_REP_MSG_COMMIT_PREPARED:
+ return "COMMIT PREPARED";
+ case LOGICAL_REP_MSG_ROLLBACK_PREPARED:
+ return "ROLLBACK PREPARED";
+ case LOGICAL_REP_MSG_STREAM_START:
+ return "STREAM START";
+ case LOGICAL_REP_MSG_STREAM_STOP:
+ return "STREAM STOP";
+ case LOGICAL_REP_MSG_STREAM_COMMIT:
+ return "STREAM COMMIT";
+ case LOGICAL_REP_MSG_STREAM_ABORT:
+ return "STREAM ABORT";
+ case LOGICAL_REP_MSG_STREAM_PREPARE:
+ return "STREAM PREPARE";
+ }
+
+ elog(ERROR, "invalid logical replication message type \"%c\"", action);
+
+ return NULL; /* keep compiler quiet */
+}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 38b493e..295b1e0 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -203,12 +203,6 @@ typedef struct FlushPosition
static dlist_head lsn_mapping = DLIST_STATIC_INIT(lsn_mapping);
-typedef struct SlotErrCallbackArg
-{
- LogicalRepRelMapEntry *rel;
- int remote_attnum;
-} SlotErrCallbackArg;
-
typedef struct ApplyExecutionData
{
EState *estate; /* executor state, used to track resources */
@@ -221,6 +215,27 @@ typedef struct ApplyExecutionData
PartitionTupleRouting *proute; /* partition routing info */
} ApplyExecutionData;
+/* Struct for saving and restoring apply errcontext information */
+typedef struct ApplyErrorCallbackArg
+{
+ LogicalRepMsgType command; /* 0 if invalid */
+ LogicalRepRelMapEntry *rel;
+
+ /* Remote node information */
+ int remote_attnum; /* -1 if invalid */
+ TransactionId remote_xid;
+ TimestampTz ts; /* commit, rollback, or prepare timestamp */
+} ApplyErrorCallbackArg;
+
+static ApplyErrorCallbackArg apply_error_callback_arg =
+{
+ .command = 0,
+ .rel = NULL,
+ .remote_attnum = -1,
+ .remote_xid = InvalidTransactionId,
+ .ts = 0,
+};
+
/*
* Stream xid hash entry. Whenever we see a new xid we create this entry in the
* xidhash and along with it create the streaming file and store the fileset handle.
@@ -335,6 +350,11 @@ static void TwoPhaseTransactionGid(Oid subid, TransactionId xid, char *gid, int
/* Common streaming function to apply all the spooled messages */
static void apply_spooled_messages(TransactionId xid, XLogRecPtr lsn);
+/* Functions for apply error callback */
+static void apply_error_callback(void *arg);
+static inline void set_apply_error_context_xact(TransactionId xid, TimestampTz ts);
+static inline void reset_apply_error_context_info(void);
+
/*
* Should this worker apply changes for given relation.
*
@@ -581,26 +601,6 @@ slot_fill_defaults(LogicalRepRelMapEntry *rel, EState *estate,
}
/*
- * Error callback to give more context info about data conversion failures
- * while reading data from the remote server.
- */
-static void
-slot_store_error_callback(void *arg)
-{
- SlotErrCallbackArg *errarg = (SlotErrCallbackArg *) arg;
- LogicalRepRelMapEntry *rel;
-
- /* Nothing to do if remote attribute number is not set */
- if (errarg->remote_attnum < 0)
- return;
-
- rel = errarg->rel;
- errcontext("processing remote data for replication target relation \"%s.%s\" column \"%s\"",
- rel->remoterel.nspname, rel->remoterel.relname,
- rel->remoterel.attnames[errarg->remote_attnum]);
-}
-
-/*
* Store tuple data into slot.
*
* Incoming data can be either text or binary format.
@@ -611,19 +611,9 @@ slot_store_data(TupleTableSlot *slot, LogicalRepRelMapEntry *rel,
{
int natts = slot->tts_tupleDescriptor->natts;
int i;
- SlotErrCallbackArg errarg;
- ErrorContextCallback errcallback;
ExecClearTuple(slot);
- /* Push callback + info on the error context stack */
- errarg.rel = rel;
- errarg.remote_attnum = -1;
- errcallback.callback = slot_store_error_callback;
- errcallback.arg = (void *) &errarg;
- errcallback.previous = error_context_stack;
- error_context_stack = &errcallback;
-
/* Call the "in" function for each non-dropped, non-null attribute */
Assert(natts == rel->attrmap->maplen);
for (i = 0; i < natts; i++)
@@ -637,7 +627,8 @@ slot_store_data(TupleTableSlot *slot, LogicalRepRelMapEntry *rel,
Assert(remoteattnum < tupleData->ncols);
- errarg.remote_attnum = remoteattnum;
+ /* Set attnum for error callback */
+ apply_error_callback_arg.remote_attnum = remoteattnum;
if (tupleData->colstatus[remoteattnum] == LOGICALREP_COLUMN_TEXT)
{
@@ -685,7 +676,8 @@ slot_store_data(TupleTableSlot *slot, LogicalRepRelMapEntry *rel,
slot->tts_isnull[i] = true;
}
- errarg.remote_attnum = -1;
+ /* Reset attnum for error callback */
+ apply_error_callback_arg.remote_attnum = -1;
}
else
{
@@ -699,9 +691,6 @@ slot_store_data(TupleTableSlot *slot, LogicalRepRelMapEntry *rel,
}
}
- /* Pop the error context stack */
- error_context_stack = errcallback.previous;
-
ExecStoreVirtualTuple(slot);
}
@@ -724,8 +713,6 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
{
int natts = slot->tts_tupleDescriptor->natts;
int i;
- SlotErrCallbackArg errarg;
- ErrorContextCallback errcallback;
/* We'll fill "slot" with a virtual tuple, so we must start with ... */
ExecClearTuple(slot);
@@ -739,14 +726,6 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
memcpy(slot->tts_values, srcslot->tts_values, natts * sizeof(Datum));
memcpy(slot->tts_isnull, srcslot->tts_isnull, natts * sizeof(bool));
- /* For error reporting, push callback + info on the error context stack */
- errarg.rel = rel;
- errarg.remote_attnum = -1;
- errcallback.callback = slot_store_error_callback;
- errcallback.arg = (void *) &errarg;
- errcallback.previous = error_context_stack;
- error_context_stack = &errcallback;
-
/* Call the "in" function for each replaced attribute */
Assert(natts == rel->attrmap->maplen);
for (i = 0; i < natts; i++)
@@ -763,7 +742,8 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
{
StringInfo colvalue = &tupleData->colvalues[remoteattnum];
- errarg.remote_attnum = remoteattnum;
+ /* Set attnum for error callback */
+ apply_error_callback_arg.remote_attnum = remoteattnum;
if (tupleData->colstatus[remoteattnum] == LOGICALREP_COLUMN_TEXT)
{
@@ -807,13 +787,11 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
slot->tts_isnull[i] = true;
}
- errarg.remote_attnum = -1;
+ /* Reset attnum for error callback */
+ apply_error_callback_arg.remote_attnum = -1;
}
}
- /* Pop the error context stack */
- error_context_stack = errcallback.previous;
-
/* And finally, declare that "slot" contains a valid virtual tuple */
ExecStoreVirtualTuple(slot);
}
@@ -827,6 +805,7 @@ apply_handle_begin(StringInfo s)
LogicalRepBeginData begin_data;
logicalrep_read_begin(s, &begin_data);
+ set_apply_error_context_xact(begin_data.xid, begin_data.committime);
remote_final_lsn = begin_data.final_lsn;
@@ -860,6 +839,7 @@ apply_handle_commit(StringInfo s)
process_syncing_tables(commit_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -877,6 +857,7 @@ apply_handle_begin_prepare(StringInfo s)
errmsg_internal("tablesync worker received a BEGIN PREPARE message")));
logicalrep_read_begin_prepare(s, &begin_data);
+ set_apply_error_context_xact(begin_data.xid, begin_data.prepare_time);
remote_final_lsn = begin_data.prepare_lsn;
@@ -962,6 +943,7 @@ apply_handle_prepare(StringInfo s)
process_syncing_tables(prepare_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -974,6 +956,7 @@ apply_handle_commit_prepared(StringInfo s)
char gid[GIDSIZE];
logicalrep_read_commit_prepared(s, &prepare_data);
+ set_apply_error_context_xact(prepare_data.xid, prepare_data.commit_time);
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
@@ -1001,6 +984,7 @@ apply_handle_commit_prepared(StringInfo s)
process_syncing_tables(prepare_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1013,6 +997,7 @@ apply_handle_rollback_prepared(StringInfo s)
char gid[GIDSIZE];
logicalrep_read_rollback_prepared(s, &rollback_data);
+ set_apply_error_context_xact(rollback_data.xid, rollback_data.rollback_time);
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, rollback_data.xid,
@@ -1050,6 +1035,7 @@ apply_handle_rollback_prepared(StringInfo s)
process_syncing_tables(rollback_data.rollback_end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1076,6 +1062,7 @@ apply_handle_stream_prepare(StringInfo s)
errmsg_internal("tablesync worker received a STREAM PREPARE message")));
logicalrep_read_stream_prepare(s, &prepare_data);
+ set_apply_error_context_xact(prepare_data.xid, prepare_data.prepare_time);
elog(DEBUG1, "received prepare for streamed transaction %u", prepare_data.xid);
@@ -1100,6 +1087,8 @@ apply_handle_stream_prepare(StringInfo s)
process_syncing_tables(prepare_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+
+ reset_apply_error_context_info();
}
/*
@@ -1156,6 +1145,8 @@ apply_handle_stream_start(StringInfo s)
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg_internal("invalid transaction ID in streamed replication transaction")));
+ set_apply_error_context_xact(stream_xid, 0);
+
/*
* Initialize the xidhash table if we haven't yet. This will be used for
* the entire duration of the apply worker so create it in permanent
@@ -1212,6 +1203,7 @@ apply_handle_stream_stop(StringInfo s)
MemoryContextReset(LogicalStreamingContext);
pgstat_report_activity(STATE_IDLE, NULL);
+ reset_apply_error_context_info();
}
/*
@@ -1235,7 +1227,10 @@ apply_handle_stream_abort(StringInfo s)
* just delete the files with serialized info.
*/
if (xid == subxid)
+ {
+ set_apply_error_context_xact(xid, 0);
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
+ }
else
{
/*
@@ -1260,6 +1255,8 @@ apply_handle_stream_abort(StringInfo s)
char path[MAXPGPATH];
StreamXidHash *ent;
+ set_apply_error_context_xact(subxid, 0);
+
subidx = -1;
begin_replication_step();
subxact_info_read(MyLogicalRepWorker->subid, xid);
@@ -1284,6 +1281,7 @@ apply_handle_stream_abort(StringInfo s)
cleanup_subxact_info();
end_replication_step();
CommitTransactionCommand();
+ reset_apply_error_context_info();
return;
}
@@ -1315,6 +1313,8 @@ apply_handle_stream_abort(StringInfo s)
end_replication_step();
CommitTransactionCommand();
}
+
+ reset_apply_error_context_info();
}
/*
@@ -1459,6 +1459,7 @@ apply_handle_stream_commit(StringInfo s)
errmsg_internal("STREAM COMMIT message without STREAM STOP")));
xid = logicalrep_read_stream_commit(s, &commit_data);
+ set_apply_error_context_xact(xid, commit_data.committime);
elog(DEBUG1, "received commit for streamed transaction %u", xid);
@@ -1473,6 +1474,8 @@ apply_handle_stream_commit(StringInfo s)
process_syncing_tables(commit_data.end_lsn);
pgstat_report_activity(STATE_IDLE, NULL);
+
+ reset_apply_error_context_info();
}
/*
@@ -1592,6 +1595,9 @@ apply_handle_insert(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ apply_error_callback_arg.rel = rel;
+
/* Initialize the executor state. */
edata = create_edata_for_relation(rel);
estate = edata->estate;
@@ -1615,6 +1621,9 @@ apply_handle_insert(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ apply_error_callback_arg.rel = NULL;
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -1713,6 +1722,9 @@ apply_handle_update(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ apply_error_callback_arg.rel = rel;
+
/* Check if we can do the update. */
check_relation_updatable(rel);
@@ -1766,6 +1778,9 @@ apply_handle_update(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ apply_error_callback_arg.rel = NULL;
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -1869,6 +1884,9 @@ apply_handle_delete(StringInfo s)
return;
}
+ /* Set relation for error callback */
+ apply_error_callback_arg.rel = rel;
+
/* Check if we can do the delete. */
check_relation_updatable(rel);
@@ -1894,6 +1912,9 @@ apply_handle_delete(StringInfo s)
finish_edata(edata);
+ /* Reset relation for error callback */
+ apply_error_callback_arg.rel = NULL;
+
logicalrep_rel_close(rel, NoLock);
end_replication_step();
@@ -2328,44 +2349,53 @@ static void
apply_dispatch(StringInfo s)
{
LogicalRepMsgType action = pq_getmsgbyte(s);
+ LogicalRepMsgType saved_command;
+
+ /*
+ * Set the current command being applied. Since this function can be
+ * called recusively when applying spooled changes, save the current
+ * command.
+ */
+ saved_command = apply_error_callback_arg.command;
+ apply_error_callback_arg.command = action;
switch (action)
{
case LOGICAL_REP_MSG_BEGIN:
apply_handle_begin(s);
- return;
+ break;
case LOGICAL_REP_MSG_COMMIT:
apply_handle_commit(s);
- return;
+ break;
case LOGICAL_REP_MSG_INSERT:
apply_handle_insert(s);
- return;
+ break;
case LOGICAL_REP_MSG_UPDATE:
apply_handle_update(s);
- return;
+ break;
case LOGICAL_REP_MSG_DELETE:
apply_handle_delete(s);
- return;
+ break;
case LOGICAL_REP_MSG_TRUNCATE:
apply_handle_truncate(s);
- return;
+ break;
case LOGICAL_REP_MSG_RELATION:
apply_handle_relation(s);
- return;
+ break;
case LOGICAL_REP_MSG_TYPE:
apply_handle_type(s);
- return;
+ break;
case LOGICAL_REP_MSG_ORIGIN:
apply_handle_origin(s);
- return;
+ break;
case LOGICAL_REP_MSG_MESSAGE:
@@ -2374,49 +2404,52 @@ apply_dispatch(StringInfo s)
* Although, it could be used by other applications that use this
* output plugin.
*/
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_START:
apply_handle_stream_start(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_STOP:
apply_handle_stream_stop(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_ABORT:
apply_handle_stream_abort(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_COMMIT:
apply_handle_stream_commit(s);
- return;
+ break;
case LOGICAL_REP_MSG_BEGIN_PREPARE:
apply_handle_begin_prepare(s);
- return;
+ break;
case LOGICAL_REP_MSG_PREPARE:
apply_handle_prepare(s);
- return;
+ break;
case LOGICAL_REP_MSG_COMMIT_PREPARED:
apply_handle_commit_prepared(s);
- return;
+ break;
case LOGICAL_REP_MSG_ROLLBACK_PREPARED:
apply_handle_rollback_prepared(s);
- return;
+ break;
case LOGICAL_REP_MSG_STREAM_PREPARE:
apply_handle_stream_prepare(s);
- return;
+ break;
+
+ default:
+ ereport(ERROR,
+ (errcode(ERRCODE_PROTOCOL_VIOLATION),
+ errmsg("invalid logical replication message type \"%c\"", action)));
}
- ereport(ERROR,
- (errcode(ERRCODE_PROTOCOL_VIOLATION),
- errmsg_internal("invalid logical replication message type \"%c\"",
- action)));
+ /* Reset the current command */
+ apply_error_callback_arg.command = saved_command;
}
/*
@@ -2517,6 +2550,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
TimestampTz last_recv_timestamp = GetCurrentTimestamp();
bool ping_sent = false;
TimeLineID tli;
+ ErrorContextCallback errcallback;
/*
* Init the ApplyMessageContext which we clean up after each replication
@@ -2537,6 +2571,14 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
/* mark as idle, before starting to loop */
pgstat_report_activity(STATE_IDLE, NULL);
+ /*
+ * Push apply error context callback. Fields will be filled during
+ * applying a change.
+ */
+ errcallback.callback = apply_error_callback;
+ errcallback.previous = error_context_stack;
+ error_context_stack = &errcallback;
+
/* This outer loop iterates once per wait. */
for (;;)
{
@@ -2737,6 +2779,9 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
}
+ /* Pop the error context stack */
+ error_context_stack = errcallback.previous;
+
/* All done */
walrcv_endstreaming(LogRepWorkerWalRcvConn, &tli);
}
@@ -3649,3 +3694,59 @@ IsLogicalWorker(void)
{
return MyLogicalRepWorker != NULL;
}
+
+/* Error callback to give more context info about the change being applied */
+static void
+apply_error_callback(void *arg)
+{
+ StringInfoData buf;
+ ApplyErrorCallbackArg *errarg = &apply_error_callback_arg;
+
+ if (apply_error_callback_arg.command == 0)
+ return;
+
+ initStringInfo(&buf);
+ appendStringInfo(&buf, _("processing remote data during \"%s\""),
+ logicalrep_message_type(errarg->command));
+
+ /* append relation information */
+ if (errarg->rel)
+ {
+ appendStringInfo(&buf, _(" for replication target relation \"%s.%s\""),
+ errarg->rel->remoterel.nspname,
+ errarg->rel->remoterel.relname);
+ if (errarg->remote_attnum >= 0)
+ appendStringInfo(&buf, _(" column \"%s\""),
+ errarg->rel->remoterel.attnames[errarg->remote_attnum]);
+ }
+
+ /* append transaction information */
+ if (TransactionIdIsNormal(errarg->remote_xid))
+ {
+ appendStringInfo(&buf, _(" in transaction %u"), errarg->remote_xid);
+ if (errarg->ts != 0)
+ appendStringInfo(&buf, _(" at %s"),
+ timestamptz_to_str(errarg->ts));
+ }
+
+ errcontext("%s", buf.data);
+ pfree(buf.data);
+}
+
+/* Set transaction information of apply error callback */
+static inline void
+set_apply_error_context_xact(TransactionId xid, TimestampTz ts)
+{
+ apply_error_callback_arg.remote_xid = xid;
+ apply_error_callback_arg.ts = ts;
+}
+
+/* Reset all information of apply error callback */
+static inline void
+reset_apply_error_context_info(void)
+{
+ apply_error_callback_arg.command = 0;
+ apply_error_callback_arg.rel = NULL;
+ apply_error_callback_arg.remote_attnum = -1;
+ set_apply_error_context_xact(InvalidTransactionId, 0);
+}
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 95c1561..83741dc 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -246,5 +246,6 @@ extern void logicalrep_write_stream_abort(StringInfo out, TransactionId xid,
TransactionId subxid);
extern void logicalrep_read_stream_abort(StringInfo in, TransactionId *xid,
TransactionId *subxid);
+extern char *logicalrep_message_type(LogicalRepMsgType action);
#endif /* LOGICAL_PROTO_H */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 37cf4b2..621d0cb 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -113,6 +113,7 @@ Append
AppendPath
AppendRelInfo
AppendState
+ApplyErrorCallbackArg
ApplyExecutionData
ApplySubXactData
Archive
@@ -2423,7 +2424,6 @@ SlabBlock
SlabChunk
SlabContext
SlabSlot
-SlotErrCallbackArg
SlotNumber
SlruCtl
SlruCtlData
--
1.8.3.1
On Thu, Aug 26, 2021 at 9:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Aug 26, 2021 at 4:42 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Aug 26, 2021 at 3:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
1. + if (errarg->rel) + appendStringInfo(&buf, _(" for replication target relation \"%s.%s\""), + errarg->rel->remoterel.nspname, + errarg->rel->remoterel.relname); + + if (errarg->remote_attnum >= 0) + appendStringInfo(&buf, _(" column \"%s\""), + errarg->rel->remoterel.attnames[errarg->remote_attnum]);Isn't it better if 'remote_attnum' check is inside if (errargrel)
check? It will be weird to print column information without rel
information and in the current code, we don't set remote_attnum
without rel. The other possibility could be to have an Assert for rel
in 'remote_attnum' check.Agreed to check 'remote_attnum' inside "if(errargrel)".
Okay, changed accordingly. Additionally, I have changed the code which
sets timestamp to (unset) when it is 0 so that it won't display the
timestamp in that case. I have made few other cosmetic changes in the
attached patch. See and let me know what you think of it?
Thank you for the patch!
Agreed with these changes. The patch looks good to me.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thu, Aug 26, 2021 at 6:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Aug 26, 2021 at 9:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Okay, changed accordingly. Additionally, I have changed the code which
sets timestamp to (unset) when it is 0 so that it won't display the
timestamp in that case. I have made few other cosmetic changes in the
attached patch. See and let me know what you think of it?Thank you for the patch!
Agreed with these changes. The patch looks good to me.
Pushed, feel free to rebase and send the remaining patch set.
--
With Regards,
Amit Kapila.
On Fri, Aug 27, 2021 at 1:36 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Aug 26, 2021 at 6:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Aug 26, 2021 at 9:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Okay, changed accordingly. Additionally, I have changed the code which
sets timestamp to (unset) when it is 0 so that it won't display the
timestamp in that case. I have made few other cosmetic changes in the
attached patch. See and let me know what you think of it?Thank you for the patch!
Agreed with these changes. The patch looks good to me.
Pushed, feel free to rebase and send the remaining patch set.
Thanks!
I'll post the updated version patch on Monday.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Fri, Aug 27, 2021 at 8:03 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Aug 27, 2021 at 1:36 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Aug 26, 2021 at 6:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Aug 26, 2021 at 9:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Okay, changed accordingly. Additionally, I have changed the code which
sets timestamp to (unset) when it is 0 so that it won't display the
timestamp in that case. I have made few other cosmetic changes in the
attached patch. See and let me know what you think of it?Thank you for the patch!
Agreed with these changes. The patch looks good to me.
Pushed, feel free to rebase and send the remaining patch set.
Thanks!
I'll post the updated version patch on Monday.
I've attached rebased patches. 0004 patch is not the scope of this
patch. It's borrowed from another thread[1]/messages/by-id/CAFiTN-v-zFpmm7Ze1Dai5LJjhhNYXvK8QABs35X89WY0HDG4Ww@mail.gmail.com to fix the assertion
failure for newly added tests. Please review them.
Regards,
[1]: /messages/by-id/CAFiTN-v-zFpmm7Ze1Dai5LJjhhNYXvK8QABs35X89WY0HDG4Ww@mail.gmail.com
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v12-0001-Add-pg_stat_subscription_errors-statistics-view.patchapplication/octet-stream; name=v12-0001-Add-pg_stat_subscription_errors-statistics-view.patchDownload
From ca22fd7b097d2262b1dae21bcca19678f5638afd Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 16 Jul 2021 23:10:22 +0900
Subject: [PATCH v12 1/4] Add pg_stat_subscription_errors statistics view.
This commits adds new system view pg_stat_logical_replication_error,
showing errors happening during applying logical replication changes
as well as during performing initial table synchronization.
The subscription error entries are removed by autovacuum workers when
the table synchronization competed in table sync worker cases and when
dropping the subscription in apply worker cases.
It also adds SQL function pg_stat_reset_subscription_error() to
reset the single subscription error.
---
doc/src/sgml/monitoring.sgml | 160 ++++++
src/backend/catalog/system_functions.sql | 2 +
src/backend/catalog/system_views.sql | 27 +
src/backend/postmaster/pgstat.c | 651 +++++++++++++++++++++++
src/backend/replication/logical/worker.c | 51 +-
src/backend/utils/adt/pgstatfuncs.c | 113 ++++
src/backend/utils/error/elog.c | 1 -
src/include/catalog/pg_proc.dat | 13 +
src/include/pgstat.h | 106 ++++
src/test/regress/expected/rules.out | 22 +
src/tools/pgindent/typedefs.list | 5 +
11 files changed, 1147 insertions(+), 4 deletions(-)
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 74a58a916c..0c02e46947 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -346,6 +346,15 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
</entry>
</row>
+ <row>
+ <entry><structname>pg_stat_subscription_errors</structname><indexterm><primary>pg_stat_subscription_errors</primary></indexterm></entry>
+ <entry>One row per error that happened on subscription, showing information about
+ the subscription errors.
+ See <link linkend="monitoring-pg-stat-subscription-errors">
+ <structname>pg_stat_subscription_errors</structname></link> for details.
+ </entry>
+ </row>
+
<row>
<entry><structname>pg_stat_ssl</structname><indexterm><primary>pg_stat_ssl</primary></indexterm></entry>
<entry>One row per connection (regular and replication), showing information about
@@ -3050,6 +3059,135 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
</sect2>
+ <sect2 id="monitoring-pg-stat-subscription-errors">
+ <title><structname>pg_stat_subscription_errors</structname></title>
+
+ <indexterm>
+ <primary>pg_stat_subscription_errors</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_stat_subscription_errors</structname> view will contain one
+ row per subscription error reported by workers applying logical replication
+ changes and workers handling the initial data copy of the subscribed tables.
+ </para>
+
+ <table id="pg-stat-subscription-errors" xreflabel="pg_stat_subscription-errors">
+ <title><structname>pg_stat_subscription_errors</structname> View</title>
+ <tgroup cols="1">
+ <thead>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ Column Type
+ </para>
+ <para>
+ Description
+ </para></entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>datname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the database in which the subscription is created.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the subscription.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the subscription.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>relid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker is processing when the
+ error happened.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>command</structfield> <type>text</type>
+ </para>
+ <para>
+ Name of command being applied when the error happened. This
+ field is always NULL if the error is reported by
+ <literal>tablesync</literal> worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>xid</structfield> <type>xid</type>
+ </para>
+ <para>
+ Transaction ID of publisher node being applied when the error
+ happened. This field is always NULL if the error is reported
+ by <literal>tablesync</literal> worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>failure_source</structfield> <type>text</type>
+ </para>
+ <para>
+ Type of worker reported the error: <literal>apply</literal> or
+ <literal>tablesync</literal>.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>failure_count</structfield> <type>uint8</type>
+ </para>
+ <para>
+ Number of times the error happened on the worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_failure</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which the last error happened.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_failure_message</structfield> <type>text</type>
+ </para>
+ <para>
+ Error message which is reported last failure time.
+ </para></entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ </sect2>
+
<sect2 id="monitoring-pg-stat-ssl-view">
<title><structname>pg_stat_ssl</structname></title>
@@ -5172,6 +5310,28 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
can be granted EXECUTE to run the function.
</para></entry>
</row>
+
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_stat_reset_subscription_error</primary>
+ </indexterm>
+ <function>pg_stat_reset_subscription_error</function> ( <parameter>subid</parameter> <type>oid</type>, <parameter>relid</parameter> <type>oid</type> )
+ <returnvalue>void</returnvalue>
+ </para>
+ <para>
+ Resets statistics of a single subscription error. If
+ the argument <parameter>relid</parameter> is not <literal>NULL</literal>,
+ resets error statistics of the <literal>tablesync</literal> worker for
+ the relation with <parameter>relid</parameter>. Otherwise, resets the
+ error statistics of the <literal>apply</literal> worker running on the
+ subscription with <parameter>subid</parameter>.
+ </para>
+ <para>
+ This function is restricted to superusers by default, but other users
+ can be granted EXECUTE to run the function.
+ </para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index a416e94d37..c9aa6f04d3 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -639,6 +639,8 @@ REVOKE EXECUTE ON FUNCTION pg_stat_reset_single_function_counters(oid) FROM publ
REVOKE EXECUTE ON FUNCTION pg_stat_reset_replication_slot(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_stat_reset_subscription_error(oid, oid) FROM public;
+
REVOKE EXECUTE ON FUNCTION lo_import(text) FROM public;
REVOKE EXECUTE ON FUNCTION lo_import(text, oid) FROM public;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 55f6e3711d..449692afa9 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1257,3 +1257,30 @@ REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
substream, subtwophasestate, subslotname, subsynccommit, subpublications)
ON pg_subscription TO public;
+
+CREATE VIEW pg_stat_subscription_errors AS
+ SELECT
+ d.datname,
+ sr.subid,
+ s.subname,
+ e.relid,
+ e.command,
+ e.xid,
+ e.failure_source,
+ e.failure_count,
+ e.last_failure,
+ e.last_failure_message,
+ e.stats_reset
+ FROM (SELECT
+ oid as subid,
+ NULL as relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT
+ srsubid as subid,
+ srrelid as relid
+ FROM pg_subscription_rel
+ WHERE srsubstate <> 'r') sr,
+ LATERAL pg_stat_get_subscription_error(sr.subid, sr.relid) e
+ JOIN pg_database d ON (e.datid = d.oid)
+ JOIN pg_subscription s ON (e.subid = s.oid);
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 4a280897b1..478b900769 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -40,6 +40,8 @@
#include "access/xact.h"
#include "catalog/pg_database.h"
#include "catalog/pg_proc.h"
+#include "catalog/pg_subscription.h"
+#include "catalog/pg_subscription_rel.h"
#include "common/ip.h"
#include "executor/instrument.h"
#include "libpq/libpq.h"
@@ -105,6 +107,7 @@
#define PGSTAT_TAB_HASH_SIZE 512
#define PGSTAT_FUNCTION_HASH_SIZE 512
#define PGSTAT_REPLSLOT_HASH_SIZE 32
+#define PGSTAT_SUBSCRIPTION_HASH_SIZE 32
/* ----------
@@ -279,6 +282,7 @@ static PgStat_GlobalStats globalStats;
static PgStat_WalStats walStats;
static PgStat_SLRUStats slruStats[SLRU_NUM_ELEMENTS];
static HTAB *replSlotStatHash = NULL;
+static HTAB *subscriptionHash = NULL;
/*
* List of OIDs of databases we need to write out. If an entry is InvalidOid,
@@ -329,6 +333,12 @@ static bool pgstat_db_requested(Oid databaseid);
static PgStat_StatReplSlotEntry *pgstat_get_replslot_entry(NameData name, bool create_it);
static void pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotstats, TimestampTz ts);
+static PgStat_StatSubEntry *pgstat_get_subscription_entry(Oid subid,
+ bool create);
+static PgStat_StatSubErrEntry *pgstat_get_subscription_error_entry(Oid subid,
+ Oid subrelid,
+ bool create);
+
static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg);
static void pgstat_send_funcstats(void);
static void pgstat_send_slru(void);
@@ -368,6 +378,10 @@ static void pgstat_recv_checksum_failure(PgStat_MsgChecksumFailure *msg, int len
static void pgstat_recv_connstat(PgStat_MsgConn *msg, int len);
static void pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len);
static void pgstat_recv_tempfile(PgStat_MsgTempFile *msg, int len);
+static void pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len);
+static void pgstat_recv_subscription_error_purge(PgStat_MsgSubscriptionErrPurge *msg,
+ int len);
+static void pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len);
/* ------------------------------------------------------------
* Public functions called from postmaster follow
@@ -1146,6 +1160,166 @@ pgstat_vacuum_stat(void)
}
}
+ /*
+ * Search for all the dead subscriptions and error entries in stats
+ * hashtable and tell the stats collector to drop them.
+ */
+ if (subscriptionHash)
+ {
+ PgStat_MsgSubscriptionPurge submsg;
+ PgStat_StatSubEntry *subent;
+ HTAB *htab;
+
+ htab = pgstat_collect_oids(SubscriptionRelationId, Anum_pg_subscription_oid);
+
+ submsg.m_nentries = 0;
+
+ hash_seq_init(&hstat, subscriptionHash);
+ while ((subent = (PgStat_StatSubEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ PgStat_MsgSubscriptionErrPurge errmsg;
+ PgStat_StatSubErrEntry *errent;
+ HASH_SEQ_STATUS hstat_rel;
+ List *not_ready_rels_list;
+ HTAB *not_ready_rels_htab;
+ ListCell *lc;
+ HASHCTL hash_ctl;
+
+ CHECK_FOR_INTERRUPTS();
+
+ if (hash_search(htab, (void *) &(subent->subid), HASH_FIND, NULL) == NULL)
+ {
+ /* This subscription is dead, add subid to the message */
+ submsg.m_subids[submsg.m_nentries++] = subent->subid;
+
+ /*
+ * If the message is full, send it out and reinitialize to
+ * empty
+ */
+ if (submsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONPURGE)
+ {
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + submsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&submsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(&submsg, len);
+ submsg.m_nentries = 0;
+ }
+
+ continue;
+ }
+
+ /*
+ * Nothing to do here if the subscription exists but has no error
+ * entries.
+ */
+ if (subent->suberrors == NULL)
+ continue;
+
+ /*
+ * The subscription has error entries. We search errors of the
+ * table sync workers who are already in sync state. Those errors
+ * should be removed.
+ *
+ * Note that the lifetime of error entries of the apply worker and
+ * the table sync worker are different. The former lives until
+ * the subscription is dropped whereas the latter lives the table
+ * synchronization is completed.
+ */
+ not_ready_rels_list = GetSubscriptionNotReadyRelations(subent->subid);
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(SubscriptionRelState);
+ not_ready_rels_htab = hash_create("not ready relations in subscription",
+ 64,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+
+ /*
+ * The number of not-ready relations can be high for example right
+ * after creating a subscription, so we load the list of
+ * SubscriptionRelState into the hash table for faster lookups.
+ */
+ foreach(lc, not_ready_rels_list)
+ {
+ SubscriptionRelState *r_elem = (SubscriptionRelState *) lfirst(lc);
+ SubscriptionRelState *r_entry;
+
+ CHECK_FOR_INTERRUPTS();
+ r_entry = hash_search(not_ready_rels_htab, (void *) &(r_elem->relid),
+ HASH_ENTER, NULL);
+ memcpy(r_entry, r_elem, sizeof(SubscriptionRelState));
+ }
+
+ list_free(not_ready_rels_list);
+
+ errmsg.m_nentries = 0;
+ errmsg.m_subid = subent->subid;
+
+ /*
+ * Search for all error entries of which relation is already ready
+ * state
+ */
+ hash_seq_init(&hstat_rel, subent->suberrors);
+ while ((errent = (PgStat_StatSubErrEntry *) hash_seq_search(&hstat_rel)) != NULL)
+ {
+ CHECK_FOR_INTERRUPTS();
+
+ /* Skip the apply worker's error */
+ if (!OidIsValid(errent->subrelid))
+ continue;
+
+ /*
+ * Add the relid to the message if the table synchronization
+ * for this relation already completes or the table is no
+ * longer subscribed.
+ */
+ if (hash_search(not_ready_rels_htab, (void *) &(errent->subrelid),
+ HASH_FIND, NULL) == NULL)
+ errmsg.m_relids[errmsg.m_nentries++] = errent->subrelid;
+
+ /*
+ * If the message is full, send it out and reinitialize to
+ * empty
+ */
+ if (errmsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONERRPURGE)
+ {
+ len = offsetof(PgStat_MsgSubscriptionErrPurge, m_relids[0])
+ + errmsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&errmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE);
+ pgstat_send(&errmsg, len);
+ errmsg.m_nentries = 0;
+ }
+ }
+
+ /* Send the rest of dead error entries */
+ if (errmsg.m_nentries > 0)
+ {
+ len = offsetof(PgStat_MsgSubscriptionErrPurge, m_relids[0])
+ + errmsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&errmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE);
+ pgstat_send(&errmsg, len);
+ errmsg.m_nentries = 0;
+ }
+
+ hash_destroy(not_ready_rels_htab);
+ }
+
+ /* Send the rest of dead subscriptions */
+ if (submsg.m_nentries > 0)
+ {
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + submsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&submsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(&submsg, len);
+ }
+
+ hash_destroy(htab);
+ }
+
/*
* Lookup our own database entry; if not found, nothing more to do.
*/
@@ -1555,6 +1729,25 @@ pgstat_reset_replslot_counter(const char *name)
pgstat_send(&msg, sizeof(msg));
}
+/* ----------
+ * pgstat_reset_subscription_error() -
+ *
+ * Tell the collector about reset the subscription error.
+ * ----------
+ */
+void
+pgstat_reset_subscription_error(Oid subid, Oid subrelid)
+{
+ PgStat_MsgSubscriptionErr msg;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_reset = true;
+
+ pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
+}
+
/* ----------
* pgstat_report_autovac() -
*
@@ -1823,6 +2016,37 @@ pgstat_report_replslot_drop(const char *slotname)
pgstat_send(&msg, sizeof(PgStat_MsgReplSlot));
}
+/* ----------
+ * pgstat_report_subscription_error() -
+ *
+ * Tell the collector about the subscription error.
+ * ----------
+ */
+void
+pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command, TransactionId xid,
+ const char *errmsg)
+{
+ PgStat_MsgSubscriptionErr msg;
+ int len;
+
+ len = offsetof(PgStat_MsgSubscriptionErr, m_errmsg[0]) + strlen(errmsg);
+ Assert(len < PGSTAT_MAX_MSG_SIZE);
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_reset = false;
+ msg.m_databaseid = MyDatabaseId;
+ msg.m_relid = relid;
+ msg.m_command = command;
+ msg.m_xid = xid;
+ msg.m_failure_time = GetCurrentTimestamp();
+ strlcpy(msg.m_errmsg, errmsg, PGSTAT_SUBSCRIPTIONERR_MSGLEN);
+
+ pgstat_send(&msg, len);
+}
+
/* ----------
* pgstat_ping() -
*
@@ -2891,6 +3115,22 @@ pgstat_fetch_replslot(NameData slotname)
return pgstat_get_replslot_entry(slotname, false);
}
+/*
+ * ---------
+ * pgstat_fetch_subscription_error() -
+ *
+ * Support function for the SQL-callable pgstat* functions. Returns
+ * a pointer to the subscription error struct.
+ * ---------
+ */
+PgStat_StatSubErrEntry *
+pgstat_fetch_subscription_error(Oid subid, Oid relid)
+{
+ backend_read_statsfile();
+
+ return pgstat_get_subscription_error_entry(subid, relid, false);
+}
+
/*
* Shut down a single backend's statistics reporting at process exit.
*
@@ -3468,6 +3708,19 @@ PgstatCollectorMain(int argc, char *argv[])
pgstat_recv_connstat(&msg.msg_conn, len);
break;
+ case PGSTAT_MTYPE_SUBSCRIPTIONERR:
+ pgstat_recv_subscription_error(&msg.msg_subscriptionerr, len);
+ break;
+
+ case PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE:
+ pgstat_recv_subscription_error_purge(&msg.msg_subscriptionerrpurge,
+ len);
+ break;
+
+ case PGSTAT_MTYPE_SUBSCRIPTIONPURGE:
+ pgstat_recv_subscription_purge(&msg.msg_subscriptionpurge, len);
+ break;
+
default:
break;
}
@@ -3768,6 +4021,50 @@ pgstat_write_statsfiles(bool permanent, bool allDbs)
}
}
+ /*
+ * Write subscription error structs
+ */
+ if (subscriptionHash)
+ {
+ PgStat_StatSubEntry *subent;
+
+ hash_seq_init(&hstat, subscriptionHash);
+ while ((subent = (PgStat_StatSubEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ PgStat_StatSubErrEntry *errent;
+ HASH_SEQ_STATUS relhstat;
+ int32 nerrors = hash_get_num_entries(subent->suberrors);
+
+ /* Skip this subscription if not have any errors */
+ if (subent->suberrors == NULL)
+ continue;
+
+ fputc('S', fpout);
+ rc = fwrite(subent, sizeof(PgStat_StatSubEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+
+ /* The number of errors follows */
+ rc = fwrite(&nerrors, sizeof(int32), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+
+ /* Then, the error entries follow */
+ hash_seq_init(&relhstat, subent->suberrors);
+ while ((errent = (PgStat_StatSubErrEntry *) hash_seq_search(&relhstat)) != NULL)
+ {
+ /*
+ * XXX we write the whole PgStat_StatSubErrEntry entry that
+ * contains the fixed-legnth error message string which is
+ * PGSTAT_SUBSCRIPTIONERR_MSGLEN in length, making the stats
+ * file bloat. It's okay since we assume that the number of
+ * error entries is not high. But if the expectation became
+ * false we should write the string and its length instead.
+ */
+ rc = fwrite(errent, sizeof(PgStat_StatSubErrEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+ }
+ }
+ }
+
/*
* No more output to be done. Close the temp file and replace the old
* pgstat.stat with it. The ferror() check replaces testing for error
@@ -4229,6 +4526,100 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
break;
}
+ /*
+ * 'S' A PgStat_StatSubEntry struct followed by the number of
+ * errors and PgStat_StatSubErrEntry structs, describing
+ * subscription errors.
+ */
+ case 'S':
+ {
+ PgStat_StatSubEntry subbuf;
+ PgStat_StatSubEntry *subent;
+ int32 nerrors;
+
+ /* Read the subscription entry */
+ if (fread(&subbuf, 1, sizeof(PgStat_StatSubEntry), fpin) !=
+ sizeof(PgStat_StatSubEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /* Create hash table if we don't have it already. */
+ if (subscriptionHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ subscriptionHash = hash_create("Subscription stat entries",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ }
+
+ /* Enter the subscription error entry */
+ subent =
+ (PgStat_StatSubEntry *) hash_search(subscriptionHash,
+ (void *) &(subbuf.subid),
+ HASH_ENTER, NULL);
+ subent->suberrors = NULL;
+
+ /* Read the number of errors in the subscription */
+ if (fread(&nerrors, 1, sizeof(int32), fpin) != sizeof(int32))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /* Read error entries */
+ for (int i = 0; i < nerrors; i++)
+ {
+ PgStat_StatSubErrEntry errbuf;
+ PgStat_StatSubErrEntry *errent;
+
+ if (fread(&errbuf, 1, sizeof(PgStat_StatSubErrEntry), fpin) !=
+ sizeof(PgStat_StatSubErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ if (subent->suberrors == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubErrEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ subent->suberrors = hash_create("Subscription error hash",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ }
+
+ /*
+ * Enter the error information to the subscription
+ * hash
+ */
+ errent =
+ (PgStat_StatSubErrEntry *) hash_search(subent->suberrors,
+ (void *) &(errbuf.subrelid),
+ HASH_ENTER, NULL);
+
+ memcpy(errent, &errbuf, sizeof(PgStat_StatSubErrEntry));
+ }
+
+ break;
+ }
+
case 'E':
goto done;
@@ -4571,6 +4962,50 @@ pgstat_read_db_statsfile_timestamp(Oid databaseid, bool permanent,
}
break;
+ /*
+ * 'S' A PgStat_StatSubEntry struct followed by the number of
+ * errors and PgStat_StatSubErrEntry structs describing
+ * subscription errors.
+ */
+ case 'S':
+ {
+ PgStat_StatSubEntry subbuf;
+ PgStat_StatSubErrEntry errbuf;
+ int32 nerrors;
+
+ if (fread(&subbuf, 1, sizeof(PgStat_StatSubEntry), fpin)
+ != sizeof(PgStat_StatSubEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ FreeFile(fpin);
+ return false;
+ }
+
+ if (fread(&nerrors, 1, sizeof(int32), fpin) != sizeof(int32))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ for (int i = 0; i < nerrors; i++)
+ {
+ if (fread(&errbuf, 1, sizeof(PgStat_StatSubErrEntry), fpin) !=
+ sizeof(PgStat_StatSubErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+ }
+ }
+
+ break;
+
case 'E':
goto done;
@@ -4776,6 +5211,7 @@ pgstat_clear_snapshot(void)
pgStatLocalContext = NULL;
pgStatDBHash = NULL;
replSlotStatHash = NULL;
+ subscriptionHash = NULL;
/*
* Historically the backend_status.c facilities lived in this file, and
@@ -5694,6 +6130,116 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
}
}
+/* ----------
+ * pgstat_recv_subscription_error() -
+ *
+ * Process a SUBSCRIPTIONERR message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len)
+{
+ PgStat_StatSubErrEntry *errent;
+ bool create = !msg->m_reset;
+
+ /* Get subscription error */
+ errent = pgstat_get_subscription_error_entry(msg->m_subid,
+ msg->m_subrelid,
+ create);
+
+ if (msg->m_reset)
+ {
+ if (errent == NULL)
+ return;
+
+ /* reset fields and set reset timestamp */
+ errent->relid = InvalidOid;
+ errent->command = 0;
+ errent->xid = InvalidTransactionId;
+ errent->failure_count = 0;
+ errent->last_failure = 0;
+ errent->last_errmsg[0] = '\0';
+ errent->stat_reset_timestamp = GetCurrentTimestamp();
+ }
+ else
+ {
+ Assert(errent);
+
+ /* update the error entry */
+ errent->databaseid = msg->m_databaseid;
+ errent->relid = msg->m_relid;
+ errent->command = msg->m_command;
+ errent->xid = msg->m_xid;
+ errent->failure_count++;
+ errent->last_failure = msg->m_failure_time;
+ strlcpy(errent->last_errmsg, msg->m_errmsg, PGSTAT_SUBSCRIPTIONERR_MSGLEN);
+ }
+}
+
+/* ----------
+ * pgstat_recv_subscription_purge() -
+ *
+ * Process a SUBSCRIPTIONPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len)
+{
+ if (subscriptionHash == NULL)
+ return;
+
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ PgStat_StatSubEntry *subent;
+
+ subent = pgstat_get_subscription_entry(msg->m_subids[i], false);
+
+ /*
+ * Nothing to do if the subscription entry is not found. This could
+ * happen when the subscription is dropped and the message for
+ * dropping subscription entry arrived before the message for
+ * reporting the error.
+ */
+ if (subent == NULL)
+ continue;
+
+ /* Cleanup the hash table for errors */
+ if (subent->suberrors != NULL)
+ hash_destroy(subent->suberrors);
+
+ /* Remove the subscription entry */
+ (void) hash_search(subscriptionHash, (void *) &(msg->m_subids[i]),
+ HASH_REMOVE, NULL);
+ }
+}
+
+/* ----------
+ * pgstat_recv_subscription_error_purge() -
+ *
+ * Process a SUBSCRIPTIONERRPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_error_purge(PgStat_MsgSubscriptionErrPurge *msg, int len)
+{
+ PgStat_StatSubEntry *subent;
+
+ subent = pgstat_get_subscription_entry(msg->m_subid, false);
+
+ /*
+ * Nothing to do if the subscription entry is not found. This could
+ * happen when the subscription with msg->m_subid is removed and the
+ * corresponding entry is also removed before receiving the error purge
+ * message.
+ */
+ if (subent == NULL)
+ return;
+
+ for (int i = 0; i < msg->m_nentries; i++)
+ (void) hash_search(subent->suberrors, (void *) &(msg->m_relids[i]),
+ HASH_REMOVE, NULL);
+}
+
/* ----------
* pgstat_write_statsfile_needed() -
*
@@ -5791,6 +6337,111 @@ pgstat_get_replslot_entry(NameData name, bool create)
return slotent;
}
+/* ----------
+ * pgstat_get_subscription_entry
+ *
+ * Return the subscription statistics with the subscription OID. Return NULL
+ * if not found and the caller didn't request to create it.
+ *
+ * 'create' tells whether to create the new subscription entry if it is not
+ * found.
+ * ----------
+ */
+static PgStat_StatSubEntry *
+pgstat_get_subscription_entry(Oid subid, bool create)
+{
+ PgStat_StatSubEntry *subent;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);
+ bool found;
+
+ if (subscriptionHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ if (!create)
+ return NULL;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubEntry);
+ subscriptionHash = hash_create("Subscription stat entries",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+ }
+
+ subent = (PgStat_StatSubEntry *) hash_search(subscriptionHash,
+ (void *) &subid,
+ action, &found);
+
+ /* initialize field */
+ if (create && !found)
+ subent->suberrors = NULL;
+
+ return subent;
+}
+
+/* ----------
+ * pgstat_get_subscription_error_entry
+ *
+ * Return the entry of subscription error entry with the subscription
+ * OID and relation OID. Return NULL if not found and the caller didn't
+ * request to create it.
+ *
+ * 'create' tells whether to create the new subscription relation entry if it is
+ * not found.
+ * ----------
+ */
+static PgStat_StatSubErrEntry *
+pgstat_get_subscription_error_entry(Oid subid, Oid subrelid, bool create)
+{
+ PgStat_StatSubEntry *subent;
+ PgStat_StatSubErrEntry *errent;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);
+ bool found;
+
+ subent = pgstat_get_subscription_entry(subid, create);
+
+ if (subent == NULL)
+ {
+ Assert(!create);
+ return NULL;
+ }
+
+ if (subent->suberrors == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ if (!create)
+ return NULL;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubErrEntry);
+ subent->suberrors = hash_create("Subscription error hash",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+ }
+
+ errent = (PgStat_StatSubErrEntry *) hash_search(subent->suberrors,
+ (void *) &subrelid,
+ action, &found);
+
+ /* initialize fields */
+ if (create && !found)
+ {
+ errent->databaseid = InvalidOid;
+ errent->relid = InvalidOid;
+ errent->command = 0;
+ errent->xid = InvalidTransactionId;
+ errent->failure_count = 0;
+ errent->last_failure = 0;
+ errent->last_errmsg[0] = '\0';
+ errent->stat_reset_timestamp = 0;
+ }
+
+ return errent;
+}
+
/* ----------
* pgstat_reset_replslot
*
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index bfb7d1a261..b11291f1ff 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3486,6 +3486,7 @@ void
ApplyWorkerMain(Datum main_arg)
{
int worker_slot = DatumGetInt32(main_arg);
+ MemoryContext cctx = CurrentMemoryContext;
MemoryContext oldctx;
char originname[NAMEDATALEN];
XLogRecPtr origin_startpos;
@@ -3586,8 +3587,27 @@ ApplyWorkerMain(Datum main_arg)
{
char *syncslotname;
- /* This is table synchronization worker, call initial sync. */
- syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ PG_TRY();
+ {
+ /* This is table synchronization worker, call initial sync. */
+ syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ }
+ PG_CATCH();
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ /* report the table sync error */
+ pgstat_report_subscription_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ MyLogicalRepWorker->relid,
+ 0,
+ InvalidTransactionId,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
/* allocate slot name in long-lived context */
myslotname = MemoryContextStrdup(ApplyContext, syncslotname);
@@ -3705,7 +3725,32 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- LogicalRepApplyLoop(origin_startpos);
+ PG_TRY();
+ {
+ LogicalRepApplyLoop(origin_startpos);
+ }
+ PG_CATCH();
+ {
+ /* report the apply error */
+ if (apply_error_callback_arg.command != 0)
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ pgstat_report_subscription_error(MySubscription->oid,
+ InvalidOid,
+ apply_error_callback_arg.rel != NULL
+ ? apply_error_callback_arg.rel->localreloid
+ : InvalidOid,
+ apply_error_callback_arg.command,
+ apply_error_callback_arg.remote_xid,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ }
+
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
proc_exit(0);
}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index ff5aedc99c..c454e2f8bc 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -24,6 +24,7 @@
#include "pgstat.h"
#include "postmaster/bgworker_internals.h"
#include "postmaster/postmaster.h"
+#include "replication/logicalproto.h"
#include "replication/slot.h"
#include "storage/proc.h"
#include "storage/procarray.h"
@@ -2239,6 +2240,23 @@ pg_stat_reset_replication_slot(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
+/* Reset a subscription error stats */
+Datum
+pg_stat_reset_subscription_error(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid;
+
+ if (PG_ARGISNULL(1))
+ relid = InvalidOid; /* reset apply worker error stats */
+ else
+ relid = PG_GETARG_OID(1); /* reset table sync worker error stats */
+
+ pgstat_reset_subscription_error(subid, relid);
+
+ PG_RETURN_VOID();
+}
+
Datum
pg_stat_get_archiver(PG_FUNCTION_ARGS)
{
@@ -2379,3 +2397,98 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
/* Returns the record as Datum */
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
+
+/*
+ * Get the subscription error for the given subscription (and relation).
+ */
+Datum
+pg_stat_get_subscription_error(PG_FUNCTION_ARGS)
+{
+#define PG_STAT_GET_SUBSCRIPTION_ERROR_COLS 10
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid;
+ TupleDesc tupdesc;
+ Datum values[PG_STAT_GET_SUBSCRIPTION_ERROR_COLS];
+ bool nulls[PG_STAT_GET_SUBSCRIPTION_ERROR_COLS];
+ PgStat_StatSubErrEntry *errent;
+
+ /* Initialise values and NULL flags arrays */
+ MemSet(values, 0, sizeof(values));
+ MemSet(nulls, 0, sizeof(nulls));
+
+ /* Initialise attributes information in the tuple descriptor */
+ tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_SUBSCRIPTION_ERROR_COLS);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "datid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 3, "relid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 4, "command",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 5, "xid",
+ XIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 6, "failure_source",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 7, "failure_count",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 8, "last_failure",
+ TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 9, "last_failure_message",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 10, "stats_reset",
+ TIMESTAMPTZOID, -1, 0);
+ BlessTupleDesc(tupdesc);
+
+ if (PG_ARGISNULL(1))
+ relid = InvalidOid;
+ else
+ relid = PG_GETARG_OID(1);
+
+ /* Get subscription errors */
+ errent = pgstat_fetch_subscription_error(subid, relid);
+
+ /* Return NULL if the subscription doesn't have any errors */
+ if (errent == NULL)
+ PG_RETURN_NULL();
+
+ values[0] = ObjectIdGetDatum(errent->databaseid);
+ values[1] = ObjectIdGetDatum(subid);
+
+ if (OidIsValid(errent->relid))
+ values[2] = ObjectIdGetDatum(errent->relid);
+ else
+ nulls[2] = true;
+
+ if (errent->command == 0)
+ nulls[3] = true;
+ else
+ values[3] = CStringGetTextDatum(logicalrep_message_type(errent->command));
+
+ if (TransactionIdIsValid(errent->xid))
+ values[4] = TransactionIdGetDatum(errent->xid);
+ else
+ nulls[4] = true;
+
+ if (OidIsValid(errent->subrelid))
+ values[5] = CStringGetTextDatum("tablesync");
+ else
+ values[5] = CStringGetTextDatum("apply");
+
+ values[6] = Int64GetDatum(errent->failure_count);
+
+ if (errent->last_failure == 0)
+ nulls[7] = true;
+ else
+ values[7] = TimestampTzGetDatum(errent->last_failure);
+
+ values[8] = CStringGetTextDatum(errent->last_errmsg);
+
+ if (errent->stat_reset_timestamp == 0)
+ nulls[9] = true;
+ else
+ values[9] = TimestampTzGetDatum(errent->stat_reset_timestamp);
+
+ /* Returns the record as Datum */
+ PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
+}
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index a3e1c59a82..871f7b1b15 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -1441,7 +1441,6 @@ getinternalerrposition(void)
return edata->internalpos;
}
-
/*
* Functions to allow construction of error message strings separately from
* the ereport() call itself.
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index b603700ed9..7f9c27bdda 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5385,6 +5385,14 @@
proargmodes => '{i,o,o,o,o,o,o,o,o,o,o}',
proargnames => '{slot_name,slot_name,spill_txns,spill_count,spill_bytes,stream_txns,stream_count,stream_bytes,total_txns,total_bytes,stats_reset}',
prosrc => 'pg_stat_get_replication_slot' },
+{ oid => '8523', descr => 'statistics: information about subscription error',
+ proname => 'pg_stat_get_subscription_error', prorows => '1', proisstrict => 'f',
+ proretset => 't', provolatile => 's', proparallel => 'r',
+ prorettype => 'record', proargtypes => 'oid oid',
+ proallargtypes => '{oid,oid,oid,oid,oid,text,xid,text,int8,timestamptz,text,timestamptz}',
+ proargmodes => '{i,i,o,o,o,o,o,o,o,o,o,o}',
+ proargnames => '{subid,relid,datid,subid,relid,command,xid,failure_source,failure_count,last_failure,last_failure_message,stats_reset}',
+ prosrc => 'pg_stat_get_subscription_error' },
{ oid => '6118', descr => 'statistics: information about subscription',
proname => 'pg_stat_get_subscription', prorows => '10', proisstrict => 'f',
proretset => 't', provolatile => 's', proparallel => 'r',
@@ -5772,6 +5780,11 @@
proname => 'pg_stat_reset_replication_slot', proisstrict => 'f',
provolatile => 'v', prorettype => 'void', proargtypes => 'text',
prosrc => 'pg_stat_reset_replication_slot' },
+{ oid => '8524',
+ descr => 'statistics: reset collected statistics for a single subscription error',
+ proname => 'pg_stat_reset_subscription_error', proisstrict => 'f',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid oid',
+ prosrc => 'pg_stat_reset_subscription_error' },
{ oid => '3163', descr => 'current trigger depth',
proname => 'pg_trigger_depth', provolatile => 's', proparallel => 'r',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 509849c7ff..a6914a24e5 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -14,6 +14,7 @@
#include "datatype/timestamp.h"
#include "portability/instr_time.h"
#include "postmaster/pgarch.h" /* for MAX_XFN_CHARS */
+#include "replication/logicalproto.h"
#include "utils/backend_progress.h" /* for backward compatibility */
#include "utils/backend_status.h" /* for backward compatibility */
#include "utils/hsearch.h"
@@ -66,6 +67,9 @@ typedef enum StatMsgType
PGSTAT_MTYPE_RESETSINGLECOUNTER,
PGSTAT_MTYPE_RESETSLRUCOUNTER,
PGSTAT_MTYPE_RESETREPLSLOTCOUNTER,
+ PGSTAT_MTYPE_SUBSCRIPTIONERR,
+ PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE,
+ PGSTAT_MTYPE_SUBSCRIPTIONPURGE,
PGSTAT_MTYPE_AUTOVAC_START,
PGSTAT_MTYPE_VACUUM,
PGSTAT_MTYPE_ANALYZE,
@@ -530,6 +534,68 @@ typedef struct PgStat_MsgReplSlot
PgStat_Counter m_total_bytes;
} PgStat_MsgReplSlot;
+/* ----------
+ * PgStat_MsgSubscriptionErr Sent by the apply worker or the table sync worker to
+ * update/reset the error happening during logical
+ * replication.
+ * ----------
+ */
+#define PGSTAT_SUBSCRIPTIONERR_MSGLEN 256
+typedef struct PgStat_MsgSubscriptionErr
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * m_subid and m_subrelid are used to determine the subscription and the
+ * reporter of this error. m_subrelid is InvalidOid if reported by the
+ * apply worker, otherwise by the table sync worker. In table sync worker
+ * case, m_subrelid must be the same as m_relid.
+ */
+ Oid m_subid;
+ Oid m_subrelid;
+
+ /* The reset message uses below field */
+ bool m_reset; /* Reset all fields and set reset_stats
+ * timestamp */
+
+ /* The error report message uses below fields */
+ Oid m_databaseid;
+ Oid m_relid;
+ LogicalRepMsgType m_command;
+ TransactionId m_xid;
+ TimestampTz m_failure_time;
+ char m_errmsg[PGSTAT_SUBSCRIPTIONERR_MSGLEN];
+} PgStat_MsgSubscriptionErr;
+
+/* ----------
+ * PgStat_MsgSubscriptionPurge Sent by the autovacuum purge the subscriptions.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionPurge
+{
+ PgStat_MsgHdr m_hdr;
+ int m_nentries;
+ Oid m_subids[PGSTAT_NUM_SUBSCRIPTIONPURGE];
+} PgStat_MsgSubscriptionPurge;
+
+/* ----------
+ * PgStat_MsgSubscriptionErrPurge Sent by the autovacuum purge the subscription
+ * errors.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONERRPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(Oid) - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionErrPurge
+{
+ PgStat_MsgHdr m_hdr;
+ Oid m_subid;
+ int m_nentries;
+ Oid m_relids[PGSTAT_NUM_SUBSCRIPTIONERRPURGE];
+} PgStat_MsgSubscriptionErrPurge;
/* ----------
* PgStat_MsgRecoveryConflict Sent by the backend upon recovery conflict
@@ -701,6 +767,9 @@ typedef union PgStat_Msg
PgStat_MsgChecksumFailure msg_checksumfailure;
PgStat_MsgReplSlot msg_replslot;
PgStat_MsgConn msg_conn;
+ PgStat_MsgSubscriptionErr msg_subscriptionerr;
+ PgStat_MsgSubscriptionErrPurge msg_subscriptionerrpurge;
+ PgStat_MsgSubscriptionPurge msg_subscriptionpurge;
} PgStat_Msg;
@@ -916,6 +985,38 @@ typedef struct PgStat_StatReplSlotEntry
TimestampTz stat_reset_timestamp;
} PgStat_StatReplSlotEntry;
+/*
+ * Subscription statistics kept in the stats collector.
+ */
+typedef struct PgStat_StatSubEntry
+{
+ Oid subid; /* hash table key */
+ HTAB *suberrors;
+} PgStat_StatSubEntry;
+
+/*
+ * Subscription error statistics kept in the stats collector. One entry represents
+ * an error that happened during logical replication, reported by the apply worker
+ * (subrelid is InvalidOid) or by the table sync worker (subrelid is a valid OID).
+ * The error reported by the table sync worker is removed also when the table
+ * synchronization process completed.
+ */
+
+typedef struct PgStat_StatSubErrEntry
+{
+ Oid subrelid; /* InvalidOid if the apply worker, otherwise
+ * the table sync worker. hash table key. */
+ Oid databaseid;
+ Oid relid; /* OID of relation related to the error. Must
+ * be the same as subrelid in the table sync
+ * case. */
+ LogicalRepMsgType command;
+ TransactionId xid;
+ PgStat_Counter failure_count;
+ TimestampTz last_failure;
+ char last_errmsg[PGSTAT_SUBSCRIPTIONERR_MSGLEN];
+ TimestampTz stat_reset_timestamp;
+} PgStat_StatSubErrEntry;
/*
* Working state needed to accumulate per-function-call timing statistics.
@@ -1009,6 +1110,7 @@ extern void pgstat_reset_shared_counters(const char *);
extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type type);
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
+extern void pgstat_reset_subscription_error(Oid subid, Oid subrelid);
extern void pgstat_report_autovac(Oid dboid);
extern void pgstat_report_vacuum(Oid tableoid, bool shared,
@@ -1024,6 +1126,9 @@ extern void pgstat_report_checksum_failure(void);
extern void pgstat_report_replslot(const PgStat_StatReplSlotEntry *repSlotStat);
extern void pgstat_report_replslot_create(const char *slotname);
extern void pgstat_report_replslot_drop(const char *slotname);
+extern void pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command,
+ TransactionId xid, const char *errmsg);
extern void pgstat_initialize(void);
@@ -1122,6 +1227,7 @@ extern PgStat_GlobalStats *pgstat_fetch_global(void);
extern PgStat_WalStats *pgstat_fetch_stat_wal(void);
extern PgStat_SLRUStats *pgstat_fetch_slru(void);
extern PgStat_StatReplSlotEntry *pgstat_fetch_replslot(NameData slotname);
+extern PgStat_StatSubErrEntry *pgstat_fetch_subscription_error(Oid subid, Oid relid);
extern void pgstat_count_slru_page_zeroed(int slru_idx);
extern void pgstat_count_slru_page_hit(int slru_idx);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2fa00a3c29..66b185fc9c 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2094,6 +2094,28 @@ pg_stat_subscription| SELECT su.oid AS subid,
st.latest_end_time
FROM (pg_subscription su
LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid, pid, received_lsn, last_msg_send_time, last_msg_receipt_time, latest_end_lsn, latest_end_time) ON ((st.subid = su.oid)));
+pg_stat_subscription_errors| SELECT d.datname,
+ sr.subid,
+ s.subname,
+ e.relid,
+ e.command,
+ e.xid,
+ e.failure_source,
+ e.failure_count,
+ e.last_failure,
+ e.last_failure_message,
+ e.stats_reset
+ FROM ( SELECT pg_subscription.oid AS subid,
+ NULL::oid AS relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT pg_subscription_rel.srsubid AS subid,
+ pg_subscription_rel.srrelid AS relid
+ FROM pg_subscription_rel
+ WHERE (pg_subscription_rel.srsubstate <> 'r'::"char")) sr,
+ ((LATERAL pg_stat_get_subscription_error(sr.subid, sr.relid) e(datid, subid, relid, command, xid, failure_source, failure_count, last_failure, last_failure_message, stats_reset)
+ JOIN pg_database d ON ((e.datid = d.oid)))
+ JOIN pg_subscription s ON ((e.subid = s.oid)));
pg_stat_sys_indexes| SELECT pg_stat_all_indexes.relid,
pg_stat_all_indexes.indexrelid,
pg_stat_all_indexes.schemaname,
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index f31a1e4e1e..e32a4f678e 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1939,6 +1939,9 @@ PgStat_MsgResetsharedcounter
PgStat_MsgResetsinglecounter
PgStat_MsgResetslrucounter
PgStat_MsgSLRU
+PgStat_MsgSubscriptionErr
+PgStat_MsgSubscriptionErrPurge
+PgStat_MsgSubscriptionPurge
PgStat_MsgTabpurge
PgStat_MsgTabstat
PgStat_MsgTempFile
@@ -1950,6 +1953,8 @@ PgStat_Single_Reset_Type
PgStat_StatDBEntry
PgStat_StatFuncEntry
PgStat_StatReplSlotEntry
+PgStat_StatSubEntry
+PgStat_StatSubErrEntry
PgStat_StatTabEntry
PgStat_SubXactStatus
PgStat_TableCounts
--
2.24.3 (Apple Git-128)
v12-0003-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchapplication/octet-stream; name=v12-0003-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchDownload
From b9c3bc135d94ac314766f2689c5e2c2ad549d32e Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 2 Aug 2021 14:27:40 +0900
Subject: [PATCH v12 3/4] Add skip_xid option to ALTER SUBSCRIPTION.
If incoming change violates any constraint, logical replication stops
until it's resolved. This commit introduces another way to skip the
transaction in question.
The user can specify XID by ALTER SUBSCRIPTION ... SET (skip_xid =
XXX), updating pg_subscription.subskipxid field, telling the apply
worker to skip the transaction. The apply worker skips all data
modification changes within the specified transaction.
After skipping the transaciton the apply worker clears
pg_subscription.subskipxid. Also, it clears the error statistics of
the subscription in pg_stat_subscription_errors system view as well in
order the user not to get confused. It's done by sending the message
for clearing a subscription error to the stats collector.
---
doc/src/sgml/logical-replication.sgml | 49 ++++-
doc/src/sgml/ref/alter_subscription.sgml | 32 ++-
src/backend/catalog/pg_subscription.c | 10 +
src/backend/commands/subscriptioncmds.c | 45 +++-
src/backend/postmaster/pgstat.c | 44 +++-
src/backend/replication/logical/worker.c | 201 ++++++++++++++++-
src/include/catalog/pg_subscription.h | 4 +
src/include/nodes/parsenodes.h | 1 +
src/include/pgstat.h | 7 +-
src/test/regress/expected/subscription.out | 13 ++
src/test/regress/sql/subscription.sql | 11 +
src/test/subscription/t/024_skip_xact.pl | 244 +++++++++++++++++++++
12 files changed, 636 insertions(+), 25 deletions(-)
create mode 100644 src/test/subscription/t/024_skip_xact.pl
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 88646bc859..d558dcfe81 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -333,20 +333,63 @@
<para>
A conflict will produce an error and will stop the replication; it must be
resolved manually by the user. Details about the conflict can be found in
- the subscriber's server log.
+ <xref linkend="monitoring-pg-stat-subscription-errors"/> as well as the
+ subscriber's server log.
</para>
<para>
The resolution can be done either by changing data on the subscriber so
that it does not conflict with the incoming change or by skipping the
- transaction that conflicts with the existing data. The transaction can be
- skipped by calling the <link linkend="pg-replication-origin-advance">
+ transaction that conflicts with the existing data. When a conflict produce
+ an error, it is shown in <structname>pg_stat_subscription_errors</structname>
+ view as follows:
+ </para>
+
+ <programlisting>
+postgres=# SELECT * FROM pg_stat_subscription_errors;
+-[ RECORD 1 ]--------+-----------------------------------------------------------
+datname | postgres
+subid | 16395
+subname | test_sub
+relid | 16385
+command | INSERT
+xid | 716
+failure_source | apply
+failure_count | 50
+last_failure | 2021-07-21 21:16:02.781779+00
+last_failure_message | duplicate key value violates unique constraint "test_pkey"
+stats_reset |
+</programlisting>
+
+ <para>
+ and it is also shown in subscriber's server log as follows:
+ </para>
+
+<screen>
+ERROR: duplicate key value violates unique constraint "test_pkey"
+DETAIL: Key (id)=(1) already exists.
+CONTEXT: processing remote data during "INSERT" for replication target relation "public.test" in transaction 716 with commit timestamp 2021-07-15 21:54:58.802874+00
+</screen>
+
+ <para>
+ The transaction ID that contains the change violating the constraint can be
+ found from those outputs (transaction ID 740 in the above case). The transaction
+ can be skipped by setting <replaceable>skip_xid</replaceable> to the subscription
+ by <command>ALTER SUBSCRIPTION ... SET</command>. Alternatively, the transaction
+ can also be skipped by calling the <link linkend="pg-replication-origin-advance">
<function>pg_replication_origin_advance()</function></link> function with
a <parameter>node_name</parameter> corresponding to the subscription name,
and a position. The current position of origins can be seen in the
<link linkend="view-pg-replication-origin-status">
<structname>pg_replication_origin_status</structname></link> system view.
</para>
+
+ <para>
+ In either way, those should be used as a last resort. They skip the whole
+ transaction including changes that may not violate any constraint and easily
+ make subscriber inconsistent if a user specifies the wrong transaction ID or
+ the position of origin.
+ </para>
</sect1>
<sect1 id="logical-replication-restrictions">
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 376fc154b1..a2dac62be7 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -202,8 +202,36 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
<literal>streaming</literal>.
</para>
<para>
- The parameters that can be reset are: <literal>streaming</literal>,
- <literal>binary</literal>, <literal>synchronous_commit</literal>.
+ The parameters that can be reset are: <literal>slot_name</literal>,
+ <literal>synchronous_commit</literal>, <literal>binary</literal>,
+ <literal>streaming</literal>, and following parameter:
+ </para>
+ <para>
+ <variablelist>
+ <varlistentry>
+ <term><literal>skip_xid</literal> (<type>xid</type>)</term>
+ <listitem>
+ <para>
+ If incoming data violates any constraints the logical replication
+ will stop until it is resolved. The resolution can be done either
+ by changing data on the subscriber so that it doesn't conflict with
+ incoming change or by skipping the whole transaction. This option
+ specifies transaction ID that logical replication worker skips to
+ apply. The logical replication worker skips all data modification
+ changes within the specified transaction. Therefore, since it skips
+ the whole transaction including the changes that may not violate the
+ constraint, it should only be used as a last resort. This option has
+ no effect for the transaction that is already prepared with enabling
+ <literal>two_phase</literal> on susbscriber. After the logical
+ replication successfully skips the transaction, the transaction ID
+ (stored in
+ <structname>pg_subscription</structname>.<structfield>subskipxid</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index 25021e25a4..cb22cd7463 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -104,6 +104,16 @@ GetSubscription(Oid subid, bool missing_ok)
Assert(!isnull);
sub->publications = textarray_to_stringlist(DatumGetArrayTypeP(datum));
+ /* Get skip XID */
+ datum = SysCacheGetAttr(SUBSCRIPTIONOID,
+ tup,
+ Anum_pg_subscription_subskipxid,
+ &isnull);
+ if (!isnull)
+ sub->skipxid = DatumGetTransactionId(datum);
+ else
+ sub->skipxid = InvalidTransactionId;
+
ReleaseSysCache(tup);
return sub;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index a58b864f12..ac7fbc1305 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -60,6 +60,7 @@
#define SUBOPT_BINARY 0x00000080
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
+#define SUBOPT_SKIP_XID 0x00000400
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -81,6 +82,7 @@ typedef struct SubOpts
bool binary;
bool streaming;
bool twophase;
+ TransactionId skip_xid;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -129,6 +131,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->streaming = false;
if (IsSet(supported_opts, SUBOPT_TWOPHASE_COMMIT))
opts->twophase = false;
+ if (IsSet(supported_opts, SUBOPT_SKIP_XID))
+ opts->skip_xid = InvalidTransactionId;
/* Parse options */
foreach(lc, stmt_options)
@@ -261,6 +265,29 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_TWOPHASE_COMMIT;
opts->twophase = defGetBoolean(defel);
}
+ else if (strcmp(defel->defname, "skip_xid") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_SKIP_XID))
+ errorConflictingDefElem(defel, pstate);
+
+ if (!is_reset)
+ {
+ char *xid_str = defGetString(defel);
+ TransactionId xid;
+
+ /* Parse the argument as TransactionId */
+ xid = DatumGetTransactionId(DirectFunctionCall1(xidin,
+ CStringGetDatum(xid_str)));
+
+ if (!TransactionIdIsNormal(xid))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid transaction id")));
+ opts->skip_xid = xid;
+ }
+
+ opts->specified_opts |= SUBOPT_SKIP_XID;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -485,6 +512,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
CharGetDatum(opts.twophase ?
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -885,7 +913,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
{
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
- SUBOPT_STREAMING);
+ SUBOPT_STREAMING | SUBOPT_SKIP_XID);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts, false);
@@ -934,6 +962,13 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
replaces[Anum_pg_subscription_substream - 1] = true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_SKIP_XID))
+ {
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(opts.skip_xid);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+ }
+
update_tuple = true;
break;
}
@@ -941,7 +976,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
case ALTER_SUBSCRIPTION_RESET_OPTIONS:
{
supported_opts = (SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
- SUBOPT_STREAMING);
+ SUBOPT_STREAMING | SUBOPT_SKIP_XID);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts, true);
@@ -967,6 +1002,12 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
replaces[Anum_pg_subscription_substream - 1] = true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_SKIP_XID))
+ {
+ nulls[Anum_pg_subscription_subskipxid - 1] =
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+ }
+
update_tuple = true;
break;
}
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 478b900769..778e409fce 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -1743,11 +1743,32 @@ pgstat_reset_subscription_error(Oid subid, Oid subrelid)
pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
msg.m_subid = subid;
msg.m_subrelid = subrelid;
+ msg.m_clear = false;
msg.m_reset = true;
pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
}
+/* ----------
+ * pgstat_clear_subscription_error() -
+ *
+ * Tell the collector about clear the error of subscription.
+ * ----------
+ */
+void
+pgstat_clear_subscription_error(Oid subid, Oid subrelid)
+{
+ PgStat_MsgSubscriptionErr msg;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_clear = true;
+ msg.m_reset = false;
+
+ pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
+}
+
/* ----------
* pgstat_report_autovac() -
*
@@ -2037,6 +2058,7 @@ pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
msg.m_subid = subid;
msg.m_subrelid = subrelid;
msg.m_reset = false;
+ msg.m_clear = false;
msg.m_databaseid = MyDatabaseId;
msg.m_relid = relid;
msg.m_command = command;
@@ -6139,27 +6161,37 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
static void
pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len)
{
+
PgStat_StatSubErrEntry *errent;
- bool create = !msg->m_reset;
+ bool create = !(msg->m_reset || msg->m_clear);
/* Get subscription error */
errent = pgstat_get_subscription_error_entry(msg->m_subid,
msg->m_subrelid,
create);
- if (msg->m_reset)
+ if (msg->m_reset || msg->m_clear)
{
+ Assert(!(msg->m_reset && msg->m_clear));
+
if (errent == NULL)
return;
- /* reset fields and set reset timestamp */
errent->relid = InvalidOid;
errent->command = 0;
errent->xid = InvalidTransactionId;
errent->failure_count = 0;
- errent->last_failure = 0;
- errent->last_errmsg[0] = '\0';
- errent->stat_reset_timestamp = GetCurrentTimestamp();
+
+ /*
+ * If the reset is requested, reset more fields and set the reset
+ * timestamp.
+ */
+ if (msg->m_reset)
+ {
+ errent->last_failure = 0;
+ errent->last_errmsg[0] = '\0';
+ errent->stat_reset_timestamp = GetCurrentTimestamp();
+ }
}
else
{
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index b11291f1ff..2417d040e9 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -136,6 +136,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -269,6 +270,21 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * skipping_xid is a valid XID if we're skipping all data modification changes
+ * (INSERT/DELETE/UPDATE/TRUNCATE) of the specified transaction in MySubscription->skipxid.
+ * Please note that we don’t skip receiving the changes particularly in streaming
+ * cases, since we decide whether or not to skip applying the changes when starting
+ * to apply. Once starting skipping changes, we copy the XID to skipping_xid and
+ * then don't stop skipping until we skip the whole transaction even if the
+ * subscription is invalidated and* MySubscription->skipxid gets changed or reset.
+ * When stopping the skipping behavior, we reset the skip XID (subskipxid) in the
+ * pg_subscription catalog and associate origin status to the transaction that resets
+ * the skip XID so that we can start streaming from the next transaction.
+ */
+static TransactionId skipping_xid = InvalidTransactionId;
+#define is_skipping_changes() (TransactionIdIsValid(skipping_xid))
+
/*
* Hash table for storing the streaming xid information along with filesets
* for streaming and subxact files.
@@ -355,6 +371,9 @@ static void apply_error_callback(void *arg);
static inline void set_apply_error_context_xact(TransactionId xid, TimestampTz ts);
static inline void reset_apply_error_context_info(void);
+static void maybe_start_skipping_changes(TransactionId xid);
+static void stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs);
+
/*
* Should this worker apply changes for given relation.
*
@@ -809,6 +828,11 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
+ /*
+ * Enable skipping all changes of this transaction if specified.
+ */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -833,7 +857,18 @@ apply_handle_commit(StringInfo s)
LSN_FORMAT_ARGS(commit_data.commit_lsn),
LSN_FORMAT_ARGS(remote_final_lsn))));
- apply_handle_commit_internal(&commit_data);
+ /*
+ * Stop the skipping transaction if enabled. Otherwise, commit the changes
+ * that are just applied.
+ */
+ if (is_skipping_changes())
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ apply_handle_commit_internal(&commit_data);
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(commit_data.end_lsn);
@@ -861,6 +896,9 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
+ /* Enable skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -919,9 +957,10 @@ apply_handle_prepare(StringInfo s)
/*
* Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction. It is done this way because at
- * commit prepared time, we won't know whether we have skipped preparing a
- * transaction because of no change.
+ * change has happened in this transaction, possibly because we're
+ * skipping data-modification changes of this transaction. It is done this
+ * way because at commit prepared time, we won't know whether we have
+ * skipped preparing a transaction because of no change.
*
* XXX, We can optimize such that at commit prepared time, we first check
* whether we have prepared the transaction or not but that doesn't seem
@@ -935,6 +974,10 @@ apply_handle_prepare(StringInfo s)
CommitTransactionCommand();
pgstat_report_stat(false);
+ /* Stop the skipping changes if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
store_flush_position(prepare_data.end_lsn);
in_remote_transaction = false;
@@ -1066,6 +1109,9 @@ apply_handle_stream_prepare(StringInfo s)
elog(DEBUG1, "received prepare for streamed transaction %u", prepare_data.xid);
+ /* Enable skipping all changes of this transaction if specified. */
+ maybe_start_skipping_changes(prepare_data.xid);
+
/* Replay all the spooled operations. */
apply_spooled_messages(prepare_data.xid, prepare_data.prepare_lsn);
@@ -1076,6 +1122,10 @@ apply_handle_stream_prepare(StringInfo s)
pgstat_report_stat(false);
+ /* Stop the skipping changes if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
store_flush_position(prepare_data.end_lsn);
in_remote_transaction = false;
@@ -1101,9 +1151,10 @@ apply_handle_origin(StringInfo s)
{
/*
* ORIGIN message can only come inside streaming transaction or inside
- * remote transaction and before any actual writes.
+ * remote transaction and before any actual writes unless we're skipping
+ * changes of the transaction.
*/
- if (!in_streamed_transaction &&
+ if (!in_streamed_transaction && !is_skipping_changes() &&
(!in_remote_transaction ||
(IsTransactionState() && !am_tablesync_worker())))
ereport(ERROR,
@@ -1125,6 +1176,9 @@ apply_handle_stream_start(StringInfo s)
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg_internal("duplicate STREAM START message")));
+ /* extract XID of the top-level transaction */
+ stream_xid = logicalrep_read_stream_start(s, &first_segment);
+
/*
* Start a transaction on stream start, this transaction will be committed
* on the stream stop unless it is a tablesync worker in which case it
@@ -1137,9 +1191,6 @@ apply_handle_stream_start(StringInfo s)
/* notify handle methods we're processing a remote transaction */
in_streamed_transaction = true;
- /* extract XID of the top-level transaction */
- stream_xid = logicalrep_read_stream_start(s, &first_segment);
-
if (!TransactionIdIsValid(stream_xid))
ereport(ERROR,
(errcode(ERRCODE_PROTOCOL_VIOLATION),
@@ -1221,6 +1272,7 @@ apply_handle_stream_abort(StringInfo s)
errmsg_internal("STREAM ABORT message without STREAM STOP")));
logicalrep_read_stream_abort(s, &xid, &subxid);
+ maybe_start_skipping_changes(xid);
/*
* If the two XIDs are the same, it's in fact abort of toplevel xact, so
@@ -1314,6 +1366,10 @@ apply_handle_stream_abort(StringInfo s)
CommitTransactionCommand();
}
+ /* Stop the skipping transaction if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
reset_apply_error_context_info();
}
@@ -1463,9 +1519,23 @@ apply_handle_stream_commit(StringInfo s)
elog(DEBUG1, "received commit for streamed transaction %u", xid);
+ /* Enable skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(xid);
+
apply_spooled_messages(xid, commit_data.commit_lsn);
- apply_handle_commit_internal(&commit_data);
+ if (is_skipping_changes())
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ {
+ /* commit the streamed transaction */
+ apply_handle_commit_internal(&commit_data);
+ }
/* unlink the files with serialized changes and subxact info */
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
@@ -2351,6 +2421,17 @@ apply_dispatch(StringInfo s)
LogicalRepMsgType action = pq_getmsgbyte(s);
LogicalRepMsgType saved_command;
+ /*
+ * Skip all data-modification changes if we're skipping changes of this
+ * transaction.
+ */
+ if (is_skipping_changes() &&
+ (action == LOGICAL_REP_MSG_INSERT ||
+ action == LOGICAL_REP_MSG_UPDATE ||
+ action == LOGICAL_REP_MSG_DELETE ||
+ action == LOGICAL_REP_MSG_TRUNCATE))
+ return;
+
/*
* Set the current command being applied. Since this function can be
* called recusively when applying spooled changes, save the current
@@ -3819,3 +3900,103 @@ reset_apply_error_context_info(void)
apply_error_callback_arg.remote_attnum = -1;
set_apply_error_context_xact(InvalidTransactionId, 0);
}
+
+/*
+ * Start skipping changes of the transaction if the given XID matches the
+ * transaction ID specified by skip_xid option.
+ */
+static void
+maybe_start_skipping_changes(TransactionId xid)
+{
+ Assert(!is_skipping_changes());
+ Assert(!TransactionIdIsValid(skipping_xid));
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ if (!TransactionIdIsValid(MySubscription->skipxid) ||
+ MySubscription->skipxid != xid)
+ return;
+
+ skipping_xid = xid;
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction with xid %u",
+ skipping_xid));
+}
+
+/*
+ * Stop skipping changes and reset the skip XID. Also, reset the skip XID
+ * (pg_subscription.subskipxid). If origin_lsn and origin_committs are valid, we
+ * set origin state to the transaction commit that resets the skip XID so that we
+ * can start streaming from the transaction next to the one that we just skipped.
+ */
+static void
+stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs)
+{
+ Relation rel;
+ HeapTuple tup;
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ Assert(is_skipping_changes());
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction with xid %u",
+ skipping_xid)));
+
+ /* Stop skipping changes */
+ skipping_xid = InvalidTransactionId;
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ if (!IsTransactionState())
+ StartTransactionCommand();
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ TransactionIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ /* Set subskipxid to null */
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ /* Update the system catalog to reset the skip XID */
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+
+ heap_freetuple(tup);
+ table_close(rel, RowExclusiveLock);
+
+ if (!XLogRecPtrIsInvalid(origin_lsn))
+ {
+ /*
+ * Update origin state so we can restart streaming from correct
+ * position in case of crash.
+ */
+ replorigin_session_origin_lsn = origin_lsn;
+ replorigin_session_origin_timestamp = origin_committs;
+ }
+
+ CommitTransactionCommand();
+ pgstat_report_stat(false);
+
+ /*
+ * Clear the error statistics of this subscription to let users know the
+ * subscription is no longer getting stuck by the conflict.
+ *
+ * The message for clearing the error statistics can be lost but that's
+ * okay. The user can know the logical replication is working fine in
+ * other ways, for example, checking pg_stat_subscription view. And the
+ * user is able to reset the single subscription error statistics by
+ * pg_reset_subscription_error SQL function.
+ */
+ pgstat_clear_subscription_error(MySubscription->oid, InvalidOid);
+}
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 21061493ea..beaa6e646d 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -67,6 +67,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
char subtwophasestate; /* Stream two-phase transactions */
+ TransactionId subskipxid BKI_FORCE_NULL; /* All changes associated with
+ * this XID are skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -103,6 +106,7 @@ typedef struct Subscription
* binary format */
bool stream; /* Allow streaming in-progress transactions. */
char twophasestate; /* Allow streaming two-phase transactions */
+ TransactionId skipxid; /* All changes of the XID are skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 3f55d63425..93bfef0e9c 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3677,6 +3677,7 @@ typedef struct AlterSubscriptionStmt
char *conninfo; /* Connection string to publisher */
List *publication; /* One or more publication to subscribe to */
List *options; /* List of DefElem nodes */
+ TransactionId skip_xid; /* XID to skip */
} AlterSubscriptionStmt;
typedef struct DropSubscriptionStmt
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index a6914a24e5..6775736b2b 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -536,7 +536,7 @@ typedef struct PgStat_MsgReplSlot
/* ----------
* PgStat_MsgSubscriptionErr Sent by the apply worker or the table sync worker to
- * update/reset the error happening during logical
+ * update/reset/clear the error happening during logical
* replication.
* ----------
*/
@@ -554,7 +554,9 @@ typedef struct PgStat_MsgSubscriptionErr
Oid m_subid;
Oid m_subrelid;
- /* The reset message uses below field */
+ /* The clear and reset messages use below fields */
+ bool m_clear; /* clear all fields except for last_failure and
+ * last_errmsg */
bool m_reset; /* Reset all fields and set reset_stats
* timestamp */
@@ -1111,6 +1113,7 @@ extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type t
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
extern void pgstat_reset_subscription_error(Oid subid, Oid subrelid);
+extern void pgstat_clear_subscription_error(Oid subid, Oid subrelid);
extern void pgstat_report_autovac(Oid dboid);
extern void pgstat_report_vacuum(Oid tableoid, bool shared,
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index e4c16cab66..e4dc4fb946 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -293,6 +293,19 @@ ERROR: unrecognized subscription parameter: "enabled"
-- fail - RESET must not include values
ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
ERROR: RESET must not include values for parameters
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = '4294967295');
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1.1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+ERROR: invalid transaction id
\dRs+
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 3b0fbea897..c458b38985 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -228,6 +228,17 @@ ALTER SUBSCRIPTION regress_testsub RESET (enabled);
-- fail - RESET must not include values
ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = '4294967295');
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1.1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+
\dRs+
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/subscription/t/024_skip_xact.pl b/src/test/subscription/t/024_skip_xact.pl
new file mode 100644
index 0000000000..affb663803
--- /dev/null
+++ b/src/test/subscription/t/024_skip_xact.pl
@@ -0,0 +1,244 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Tests for skipping logical replication transactions
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 10;
+
+# Test if the error reported on pg_subscription_errors view is expected.
+sub test_subscription_error
+{
+ my ($node, $source, $relname, $expected_error, $msg) = @_;
+
+ # Wait for the error statistics to be updated.
+ $node->poll_query_until(
+ 'postgres', qq[
+SELECT count(1) > 0 FROM pg_stat_subscription_errors
+WHERE relid = '$relname'::regclass AND failure_source = '$source';
+]) or die "Timed out while waiting for statistics to be updated";
+
+ my $result = $node->safe_psql(
+ 'postgres',
+ qq[
+SELECT datname, subname, command, relid::regclass, failure_source, failure_count > 0
+FROM pg_stat_subscription_errors
+WHERE relid = '$relname'::regclass AND failure_source = '$source';
+]);
+ is($result, $expected_error, $msg);
+}
+
+# Check the error reported on pg_stat_subscription view and skip the failed
+# transaction.
+sub test_skip_subscription_error
+{
+ my ($node, $source, $subname, $relname, $expected_error, $msg) = @_;
+
+ # Check the reported error.
+ test_subscription_error($node, $source, $relname, $expected_error, $msg);
+
+ # Get XID of the failed transaction.
+ my $skipxid = $node->safe_psql(
+ 'postgres',
+ "SELECT xid FROM pg_stat_subscription_errors WHERE relid = '$relname'::regclass");
+ $node->safe_psql('postgres',
+ "ALTER SUBSCRIPTION $subname SET (skip_xid = '$skipxid')");
+
+ # Restart the subscriber to restart logical replication without interval.
+ $node->restart;
+
+ # Wait for the failed transaction to be skipped.
+ $node->poll_query_until(
+ 'postgres',
+ qq[
+SELECT subskipxid IS NULL FROM pg_subscription
+WHERE subname = '$subname'
+]) or die "Timed out while waiting for the transaction to be skipped";
+
+ # Also wait for the error details to be cleared.
+ $node->poll_query_until(
+ 'postgres',
+ qq[
+SELECT command IS NULL FROM pg_stat_subscription_errors
+WHERE subname = '$subname' AND failure_source = '$source';
+]) or die "Timed out while waiting for the transaction to be skipped";
+}
+
+# Create publisher node.
+my $node_publisher = PostgresNode->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+logical_decoding_work_mem = 64kB
+]);
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgresNode->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+
+# don't overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+wal_retrieve_retry_interval = 5s
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we create
+# the same tables but with primary keys. Also, insert some data that will conflict
+# with the data replicated from publisher later.
+$node_publisher->safe_psql('postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int);
+CREATE TABLE test_tab2 (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+INSERT INTO test_tab1 VALUES (1);
+INSERT INTO test_tab2 VALUES (1);
+COMMIT;
+]);
+$node_subscriber->safe_psql('postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int primary key);
+CREATE TABLE test_tab2 (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab2 VALUES (1);
+INSERT INTO test_tab_streaming SELECT 10000, md5(10000::text);
+COMMIT;
+]);
+
+# Check if there is no subscription errors before starting logical replication.
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_errors");
+is($result, qq(0), 'check no subscription error');
+
+# Setup publications.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ q[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Create subscriptions. The table sync for test_tab2 on tap_sub will enter to
+# infinite error due to violating the unique constraint.
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (two_phase = on);");
+my $appname_streaming = 'tap_sub_streaming';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming' PUBLICATION tap_pub_streaming WITH (streaming = on, two_phase = on);");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Wait for initial table sync for test_tab1 and test_tab_streaming to finish.
+$node_subscriber->poll_query_until('postgres',
+ q[
+SELECT count(1) = 2 FROM pg_subscription_rel
+WHERE srrelid in ('test_tab1'::regclass, 'test_tab_streaming'::regclass) AND srsubstate = 'r'
+]) or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check the initial data.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab1");
+is($result, q(1), 'check initial data was copied to subscriber');
+
+# Insert more data to test_tab1, raising an error on the subscriber due to violating
+# the unique constraint on test_tab1. Then skip the transaction in question.
+$node_publisher->safe_psql('postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab1 VALUES (1);
+COMMIT;
+]);
+test_skip_subscription_error($node_subscriber,
+ 'apply', 'tap_sub', 'test_tab1',
+ qq(postgres|tap_sub|INSERT|test_tab1|apply|t),
+ 'skip the error reported by the apply worker');
+
+# Check the table sync worker's error in the view.
+test_subscription_error($node_subscriber,
+ 'tablesync', 'test_tab2',
+ qq(postgres|tap_sub||test_tab2|tablesync|t),
+ 'skip the error reported by the table sync worker');
+
+# Insert enough rows to test_tab_streaming to exceed the 64kB limit, also raising an
+# error on the subscriber during applying spooled changes for the same reason. Then
+# skip the transactio in question.
+$node_publisher->safe_psql('postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+COMMIT;
+]);
+test_skip_subscription_error($node_subscriber,
+ 'apply', 'tap_sub_streaming', 'test_tab_streaming',
+ qq(postgres|tap_sub_streaming|INSERT|test_tab_streaming|apply|t),
+ 'skip the error reported by the table sync worker during during applying streaming changes');
+
+# Insert data to test_tab1 and test_tab_streaming that don't conflict.
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab1 VALUES (2)");
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab_streaming VALUES (10001, md5(10001::text))");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Check the data is successfully replicated after skipping the transactions.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT * FROM test_tab1");
+is($result, q(1
+2), "subscription gets changes after skipped transaction");
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM test_tab_streaming");
+is($result, q(2), "subscription gets changes after skipped streamed transaction");
+
+# Tests for skipping the transactions that are prepared and stream_prepared. We insert
+# the same data as the previous tests but prepare the transactions. Those insertions
+# raise an error on the subscriptions. Then we skip the transactions in question.
+$node_publisher->safe_psql('postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab1 VALUES (1);
+PREPARE TRANSACTION 'skip_sub1';
+COMMIT PREPARED 'skip_sub1';
+]);
+test_skip_subscription_error($node_subscriber,
+ 'apply', 'tap_sub', 'test_tab1',
+ qq(postgres|tap_sub|INSERT|test_tab1|apply|t),
+ 'skip the error on changes of the prepared transaction');
+
+$node_publisher->safe_psql('postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+PREPARE TRANSACTION 'skip_sub2';
+COMMIT PREPARED 'skip_sub2';
+]);
+test_skip_subscription_error($node_subscriber,
+ 'apply', 'tap_sub_streaming', 'test_tab_streaming',
+ qq(postgres|tap_sub_streaming|INSERT|test_tab_streaming|apply|t),
+ 'skip the error on changes of the prepared-streamed transaction');
+
+# Check if the view doesn't show any entries after dropping the subscriptions.
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+DROP SUBSCRIPTION tap_sub;
+DROP SUBSCRIPTION tap_sub_streaming;
+]);
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_errors");
+is($result, q(0), 'no error after dropping subscription');
--
2.24.3 (Apple Git-128)
v12-0004-Using-fileset-more-effectively-in-the-apply-work.patchapplication/octet-stream; name=v12-0004-Using-fileset-more-effectively-in-the-apply-work.patchDownload
From 1057c1b6ea69589dcb6cd44e8ea725df664b11d5 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilipkumar@localhost.localdomain>
Date: Fri, 27 Aug 2021 11:49:12 +0530
Subject: [PATCH v12 4/4] Using fileset more effectively in the apply worker
Do not use separate file sets for each xid, instead use one fileset
for the worker's entire lifetime. Now, the changes/subxacts files
for every streaming transaction will be created under the same fileset
and the files will be deleted after the transaction is completed.
The fileset will be there until the worker exit.
---
src/backend/replication/logical/launcher.c | 2 +-
src/backend/replication/logical/worker.c | 249 +++++----------------
src/backend/storage/file/buffile.c | 23 +-
src/backend/utils/sort/logtape.c | 2 +-
src/backend/utils/sort/sharedtuplestore.c | 3 +-
src/include/storage/buffile.h | 5 +-
6 files changed, 75 insertions(+), 209 deletions(-)
diff --git a/src/backend/replication/logical/launcher.c b/src/backend/replication/logical/launcher.c
index 8b1772db69..644a9c20fe 100644
--- a/src/backend/replication/logical/launcher.c
+++ b/src/backend/replication/logical/launcher.c
@@ -648,7 +648,7 @@ logicalrep_worker_onexit(int code, Datum arg)
logicalrep_worker_detach();
- /* Cleanup filesets used for streaming transactions. */
+ /* Cleanup fileset used for streaming transactions. */
logicalrep_worker_cleanupfileset();
ApplyLauncherWakeup();
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 2417d040e9..34ed8e4b3b 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -237,20 +237,6 @@ static ApplyErrorCallbackArg apply_error_callback_arg =
.ts = 0,
};
-/*
- * Stream xid hash entry. Whenever we see a new xid we create this entry in the
- * xidhash and along with it create the streaming file and store the fileset handle.
- * The subxact file is created iff there is any subxact info under this xid. This
- * entry is used on the subsequent streams for the xid to get the corresponding
- * fileset handles, so storing them in hash makes the search faster.
- */
-typedef struct StreamXidHash
-{
- TransactionId xid; /* xid is the hash key and must be first */
- FileSet *stream_fileset; /* file set for stream data */
- FileSet *subxact_fileset; /* file set for subxact info */
-} StreamXidHash;
-
static MemoryContext ApplyMessageContext = NULL;
MemoryContext ApplyContext = NULL;
@@ -286,10 +272,13 @@ static TransactionId skipping_xid = InvalidTransactionId;
#define is_skipping_changes() (TransactionIdIsValid(skipping_xid))
/*
- * Hash table for storing the streaming xid information along with filesets
- * for streaming and subxact files.
+ * The fileset is used by the worker to create the changes and subxact files
+ * for the streaming transaction. Upon the arrival of the first streaming
+ * transaction, the fileset will be initialized, and it will be deleted when
+ * the worker exits. Under this, separate buffiles would be created for each
+ * transaction and would be deleted after the transaction is completed.
*/
-static HTAB *xidhash = NULL;
+static FileSet *stream_fileset = NULL;
/* BufFile handle of the current streaming file */
static BufFile *stream_fd = NULL;
@@ -1169,7 +1158,6 @@ static void
apply_handle_stream_start(StringInfo s)
{
bool first_segment;
- HASHCTL hash_ctl;
if (in_streamed_transaction)
ereport(ERROR,
@@ -1199,17 +1187,20 @@ apply_handle_stream_start(StringInfo s)
set_apply_error_context_xact(stream_xid, 0);
/*
- * Initialize the xidhash table if we haven't yet. This will be used for
+ * Initialize the stream_fileset if we haven't yet. This will be used for
* the entire duration of the apply worker so create it in permanent
* context.
*/
- if (xidhash == NULL)
+ if (stream_fileset == NULL)
{
- hash_ctl.keysize = sizeof(TransactionId);
- hash_ctl.entrysize = sizeof(StreamXidHash);
- hash_ctl.hcxt = ApplyContext;
- xidhash = hash_create("StreamXidHash", 1024, &hash_ctl,
- HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ MemoryContext oldctx;
+
+ oldctx = MemoryContextSwitchTo(ApplyContext);
+
+ stream_fileset = palloc(sizeof(FileSet));
+ FileSetInit(stream_fileset);
+
+ MemoryContextSwitchTo(oldctx);
}
/* open the spool file for this transaction */
@@ -1305,7 +1296,6 @@ apply_handle_stream_abort(StringInfo s)
BufFile *fd;
bool found = false;
char path[MAXPGPATH];
- StreamXidHash *ent;
set_apply_error_context_xact(subxid, 0);
@@ -1337,19 +1327,9 @@ apply_handle_stream_abort(StringInfo s)
return;
}
- ent = (StreamXidHash *) hash_search(xidhash,
- (void *) &xid,
- HASH_FIND,
- NULL);
- if (!ent)
- ereport(ERROR,
- (errcode(ERRCODE_PROTOCOL_VIOLATION),
- errmsg_internal("transaction %u not found in stream XID hash table",
- xid)));
-
/* open the changes file */
changes_filename(path, MyLogicalRepWorker->subid, xid);
- fd = BufFileOpenFileSet(ent->stream_fileset, path, O_RDWR);
+ fd = BufFileOpenFileSet(stream_fileset, path, O_RDWR, false);
/* OK, truncate the file at the right offset */
BufFileTruncateFileSet(fd, subxact_data.subxacts[subidx].fileno,
@@ -1383,7 +1363,6 @@ apply_spooled_messages(TransactionId xid, XLogRecPtr lsn)
int nchanges;
char path[MAXPGPATH];
char *buffer = NULL;
- StreamXidHash *ent;
MemoryContext oldcxt;
BufFile *fd;
@@ -1401,17 +1380,7 @@ apply_spooled_messages(TransactionId xid, XLogRecPtr lsn)
changes_filename(path, MyLogicalRepWorker->subid, xid);
elog(DEBUG1, "replaying changes from file \"%s\"", path);
- ent = (StreamXidHash *) hash_search(xidhash,
- (void *) &xid,
- HASH_FIND,
- NULL);
- if (!ent)
- ereport(ERROR,
- (errcode(ERRCODE_PROTOCOL_VIOLATION),
- errmsg_internal("transaction %u not found in stream XID hash table",
- xid)));
-
- fd = BufFileOpenFileSet(ent->stream_fileset, path, O_RDONLY);
+ fd = BufFileOpenFileSet(stream_fileset, path, O_RDONLY, false);
buffer = palloc(BLCKSZ);
initStringInfo(&s2);
@@ -2623,27 +2592,14 @@ UpdateWorkerStats(XLogRecPtr last_lsn, TimestampTz send_time, bool reply)
}
/*
- * Cleanup filesets.
+ * Cleanup fileset.
*/
void
-logicalrep_worker_cleanupfileset(void)
+logicalrep_worker_cleanupfileset()
{
- HASH_SEQ_STATUS status;
- StreamXidHash *hentry;
-
- /* Remove all the pending stream and subxact filesets. */
- if (xidhash)
- {
- hash_seq_init(&status, xidhash);
- while ((hentry = (StreamXidHash *) hash_seq_search(&status)) != NULL)
- {
- FileSetDeleteAll(hentry->stream_fileset);
-
- /* Delete the subxact fileset iff it is created. */
- if (hentry->subxact_fileset)
- FileSetDeleteAll(hentry->subxact_fileset);
- }
- }
+ /* If the fileset is created, clean the underlying files. */
+ if (stream_fileset != NULL)
+ FileSetDeleteAll(stream_fileset);
}
/*
@@ -3107,58 +3063,29 @@ subxact_info_write(Oid subid, TransactionId xid)
{
char path[MAXPGPATH];
Size len;
- StreamXidHash *ent;
BufFile *fd;
Assert(TransactionIdIsValid(xid));
- /* Find the xid entry in the xidhash */
- ent = (StreamXidHash *) hash_search(xidhash,
- (void *) &xid,
- HASH_FIND,
- NULL);
- /* By this time we must have created the transaction entry */
- Assert(ent);
+ /* Get the subxact filename. */
+ subxact_filename(path, subid, xid);
/*
- * If there is no subtransaction then nothing to do, but if already have
- * subxact file then delete that.
+ * If there are no subtransactions, there is nothing to be done, but if
+ * subxacts already exist, delete it.
*/
if (subxact_data.nsubxacts == 0)
{
- if (ent->subxact_fileset)
- {
- cleanup_subxact_info();
- FileSetDeleteAll(ent->subxact_fileset);
- pfree(ent->subxact_fileset);
- ent->subxact_fileset = NULL;
- }
+ cleanup_subxact_info();
+ BufFileDeleteFileSet(stream_fileset, path, true);
+
return;
}
- subxact_filename(path, subid, xid);
-
- /*
- * Create the subxact file if it not already created, otherwise open the
- * existing file.
- */
- if (ent->subxact_fileset == NULL)
- {
- MemoryContext oldctx;
-
- /*
- * We need to maintain fileset across multiple stream start/stop
- * calls. So, need to allocate it in a persistent context.
- */
- oldctx = MemoryContextSwitchTo(ApplyContext);
- ent->subxact_fileset = palloc(sizeof(FileSet));
- FileSetInit(ent->subxact_fileset);
- MemoryContextSwitchTo(oldctx);
-
- fd = BufFileCreateFileSet(ent->subxact_fileset, path);
- }
- else
- fd = BufFileOpenFileSet(ent->subxact_fileset, path, O_RDWR);
+ /* Open the subxact file, if it does not exist, create it. */
+ fd = BufFileOpenFileSet(stream_fileset, path, O_RDWR, true);
+ if (fd == NULL)
+ fd = BufFileCreateFileSet(stream_fileset, path);
len = sizeof(SubXactInfo) * subxact_data.nsubxacts;
@@ -3185,34 +3112,20 @@ subxact_info_read(Oid subid, TransactionId xid)
char path[MAXPGPATH];
Size len;
BufFile *fd;
- StreamXidHash *ent;
MemoryContext oldctx;
Assert(!subxact_data.subxacts);
Assert(subxact_data.nsubxacts == 0);
Assert(subxact_data.nsubxacts_max == 0);
- /* Find the stream xid entry in the xidhash */
- ent = (StreamXidHash *) hash_search(xidhash,
- (void *) &xid,
- HASH_FIND,
- NULL);
- if (!ent)
- ereport(ERROR,
- (errcode(ERRCODE_PROTOCOL_VIOLATION),
- errmsg_internal("transaction %u not found in stream XID hash table",
- xid)));
-
/*
- * If subxact_fileset is not valid that mean we don't have any subxact
- * info
+ * Open the subxact file for the input streaming xid, just return if the
+ * file does not exist.
*/
- if (ent->subxact_fileset == NULL)
- return;
-
subxact_filename(path, subid, xid);
-
- fd = BufFileOpenFileSet(ent->subxact_fileset, path, O_RDONLY);
+ fd = BufFileOpenFileSet(stream_fileset, path, O_RDONLY, true);
+ if (fd == NULL)
+ return;
/* read number of subxact items */
if (BufFileRead(fd, &subxact_data.nsubxacts,
@@ -3348,42 +3261,20 @@ changes_filename(char *path, Oid subid, TransactionId xid)
* Cleanup files for a subscription / toplevel transaction.
*
* Remove files with serialized changes and subxact info for a particular
- * toplevel transaction. Each subscription has a separate set of files.
+ * toplevel transaction. Each subscription has a separate file.
*/
static void
stream_cleanup_files(Oid subid, TransactionId xid)
{
char path[MAXPGPATH];
- StreamXidHash *ent;
-
- /* Find the xid entry in the xidhash */
- ent = (StreamXidHash *) hash_search(xidhash,
- (void *) &xid,
- HASH_FIND,
- NULL);
- if (!ent)
- ereport(ERROR,
- (errcode(ERRCODE_PROTOCOL_VIOLATION),
- errmsg_internal("transaction %u not found in stream XID hash table",
- xid)));
- /* Delete the change file and release the stream fileset memory */
+ /* Delete the changes file. */
changes_filename(path, subid, xid);
- FileSetDeleteAll(ent->stream_fileset);
- pfree(ent->stream_fileset);
- ent->stream_fileset = NULL;
-
- /* Delete the subxact file and release the memory, if it exist */
- if (ent->subxact_fileset)
- {
- subxact_filename(path, subid, xid);
- FileSetDeleteAll(ent->subxact_fileset);
- pfree(ent->subxact_fileset);
- ent->subxact_fileset = NULL;
- }
+ BufFileDeleteFileSet(stream_fileset, path, false);
- /* Remove the xid entry from the stream xid hash */
- hash_search(xidhash, (void *) &xid, HASH_REMOVE, NULL);
+ /* Delete the subxact file, if it exist. */
+ subxact_filename(path, subid, xid);
+ BufFileDeleteFileSet(stream_fileset, path, true);
}
/*
@@ -3393,8 +3284,8 @@ stream_cleanup_files(Oid subid, TransactionId xid)
*
* Open a file for streamed changes from a toplevel transaction identified
* by stream_xid (global variable). If it's the first chunk of streamed
- * changes for this transaction, initialize the fileset and create the buffile,
- * otherwise open the previously created file.
+ * changes for this transaction, create the buffile, otherwise open the
+ * previously created file.
*
* This can only be called at the beginning of a "streaming" block, i.e.
* between stream_start/stream_stop messages from the upstream.
@@ -3403,20 +3294,13 @@ static void
stream_open_file(Oid subid, TransactionId xid, bool first_segment)
{
char path[MAXPGPATH];
- bool found;
MemoryContext oldcxt;
- StreamXidHash *ent;
Assert(in_streamed_transaction);
Assert(OidIsValid(subid));
Assert(TransactionIdIsValid(xid));
Assert(stream_fd == NULL);
- /* create or find the xid entry in the xidhash */
- ent = (StreamXidHash *) hash_search(xidhash,
- (void *) &xid,
- HASH_ENTER,
- &found);
changes_filename(path, subid, xid);
elog(DEBUG1, "opening file \"%s\" for streamed changes", path);
@@ -3428,49 +3312,18 @@ stream_open_file(Oid subid, TransactionId xid, bool first_segment)
oldcxt = MemoryContextSwitchTo(LogicalStreamingContext);
/*
- * If this is the first streamed segment, the file must not exist, so make
- * sure we're the ones creating it. Otherwise just open the file for
- * writing, in append mode.
+ * If this is the first streamed segment, create the changes file.
+ * Otherwise, just open the file for writing, in append mode.
*/
if (first_segment)
- {
- MemoryContext savectx;
- FileSet *fileset;
-
- if (found)
- ereport(ERROR,
- (errcode(ERRCODE_PROTOCOL_VIOLATION),
- errmsg_internal("incorrect first-segment flag for streamed replication transaction")));
-
- /*
- * We need to maintain fileset across multiple stream start/stop
- * calls. So, need to allocate it in a persistent context.
- */
- savectx = MemoryContextSwitchTo(ApplyContext);
- fileset = palloc(sizeof(FileSet));
-
- FileSetInit(fileset);
- MemoryContextSwitchTo(savectx);
-
- stream_fd = BufFileCreateFileSet(fileset, path);
-
- /* Remember the fileset for the next stream of the same transaction */
- ent->xid = xid;
- ent->stream_fileset = fileset;
- ent->subxact_fileset = NULL;
- }
+ stream_fd = BufFileCreateFileSet(stream_fileset, path);
else
{
- if (!found)
- ereport(ERROR,
- (errcode(ERRCODE_PROTOCOL_VIOLATION),
- errmsg_internal("incorrect first-segment flag for streamed replication transaction")));
-
/*
* Open the file and seek to the end of the file because we always
* append the changes file.
*/
- stream_fd = BufFileOpenFileSet(ent->stream_fileset, path, O_RDWR);
+ stream_fd = BufFileOpenFileSet(stream_fileset, path, O_RDWR, false);
BufFileSeek(stream_fd, 0, 0, SEEK_END);
}
diff --git a/src/backend/storage/file/buffile.c b/src/backend/storage/file/buffile.c
index 5e5409d84d..d96b25df79 100644
--- a/src/backend/storage/file/buffile.c
+++ b/src/backend/storage/file/buffile.c
@@ -278,10 +278,12 @@ BufFileCreateFileSet(FileSet *fileset, const char *name)
* with BufFileCreateFileSet in the same FileSet using the same name.
* The backend that created the file must have called BufFileClose() or
* BufFileExportFileSet() to make sure that it is ready to be opened by other
- * backends and render it read-only.
+ * backends and render it read-only. If missing_ok is true, it will return
+ * NULL if the file does not exist otherwise, it will throw an error.
*/
BufFile *
-BufFileOpenFileSet(FileSet *fileset, const char *name, int mode)
+BufFileOpenFileSet(FileSet *fileset, const char *name, int mode,
+ bool missing_ok)
{
BufFile *file;
char segment_name[MAXPGPATH];
@@ -318,10 +320,18 @@ BufFileOpenFileSet(FileSet *fileset, const char *name, int mode)
* name.
*/
if (nfiles == 0)
+ {
+ /* free the memory */
+ pfree(files);
+
+ if (missing_ok)
+ return NULL;
+
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not open temporary file \"%s\" from BufFile \"%s\": %m",
segment_name, name)));
+ }
file = makeBufFileCommon(nfiles);
file->files = files;
@@ -341,10 +351,11 @@ BufFileOpenFileSet(FileSet *fileset, const char *name, int mode)
* the FileSet to be cleaned up.
*
* Only one backend should attempt to delete a given name, and should know
- * that it exists and has been exported or closed.
+ * that it exists and has been exported or closed otherwise missing_ok should
+ * be passed true.
*/
void
-BufFileDeleteFileSet(FileSet *fileset, const char *name)
+BufFileDeleteFileSet(FileSet *fileset, const char *name, bool missing_ok)
{
char segment_name[MAXPGPATH];
int segment = 0;
@@ -358,7 +369,7 @@ BufFileDeleteFileSet(FileSet *fileset, const char *name)
for (;;)
{
FileSetSegmentName(segment_name, name, segment);
- if (!FileSetDelete(fileset, segment_name, true))
+ if (!FileSetDelete(fileset, segment_name, !missing_ok))
break;
found = true;
++segment;
@@ -366,7 +377,7 @@ BufFileDeleteFileSet(FileSet *fileset, const char *name)
CHECK_FOR_INTERRUPTS();
}
- if (!found)
+ if (!found && !missing_ok)
elog(ERROR, "could not delete unknown BufFile \"%s\"", name);
}
diff --git a/src/backend/utils/sort/logtape.c b/src/backend/utils/sort/logtape.c
index f7994d771d..debf12e1b0 100644
--- a/src/backend/utils/sort/logtape.c
+++ b/src/backend/utils/sort/logtape.c
@@ -564,7 +564,7 @@ ltsConcatWorkerTapes(LogicalTapeSet *lts, TapeShare *shared,
lt = <s->tapes[i];
pg_itoa(i, filename);
- file = BufFileOpenFileSet(&fileset->fs, filename, O_RDONLY);
+ file = BufFileOpenFileSet(&fileset->fs, filename, O_RDONLY, false);
filesize = BufFileSize(file);
/*
diff --git a/src/backend/utils/sort/sharedtuplestore.c b/src/backend/utils/sort/sharedtuplestore.c
index 504ef1c286..033088f9bc 100644
--- a/src/backend/utils/sort/sharedtuplestore.c
+++ b/src/backend/utils/sort/sharedtuplestore.c
@@ -560,7 +560,8 @@ sts_parallel_scan_next(SharedTuplestoreAccessor *accessor, void *meta_data)
sts_filename(name, accessor, accessor->read_participant);
accessor->read_file =
- BufFileOpenFileSet(&accessor->fileset->fs, name, O_RDONLY);
+ BufFileOpenFileSet(&accessor->fileset->fs, name, O_RDONLY,
+ false);
}
/* Seek and load the chunk header. */
diff --git a/src/include/storage/buffile.h b/src/include/storage/buffile.h
index 143eada85f..7ae5ea2dde 100644
--- a/src/include/storage/buffile.h
+++ b/src/include/storage/buffile.h
@@ -49,8 +49,9 @@ extern long BufFileAppend(BufFile *target, BufFile *source);
extern BufFile *BufFileCreateFileSet(FileSet *fileset, const char *name);
extern void BufFileExportFileSet(BufFile *file);
extern BufFile *BufFileOpenFileSet(FileSet *fileset, const char *name,
- int mode);
-extern void BufFileDeleteFileSet(FileSet *fileset, const char *name);
+ int mode, bool missing_ok);
+extern void BufFileDeleteFileSet(FileSet *fileset, const char *name,
+ bool missing_ok);
extern void BufFileTruncateFileSet(BufFile *file, int fileno, off_t offset);
#endif /* BUFFILE_H */
--
2.24.3 (Apple Git-128)
v12-0002-Add-RESET-command-to-ALTER-SUBSCRIPTION-command.patchapplication/octet-stream; name=v12-0002-Add-RESET-command-to-ALTER-SUBSCRIPTION-command.patchDownload
From 1d0347681e4c9ead7dab0d522956166cf36f30ed Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 2 Aug 2021 14:23:18 +0900
Subject: [PATCH v12 2/4] Add RESET command to ALTER SUBSCRIPTION command.
ALTER SUBSCRIPTION ... RESET command resets subscription
parameters. The parameters that can be set are streaming, binary,
synchronous_commit.
RESET command is reuiqred by follow-up commit introducing to a new
parameter skip_xid to reset.
---
doc/src/sgml/ref/alter_subscription.sgml | 8 ++-
src/backend/commands/subscriptioncmds.c | 78 +++++++++++++++++-----
src/backend/parser/gram.y | 11 ++-
src/include/nodes/parsenodes.h | 5 +-
src/test/regress/expected/subscription.out | 14 +++-
src/test/regress/sql/subscription.sql | 13 ++++
6 files changed, 109 insertions(+), 20 deletions(-)
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 835be0d2a4..376fc154b1 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -29,6 +29,7 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> REFRESH PUB
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> ENABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> DISABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <replaceable>new_name</replaceable>
</synopsis>
@@ -189,16 +190,21 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
<varlistentry>
<term><literal>SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )</literal></term>
+ <term><literal>RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )</literal></term>
<listitem>
<para>
This clause alters parameters originally set by
<xref linkend="sql-createsubscription"/>. See there for more
- information. The parameters that can be altered
+ information. The parameters that can be set
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, and
<literal>streaming</literal>.
</para>
+ <para>
+ The parameters that can be reset are: <literal>streaming</literal>,
+ <literal>binary</literal>, <literal>synchronous_commit</literal>.
+ </para>
</listitem>
</varlistentry>
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index c47ba26369..a58b864f12 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -99,7 +99,8 @@ static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname,
*/
static void
parse_subscription_options(ParseState *pstate, List *stmt_options,
- bits32 supported_opts, SubOpts *opts)
+ bits32 supported_opts, SubOpts *opts,
+ bool is_reset)
{
ListCell *lc;
@@ -134,6 +135,11 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
{
DefElem *defel = (DefElem *) lfirst(lc);
+ if (is_reset && defel->arg != NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("RESET must not include values for parameters")));
+
if (IsSet(supported_opts, SUBOPT_CONNECT) &&
strcmp(defel->defname, "connect") == 0)
{
@@ -192,12 +198,18 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_SYNCHRONOUS_COMMIT;
- opts->synchronous_commit = defGetString(defel);
+ if (!is_reset)
+ {
+ opts->synchronous_commit = defGetString(defel);
- /* Test if the given value is valid for synchronous_commit GUC. */
- (void) set_config_option("synchronous_commit", opts->synchronous_commit,
- PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
- false, 0, false);
+ /*
+ * Test if the given value is valid for synchronous_commit
+ * GUC.
+ */
+ (void) set_config_option("synchronous_commit", opts->synchronous_commit,
+ PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
+ false, 0, false);
+ }
}
else if (IsSet(supported_opts, SUBOPT_REFRESH) &&
strcmp(defel->defname, "refresh") == 0)
@@ -215,7 +227,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_BINARY;
- opts->binary = defGetBoolean(defel);
+ if (!is_reset)
+ opts->binary = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_STREAMING) &&
strcmp(defel->defname, "streaming") == 0)
@@ -224,7 +237,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_STREAMING;
- opts->streaming = defGetBoolean(defel);
+ if (!is_reset)
+ opts->streaming = defGetBoolean(defel);
}
else if (strcmp(defel->defname, "two_phase") == 0)
{
@@ -397,7 +411,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT);
- parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
+ parse_subscription_options(pstate, stmt->options, supported_opts, &opts,
+ false);
/*
* Since creating a replication slot is not transactional, rolling back
@@ -866,14 +881,14 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
switch (stmt->kind)
{
- case ALTER_SUBSCRIPTION_OPTIONS:
+ case ALTER_SUBSCRIPTION_SET_OPTIONS:
{
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING);
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
if (IsSet(opts.specified_opts, SUBOPT_SLOT_NAME))
{
@@ -923,10 +938,43 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
break;
}
+ case ALTER_SUBSCRIPTION_RESET_OPTIONS:
+ {
+ supported_opts = (SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
+ SUBOPT_STREAMING);
+
+ parse_subscription_options(pstate, stmt->options,
+ supported_opts, &opts, true);
+
+ if (IsSet(opts.specified_opts, SUBOPT_SYNCHRONOUS_COMMIT))
+ {
+ values[Anum_pg_subscription_subsynccommit - 1] =
+ CStringGetTextDatum("off");
+ replaces[Anum_pg_subscription_subsynccommit - 1] = true;
+ }
+
+ if (IsSet(opts.specified_opts, SUBOPT_BINARY))
+ {
+ values[Anum_pg_subscription_subbinary - 1] =
+ BoolGetDatum(false);
+ replaces[Anum_pg_subscription_subbinary - 1] = true;
+ }
+
+ if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
+ {
+ values[Anum_pg_subscription_substream - 1] =
+ BoolGetDatum(false);
+ replaces[Anum_pg_subscription_substream - 1] = true;
+ }
+
+ update_tuple = true;
+ break;
+ }
case ALTER_SUBSCRIPTION_ENABLED:
{
parse_subscription_options(pstate, stmt->options,
- SUBOPT_ENABLED, &opts);
+ SUBOPT_ENABLED, &opts, false);
+
Assert(IsSet(opts.specified_opts, SUBOPT_ENABLED));
if (!sub->slotname && opts.enabled)
@@ -961,7 +1009,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
{
supported_opts = SUBOPT_COPY_DATA | SUBOPT_REFRESH;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
values[Anum_pg_subscription_subpublications - 1] =
publicationListToArray(stmt->publication);
@@ -1008,7 +1056,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = SUBOPT_REFRESH | SUBOPT_COPY_DATA;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
publist = merge_publications(sub->publications, stmt->publication, isadd, stmt->subname);
values[Anum_pg_subscription_subpublications - 1] =
@@ -1056,7 +1104,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
errmsg("ALTER SUBSCRIPTION ... REFRESH is not allowed for disabled subscriptions")));
parse_subscription_options(pstate, stmt->options,
- SUBOPT_COPY_DATA, &opts);
+ SUBOPT_COPY_DATA, &opts, false);
/*
* The subscription option "two_phase" requires that
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 39a2849eba..bcf85e8980 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9707,7 +9707,16 @@ AlterSubscriptionStmt:
{
AlterSubscriptionStmt *n =
makeNode(AlterSubscriptionStmt);
- n->kind = ALTER_SUBSCRIPTION_OPTIONS;
+ n->kind = ALTER_SUBSCRIPTION_SET_OPTIONS;
+ n->subname = $3;
+ n->options = $5;
+ $$ = (Node *)n;
+ }
+ | ALTER SUBSCRIPTION name RESET definition
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_RESET_OPTIONS;
n->subname = $3;
n->options = $5;
$$ = (Node *)n;
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 7af13dee43..3f55d63425 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3659,7 +3659,8 @@ typedef struct CreateSubscriptionStmt
typedef enum AlterSubscriptionType
{
- ALTER_SUBSCRIPTION_OPTIONS,
+ ALTER_SUBSCRIPTION_SET_OPTIONS,
+ ALTER_SUBSCRIPTION_RESET_OPTIONS,
ALTER_SUBSCRIPTION_CONNECTION,
ALTER_SUBSCRIPTION_SET_PUBLICATION,
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
@@ -3671,7 +3672,7 @@ typedef enum AlterSubscriptionType
typedef struct AlterSubscriptionStmt
{
NodeTag type;
- AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_OPTIONS, etc */
+ AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_SET_OPTIONS, etc */
char *subname; /* Name of the subscription */
char *conninfo; /* Connection string to publisher */
List *publication; /* One or more publication to subscribe to */
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 15a1ac6398..e4c16cab66 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -281,11 +281,23 @@ ALTER SUBSCRIPTION regress_testsub SET (two_phase = false);
ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+-- ok
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit);
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit, binary, streaming);
+-- fail - unsupported parameters
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ERROR: unrecognized subscription parameter: "connect"
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+ERROR: unrecognized subscription parameter: "enabled"
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+ERROR: RESET must not include values for parameters
\dRs+
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7faa935a2a..3b0fbea897 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -215,6 +215,19 @@ ALTER SUBSCRIPTION regress_testsub SET (two_phase = false);
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+
+-- ok
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit);
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit, binary, streaming);
+
+-- fail - unsupported parameters
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+
\dRs+
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
--
2.24.3 (Apple Git-128)
On Mon, Aug 30, 2021 at 5:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached rebased patches. 0004 patch is not the scope of this
patch. It's borrowed from another thread[1] to fix the assertion
failure for newly added tests. Please review them.
I have some initial feedback on the v12-0001 patch.
Most of these are suggested improvements to wording, and some typo fixes.
(0) Patch comment
Suggestion to improve the patch comment:
BEFORE:
Add pg_stat_subscription_errors statistics view.
This commits adds new system view pg_stat_logical_replication_error,
showing errors happening during applying logical replication changes
as well as during performing initial table synchronization.
The subscription error entries are removed by autovacuum workers when
the table synchronization competed in table sync worker cases and when
dropping the subscription in apply worker cases.
It also adds SQL function pg_stat_reset_subscription_error() to
reset the single subscription error.
AFTER:
Add a subscription errors statistics view "pg_stat_subscription_errors".
This commits adds a new system view pg_stat_logical_replication_errors,
that records information about any errors which occur during application
of logical replication changes as well as during performing initial table
synchronization.
The subscription error entries are removed by autovacuum workers after
table synchronization completes in table sync worker cases and after
dropping the subscription in apply worker cases.
It also adds an SQL function pg_stat_reset_subscription_error() to
reset a single subscription error.
doc/src/sgml/monitoring.sgml:
(1)
BEFORE:
+ <entry>One row per error that happened on subscription, showing
information about
+ the subscription errors.
AFTER:
+ <entry>One row per error that occurred on subscription,
providing information about
+ each subscription error.
(2)
BEFORE:
+ The <structname>pg_stat_subscription_errors</structname> view will
contain one
AFTER:
+ The <structname>pg_stat_subscription_errors</structname> view contains one
(3)
BEFORE:
+ Name of the database in which the subscription is created.
AFTER:
+ Name of the database in which the subscription was created.
(4)
BEFORE:
+ OID of the relation that the worker is processing when the
+ error happened.
AFTER:
+ OID of the relation that the worker was processing when the
+ error occurred.
(5)
BEFORE:
+ Name of command being applied when the error happened. This
+ field is always NULL if the error is reported by
+ <literal>tablesync</literal> worker.
AFTER:
+ Name of command being applied when the error occurred. This
+ field is always NULL if the error is reported by a
+ <literal>tablesync</literal> worker.
(6)
BEFORE:
+ Transaction ID of publisher node being applied when the error
+ happened. This field is always NULL if the error is reported
+ by <literal>tablesync</literal> worker.
AFTER:
+ Transaction ID of the publisher node being applied when the error
+ happened. This field is always NULL if the error is reported
+ by a <literal>tablesync</literal> worker.
(7)
BEFORE:
+ Type of worker reported the error: <literal>apply</literal> or
+ <literal>tablesync</literal>.
AFTER:
+ Type of worker reporting the error: <literal>apply</literal> or
+ <literal>tablesync</literal>.
(8)
BEFORE:
+ Number of times error happened on the worker.
AFTER:
+ Number of times the error occurred in the worker.
[or "Number of times the worker reported the error" ?]
(9)
BEFORE:
+ Time at which the last error happened.
AFTER:
+ Time at which the last error occurred.
(10)
BEFORE:
+ Error message which is reported last failure time.
AFTER:
+ Error message which was reported at the last failure time.
Maybe this should just say "Last reported error message" ?
(11)
You shouldn't call hash_get_num_entries() on a NULL pointer.
Suggest swappling lines, as shown below:
BEFORE:
+ int32 nerrors = hash_get_num_entries(subent->suberrors);
+
+ /* Skip this subscription if not have any errors */
+ if (subent->suberrors == NULL)
+ continue;
AFTER:
+ int32 nerrors;
+
+ /* Skip this subscription if not have any errors */
+ if (subent->suberrors == NULL)
+ continue;
+ nerrors = hash_get_num_entries(subent->suberrors);
(12)
Typo: legnth -> length
+ * contains the fixed-legnth error message string which is
src/backend/postmaster/pgstat.c
(13)
"Subscription stat entries" hashtable is created in two different
places, one with HASH_CONTEXT and the other without. Is this
intentional?
Shouldn't there be a single function for creating this?
(14)
+ * PgStat_MsgSubscriptionPurge Sent by the autovacuum purge the subscriptions.
Seems to be missing a word, is it meant to say "Sent by the autovacuum
to purge the subscriptions." ?
(15)
+ * PgStat_MsgSubscriptionErrPurge Sent by the autovacuum purge the subscription
+ * errors.
Seems to be missing a word, is it meant to say "Sent by the autovacuum
to purge the subscription errors." ?
Regards,
Greg Nancarrow
Fujitsu Australia
On Mon, Aug 30, 2021 at 5:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached rebased patches. 0004 patch is not the scope of this
patch. It's borrowed from another thread[1] to fix the assertion
failure for newly added tests. Please review them.
I have a few comments on the v12-0002 patch:
(1) Patch comment
Has a typo and could be expressed a bit better.
Suggestion:
BEFORE:
RESET command is reuiqred by follow-up commit introducing to a new
parameter skip_xid to reset.
AFTER:
The RESET parameter for ALTER SUBSCRIPTION is required by the
follow-up commit that introduces a new resettable subscription
parameter "skip_xid".
doc/src/sgml/ref/alter_subscription.sgml
(2)
I don't think "RESET" is sufficiently described in
alter_subscription.sgml. Just putting it under "SET" and changing
"altered" to "set" doesn't explain what resetting does. It should say
something about setting the parameter back to its original (default)
value.
(3)
case ALTER_SUBSCRIPTION_RESET_OPTIONS
Some comments here would be helpful e.g. Reset the specified
parameters back to their default values.
Regards,
Greg Nancarrow
Fujitsu Australia
From Mon, Aug 30, 2021 3:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached rebased patches. 0004 patch is not the scope of this patch. It's
borrowed from another thread[1] to fix the assertion failure for newly added
tests. Please review them.
Hi,
I reviewed the v12-0001 patch, here are some comments:
1)
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -1441,7 +1441,6 @@ getinternalerrposition(void)
return edata->internalpos;
}
-
It seems a miss change in elog.c
2)
+ TupleDescInitEntry(tupdesc, (AttrNumber) 10, "stats_reset",
+ TIMESTAMPTZOID, -1, 0);
The document doesn't mention the column "stats_reset".
3)
+typedef struct PgStat_StatSubErrEntry
+{
+ Oid subrelid; /* InvalidOid if the apply worker, otherwise
+ * the table sync worker. hash table key. */
From the comments of subrelid, I think one subscription only have one apply
worker error entry, right ? If so, I was thinking can we move the the apply
error entry to PgStat_StatSubEntry. In that approach, we don't need to build a
inner hash table when there are no table sync error entry.
4)
Is it possible to add some testcases to test the subscription error entry delete ?
Best regards,
Hou zj
From Mon, Aug 30, 2021 3:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached rebased patches. 0004 patch is not the scope of this
patch. It's borrowed from another thread[1] to fix the assertion
failure for newly added tests. Please review them.
Hi,
I reviewed the 0002 patch and have a suggestion for it.
+ if (IsSet(opts.specified_opts, SUBOPT_SYNCHRONOUS_COMMIT))
+ {
+ values[Anum_pg_subscription_subsynccommit - 1] =
+ CStringGetTextDatum("off");
+ replaces[Anum_pg_subscription_subsynccommit - 1] = true;
+ }
Currently, the patch set the default value out of parse_subscription_options(),
but I think It might be more standard to set the value in
parse_subscription_options(). Like:
if (!is_reset)
{
...
+ }
+ else
+ opts->synchronous_commit = "off";
And then, we can set the value like:
values[Anum_pg_subscription_subsynccommit - 1] =
CStringGetTextDatum(opts.synchronous_commit);
Besides, instead of adding a switch case like the following:
+ case ALTER_SUBSCRIPTION_RESET_OPTIONS:
+ {
We can add a bool flag(isReset) in AlterSubscriptionStmt and check the flag
when invoking parse_subscription_options(). In this approach, the code can be
shorter.
Attach a diff file based on the v12-0002 which change the code like the above
suggestion.
Best regards,
Hou zj
Attachments:
0001-diff-for-0002_patchapplication/octet-stream; name=0001-diff-for-0002_patchDownload
From ece40ad7ef853488d323f59a09d0614ff242fa09 Mon Sep 17 00:00:00 2001
From: "houzj.fnst" <houzj.fnst@cn.fujitsu.com>
Date: Thu, 2 Sep 2021 19:16:03 +0800
Subject: [PATCH] diff for 0002
---
src/backend/commands/subscriptioncmds.c | 49 ++++++++-------------------------
src/backend/parser/gram.y | 6 ++--
src/include/nodes/parsenodes.h | 6 ++--
3 files changed, 19 insertions(+), 42 deletions(-)
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index a58b864..b3ae2c5 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -210,6 +210,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
false, 0, false);
}
+ else
+ opts->synchronous_commit = "off";
}
else if (IsSet(supported_opts, SUBOPT_REFRESH) &&
strcmp(defel->defname, "refresh") == 0)
@@ -881,14 +883,19 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
switch (stmt->kind)
{
- case ALTER_SUBSCRIPTION_SET_OPTIONS:
+ case ALTER_SUBSCRIPTION_OPTIONS:
{
- supported_opts = (SUBOPT_SLOT_NAME |
- SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
- SUBOPT_STREAMING);
+ if (stmt->isReset)
+ supported_opts = (SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
+ SUBOPT_STREAMING);
+ else
+ supported_opts = (SUBOPT_SLOT_NAME |
+ SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
+ SUBOPT_STREAMING);
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts, false);
+ supported_opts, &opts,
+ stmt->isReset);
if (IsSet(opts.specified_opts, SUBOPT_SLOT_NAME))
{
@@ -938,38 +945,6 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
break;
}
- case ALTER_SUBSCRIPTION_RESET_OPTIONS:
- {
- supported_opts = (SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
- SUBOPT_STREAMING);
-
- parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts, true);
-
- if (IsSet(opts.specified_opts, SUBOPT_SYNCHRONOUS_COMMIT))
- {
- values[Anum_pg_subscription_subsynccommit - 1] =
- CStringGetTextDatum("off");
- replaces[Anum_pg_subscription_subsynccommit - 1] = true;
- }
-
- if (IsSet(opts.specified_opts, SUBOPT_BINARY))
- {
- values[Anum_pg_subscription_subbinary - 1] =
- BoolGetDatum(false);
- replaces[Anum_pg_subscription_subbinary - 1] = true;
- }
-
- if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
- {
- values[Anum_pg_subscription_substream - 1] =
- BoolGetDatum(false);
- replaces[Anum_pg_subscription_substream - 1] = true;
- }
-
- update_tuple = true;
- break;
- }
case ALTER_SUBSCRIPTION_ENABLED:
{
parse_subscription_options(pstate, stmt->options,
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index bcf85e8..9cbc065 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9707,18 +9707,20 @@ AlterSubscriptionStmt:
{
AlterSubscriptionStmt *n =
makeNode(AlterSubscriptionStmt);
- n->kind = ALTER_SUBSCRIPTION_SET_OPTIONS;
+ n->kind = ALTER_SUBSCRIPTION_OPTIONS;
n->subname = $3;
n->options = $5;
+ n->isReset = false;
$$ = (Node *)n;
}
| ALTER SUBSCRIPTION name RESET definition
{
AlterSubscriptionStmt *n =
makeNode(AlterSubscriptionStmt);
- n->kind = ALTER_SUBSCRIPTION_RESET_OPTIONS;
+ n->kind = ALTER_SUBSCRIPTION_OPTIONS;
n->subname = $3;
n->options = $5;
+ n->isReset = true;
$$ = (Node *)n;
}
| ALTER SUBSCRIPTION name CONNECTION Sconst
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 3f55d63..346887f 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3659,8 +3659,7 @@ typedef struct CreateSubscriptionStmt
typedef enum AlterSubscriptionType
{
- ALTER_SUBSCRIPTION_SET_OPTIONS,
- ALTER_SUBSCRIPTION_RESET_OPTIONS,
+ ALTER_SUBSCRIPTION_OPTIONS,
ALTER_SUBSCRIPTION_CONNECTION,
ALTER_SUBSCRIPTION_SET_PUBLICATION,
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
@@ -3672,11 +3671,12 @@ typedef enum AlterSubscriptionType
typedef struct AlterSubscriptionStmt
{
NodeTag type;
- AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_SET_OPTIONS, etc */
+ AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_OPTIONS, etc */
char *subname; /* Name of the subscription */
char *conninfo; /* Connection string to publisher */
List *publication; /* One or more publication to subscribe to */
List *options; /* List of DefElem nodes */
+ bool isReset; /* true if RESET option */
} AlterSubscriptionStmt;
typedef struct DropSubscriptionStmt
--
2.7.2.windows.1
On Mon, Aug 30, 2021 at 5:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached rebased patches. 0004 patch is not the scope of this
patch. It's borrowed from another thread[1] to fix the assertion
failure for newly added tests. Please review them.
Some initial comments for the v12-0003 patch:
(1) Patch comment
"This commit introduces another way to skip the transaction in question."
I think it should further explain: "This commit introduces another way
to skip the transaction in question, other than manually updating the
subscriber's database or using pg_replication_origin_advance()."
doc/src/sgml/logical-replication.sgml
(2)
Suggested minor update:
BEFORE:
+ transaction that conflicts with the existing data. When a conflict produce
+ an error, it is shown in
<structname>pg_stat_subscription_errors</structname>
+ view as follows:
AFTER:
+ transaction that conflicts with the existing data. When a conflict produces
+ an error, it is recorded in the
<structname>pg_stat_subscription_errors</structname>
+ view as follows:
(3)
+ found from those outputs (transaction ID 740 in the above case).
The transaction
Shouldn't it be transaction ID 716?
(4)
+ can be skipped by setting <replaceable>skip_xid</replaceable> to
the subscription
Is it better to say here ... "on the subscription" ?
(5)
Just skipping a transaction could make a subscriber inconsistent, right?
Would it be better as follows?
BEFORE:
+ In either way, those should be used as a last resort. They skip the whole
+ transaction including changes that may not violate any constraint and easily
+ make subscriber inconsistent if a user specifies the wrong transaction ID or
+ the position of origin.
AFTER:
+ Either way, those transaction skipping methods should be used as a
last resort.
+ They skip the whole transaction, including changes that may not violate any
+ constraint. They may easily make the subscriber inconsistent,
especially if a
+ user specifies the wrong transaction ID or the position of origin.
(6)
The grammar is not great in the following description, so here's a
suggested improvement:
BEFORE:
+ incoming change or by skipping the whole transaction. This option
+ specifies transaction ID that logical replication worker skips to
+ apply. The logical replication worker skips all data modification
AFTER:
+ incoming changes or by skipping the whole transaction. This option
+ specifies the ID of the transaction whose application is to
be skipped
+ by the logical replication worker. The logical replication worker
+ skips all data modification
src/backend/postmaster/pgstat.c
(7)
BEFORE:
+ * Tell the collector about clear the error of subscription.
AFTER:
+ * Tell the collector to clear the subscription error.
src/backend/replication/logical/worker.c
(8)
+ * subscription is invalidated and* MySubscription->skipxid gets
changed or reset.
There is a "*" after "and".
(9)
Do these lines really need to be moved up?
+ /* extract XID of the top-level transaction */
+ stream_xid = logicalrep_read_stream_start(s, &first_segment);
+
src/backend/postmaster/pgstat.c
(10)
+ bool m_clear; /* clear all fields except for last_failure and
+ * last_errmsg */
I think it should say: clear all fields except for last_failure,
last_errmsg and stat_reset_timestamp.
Regards,
Greg Nancarrow
Fujitsu Australia
On Aug 30, 2021, at 12:06 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached rebased patches.
Thanks for these patches, Sawada-san!
The first patch in your series, v12-0001, seems useful to me even before committing any of the rest. I would like to integrate the new pg_stat_subscription_errors view it creates into regression tests for other logical replication features under development.
In particular, it can be hard to write TAP tests that need to wait for subscriptions to catch up or fail. With your view committed, a new PostgresNode function to wait for catchup or for failure can be added, and then developers of different projects can all use that. I am attaching a version of such a function, plus some tests of your patch (since it does not appear to have any). Would you mind reviewing these and giving comments or including them in your next patch version?
Attachments:
0001-Adding-tests-of-subscription-errors.patchapplication/octet-stream; name=0001-Adding-tests-of-subscription-errors.patch; x-unix-mode=0644Download
From 7f241b871310ac37898bd04a43868fd8d4a8e3a4 Mon Sep 17 00:00:00 2001
From: Mark Dilger <mark.dilger@enterprisedb.com>
Date: Thu, 2 Sep 2021 11:55:29 -0700
Subject: [PATCH 2/2] Adding tests of subscription errors
---
src/test/perl/PostgresNode.pm | 64 +++++++++++
src/test/regress/expected/subscription.out | 6 +
src/test/regress/expected/sysviews.out | 8 ++
src/test/regress/sql/subscription.sql | 4 +
src/test/regress/sql/sysviews.sql | 5 +
src/test/subscription/t/025_errors.pl | 124 +++++++++++++++++++++
6 files changed, 211 insertions(+)
create mode 100644 src/test/subscription/t/025_errors.pl
diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index c59da758c7..d4d8cd8b68 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -2499,6 +2499,70 @@ sub wait_for_slot_catchup
return;
}
+
+=item $node->wait_for_subscription($dbname, @subcriptions)
+
+Wait for the named subscriptions to catch up or error.
+
+=cut
+
+sub wait_for_subscriptions
+{
+ my ($self, $dbname, @subscriptions) = @_;
+
+ # Unique-ify the subscriptions passed by the caller
+ my %unique = map { $_ => 1 } @subscriptions;
+ my @unique = sort keys %unique;
+ my $unique_count = scalar(@unique);
+
+ # It makes sense to quietly return immediately for a list of zero
+ # subscriptions, but that is more likely user error than intentional, so we
+ # instead tell the caller about it noisily.
+ croak "subscriptions must be specified" unless $unique_count;
+
+ # Construct a SQL list from the unique subscription names
+ my @escaped = map { s/'/''/g; s/\\/\\\\/g; $_ } @unique;
+ my $quotedlist = join(', ', map { "'$_'" } @escaped);
+
+ # Sanity check that the subscriptions exist. We don't want to
+ # poll until timeout on a non-existent misspelled subscription name.
+ my $unmatched = $self->safe_psql($dbname, qq(
+ SELECT string_agg(subname, ', ') FROM (
+ SELECT arg.subname
+ FROM (SELECT subname FROM unnest(ARRAY[$quotedlist]::text[]) AS subname) AS arg
+ LEFT JOIN pg_catalog.pg_subscription pg
+ ON arg.subname = pg.subname
+ WHERE pg.subname IS NULL
+ ORDER BY arg.subname
+ ) AS ss
+ ));
+ croak "no such subscription: $unmatched"
+ if length $unmatched;
+
+ # Ok, the subscriptions exist, so we can poll on them synchronizing or
+ # failing. There is a race condition between when we checked above and
+ # this query, but we were only trying to detect typos in the tests, not
+ # concurrent subscription drops.
+ my $polling_sql = qq(
+ SELECT COUNT(1) = $unique_count FROM
+ (SELECT s.oid AS subid
+ FROM pg_catalog.pg_subscription s
+ LEFT JOIN pg_catalog.pg_subscription_rel sr
+ ON sr.srsubid = s.oid
+ WHERE (sr IS NULL OR sr.srsubstate IN ('s', 'r'))
+ AND s.subname IN ($quotedlist)
+ UNION
+ SELECT e.subid
+ FROM pg_catalog.pg_stat_subscription_errors e
+ WHERE e.subname IN ($quotedlist)
+ ) AS synced_or_errored
+ );
+ $self->poll_query_until($dbname, $polling_sql)
+ or croak "timed out waiting for subscriptions";
+ print "done\n";
+ return;
+}
+
=pod
=item $node->query_hash($dbname, $query, @columns)
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 15a1ac6398..d33174849c 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -289,6 +289,12 @@ ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+-- no errors should be reported
+SELECT * FROM pg_stat_subscription_errors;
+ datname | subid | subname | relid | command | xid | failure_source | failure_count | last_failure | last_failure_message | stats_reset
+---------+-------+---------+-------+---------+-----+----------------+---------------+--------------+----------------------+-------------
+(0 rows)
+
DROP SUBSCRIPTION regress_testsub;
-- two_phase and streaming are compatible.
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true, two_phase = true);
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index 6e54f3e15e..7cfe54224d 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -150,3 +150,11 @@ select count(distinct utc_offset) >= 24 as ok from pg_timezone_abbrevs;
t
(1 row)
+-- Test that the subscription errors view exists, and has the right columns.
+-- If we expected any rows to exist, we would need to filter out unstable
+-- columns. But since there should be no errors, we just select them all.
+select * from pg_stat_subscription_errors;
+ datname | subid | subname | relid | command | xid | failure_source | failure_count | last_failure | last_failure_message | stats_reset
+---------+-------+---------+-------+---------+-----+----------------+---------------+--------------+----------------------+-------------
+(0 rows)
+
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7faa935a2a..b2caf86b22 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -218,6 +218,10 @@ ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+
+-- no errors should be reported
+SELECT * FROM pg_stat_subscription_errors;
+
DROP SUBSCRIPTION regress_testsub;
-- two_phase and streaming are compatible.
diff --git a/src/test/regress/sql/sysviews.sql b/src/test/regress/sql/sysviews.sql
index dc8c9a3ac2..3991a246f5 100644
--- a/src/test/regress/sql/sysviews.sql
+++ b/src/test/regress/sql/sysviews.sql
@@ -60,3 +60,8 @@ set timezone_abbreviations = 'Australia';
select count(distinct utc_offset) >= 24 as ok from pg_timezone_abbrevs;
set timezone_abbreviations = 'India';
select count(distinct utc_offset) >= 24 as ok from pg_timezone_abbrevs;
+
+-- Test that the subscription errors view exists, and has the right columns.
+-- If we expected any rows to exist, we would need to filter out unstable
+-- columns. But since there should be no errors, we just select them all.
+select * from pg_stat_subscription_errors;
diff --git a/src/test/subscription/t/025_errors.pl b/src/test/subscription/t/025_errors.pl
new file mode 100644
index 0000000000..c5bd45ab15
--- /dev/null
+++ b/src/test/subscription/t/025_errors.pl
@@ -0,0 +1,124 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# This test checks behaviour of pg_stat_subscription_errors view when
+# tablesync and apply workers fail
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 3;
+
+# Create a chain of nodes for logical replication to propogate as:
+#
+# publisher => middleman => subscriber
+#
+my ($publisher, $middleman, $subscriber);
+
+$publisher = PostgresNode->new('publisher');
+$publisher->init(allows_streaming => 'logical');
+$publisher->start;
+
+$middleman = PostgresNode->new('middleman');
+$middleman->init(allows_streaming => 'logical');
+$middleman->start;
+
+$subscriber = PostgresNode->new('subscriber');
+$subscriber->init;
+$subscriber->start;
+
+my ($node, $schema, $results);
+my @nodes = ($publisher, $middleman, $subscriber);
+
+# Create boilerplate text of the DDL needed to construct schemas and objects on
+# each node
+#
+my @schemas = qw(
+ good conflicts_on_middleman conflicts_on_subscriber);
+my @ddl = map { qq(
+CREATE SCHEMA $_;
+CREATE TABLE $_.tbl (i INTEGER);
+ALTER TABLE $_.tbl REPLICA IDENTITY FULL;
+CREATE INDEX ${_}_idx ON $_.tbl(i);
+) } @schemas;
+
+for my $node (@nodes)
+{
+ $node->safe_psql('postgres', $_) for (@ddl);
+}
+
+# Create non-unique data in all schemas on publisher
+#
+my @dml = map { qq(INSERT INTO $_.tbl (i) VALUES (1), (1), (1)) } @schemas;
+$publisher->safe_psql('postgres', $_) for (@dml);
+
+# Create additional DDL on the middleman and subscriber that will cause
+# replication failures during the initial tablesync.
+#
+$middleman->safe_psql('postgres', qq(
+CREATE UNIQUE INDEX unique_idx
+ ON conflicts_on_middleman.tbl(i)));
+$subscriber->safe_psql('postgres', qq(
+ALTER TABLE conflicts_on_subscriber.tbl
+ ADD CONSTRAINT must_be_three CHECK (i = 3)));
+
+# Insert data to all schemas on the middleman which do not violate the
+# middleman's uniqueness requirements.
+#
+for $node ($publisher, $middleman)
+{
+ $node->safe_psql('postgres', qq(INSERT INTO $_.tbl VALUES (1), (2), (3), (4)))
+ for (@schemas);
+}
+
+# Create publications named after the schemas they publish
+for $node ($publisher, $middleman)
+{
+ $node->safe_psql('postgres', qq(CREATE PUBLICATION $_ FOR TABLE $_.tbl))
+ for (@schemas);
+}
+
+my $publisher_connstr = $publisher->connstr . ' dbname=postgres';
+my $middleman_connstr = $middleman->connstr . ' dbname=postgres';
+
+# Create subscriptions named after the schemas they subscribe
+#
+for $schema (@schemas)
+{
+ $middleman->safe_psql('postgres', qq(
+CREATE SUBSCRIPTION $schema
+ CONNECTION '$publisher_connstr'
+ PUBLICATION $schema));
+}
+
+for $schema (@schemas)
+{
+ $subscriber->safe_psql('postgres', qq(
+CREATE SUBSCRIPTION $schema
+ CONNECTION '$middleman_connstr'
+ PUBLICATION $schema));
+}
+
+# Wait for the subscriptions to finish synchronizing or to error
+#
+$middleman->wait_for_subscriptions('postgres', @schemas);
+
+$subscriber->wait_for_subscriptions('postgres', @schemas);
+
+$results = $publisher->safe_psql('postgres',
+ 'select count(*) from pg_catalog.pg_stat_subscription_errors');
+is ($results, 0, "publisher has no subscription errors");
+
+$results = $middleman->safe_psql('postgres',
+ 'select datname, subname, failure_source, last_failure_message from pg_catalog.pg_stat_subscription_errors');
+is ($results, 'postgres|conflicts_on_middleman|tablesync|duplicate key value violates unique constraint "unique_idx"',
+ 'expected subscription failure on middleman');
+
+$results = $subscriber->safe_psql('postgres',
+ 'select datname, subname, failure_source, last_failure_message from pg_catalog.pg_stat_subscription_errors');
+is ($results, 'postgres|conflicts_on_subscriber|tablesync|new row for relation "tbl" violates check constraint "must_be_three"',
+ 'expected subscription failure on subscriber');
+
+$subscriber->stop();
+$middleman->stop();
+$publisher->stop();
--
2.21.1 (Apple Git-122.3)
On Aug 30, 2021, at 12:06 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached rebased patches.
Here are some review comments:
For the v12-0002 patch:
The documentation changes for ALTER SUBSCRIPTION .. RESET look strange to me. You grouped SET and RESET together, much like sql-altertable.html has them grouped, but I don't think it flows naturally here, as the two commands do not support the same set of parameters. It might look better if you documented these separately. It might also be good to order the parameters the same, so that the differences can more quickly be seen.
For the v12-0003 patch:
I believe this feature is needed, but it also seems like a very powerful foot-gun. Can we do anything to make it less likely that users will hurt themselves with this tool?
I am thinking back to support calls I have attended. When a production system is down, there is often some hesitancy to perform ad-hoc operations on the database, but once the decision has been made to do so, people try to get the whole process done as quickly as possible. If multiple transactions on the publisher fail on the subscriber, they will do so in series, not in parallel. The process of clearing these errors will amount to copying the xid of each failed transaction to the ALTER SUBSCRIPTION ... SET (skip_xid = xxx) command and running it, then the next, then the next, .... Perhaps the first couple times through the process, the customer will look to see that the failure is of the same type and on the same table, but after a short time they will likely just script something to clear the rest as quickly as possible. In the heat of the moment, they may not include a check of the failure message, but merely a grep of the failing xid.
If the user could instead clear all failed transactions of the same type, that might make it less likely that they unthinkingly also skip subsequent errors of some different type. Perhaps something like ALTER SUBSCRIPTION ... SET (skip_failures = 'duplicate key value violates unique constraint "test_pkey"')? This is arguably a different feature request, and not something your patch is required to address, but I wonder how much we should limit people shooting themselves in the foot? If we built something like this using your skip_xid feature, rather than instead of your skip_xid feature, would your feature need to be modified?
The docs could use some rewording, too:
+ If incoming data violates any constraints the logical replication
+ will stop until it is resolved.
In my experience, logical replication doesn't stop, but instead goes into an infinite loop of retries.
+ The resolution can be done either
+ by changing data on the subscriber so that it doesn't conflict with
+ incoming change or by skipping the whole transaction.
I'm having trouble thinking of an example conflict where skipping a transaction would be better than writing a BEFORE INSERT trigger on the conflicting table which suppresses or redirects conflicting rows somewhere else. Particularly for larger transactions containing multiple statements, suppressing the conflicting rows using a trigger would be less messy than skipping the transaction. I think your patch adds a useful tool to the toolkit, but maybe we should mention more alternatives in the docs? Something like, "changing the data on the subscriber so that it doesn't conflict with incoming changes, or dropping the conflicting constraint or unique index, or writing a trigger on the subscriber to suppress or redirect conflicting incoming changes, or as a last resort, by skipping the whole transaction"?
Perhaps I'm reading your phrase "changing the data on the subscriber" too narrowly. To me, that means running DML (either a DELETE or an UPDATE) on the existing data in the table where the conflict arises. These other options are DDL and do not easily come to mind when I read that phrase.
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Mon, Aug 30, 2021 at 5:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached rebased patches. 0004 patch is not the scope of this
patch. It's borrowed from another thread[1] to fix the assertion
failure for newly added tests. Please review them.
BTW, these patches need rebasing (broken by recent commits, patches
0001, 0003 and 0004 no longer apply, and it's failing in the cfbot).
Regards,
Greg Nancarrow
Fujitsu Australia
On Fri, Sep 3, 2021 at 2:15 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote:
On Aug 30, 2021, at 12:06 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached rebased patches.
For the v12-0003 patch:
I believe this feature is needed, but it also seems like a very powerful foot-gun. Can we do anything to make it less likely that users will hurt themselves with this tool?
This won't do any more harm than currently, users can do via
pg_replication_slot_advance and the same is documented as well, see
[1]: . This will be allowed to only superusers. Its effect will be documented with a precautionary note to use it only when the other available ways can't be used. Any better ideas?
documented with a precautionary note to use it only when the other
available ways can't be used. Any better ideas?
I am thinking back to support calls I have attended. When a production system is down, there is often some hesitancy to perform ad-hoc operations on the database, but once the decision has been made to do so, people try to get the whole process done as quickly as possible. If multiple transactions on the publisher fail on the subscriber, they will do so in series, not in parallel.
The subscriber will know only one transaction failure at a time, till
that is resolved, the apply won't move ahead and it won't know even if
there are other transactions that are going to fail in the future.
If the user could instead clear all failed transactions of the same type, that might make it less likely that they unthinkingly also skip subsequent errors of some different type. Perhaps something like ALTER SUBSCRIPTION ... SET (skip_failures = 'duplicate key value violates unique constraint "test_pkey"')?
I think if we want we can allow to skip particular error via
skip_error_code instead of via error message but not sure if it would
be better to skip a particular operation of the transaction rather
than the entire transaction. Normally from the atomicity purpose the
transaction can be either committed or rolled-back but not partially
done so I think it would be preferable to skip the entire transaction
rather than skipping it partially.
This is arguably a different feature request, and not something your patch is required to address, but I wonder how much we should limit people shooting themselves in the foot? If we built something like this using your skip_xid feature, rather than instead of your skip_xid feature, would your feature need to be modified?
Sawada-San can answer better but I don't see any problem building any
such feature on top of what is currently proposed.
I'm having trouble thinking of an example conflict where skipping a transaction would be better than writing a BEFORE INSERT trigger on the conflicting table which suppresses or redirects conflicting rows somewhere else. Particularly for larger transactions containing multiple statements, suppressing the conflicting rows using a trigger would be less messy than skipping the transaction. I think your patch adds a useful tool to the toolkit, but maybe we should mention more alternatives in the docs? Something like, "changing the data on the subscriber so that it doesn't conflict with incoming changes, or dropping the conflicting constraint or unique index, or writing a trigger on the subscriber to suppress or redirect conflicting incoming changes, or as a last resort, by skipping the whole transaction"?
+1 for extending the docs as per this suggestion.
--
With Regards,
Amit Kapila.
On Sat, Sep 4, 2021 at 8:54 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Sep 3, 2021 at 2:15 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote:
On Aug 30, 2021, at 12:06 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached rebased patches.
For the v12-0003 patch:
I believe this feature is needed, but it also seems like a very powerful foot-gun. Can we do anything to make it less likely that users will hurt themselves with this tool?
This won't do any more harm than currently, users can do via
pg_replication_slot_advance and the same is documented as well, see
[1].
Sorry, forgot to give the link.
[1]: https://www.postgresql.org/docs/devel/logical-replication-conflicts.html
--
With Regards,
Amit Kapila.
On Thu, Sep 2, 2021 at 12:06 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Mon, Aug 30, 2021 at 5:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached rebased patches. 0004 patch is not the scope of this
patch. It's borrowed from another thread[1] to fix the assertion
failure for newly added tests. Please review them.I have some initial feedback on the v12-0001 patch.
Most of these are suggested improvements to wording, and some typo fixes.
Thank you for the comments!
(0) Patch comment
Suggestion to improve the patch comment:
BEFORE:
Add pg_stat_subscription_errors statistics view.This commits adds new system view pg_stat_logical_replication_error,
Oops, I realized that it should be pg_stat_subscription_errors.
showing errors happening during applying logical replication changes
as well as during performing initial table synchronization.The subscription error entries are removed by autovacuum workers when
the table synchronization competed in table sync worker cases and when
dropping the subscription in apply worker cases.It also adds SQL function pg_stat_reset_subscription_error() to
reset the single subscription error.AFTER:
Add a subscription errors statistics view "pg_stat_subscription_errors".This commits adds a new system view pg_stat_logical_replication_errors,
that records information about any errors which occur during application
of logical replication changes as well as during performing initial table
synchronization.
I think that views don't have any data so "show information" seems
appropriate to me here. Thoughts?
The subscription error entries are removed by autovacuum workers after
table synchronization completes in table sync worker cases and after
dropping the subscription in apply worker cases.It also adds an SQL function pg_stat_reset_subscription_error() to
reset a single subscription error.doc/src/sgml/monitoring.sgml:
(1) BEFORE: + <entry>One row per error that happened on subscription, showing information about + the subscription errors. AFTER: + <entry>One row per error that occurred on subscription, providing information about + each subscription error.
Fixed.
(2) BEFORE: + The <structname>pg_stat_subscription_errors</structname> view will contain one AFTER: + The <structname>pg_stat_subscription_errors</structname> view contains one
I think that descriptions of other statistics view also say "XXX view
will contain ...".
(3) BEFORE: + Name of the database in which the subscription is created. AFTER: + Name of the database in which the subscription was created.
Fixed.
(4) BEFORE: + OID of the relation that the worker is processing when the + error happened. AFTER: + OID of the relation that the worker was processing when the + error occurred.
Fixed.
(5) BEFORE: + Name of command being applied when the error happened. This + field is always NULL if the error is reported by + <literal>tablesync</literal> worker. AFTER: + Name of command being applied when the error occurred. This + field is always NULL if the error is reported by a + <literal>tablesync</literal> worker.
Fixed.
(6) BEFORE: + Transaction ID of publisher node being applied when the error + happened. This field is always NULL if the error is reported + by <literal>tablesync</literal> worker. AFTER: + Transaction ID of the publisher node being applied when the error + happened. This field is always NULL if the error is reported + by a <literal>tablesync</literal> worker.
Fixed.
(7) BEFORE: + Type of worker reported the error: <literal>apply</literal> or + <literal>tablesync</literal>. AFTER: + Type of worker reporting the error: <literal>apply</literal> or + <literal>tablesync</literal>.
Fixed.
(8) BEFORE: + Number of times error happened on the worker. AFTER: + Number of times the error occurred in the worker.[or "Number of times the worker reported the error" ?]
I prefer "Number of times the error occurred in the worker."
(9) BEFORE: + Time at which the last error happened. AFTER: + Time at which the last error occurred.
Fixed.
(10) BEFORE: + Error message which is reported last failure time. AFTER: + Error message which was reported at the last failure time.Maybe this should just say "Last reported error message" ?
Fixed.
(11)
You shouldn't call hash_get_num_entries() on a NULL pointer.Suggest swappling lines, as shown below:
BEFORE: + int32 nerrors = hash_get_num_entries(subent->suberrors); + + /* Skip this subscription if not have any errors */ + if (subent->suberrors == NULL) + continue; AFTER: + int32 nerrors; + + /* Skip this subscription if not have any errors */ + if (subent->suberrors == NULL) + continue; + nerrors = hash_get_num_entries(subent->suberrors);
Right. Fixed.
(12)
Typo: legnth -> length+ * contains the fixed-legnth error message string which is
Fixed.
src/backend/postmaster/pgstat.c
(13)
"Subscription stat entries" hashtable is created in two different
places, one with HASH_CONTEXT and the other without. Is this
intentional?
Shouldn't there be a single function for creating this?
Yes, it's intentional. It's consistent with hash tables for other statistics.
(14)
+ * PgStat_MsgSubscriptionPurge Sent by the autovacuum purge the subscriptions.Seems to be missing a word, is it meant to say "Sent by the autovacuum
to purge the subscriptions." ?
Yes, fixed.
(15) + * PgStat_MsgSubscriptionErrPurge Sent by the autovacuum purge the subscription + * errors.Seems to be missing a word, is it meant to say "Sent by the autovacuum
to purge the subscription errors." ?
Thanks, fixed.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thu, Sep 2, 2021 at 2:55 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Mon, Aug 30, 2021 at 5:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached rebased patches. 0004 patch is not the scope of this
patch. It's borrowed from another thread[1] to fix the assertion
failure for newly added tests. Please review them.I have a few comments on the v12-0002 patch:
Thank you for the comments!
(1) Patch comment
Has a typo and could be expressed a bit better.
Suggestion:
BEFORE:
RESET command is reuiqred by follow-up commit introducing to a new
parameter skip_xid to reset.
AFTER:
The RESET parameter for ALTER SUBSCRIPTION is required by the
follow-up commit that introduces a new resettable subscription
parameter "skip_xid".
Fixed.
doc/src/sgml/ref/alter_subscription.sgml
(2)
I don't think "RESET" is sufficiently described in
alter_subscription.sgml. Just putting it under "SET" and changing
"altered" to "set" doesn't explain what resetting does. It should say
something about setting the parameter back to its original (default)
value.
Doesn't "RESET" normally mean to change the parameter back to its default value?
(3)
case ALTER_SUBSCRIPTION_RESET_OPTIONSSome comments here would be helpful e.g. Reset the specified
parameters back to their default values.
Okay, added.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thu, Sep 2, 2021 at 9:03 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Mon, Aug 30, 2021 at 5:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached rebased patches. 0004 patch is not the scope of this
patch. It's borrowed from another thread[1] to fix the assertion
failure for newly added tests. Please review them.
Thank you for the comments!
Some initial comments for the v12-0003 patch:
(1) Patch comment
"This commit introduces another way to skip the transaction in question."I think it should further explain: "This commit introduces another way
to skip the transaction in question, other than manually updating the
subscriber's database or using pg_replication_origin_advance()."
Updated.
doc/src/sgml/logical-replication.sgml
(2)Suggested minor update:
BEFORE: + transaction that conflicts with the existing data. When a conflict produce + an error, it is shown in <structname>pg_stat_subscription_errors</structname> + view as follows: AFTER: + transaction that conflicts with the existing data. When a conflict produces + an error, it is recorded in the <structname>pg_stat_subscription_errors</structname> + view as follows:
Fixed.
(3)
+ found from those outputs (transaction ID 740 in the above case).
The transactionShouldn't it be transaction ID 716?
Right, fixed.
(4)
+ can be skipped by setting <replaceable>skip_xid</replaceable> to
the subscriptionIs it better to say here ... "on the subscription" ?
Okay, fixed.
(5)
Just skipping a transaction could make a subscriber inconsistent, right?Would it be better as follows?
BEFORE: + In either way, those should be used as a last resort. They skip the whole + transaction including changes that may not violate any constraint and easily + make subscriber inconsistent if a user specifies the wrong transaction ID or + the position of origin.AFTER: + Either way, those transaction skipping methods should be used as a last resort. + They skip the whole transaction, including changes that may not violate any + constraint. They may easily make the subscriber inconsistent, especially if a + user specifies the wrong transaction ID or the position of origin.
Agreed, fixed.
(6)
The grammar is not great in the following description, so here's a
suggested improvement:BEFORE: + incoming change or by skipping the whole transaction. This option + specifies transaction ID that logical replication worker skips to + apply. The logical replication worker skips all data modificationAFTER: + incoming changes or by skipping the whole transaction. This option + specifies the ID of the transaction whose application is to be skipped + by the logical replication worker. The logical replication worker + skips all data modification
Fixed.
src/backend/postmaster/pgstat.c (7) BEFORE: + * Tell the collector about clear the error of subscription. AFTER: + * Tell the collector to clear the subscription error.
Fixed.
src/backend/replication/logical/worker.c
(8)
+ * subscription is invalidated and* MySubscription->skipxid gets
changed or reset.There is a "*" after "and".
Fixed.
(9)
Do these lines really need to be moved up?+ /* extract XID of the top-level transaction */ + stream_xid = logicalrep_read_stream_start(s, &first_segment); +
I had missed to revert this change, fixed.
src/backend/postmaster/pgstat.c
(10)+ bool m_clear; /* clear all fields except for last_failure and + * last_errmsg */I think it should say: clear all fields except for last_failure,
last_errmsg and stat_reset_timestamp.
Fixed.
Those comments including your comments on the v12-0001 and v12-0002
are incorporated into local branch. I'll submit the updated patches
after incorporating all other comments.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thu, Sep 2, 2021 at 5:41 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
From Mon, Aug 30, 2021 3:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached rebased patches. 0004 patch is not the scope of this patch. It's
borrowed from another thread[1] to fix the assertion failure for newly added
tests. Please review them.Hi,
I reviewed the v12-0001 patch, here are some comments:
Thank you for the comments!
1) --- a/src/backend/utils/error/elog.c +++ b/src/backend/utils/error/elog.c @@ -1441,7 +1441,6 @@ getinternalerrposition(void) return edata->internalpos; }-
It seems a miss change in elog.c
Fixed.
2)
+ TupleDescInitEntry(tupdesc, (AttrNumber) 10, "stats_reset", + TIMESTAMPTZOID, -1, 0);The document doesn't mention the column "stats_reset".
Added.
3)
+typedef struct PgStat_StatSubErrEntry +{ + Oid subrelid; /* InvalidOid if the apply worker, otherwise + * the table sync worker. hash table key. */From the comments of subrelid, I think one subscription only have one apply
worker error entry, right ? If so, I was thinking can we move the the apply
error entry to PgStat_StatSubEntry. In that approach, we don't need to build a
inner hash table when there are no table sync error entry.
I wanted to avoid having unnecessary error entry fields when there is
no apply worker error but there is a table sync worker error. But
after more thoughts, the apply worker is likely to raise an error than
table sync workers. So it might be better to have both
PgStat_StatSubErrEntry for the apply worker error and hash table for
table sync workers errors in PgStat_StatSubEntry.
4)
Is it possible to add some testcases to test the subscription error entry delete ?
Do you mean the tests checking if subscription error entry is deleted
after DROP SUBSCRIPTION?
Those comments are incorporated into local branches. I'll submit the
updated patches after incorporating other comments.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thu, Sep 2, 2021 at 8:37 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
From Mon, Aug 30, 2021 3:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached rebased patches. 0004 patch is not the scope of this
patch. It's borrowed from another thread[1] to fix the assertion
failure for newly added tests. Please review them.Hi,
I reviewed the 0002 patch and have a suggestion for it.
+ if (IsSet(opts.specified_opts, SUBOPT_SYNCHRONOUS_COMMIT)) + { + values[Anum_pg_subscription_subsynccommit - 1] = + CStringGetTextDatum("off"); + replaces[Anum_pg_subscription_subsynccommit - 1] = true; + }Currently, the patch set the default value out of parse_subscription_options(),
but I think It might be more standard to set the value in
parse_subscription_options(). Like:if (!is_reset) { ... + } + else + opts->synchronous_commit = "off";And then, we can set the value like:
values[Anum_pg_subscription_subsynccommit - 1] =
CStringGetTextDatum(opts.synchronous_commit);
You're right. Fixed.
Besides, instead of adding a switch case like the following: + case ALTER_SUBSCRIPTION_RESET_OPTIONS: + {We can add a bool flag(isReset) in AlterSubscriptionStmt and check the flag
when invoking parse_subscription_options(). In this approach, the code can be
shorter.Attach a diff file based on the v12-0002 which change the code like the above
suggestion.
Thank you for the patch!
@@ -3672,11 +3671,12 @@ typedef enum AlterSubscriptionType
typedef struct AlterSubscriptionStmt
{
NodeTag type;
- AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_SET_OPTIONS, etc */
+ AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_OPTIONS, etc */
char *subname; /* Name of the subscription */
char *conninfo; /* Connection string to publisher */
List *publication; /* One or more publication to
subscribe to */
List *options; /* List of DefElem nodes */
+ bool isReset; /* true if RESET option */
} AlterSubscriptionStmt;
It's unnatural to me that AlterSubscriptionStmt has isReset flag even
in spite of having 'kind' indicating the command. How about having
RESET comand use the same logic of SET as you suggested while having
both ALTER_SUBSCRIPTION_SET_OPTIONS and
ALTER_SUBSCRIPTION_RESET_OPTIONS?
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Fri, Sep 3, 2021 at 3:46 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Mon, Aug 30, 2021 at 5:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached rebased patches. 0004 patch is not the scope of this
patch. It's borrowed from another thread[1] to fix the assertion
failure for newly added tests. Please review them.BTW, these patches need rebasing (broken by recent commits, patches
0001, 0003 and 0004 no longer apply, and it's failing in the cfbot).
Thanks! I'll submit the updated patches early this week.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
From Sun, Sep 5, 2021 9:58 PM Masahiko Sawada <sawada.mshk@gmail.com>:
On Thu, Sep 2, 2021 at 8:37 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:
From Mon, Aug 30, 2021 3:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached rebased patches. 0004 patch is not the scope of this
patch. It's borrowed from another thread[1] to fix the assertion
failure for newly added tests. Please review them.Hi,
I reviewed the 0002 patch and have a suggestion for it.
@@ -3672,11 +3671,12 @@ typedef enum AlterSubscriptionType typedef struct AlterSubscriptionStmt { NodeTag type; - AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_SET_OPTIONS, etc */ + AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_OPTIONS, etc + */ char *subname; /* Name of the subscription */ char *conninfo; /* Connection string to publisher */ List *publication; /* One or more publication to subscribe to */ List *options; /* List of DefElem nodes */ + bool isReset; /* true if RESET option */ } AlterSubscriptionStmt;It's unnatural to me that AlterSubscriptionStmt has isReset flag even in spite of
having 'kind' indicating the command. How about having RESET comand use
the same logic of SET as you suggested while having both
ALTER_SUBSCRIPTION_SET_OPTIONS and
ALTER_SUBSCRIPTION_RESET_OPTIONS?
Yes, I agree with you it will look more natural with ALTER_SUBSCRIPTION_RESET_OPTIONS.
Best regards,
Hou zj
On Sat, Sep 4, 2021 at 12:24 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Sep 3, 2021 at 2:15 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote:
On Aug 30, 2021, at 12:06 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached rebased patches.
For the v12-0003 patch:
I believe this feature is needed, but it also seems like a very powerful foot-gun. Can we do anything to make it less likely that users will hurt themselves with this tool?
This won't do any more harm than currently, users can do via
pg_replication_slot_advance and the same is documented as well, see
[1]. This will be allowed to only superusers. Its effect will be
documented with a precautionary note to use it only when the other
available ways can't be used.
Right.
I am thinking back to support calls I have attended. When a production system is down, there is often some hesitancy to perform ad-hoc operations on the database, but once the decision has been made to do so, people try to get the whole process done as quickly as possible. If multiple transactions on the publisher fail on the subscriber, they will do so in series, not in parallel.
The subscriber will know only one transaction failure at a time, till
that is resolved, the apply won't move ahead and it won't know even if
there are other transactions that are going to fail in the future.If the user could instead clear all failed transactions of the same type, that might make it less likely that they unthinkingly also skip subsequent errors of some different type. Perhaps something like ALTER SUBSCRIPTION ... SET (skip_failures = 'duplicate key value violates unique constraint "test_pkey"')?
I think if we want we can allow to skip particular error via
skip_error_code instead of via error message but not sure if it would
be better to skip a particular operation of the transaction rather
than the entire transaction. Normally from the atomicity purpose the
transaction can be either committed or rolled-back but not partially
done so I think it would be preferable to skip the entire transaction
rather than skipping it partially.
I think the suggestion by Mark is to skip the entire transaction if
the kind of error matches the specified error.
I think my proposed feature is meant to be a tool to cover the
situation like where something should not happen have happened, rather
than conflict resolution. If the users failed into a difficult
situation where they need to skip a lot of transaction by this
skip_xid feature, they should rebuild the logical replication from
scratch. It seems to me that skipping all transactions that failed due
to the same type of failure seems to be problematic, for example, if
the user forget to reset it. If we want to skip the particular
operation that failed due to the specified error, we should have a
proper conflict resolution feature that can handle various types of
conflicts by various types of resolutions methods, like other RDBMS
supports.
This is arguably a different feature request, and not something your patch is required to address, but I wonder how much we should limit people shooting themselves in the foot? If we built something like this using your skip_xid feature, rather than instead of your skip_xid feature, would your feature need to be modified?
Sawada-San can answer better but I don't see any problem building any
such feature on top of what is currently proposed.
If the feature you proposed is to skip the entire transaction, I also
don't see any problem building the feature on top of my patch. The
patch adds the mechanism to skip the entire transaction so what we
need to do for that feature is to extend how to trigger the skipping
behavior.
I'm having trouble thinking of an example conflict where skipping a transaction would be better than writing a BEFORE INSERT trigger on the conflicting table which suppresses or redirects conflicting rows somewhere else. Particularly for larger transactions containing multiple statements, suppressing the conflicting rows using a trigger would be less messy than skipping the transaction. I think your patch adds a useful tool to the toolkit, but maybe we should mention more alternatives in the docs? Something like, "changing the data on the subscriber so that it doesn't conflict with incoming changes, or dropping the conflicting constraint or unique index, or writing a trigger on the subscriber to suppress or redirect conflicting incoming changes, or as a last resort, by skipping the whole transaction"?
+1 for extending the docs as per this suggestion.
Agreed. I'll add such description to the doc.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Sun, Sep 5, 2021 at 10:58 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Sep 3, 2021 at 3:46 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Mon, Aug 30, 2021 at 5:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached rebased patches. 0004 patch is not the scope of this
patch. It's borrowed from another thread[1] to fix the assertion
failure for newly added tests. Please review them.BTW, these patches need rebasing (broken by recent commits, patches
0001, 0003 and 0004 no longer apply, and it's failing in the cfbot).Thanks! I'll submit the updated patches early this week.
Sorry for the late response. I've attached the updated patches that
incorporate all comments unless I missed something. Please review
them.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v13-0003-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchapplication/octet-stream; name=v13-0003-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchDownload
From 7a410e4dd45afaafda7bbaf8153dc965bfd4518a Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 2 Aug 2021 14:27:40 +0900
Subject: [PATCH v13 3/3] Add skip_xid option to ALTER SUBSCRIPTION.
If incoming change violates any constraint, logical replication stops
until it's resolved. This commit introduces another way to skip the
transaction in question, other than manually updating the subscriber's
database or using pg_replication_origin_advance().
The user can specify XID by ALTER SUBSCRIPTION ... SET (skip_xid =
XXX), updating pg_subscription.subskipxid field, telling the apply
worker to skip the transaction. The apply worker skips all data
modification changes within the specified transaction.
After skipping the transaciton the apply worker clears
pg_subscription.subskipxid. Also, it clears the error statistics of
the subscription in pg_stat_subscription_errors system view as well in
order the user not to get confused. It's done by sending the message
for clearing a subscription error to the stats collector.
---
doc/src/sgml/logical-replication.sgml | 56 ++++-
doc/src/sgml/ref/alter_subscription.sgml | 37 +++-
src/backend/catalog/pg_subscription.c | 10 +
src/backend/commands/subscriptioncmds.c | 42 +++-
src/backend/postmaster/pgstat.c | 47 +++-
src/backend/replication/logical/worker.c | 195 +++++++++++++++-
src/include/catalog/pg_subscription.h | 4 +
src/include/nodes/parsenodes.h | 1 +
src/include/pgstat.h | 7 +-
src/test/regress/expected/subscription.out | 13 ++
src/test/regress/sql/subscription.sql | 11 +
src/test/subscription/t/024_skip_xact.pl | 244 +++++++++++++++++++++
12 files changed, 646 insertions(+), 21 deletions(-)
create mode 100644 src/test/subscription/t/024_skip_xact.pl
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 88646bc859..f7da60290a 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -333,20 +333,68 @@
<para>
A conflict will produce an error and will stop the replication; it must be
resolved manually by the user. Details about the conflict can be found in
- the subscriber's server log.
+ <xref linkend="monitoring-pg-stat-subscription-errors"/> as well as the
+ subscriber's server log.
</para>
<para>
The resolution can be done either by changing data on the subscriber so
- that it does not conflict with the incoming change or by skipping the
- transaction that conflicts with the existing data. The transaction can be
- skipped by calling the <link linkend="pg-replication-origin-advance">
+ that it does not conflict with the incoming changes or by skipping the whole
+ transaction. This option specifies the ID of the transaction whose
+ application is to be skipped by the logical replication worker. The logical
+ replication worker skips all data modification transaction conflicts with
+ the existing data. When a conflict produce an error, it is shown in
+ <structname>pg_stat_subscription_errors</structname> view as follows:
+ </para>
+
+ <programlisting>
+postgres=# SELECT * FROM pg_stat_subscription_errors;
+-[ RECORD 1 ]--------+-----------------------------------------------------------
+datname | postgres
+subid | 16395
+subname | test_sub
+relid | 16385
+command | INSERT
+xid | 716
+failure_source | apply
+failure_count | 50
+last_failure | 2021-07-21 21:16:02.781779+00
+last_failure_message | duplicate key value violates unique constraint "test_pkey"
+stats_reset |
+</programlisting>
+
+ <para>
+ and it is also shown in subscriber's server log as follows:
+ </para>
+
+<screen>
+ERROR: duplicate key value violates unique constraint "test_pkey"
+DETAIL: Key (id)=(1) already exists.
+CONTEXT: processing remote data during "INSERT" for replication target relation "public.test" in transaction 716 with commit timestamp 2021-07-15 21:54:58.802874+00
+</screen>
+
+ <para>
+ The transaction ID that contains the change violating the constraint can be
+ found from those outputs (transaction ID 716 in the above case). The transaction
+ can be skipped by setting <replaceable>skip_xid</replaceable> on the subscription
+ by <command>ALTER SUBSCRIPTION ... SET</command>. Alternatively, the transaction
+ can also be skipped by calling the <link linkend="pg-replication-origin-advance">
<function>pg_replication_origin_advance()</function></link> function with
a <parameter>node_name</parameter> corresponding to the subscription name,
and a position. The current position of origins can be seen in the
<link linkend="view-pg-replication-origin-status">
<structname>pg_replication_origin_status</structname></link> system view.
</para>
+
+ <para>
+ In this case, you need to consider changing the data on the subscriber so that it
+ doesn't conflict with incoming changes, or dropping the conflicting constraint or
+ unique index, or writing a trigger on the subscriber to suppress or redirect
+ conflicting incoming changes, or as a last resort, by skipping the whole transaction.
+ They skip the whole transaction, including changes that may not violate any
+ constraint. They may easily make the subscriber inconsistent, especially if
+ a user specifies the wrong transaction ID or the position of origin.
+ </para>
</sect1>
<sect1 id="logical-replication-restrictions">
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 376fc154b1..1f6c05c5d5 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -202,8 +202,41 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
<literal>streaming</literal>.
</para>
<para>
- The parameters that can be reset are: <literal>streaming</literal>,
- <literal>binary</literal>, <literal>synchronous_commit</literal>.
+ The parameters that can be reset are: <literal>slot_name</literal>,
+ <literal>synchronous_commit</literal>, <literal>binary</literal>,
+ <literal>streaming</literal>, and following parameter:
+ </para>
+ <para>
+ <variablelist>
+ <varlistentry>
+ <term><literal>skip_xid</literal> (<type>xid</type>)</term>
+ <listitem>
+ <para>
+ If incoming data violates any constraints the logical replication
+ will stop until it is resolved. The resolution can be done either
+ by changing data on the subscriber so that it doesn't conflict with
+ incoming change or by skipping the whole transaction. This option
+ specifies transaction ID that logical replication worker skips to
+ apply. The logical replication worker skips all data modification
+ changes within the specified transaction. Therefore, since it skips
+ the whole transaction including the changes that may not violate the
+ constraint, it should only be used as a last resort. This option has
+ no effect for the transaction that is already prepared with enabling
+ <literal>two_phase</literal> on susbscriber. After the logical
+ replication successfully skips the transaction, the transaction ID
+ (stored in
+ <structname>pg_subscription</structname>.<structfield>subskipxid</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts.
+ </para>
+
+ <para>
+ Setting and resetting of <literal>skip_xid</literal> option is
+ restrited to superusers.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index 25021e25a4..cb22cd7463 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -104,6 +104,16 @@ GetSubscription(Oid subid, bool missing_ok)
Assert(!isnull);
sub->publications = textarray_to_stringlist(DatumGetArrayTypeP(datum));
+ /* Get skip XID */
+ datum = SysCacheGetAttr(SUBSCRIPTIONOID,
+ tup,
+ Anum_pg_subscription_subskipxid,
+ &isnull);
+ if (!isnull)
+ sub->skipxid = DatumGetTransactionId(datum);
+ else
+ sub->skipxid = InvalidTransactionId;
+
ReleaseSysCache(tup);
return sub;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 896ec8b836..fd74037fb8 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -60,6 +60,7 @@
#define SUBOPT_BINARY 0x00000080
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
+#define SUBOPT_SKIP_XID 0x00000400
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -81,6 +82,7 @@ typedef struct SubOpts
bool binary;
bool streaming;
bool twophase;
+ TransactionId skip_xid;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -129,6 +131,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->streaming = false;
if (IsSet(supported_opts, SUBOPT_TWOPHASE_COMMIT))
opts->twophase = false;
+ if (IsSet(supported_opts, SUBOPT_SKIP_XID))
+ opts->skip_xid = InvalidTransactionId;
/* Parse options */
foreach(lc, stmt_options)
@@ -261,6 +265,29 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_TWOPHASE_COMMIT;
opts->twophase = defGetBoolean(defel);
}
+ else if (strcmp(defel->defname, "skip_xid") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_SKIP_XID))
+ errorConflictingDefElem(defel, pstate);
+
+ if (!is_reset)
+ {
+ char *xid_str = defGetString(defel);
+ TransactionId xid;
+
+ /* Parse the argument as TransactionId */
+ xid = DatumGetTransactionId(DirectFunctionCall1(xidin,
+ CStringGetDatum(xid_str)));
+
+ if (!TransactionIdIsNormal(xid))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid transaction id")));
+ opts->skip_xid = xid;
+ }
+
+ opts->specified_opts |= SUBOPT_SKIP_XID;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -485,6 +512,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
CharGetDatum(opts.twophase ?
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -888,7 +916,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (is_reset)
supported_opts = (SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
- SUBOPT_STREAMING);
+ SUBOPT_STREAMING | SUBOPT_SKIP_XID);
else
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
@@ -941,6 +969,18 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
replaces[Anum_pg_subscription_substream - 1] = true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_SKIP_XID))
+ {
+ if (!superuser())
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("must be superuser to set %s", "skip_xid")));
+
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(opts.skip_xid);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+ }
+
update_tuple = true;
break;
}
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 0178186838..da4c493131 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -1744,11 +1744,32 @@ pgstat_reset_subscription_error(Oid subid, Oid subrelid)
pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
msg.m_subid = subid;
msg.m_subrelid = subrelid;
+ msg.m_clear = false;
msg.m_reset = true;
pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
}
+/* ----------
+ * pgstat_clear_subscription_error() -
+ *
+ * Tell the collector to clear the error of subscription.
+ * ----------
+ */
+void
+pgstat_clear_subscription_error(Oid subid, Oid subrelid)
+{
+ PgStat_MsgSubscriptionErr msg;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_clear = true;
+ msg.m_reset = false;
+
+ pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
+}
+
/* ----------
* pgstat_report_autovac() -
*
@@ -2038,6 +2059,7 @@ pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
msg.m_subid = subid;
msg.m_subrelid = subrelid;
msg.m_reset = false;
+ msg.m_clear = false;
msg.m_relid = relid;
msg.m_command = command;
msg.m_xid = xid;
@@ -6158,24 +6180,39 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
static void
pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len)
{
+
PgStat_StatSubErrEntry *errent;
- bool create = !msg->m_reset;
+ bool create = !(msg->m_reset || msg->m_clear);
/* Get subscription error */
errent = pgstat_get_subscription_error_entry(msg->m_subid,
msg->m_subrelid,
create);
- if (msg->m_reset)
+ if (msg->m_reset || msg->m_clear)
{
Assert(!create);
+ Assert(!(msg->m_reset && msg->m_clear));
if (errent == NULL)
return;
- /* reset fields and set reset timestamp */
- pgstat_reset_subscription_error_entry(errent,
- GetCurrentTimestamp());
+ /* Both clear and reset initialize these fields */
+ errent->relid = InvalidOid;
+ errent->command = 0;
+ errent->xid = InvalidTransactionId;
+ errent->failure_count = 0;
+
+ /*
+ * If the reset is requested, reset more fields and set the reset
+ * timestamp.
+ */
+ if (msg->m_reset)
+ {
+ errent->last_failure = 0;
+ errent->last_errmsg[0] = '\0';
+ errent->stat_reset_timestamp = GetCurrentTimestamp();
+ }
}
else
{
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index e91fa86b1a..b20de5909b 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -136,6 +136,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -255,6 +256,21 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * skipping_xid is a valid XID if we're skipping all data modification changes
+ * (INSERT/DELETE/UPDATE/TRUNCATE) of the specified transaction in MySubscription->skipxid.
+ * Please note that we don’t skip receiving the changes particularly in streaming
+ * cases, since we decide whether or not to skip applying the changes when starting
+ * to apply. Once starting skipping changes, we copy the XID to skipping_xid and
+ * then don't stop skipping until we skip the whole transaction even if the
+ * subscription is invalidated and MySubscription->skipxid gets changed or reset.
+ * When stopping the skipping behavior, we reset the skip XID (subskipxid) in the
+ * pg_subscription catalog and associate origin status to the transaction that resets
+ * the skip XID so that we can start streaming from the next transaction.
+ */
+static TransactionId skipping_xid = InvalidTransactionId;
+#define is_skipping_changes() (TransactionIdIsValid(skipping_xid))
+
/* BufFile handle of the current streaming file */
static BufFile *stream_fd = NULL;
@@ -335,6 +351,9 @@ static void apply_error_callback(void *arg);
static inline void set_apply_error_context_xact(TransactionId xid, TimestampTz ts);
static inline void reset_apply_error_context_info(void);
+static void maybe_start_skipping_changes(TransactionId xid);
+static void stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs);
+
/*
* Should this worker apply changes for given relation.
*
@@ -789,6 +808,11 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
+ /*
+ * Enable skipping all changes of this transaction if specified.
+ */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -813,7 +837,18 @@ apply_handle_commit(StringInfo s)
LSN_FORMAT_ARGS(commit_data.commit_lsn),
LSN_FORMAT_ARGS(remote_final_lsn))));
- apply_handle_commit_internal(&commit_data);
+ /*
+ * Stop the skipping transaction if enabled. Otherwise, commit the changes
+ * that are just applied.
+ */
+ if (is_skipping_changes())
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ apply_handle_commit_internal(&commit_data);
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(commit_data.end_lsn);
@@ -841,6 +876,9 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
+ /* Enable skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -899,9 +937,10 @@ apply_handle_prepare(StringInfo s)
/*
* Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction. It is done this way because at
- * commit prepared time, we won't know whether we have skipped preparing a
- * transaction because of no change.
+ * change has happened in this transaction, possibly because we're
+ * skipping data-modification changes of this transaction. It is done this
+ * way because at commit prepared time, we won't know whether we have
+ * skipped preparing a transaction because of no change.
*
* XXX, We can optimize such that at commit prepared time, we first check
* whether we have prepared the transaction or not but that doesn't seem
@@ -915,6 +954,10 @@ apply_handle_prepare(StringInfo s)
CommitTransactionCommand();
pgstat_report_stat(false);
+ /* Stop the skipping changes if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
store_flush_position(prepare_data.end_lsn);
in_remote_transaction = false;
@@ -1046,6 +1089,9 @@ apply_handle_stream_prepare(StringInfo s)
elog(DEBUG1, "received prepare for streamed transaction %u", prepare_data.xid);
+ /* Enable skipping all changes of this transaction if specified. */
+ maybe_start_skipping_changes(prepare_data.xid);
+
/* Replay all the spooled operations. */
apply_spooled_messages(prepare_data.xid, prepare_data.prepare_lsn);
@@ -1056,6 +1102,10 @@ apply_handle_stream_prepare(StringInfo s)
pgstat_report_stat(false);
+ /* Stop the skipping changes if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
store_flush_position(prepare_data.end_lsn);
in_remote_transaction = false;
@@ -1081,9 +1131,10 @@ apply_handle_origin(StringInfo s)
{
/*
* ORIGIN message can only come inside streaming transaction or inside
- * remote transaction and before any actual writes.
+ * remote transaction and before any actual writes unless we're skipping
+ * changes of the transaction.
*/
- if (!in_streamed_transaction &&
+ if (!in_streamed_transaction && !is_skipping_changes() &&
(!in_remote_transaction ||
(IsTransactionState() && !am_tablesync_worker())))
ereport(ERROR,
@@ -1206,6 +1257,7 @@ apply_handle_stream_abort(StringInfo s)
errmsg_internal("STREAM ABORT message without STREAM STOP")));
logicalrep_read_stream_abort(s, &xid, &subxid);
+ maybe_start_skipping_changes(xid);
/*
* If the two XIDs are the same, it's in fact abort of toplevel xact, so
@@ -1289,6 +1341,10 @@ apply_handle_stream_abort(StringInfo s)
CommitTransactionCommand();
}
+ /* Stop the skipping transaction if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
reset_apply_error_context_info();
}
@@ -1428,9 +1484,23 @@ apply_handle_stream_commit(StringInfo s)
elog(DEBUG1, "received commit for streamed transaction %u", xid);
+ /* Enable skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(xid);
+
apply_spooled_messages(xid, commit_data.commit_lsn);
- apply_handle_commit_internal(&commit_data);
+ if (is_skipping_changes())
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ {
+ /* commit the streamed transaction */
+ apply_handle_commit_internal(&commit_data);
+ }
/* unlink the files with serialized changes and subxact info */
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
@@ -2316,6 +2386,17 @@ apply_dispatch(StringInfo s)
LogicalRepMsgType action = pq_getmsgbyte(s);
LogicalRepMsgType saved_command;
+ /*
+ * Skip all data-modification changes if we're skipping changes of this
+ * transaction.
+ */
+ if (is_skipping_changes() &&
+ (action == LOGICAL_REP_MSG_INSERT ||
+ action == LOGICAL_REP_MSG_UPDATE ||
+ action == LOGICAL_REP_MSG_DELETE ||
+ action == LOGICAL_REP_MSG_TRUNCATE))
+ return;
+
/*
* Set the current command being applied. Since this function can be
* called recusively when applying spooled changes, save the current
@@ -3662,3 +3743,103 @@ reset_apply_error_context_info(void)
apply_error_callback_arg.remote_attnum = -1;
set_apply_error_context_xact(InvalidTransactionId, 0);
}
+
+/*
+ * Start skipping changes of the transaction if the given XID matches the
+ * transaction ID specified by skip_xid option.
+ */
+static void
+maybe_start_skipping_changes(TransactionId xid)
+{
+ Assert(!is_skipping_changes());
+ Assert(!TransactionIdIsValid(skipping_xid));
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ if (!TransactionIdIsValid(MySubscription->skipxid) ||
+ MySubscription->skipxid != xid)
+ return;
+
+ skipping_xid = xid;
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction with xid %u",
+ skipping_xid));
+}
+
+/*
+ * Stop skipping changes and reset the skip XID. Also, reset the skip XID
+ * (pg_subscription.subskipxid). If origin_lsn and origin_committs are valid, we
+ * set origin state to the transaction commit that resets the skip XID so that we
+ * can start streaming from the transaction next to the one that we just skipped.
+ */
+static void
+stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs)
+{
+ Relation rel;
+ HeapTuple tup;
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ Assert(is_skipping_changes());
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction with xid %u",
+ skipping_xid)));
+
+ /* Stop skipping changes */
+ skipping_xid = InvalidTransactionId;
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ if (!IsTransactionState())
+ StartTransactionCommand();
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ TransactionIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ /* Set subskipxid to null */
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ /* Update the system catalog to reset the skip XID */
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+
+ heap_freetuple(tup);
+ table_close(rel, RowExclusiveLock);
+
+ if (!XLogRecPtrIsInvalid(origin_lsn))
+ {
+ /*
+ * Update origin state so we can restart streaming from correct
+ * position in case of crash.
+ */
+ replorigin_session_origin_lsn = origin_lsn;
+ replorigin_session_origin_timestamp = origin_committs;
+ }
+
+ CommitTransactionCommand();
+ pgstat_report_stat(false);
+
+ /*
+ * Clear the error statistics of this subscription to let users know the
+ * subscription is no longer getting stuck by the conflict.
+ *
+ * The message for clearing the error statistics can be lost but that's
+ * okay. The user can know the logical replication is working fine in
+ * other ways, for example, checking pg_stat_subscription view. And the
+ * user is able to reset the single subscription error statistics by
+ * pg_reset_subscription_error SQL function.
+ */
+ pgstat_clear_subscription_error(MySubscription->oid, InvalidOid);
+}
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 21061493ea..beaa6e646d 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -67,6 +67,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
char subtwophasestate; /* Stream two-phase transactions */
+ TransactionId subskipxid BKI_FORCE_NULL; /* All changes associated with
+ * this XID are skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -103,6 +106,7 @@ typedef struct Subscription
* binary format */
bool stream; /* Allow streaming in-progress transactions. */
char twophasestate; /* Allow streaming two-phase transactions */
+ TransactionId skipxid; /* All changes of the XID are skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 5424380bb7..c5afcad231 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3694,6 +3694,7 @@ typedef struct AlterSubscriptionStmt
char *conninfo; /* Connection string to publisher */
List *publication; /* One or more publication to subscribe to */
List *options; /* List of DefElem nodes */
+ TransactionId skip_xid; /* XID to skip */
} AlterSubscriptionStmt;
typedef struct DropSubscriptionStmt
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 6ff8720631..5ed1319743 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -536,7 +536,7 @@ typedef struct PgStat_MsgReplSlot
/* ----------
* PgStat_MsgSubscriptionErr Sent by the apply worker or the table sync worker to
- * update/reset the error happening during logical
+ * update/reset/clear the error happening during logical
* replication.
* ----------
*/
@@ -554,7 +554,9 @@ typedef struct PgStat_MsgSubscriptionErr
Oid m_subid;
Oid m_subrelid;
- /* The reset message uses below field */
+ /* The clear and reset messages use below fields */
+ bool m_clear; /* clear all fields except for last_failure,
+ last_errmsg and stat_reset_timestamp. */
bool m_reset; /* Reset all fields and set reset_stats
* timestamp */
@@ -1111,6 +1113,7 @@ extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type t
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
extern void pgstat_reset_subscription_error(Oid subid, Oid subrelid);
+extern void pgstat_clear_subscription_error(Oid subid, Oid subrelid);
extern void pgstat_report_autovac(Oid dboid);
extern void pgstat_report_vacuum(Oid tableoid, bool shared,
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index e4c16cab66..e4dc4fb946 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -293,6 +293,19 @@ ERROR: unrecognized subscription parameter: "enabled"
-- fail - RESET must not include values
ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
ERROR: RESET must not include values for parameters
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = '4294967295');
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1.1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+ERROR: invalid transaction id
\dRs+
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 3b0fbea897..c458b38985 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -228,6 +228,17 @@ ALTER SUBSCRIPTION regress_testsub RESET (enabled);
-- fail - RESET must not include values
ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = '4294967295');
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1.1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+
\dRs+
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/subscription/t/024_skip_xact.pl b/src/test/subscription/t/024_skip_xact.pl
new file mode 100644
index 0000000000..affb663803
--- /dev/null
+++ b/src/test/subscription/t/024_skip_xact.pl
@@ -0,0 +1,244 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Tests for skipping logical replication transactions
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 10;
+
+# Test if the error reported on pg_subscription_errors view is expected.
+sub test_subscription_error
+{
+ my ($node, $source, $relname, $expected_error, $msg) = @_;
+
+ # Wait for the error statistics to be updated.
+ $node->poll_query_until(
+ 'postgres', qq[
+SELECT count(1) > 0 FROM pg_stat_subscription_errors
+WHERE relid = '$relname'::regclass AND failure_source = '$source';
+]) or die "Timed out while waiting for statistics to be updated";
+
+ my $result = $node->safe_psql(
+ 'postgres',
+ qq[
+SELECT datname, subname, command, relid::regclass, failure_source, failure_count > 0
+FROM pg_stat_subscription_errors
+WHERE relid = '$relname'::regclass AND failure_source = '$source';
+]);
+ is($result, $expected_error, $msg);
+}
+
+# Check the error reported on pg_stat_subscription view and skip the failed
+# transaction.
+sub test_skip_subscription_error
+{
+ my ($node, $source, $subname, $relname, $expected_error, $msg) = @_;
+
+ # Check the reported error.
+ test_subscription_error($node, $source, $relname, $expected_error, $msg);
+
+ # Get XID of the failed transaction.
+ my $skipxid = $node->safe_psql(
+ 'postgres',
+ "SELECT xid FROM pg_stat_subscription_errors WHERE relid = '$relname'::regclass");
+ $node->safe_psql('postgres',
+ "ALTER SUBSCRIPTION $subname SET (skip_xid = '$skipxid')");
+
+ # Restart the subscriber to restart logical replication without interval.
+ $node->restart;
+
+ # Wait for the failed transaction to be skipped.
+ $node->poll_query_until(
+ 'postgres',
+ qq[
+SELECT subskipxid IS NULL FROM pg_subscription
+WHERE subname = '$subname'
+]) or die "Timed out while waiting for the transaction to be skipped";
+
+ # Also wait for the error details to be cleared.
+ $node->poll_query_until(
+ 'postgres',
+ qq[
+SELECT command IS NULL FROM pg_stat_subscription_errors
+WHERE subname = '$subname' AND failure_source = '$source';
+]) or die "Timed out while waiting for the transaction to be skipped";
+}
+
+# Create publisher node.
+my $node_publisher = PostgresNode->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+logical_decoding_work_mem = 64kB
+]);
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgresNode->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+
+# don't overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+wal_retrieve_retry_interval = 5s
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we create
+# the same tables but with primary keys. Also, insert some data that will conflict
+# with the data replicated from publisher later.
+$node_publisher->safe_psql('postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int);
+CREATE TABLE test_tab2 (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+INSERT INTO test_tab1 VALUES (1);
+INSERT INTO test_tab2 VALUES (1);
+COMMIT;
+]);
+$node_subscriber->safe_psql('postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int primary key);
+CREATE TABLE test_tab2 (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab2 VALUES (1);
+INSERT INTO test_tab_streaming SELECT 10000, md5(10000::text);
+COMMIT;
+]);
+
+# Check if there is no subscription errors before starting logical replication.
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_errors");
+is($result, qq(0), 'check no subscription error');
+
+# Setup publications.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ q[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Create subscriptions. The table sync for test_tab2 on tap_sub will enter to
+# infinite error due to violating the unique constraint.
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (two_phase = on);");
+my $appname_streaming = 'tap_sub_streaming';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming' PUBLICATION tap_pub_streaming WITH (streaming = on, two_phase = on);");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Wait for initial table sync for test_tab1 and test_tab_streaming to finish.
+$node_subscriber->poll_query_until('postgres',
+ q[
+SELECT count(1) = 2 FROM pg_subscription_rel
+WHERE srrelid in ('test_tab1'::regclass, 'test_tab_streaming'::regclass) AND srsubstate = 'r'
+]) or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check the initial data.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab1");
+is($result, q(1), 'check initial data was copied to subscriber');
+
+# Insert more data to test_tab1, raising an error on the subscriber due to violating
+# the unique constraint on test_tab1. Then skip the transaction in question.
+$node_publisher->safe_psql('postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab1 VALUES (1);
+COMMIT;
+]);
+test_skip_subscription_error($node_subscriber,
+ 'apply', 'tap_sub', 'test_tab1',
+ qq(postgres|tap_sub|INSERT|test_tab1|apply|t),
+ 'skip the error reported by the apply worker');
+
+# Check the table sync worker's error in the view.
+test_subscription_error($node_subscriber,
+ 'tablesync', 'test_tab2',
+ qq(postgres|tap_sub||test_tab2|tablesync|t),
+ 'skip the error reported by the table sync worker');
+
+# Insert enough rows to test_tab_streaming to exceed the 64kB limit, also raising an
+# error on the subscriber during applying spooled changes for the same reason. Then
+# skip the transactio in question.
+$node_publisher->safe_psql('postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+COMMIT;
+]);
+test_skip_subscription_error($node_subscriber,
+ 'apply', 'tap_sub_streaming', 'test_tab_streaming',
+ qq(postgres|tap_sub_streaming|INSERT|test_tab_streaming|apply|t),
+ 'skip the error reported by the table sync worker during during applying streaming changes');
+
+# Insert data to test_tab1 and test_tab_streaming that don't conflict.
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab1 VALUES (2)");
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab_streaming VALUES (10001, md5(10001::text))");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Check the data is successfully replicated after skipping the transactions.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT * FROM test_tab1");
+is($result, q(1
+2), "subscription gets changes after skipped transaction");
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM test_tab_streaming");
+is($result, q(2), "subscription gets changes after skipped streamed transaction");
+
+# Tests for skipping the transactions that are prepared and stream_prepared. We insert
+# the same data as the previous tests but prepare the transactions. Those insertions
+# raise an error on the subscriptions. Then we skip the transactions in question.
+$node_publisher->safe_psql('postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab1 VALUES (1);
+PREPARE TRANSACTION 'skip_sub1';
+COMMIT PREPARED 'skip_sub1';
+]);
+test_skip_subscription_error($node_subscriber,
+ 'apply', 'tap_sub', 'test_tab1',
+ qq(postgres|tap_sub|INSERT|test_tab1|apply|t),
+ 'skip the error on changes of the prepared transaction');
+
+$node_publisher->safe_psql('postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+PREPARE TRANSACTION 'skip_sub2';
+COMMIT PREPARED 'skip_sub2';
+]);
+test_skip_subscription_error($node_subscriber,
+ 'apply', 'tap_sub_streaming', 'test_tab_streaming',
+ qq(postgres|tap_sub_streaming|INSERT|test_tab_streaming|apply|t),
+ 'skip the error on changes of the prepared-streamed transaction');
+
+# Check if the view doesn't show any entries after dropping the subscriptions.
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+DROP SUBSCRIPTION tap_sub;
+DROP SUBSCRIPTION tap_sub_streaming;
+]);
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_errors");
+is($result, q(0), 'no error after dropping subscription');
--
2.24.3 (Apple Git-128)
v13-0002-Add-RESET-command-to-ALTER-SUBSCRIPTION-command.patchapplication/octet-stream; name=v13-0002-Add-RESET-command-to-ALTER-SUBSCRIPTION-command.patchDownload
From cce498d4a2ab4aff4850d7e60aac1096381aba28 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 2 Aug 2021 14:23:18 +0900
Subject: [PATCH v13 2/3] Add RESET command to ALTER SUBSCRIPTION command.
ALTER SUBSCRIPTION ... RESET command resets subscription
parameters. The parameters that can be set are streaming, binary,
synchronous_commit.
The RESET parameter for ALTER SUBSCRIPTION is required by the
follow-up commit that introduces a new resettable subscription
parameter "skip_xid".
---
doc/src/sgml/ref/alter_subscription.sgml | 8 ++-
src/backend/commands/subscriptioncmds.c | 59 +++++++++++++++-------
src/backend/parser/gram.y | 11 +++-
src/include/nodes/parsenodes.h | 5 +-
src/test/regress/expected/subscription.out | 14 ++++-
src/test/regress/sql/subscription.sql | 13 +++++
6 files changed, 87 insertions(+), 23 deletions(-)
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 835be0d2a4..376fc154b1 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -29,6 +29,7 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> REFRESH PUB
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> ENABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> DISABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <replaceable>new_name</replaceable>
</synopsis>
@@ -189,16 +190,21 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
<varlistentry>
<term><literal>SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )</literal></term>
+ <term><literal>RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )</literal></term>
<listitem>
<para>
This clause alters parameters originally set by
<xref linkend="sql-createsubscription"/>. See there for more
- information. The parameters that can be altered
+ information. The parameters that can be set
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, and
<literal>streaming</literal>.
</para>
+ <para>
+ The parameters that can be reset are: <literal>streaming</literal>,
+ <literal>binary</literal>, <literal>synchronous_commit</literal>.
+ </para>
</listitem>
</varlistentry>
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index c47ba26369..896ec8b836 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -99,7 +99,8 @@ static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname,
*/
static void
parse_subscription_options(ParseState *pstate, List *stmt_options,
- bits32 supported_opts, SubOpts *opts)
+ bits32 supported_opts, SubOpts *opts,
+ bool is_reset)
{
ListCell *lc;
@@ -134,6 +135,11 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
{
DefElem *defel = (DefElem *) lfirst(lc);
+ if (is_reset && defel->arg != NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("RESET must not include values for parameters")));
+
if (IsSet(supported_opts, SUBOPT_CONNECT) &&
strcmp(defel->defname, "connect") == 0)
{
@@ -192,12 +198,18 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_SYNCHRONOUS_COMMIT;
- opts->synchronous_commit = defGetString(defel);
+ if (!is_reset)
+ {
+ opts->synchronous_commit = defGetString(defel);
- /* Test if the given value is valid for synchronous_commit GUC. */
- (void) set_config_option("synchronous_commit", opts->synchronous_commit,
- PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
- false, 0, false);
+ /*
+ * Test if the given value is valid for synchronous_commit
+ * GUC.
+ */
+ (void) set_config_option("synchronous_commit", opts->synchronous_commit,
+ PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
+ false, 0, false);
+ }
}
else if (IsSet(supported_opts, SUBOPT_REFRESH) &&
strcmp(defel->defname, "refresh") == 0)
@@ -215,7 +227,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_BINARY;
- opts->binary = defGetBoolean(defel);
+ if (!is_reset)
+ opts->binary = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_STREAMING) &&
strcmp(defel->defname, "streaming") == 0)
@@ -224,7 +237,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_STREAMING;
- opts->streaming = defGetBoolean(defel);
+ if (!is_reset)
+ opts->streaming = defGetBoolean(defel);
}
else if (strcmp(defel->defname, "two_phase") == 0)
{
@@ -397,7 +411,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT);
- parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
+ parse_subscription_options(pstate, stmt->options, supported_opts, &opts,
+ false);
/*
* Since creating a replication slot is not transactional, rolling back
@@ -866,14 +881,21 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
switch (stmt->kind)
{
- case ALTER_SUBSCRIPTION_OPTIONS:
+ case ALTER_SUBSCRIPTION_SET_OPTIONS:
+ case ALTER_SUBSCRIPTION_RESET_OPTIONS:
{
- supported_opts = (SUBOPT_SLOT_NAME |
- SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
- SUBOPT_STREAMING);
+ bool is_reset = (stmt->kind == ALTER_SUBSCRIPTION_RESET_OPTIONS);
+
+ if (is_reset)
+ supported_opts = (SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
+ SUBOPT_STREAMING);
+ else
+ supported_opts = (SUBOPT_SLOT_NAME |
+ SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
+ SUBOPT_STREAMING);
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, is_reset);
if (IsSet(opts.specified_opts, SUBOPT_SLOT_NAME))
{
@@ -926,7 +948,8 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
case ALTER_SUBSCRIPTION_ENABLED:
{
parse_subscription_options(pstate, stmt->options,
- SUBOPT_ENABLED, &opts);
+ SUBOPT_ENABLED, &opts, false);
+
Assert(IsSet(opts.specified_opts, SUBOPT_ENABLED));
if (!sub->slotname && opts.enabled)
@@ -961,7 +984,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
{
supported_opts = SUBOPT_COPY_DATA | SUBOPT_REFRESH;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
values[Anum_pg_subscription_subpublications - 1] =
publicationListToArray(stmt->publication);
@@ -1008,7 +1031,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = SUBOPT_REFRESH | SUBOPT_COPY_DATA;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
publist = merge_publications(sub->publications, stmt->publication, isadd, stmt->subname);
values[Anum_pg_subscription_subpublications - 1] =
@@ -1056,7 +1079,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
errmsg("ALTER SUBSCRIPTION ... REFRESH is not allowed for disabled subscriptions")));
parse_subscription_options(pstate, stmt->options,
- SUBOPT_COPY_DATA, &opts);
+ SUBOPT_COPY_DATA, &opts, false);
/*
* The subscription option "two_phase" requires that
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index e3068a374e..70558f964a 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9721,7 +9721,16 @@ AlterSubscriptionStmt:
{
AlterSubscriptionStmt *n =
makeNode(AlterSubscriptionStmt);
- n->kind = ALTER_SUBSCRIPTION_OPTIONS;
+ n->kind = ALTER_SUBSCRIPTION_SET_OPTIONS;
+ n->subname = $3;
+ n->options = $5;
+ $$ = (Node *)n;
+ }
+ | ALTER SUBSCRIPTION name RESET definition
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_RESET_OPTIONS;
n->subname = $3;
n->options = $5;
$$ = (Node *)n;
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 45e4f2a16e..5424380bb7 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3676,7 +3676,8 @@ typedef struct CreateSubscriptionStmt
typedef enum AlterSubscriptionType
{
- ALTER_SUBSCRIPTION_OPTIONS,
+ ALTER_SUBSCRIPTION_SET_OPTIONS,
+ ALTER_SUBSCRIPTION_RESET_OPTIONS,
ALTER_SUBSCRIPTION_CONNECTION,
ALTER_SUBSCRIPTION_SET_PUBLICATION,
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
@@ -3688,7 +3689,7 @@ typedef enum AlterSubscriptionType
typedef struct AlterSubscriptionStmt
{
NodeTag type;
- AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_OPTIONS, etc */
+ AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_SET_OPTIONS, etc */
char *subname; /* Name of the subscription */
char *conninfo; /* Connection string to publisher */
List *publication; /* One or more publication to subscribe to */
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 15a1ac6398..e4c16cab66 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -281,11 +281,23 @@ ALTER SUBSCRIPTION regress_testsub SET (two_phase = false);
ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+-- ok
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit);
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit, binary, streaming);
+-- fail - unsupported parameters
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ERROR: unrecognized subscription parameter: "connect"
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+ERROR: unrecognized subscription parameter: "enabled"
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+ERROR: RESET must not include values for parameters
\dRs+
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7faa935a2a..3b0fbea897 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -215,6 +215,19 @@ ALTER SUBSCRIPTION regress_testsub SET (two_phase = false);
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+
+-- ok
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit);
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit, binary, streaming);
+
+-- fail - unsupported parameters
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+
\dRs+
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
--
2.24.3 (Apple Git-128)
v13-0001-Add-pg_stat_subscription_errors-statistics-view.patchapplication/octet-stream; name=v13-0001-Add-pg_stat_subscription_errors-statistics-view.patchDownload
From 7ca7e66746d76b9659ef64ccba8d074c0f8e4e43 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 16 Jul 2021 23:10:22 +0900
Subject: [PATCH v13 1/3] Add pg_stat_subscription_errors statistics view.
This commits adds new system view pg_stat_subscription_errors view
that show information about any errors which occur during application
of logical replication changes as well as during performing initial
table synchronization.
The subscription error entries are removed by autovacuum workers after
table synchronization compeletes in table sync worker cases and after
dropping the subscription in apply worker cases.
It also adds SQL function pg_stat_reset_subscription_error() to
reset the single subscription error.
---
doc/src/sgml/monitoring.sgml | 169 ++++++
src/backend/catalog/system_functions.sql | 2 +
src/backend/catalog/system_views.sql | 27 +
src/backend/postmaster/pgstat.c | 685 +++++++++++++++++++++++
src/backend/replication/logical/worker.c | 51 +-
src/backend/utils/adt/pgstatfuncs.c | 112 ++++
src/include/catalog/pg_proc.dat | 13 +
src/include/pgstat.h | 106 ++++
src/test/regress/expected/rules.out | 22 +
src/tools/pgindent/typedefs.list | 5 +
10 files changed, 1189 insertions(+), 3 deletions(-)
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 2281ba120f..3a4c98ba42 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -346,6 +346,15 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
</entry>
</row>
+ <row>
+ <entry><structname>pg_stat_subscription_errors</structname><indexterm><primary>pg_stat_subscription_errors</primary></indexterm></entry>
+ <entry>One row per error that occurred on subscription, showing information about
+ each subscription error.
+ See <link linkend="monitoring-pg-stat-subscription-errors">
+ <structname>pg_stat_subscription_errors</structname></link> for details.
+ </entry>
+ </row>
+
<row>
<entry><structname>pg_stat_ssl</structname><indexterm><primary>pg_stat_ssl</primary></indexterm></entry>
<entry>One row per connection (regular and replication), showing information about
@@ -3050,6 +3059,144 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
</sect2>
+ <sect2 id="monitoring-pg-stat-subscription-errors">
+ <title><structname>pg_stat_subscription_errors</structname></title>
+
+ <indexterm>
+ <primary>pg_stat_subscription_errors</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_stat_subscription_errors</structname> view will contain one
+ row per subscription error reported by workers applying logical replication
+ changes and workers handling the initial data copy of the subscribed tables.
+ </para>
+
+ <table id="pg-stat-subscription-errors" xreflabel="pg_stat_subscription-errors">
+ <title><structname>pg_stat_subscription_errors</structname> View</title>
+ <tgroup cols="1">
+ <thead>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ Column Type
+ </para>
+ <para>
+ Description
+ </para></entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>datname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the database in which the subscription was created.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the subscription.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the subscription.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>relid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker was processing when the
+ error occurred.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>command</structfield> <type>text</type>
+ </para>
+ <para>
+ Name of command being applied when the error occurred. This
+ field is always NULL if the error is reported by
+ <literal>tablesync</literal> worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>xid</structfield> <type>xid</type>
+ </para>
+ <para>
+ Transaction ID of the publisher node being applied when the error
+ occurred. This field is always NULL if the error is reported
+ by <literal>tablesync</literal> worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>failure_source</structfield> <type>text</type>
+ </para>
+ <para>
+ Type of worker reporting the error: <literal>apply</literal> or
+ <literal>tablesync</literal>.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>failure_count</structfield> <type>uint8</type>
+ </para>
+ <para>
+ Number of times the error occurred in the worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_failure</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which the last error occurred.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_failure_message</structfield> <type>text</type>
+ </para>
+ <para>
+ Last reported error message.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>stats_reset</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which these statistics were last reset
+ </para></entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ </sect2>
+
<sect2 id="monitoring-pg-stat-ssl-view">
<title><structname>pg_stat_ssl</structname></title>
@@ -5172,6 +5319,28 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
can be granted EXECUTE to run the function.
</para></entry>
</row>
+
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_stat_reset_subscription_error</primary>
+ </indexterm>
+ <function>pg_stat_reset_subscription_error</function> ( <parameter>subid</parameter> <type>oid</type>, <parameter>relid</parameter> <type>oid</type> )
+ <returnvalue>void</returnvalue>
+ </para>
+ <para>
+ Resets statistics of a single subscription error. If
+ the argument <parameter>relid</parameter> is not <literal>NULL</literal>,
+ resets error statistics of the <literal>tablesync</literal> worker for
+ the relation with <parameter>relid</parameter>. Otherwise, resets the
+ error statistics of the <literal>apply</literal> worker running on the
+ subscription with <parameter>subid</parameter>.
+ </para>
+ <para>
+ This function is restricted to superusers by default, but other users
+ can be granted EXECUTE to run the function.
+ </para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index a416e94d37..c9aa6f04d3 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -639,6 +639,8 @@ REVOKE EXECUTE ON FUNCTION pg_stat_reset_single_function_counters(oid) FROM publ
REVOKE EXECUTE ON FUNCTION pg_stat_reset_replication_slot(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_stat_reset_subscription_error(oid, oid) FROM public;
+
REVOKE EXECUTE ON FUNCTION lo_import(text) FROM public;
REVOKE EXECUTE ON FUNCTION lo_import(text, oid) FROM public;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 55f6e3711d..b0cd8d2546 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1257,3 +1257,30 @@ REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
substream, subtwophasestate, subslotname, subsynccommit, subpublications)
ON pg_subscription TO public;
+
+CREATE VIEW pg_stat_subscription_errors AS
+ SELECT
+ d.datname,
+ sr.subid,
+ s.subname,
+ e.relid,
+ e.command,
+ e.xid,
+ e.failure_source,
+ e.failure_count,
+ e.last_failure,
+ e.last_failure_message,
+ e.stats_reset
+ FROM (SELECT
+ oid as subid,
+ NULL as relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT
+ srsubid as subid,
+ srrelid as relid
+ FROM pg_subscription_rel
+ WHERE srsubstate <> 'r') sr,
+ LATERAL pg_stat_get_subscription_error(sr.subid, sr.relid) e
+ JOIN pg_subscription s ON (e.subid = s.oid)
+ JOIN pg_database d ON (s.subdbid = d.oid);
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 3450a10129..0178186838 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -41,6 +41,8 @@
#include "catalog/catalog.h"
#include "catalog/pg_database.h"
#include "catalog/pg_proc.h"
+#include "catalog/pg_subscription.h"
+#include "catalog/pg_subscription_rel.h"
#include "common/ip.h"
#include "executor/instrument.h"
#include "libpq/libpq.h"
@@ -106,6 +108,7 @@
#define PGSTAT_TAB_HASH_SIZE 512
#define PGSTAT_FUNCTION_HASH_SIZE 512
#define PGSTAT_REPLSLOT_HASH_SIZE 32
+#define PGSTAT_SUBSCRIPTION_HASH_SIZE 32
/* ----------
@@ -280,6 +283,7 @@ static PgStat_GlobalStats globalStats;
static PgStat_WalStats walStats;
static PgStat_SLRUStats slruStats[SLRU_NUM_ELEMENTS];
static HTAB *replSlotStatHash = NULL;
+static HTAB *subscriptionHash = NULL;
/*
* List of OIDs of databases we need to write out. If an entry is InvalidOid,
@@ -330,6 +334,14 @@ static bool pgstat_db_requested(Oid databaseid);
static PgStat_StatReplSlotEntry *pgstat_get_replslot_entry(NameData name, bool create_it);
static void pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotstats, TimestampTz ts);
+static PgStat_StatSubEntry *pgstat_get_subscription_entry(Oid subid,
+ bool create);
+static PgStat_StatSubErrEntry *pgstat_get_subscription_error_entry(Oid subid,
+ Oid subrelid,
+ bool create);
+static void pgstat_reset_subscription_error_entry(PgStat_StatSubErrEntry *errent,
+ TimestampTz ts);
+
static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg);
static void pgstat_send_funcstats(void);
static void pgstat_send_slru(void);
@@ -369,6 +381,10 @@ static void pgstat_recv_checksum_failure(PgStat_MsgChecksumFailure *msg, int len
static void pgstat_recv_connstat(PgStat_MsgConn *msg, int len);
static void pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len);
static void pgstat_recv_tempfile(PgStat_MsgTempFile *msg, int len);
+static void pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len);
+static void pgstat_recv_subscription_error_purge(PgStat_MsgSubscriptionErrPurge *msg,
+ int len);
+static void pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len);
/* ------------------------------------------------------------
* Public functions called from postmaster follow
@@ -1147,6 +1163,164 @@ pgstat_vacuum_stat(void)
}
}
+ /*
+ * Search for all the dead subscriptions and error entries in stats
+ * hashtable and tell the stats collector to drop them.
+ */
+ if (subscriptionHash)
+ {
+ PgStat_MsgSubscriptionPurge submsg;
+ PgStat_StatSubEntry *subent;
+ HTAB *htab;
+
+ htab = pgstat_collect_oids(SubscriptionRelationId, Anum_pg_subscription_oid);
+
+ submsg.m_nentries = 0;
+ hash_seq_init(&hstat, subscriptionHash);
+ while ((subent = (PgStat_StatSubEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ PgStat_MsgSubscriptionErrPurge errmsg;
+ PgStat_StatSubErrEntry *errent;
+ HASH_SEQ_STATUS hstat_rel;
+ List *not_ready_rels_list;
+ HTAB *not_ready_rels_htab;
+ ListCell *lc;
+ HASHCTL hash_ctl;
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* Check if the subscription is still live */
+ if (hash_search(htab, (void *) &(subent->subid), HASH_FIND, NULL) == NULL)
+ {
+ /* This subscription is dead, add subid to the message */
+ submsg.m_subids[submsg.m_nentries++] = subent->subid;
+
+ /*
+ * If the message is full, send it out and reinitialize to
+ * empty
+ */
+ if (submsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONPURGE)
+ {
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + submsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&submsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(&submsg, len);
+ submsg.m_nentries = 0;
+ }
+
+ continue;
+ }
+
+ /*
+ * Nothing to do here if the subscription exists but has no table
+ * sync error entries.
+ */
+ if (subent->sync_errors == NULL)
+ continue;
+
+ /*
+ * The subscription has table sync error entries. We search errors of
+ * the table sync workers who are already in sync state. Those errors
+ * should be removed.
+ *
+ * Note that the lifetime of error entries of the apply worker and
+ * the table sync worker are different. The former lives until
+ * the subscription is dropped whereas the latter lives until the table
+ * synchronization is completed.
+ */
+ not_ready_rels_list = GetSubscriptionNotReadyRelations(subent->subid);
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(SubscriptionRelState);
+ not_ready_rels_htab = hash_create("not ready relations in subscription",
+ 64,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+
+ /*
+ * The number of not-ready relations can be high for example right
+ * after creating a subscription, so we load the list of
+ * SubscriptionRelState into the hash table for faster lookups.
+ */
+ foreach(lc, not_ready_rels_list)
+ {
+ SubscriptionRelState *r_elem = (SubscriptionRelState *) lfirst(lc);
+ SubscriptionRelState *r_entry;
+
+ CHECK_FOR_INTERRUPTS();
+ r_entry = hash_search(not_ready_rels_htab, (void *) &(r_elem->relid),
+ HASH_ENTER, NULL);
+ memcpy(r_entry, r_elem, sizeof(SubscriptionRelState));
+ }
+
+ list_free(not_ready_rels_list);
+
+ errmsg.m_nentries = 0;
+ errmsg.m_subid = subent->subid;
+
+ /*
+ * Search for all table sync error entries of which relation is already
+ * ready state
+ */
+ hash_seq_init(&hstat_rel, subent->sync_errors);
+ while ((errent = (PgStat_StatSubErrEntry *) hash_seq_search(&hstat_rel)) != NULL)
+ {
+ Assert(OidIsValid(errent->relid));
+
+ CHECK_FOR_INTERRUPTS();
+
+ /*
+ * Add the relid to the message if the table synchronization
+ * for this relation already completes or the table is no
+ * longer subscribed.
+ */
+ if (hash_search(not_ready_rels_htab, (void *) &(errent->relid),
+ HASH_FIND, NULL) == NULL)
+ errmsg.m_relids[errmsg.m_nentries++] = errent->relid;
+
+ /*
+ * If the message is full, send it out and reinitialize to
+ * empty
+ */
+ if (errmsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONERRPURGE)
+ {
+ len = offsetof(PgStat_MsgSubscriptionErrPurge, m_relids[0])
+ + errmsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&errmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE);
+ pgstat_send(&errmsg, len);
+ errmsg.m_nentries = 0;
+ }
+ }
+
+ /* Send the rest of dead error entries */
+ if (errmsg.m_nentries > 0)
+ {
+ len = offsetof(PgStat_MsgSubscriptionErrPurge, m_relids[0])
+ + errmsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&errmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE);
+ pgstat_send(&errmsg, len);
+ errmsg.m_nentries = 0;
+ }
+
+ hash_destroy(not_ready_rels_htab);
+ }
+
+ /* Send the rest of dead subscriptions */
+ if (submsg.m_nentries > 0)
+ {
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + submsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&submsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(&submsg, len);
+ }
+
+ hash_destroy(htab);
+ }
+
/*
* Lookup our own database entry; if not found, nothing more to do.
*/
@@ -1556,6 +1730,25 @@ pgstat_reset_replslot_counter(const char *name)
pgstat_send(&msg, sizeof(msg));
}
+/* ----------
+ * pgstat_reset_subscription_error() -
+ *
+ * Tell the collector about reset the subscription error.
+ * ----------
+ */
+void
+pgstat_reset_subscription_error(Oid subid, Oid subrelid)
+{
+ PgStat_MsgSubscriptionErr msg;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_reset = true;
+
+ pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
+}
+
/* ----------
* pgstat_report_autovac() -
*
@@ -1824,6 +2017,36 @@ pgstat_report_replslot_drop(const char *slotname)
pgstat_send(&msg, sizeof(PgStat_MsgReplSlot));
}
+/* ----------
+ * pgstat_report_subscription_error() -
+ *
+ * Tell the collector about the subscription error.
+ * ----------
+ */
+void
+pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command, TransactionId xid,
+ const char *errmsg)
+{
+ PgStat_MsgSubscriptionErr msg;
+ int len;
+
+ len = offsetof(PgStat_MsgSubscriptionErr, m_errmsg[0]) + strlen(errmsg);
+ Assert(len < PGSTAT_MAX_MSG_SIZE);
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_reset = false;
+ msg.m_relid = relid;
+ msg.m_command = command;
+ msg.m_xid = xid;
+ msg.m_failure_time = GetCurrentTimestamp();
+ strlcpy(msg.m_errmsg, errmsg, PGSTAT_SUBSCRIPTIONERR_MSGLEN);
+
+ pgstat_send(&msg, len);
+}
+
/* ----------
* pgstat_ping() -
*
@@ -2892,6 +3115,22 @@ pgstat_fetch_replslot(NameData slotname)
return pgstat_get_replslot_entry(slotname, false);
}
+/*
+ * ---------
+ * pgstat_fetch_subscription_error() -
+ *
+ * Support function for the SQL-callable pgstat* functions. Returns
+ * a pointer to the subscription error struct.
+ * ---------
+ */
+PgStat_StatSubErrEntry *
+pgstat_fetch_subscription_error(Oid subid, Oid relid)
+{
+ backend_read_statsfile();
+
+ return pgstat_get_subscription_error_entry(subid, relid, false);
+}
+
/*
* Shut down a single backend's statistics reporting at process exit.
*
@@ -3469,6 +3708,19 @@ PgstatCollectorMain(int argc, char *argv[])
pgstat_recv_connstat(&msg.msg_conn, len);
break;
+ case PGSTAT_MTYPE_SUBSCRIPTIONERR:
+ pgstat_recv_subscription_error(&msg.msg_subscriptionerr, len);
+ break;
+
+ case PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE:
+ pgstat_recv_subscription_error_purge(&msg.msg_subscriptionerrpurge,
+ len);
+ break;
+
+ case PGSTAT_MTYPE_SUBSCRIPTIONPURGE:
+ pgstat_recv_subscription_purge(&msg.msg_subscriptionpurge, len);
+ break;
+
default:
break;
}
@@ -3769,6 +4021,57 @@ pgstat_write_statsfiles(bool permanent, bool allDbs)
}
}
+ /*
+ * Write subscription error structs
+ */
+ if (subscriptionHash)
+ {
+ PgStat_StatSubEntry *subent;
+
+ hash_seq_init(&hstat, subscriptionHash);
+ while ((subent = (PgStat_StatSubEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ PgStat_StatSubErrEntry *errent;
+ int32 nerrors = (subent->sync_errors == NULL)
+ ? 0
+ : hash_get_num_entries(subent->sync_errors);
+
+ /*
+ * We always write at least subscription entry since it could have
+ * apply worker error.
+ */
+ fputc('S', fpout);
+ rc = fwrite(subent, sizeof(PgStat_StatSubEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+
+ /* The number of errors follows */
+ rc = fwrite(&nerrors, sizeof(int32), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+
+ /* Then, the error entries follow */
+ if (nerrors > 0)
+ {
+ HASH_SEQ_STATUS relhstat;
+
+ hash_seq_init(&relhstat, subent->sync_errors);
+ while ((errent = (PgStat_StatSubErrEntry *) hash_seq_search(&relhstat)) != NULL)
+ {
+ /*
+ * XXX we write the whole PgStat_StatSubErrEntry entry
+ * that contains the fixed-length error message string
+ * which is PGSTAT_SUBSCRIPTIONERR_MSGLEN in length,
+ * making the stats file bloat. It's okay since we assume
+ * that the number of error entries is not high. But if
+ * the expectation became false we should write the string
+ * and its length instead.
+ */
+ rc = fwrite(errent, sizeof(PgStat_StatSubErrEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+ }
+ }
+ }
+ }
+
/*
* No more output to be done. Close the temp file and replace the old
* pgstat.stat with it. The ferror() check replaces testing for error
@@ -4230,6 +4533,108 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
break;
}
+ /*
+ * 'S' A PgStat_StatSubEntry struct followed by the number of
+ * errors and PgStat_StatSubErrEntry structs, describing
+ * subscription errors.
+ */
+ case 'S':
+ {
+ PgStat_StatSubEntry subbuf;
+ PgStat_StatSubEntry *subent;
+ int32 nerrors;
+
+ /* Read the subscription entry */
+ if (fread(&subbuf, 1, sizeof(PgStat_StatSubEntry), fpin) !=
+ sizeof(PgStat_StatSubEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /* Create hash table if we don't have it already. */
+ if (subscriptionHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ subscriptionHash = hash_create("Subscription stat entries",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ }
+
+ /*
+ * Enter the subscription error entry and initialize
+ * fields
+ */
+ subent =
+ (PgStat_StatSubEntry *) hash_search(subscriptionHash,
+ (void *) &(subbuf.subid),
+ HASH_ENTER, NULL);
+ memcpy(&(subent->apply_error), &(subbuf.apply_error),
+ sizeof(PgStat_StatSubErrEntry));
+ subent->sync_errors = NULL;
+
+ /*
+ * Read the number of table sync errors in the
+ * subscription
+ */
+ if (fread(&nerrors, 1, sizeof(int32), fpin) != sizeof(int32))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /* Read table sync error entries */
+ for (int i = 0; i < nerrors; i++)
+ {
+ PgStat_StatSubErrEntry errbuf;
+ PgStat_StatSubErrEntry *errent;
+
+ if (fread(&errbuf, 1, sizeof(PgStat_StatSubErrEntry), fpin) !=
+ sizeof(PgStat_StatSubErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ if (subent->sync_errors == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubErrEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ subent->sync_errors = hash_create("Subscription error hash",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ }
+
+ /*
+ * Enter the table sync error information to the
+ * subscription hash
+ */
+ errent =
+ (PgStat_StatSubErrEntry *) hash_search(subent->sync_errors,
+ (void *) &(errbuf.relid),
+ HASH_ENTER, NULL);
+
+ memcpy(errent, &errbuf, sizeof(PgStat_StatSubErrEntry));
+ }
+
+ break;
+ }
+
case 'E':
goto done;
@@ -4572,6 +4977,50 @@ pgstat_read_db_statsfile_timestamp(Oid databaseid, bool permanent,
}
break;
+ /*
+ * 'S' A PgStat_StatSubEntry struct followed by the number of
+ * errors and PgStat_StatSubErrEntry structs describing
+ * subscription errors.
+ */
+ case 'S':
+ {
+ PgStat_StatSubEntry subbuf;
+ PgStat_StatSubErrEntry errbuf;
+ int32 nerrors;
+
+ if (fread(&subbuf, 1, sizeof(PgStat_StatSubEntry), fpin)
+ != sizeof(PgStat_StatSubEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ FreeFile(fpin);
+ return false;
+ }
+
+ if (fread(&nerrors, 1, sizeof(int32), fpin) != sizeof(int32))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ for (int i = 0; i < nerrors; i++)
+ {
+ if (fread(&errbuf, 1, sizeof(PgStat_StatSubErrEntry), fpin) !=
+ sizeof(PgStat_StatSubErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+ }
+ }
+
+ break;
+
case 'E':
goto done;
@@ -4777,6 +5226,7 @@ pgstat_clear_snapshot(void)
pgStatLocalContext = NULL;
pgStatDBHash = NULL;
replSlotStatHash = NULL;
+ subscriptionHash = NULL;
/*
* Historically the backend_status.c facilities lived in this file, and
@@ -5699,6 +6149,125 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
}
}
+/* ----------
+ * pgstat_recv_subscription_error() -
+ *
+ * Process a SUBSCRIPTIONERR message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len)
+{
+ PgStat_StatSubErrEntry *errent;
+ bool create = !msg->m_reset;
+
+ /* Get subscription error */
+ errent = pgstat_get_subscription_error_entry(msg->m_subid,
+ msg->m_subrelid,
+ create);
+
+ if (msg->m_reset)
+ {
+ Assert(!create);
+
+ if (errent == NULL)
+ return;
+
+ /* reset fields and set reset timestamp */
+ pgstat_reset_subscription_error_entry(errent,
+ GetCurrentTimestamp());
+ }
+ else
+ {
+ Assert(errent);
+ Assert((OidIsValid(msg->m_subrelid) && msg->m_subrelid == msg->m_relid &&
+ msg->m_subrelid == errent->relid) || !OidIsValid(msg->m_subrelid));
+
+ /*
+ * If reported by the apply worker, we have to update the relid since
+ * the apply worker could report different relid per error. In table
+ * sync error case, relid should have been set by a hash table lookup.
+ * So we don't update the hash entry key.
+ */
+ if (!OidIsValid(msg->m_subrelid))
+ errent->relid = msg->m_relid;
+
+ /* update the error entry */
+ errent->command = msg->m_command;
+ errent->xid = msg->m_xid;
+ errent->failure_count++;
+ errent->last_failure = msg->m_failure_time;
+ strlcpy(errent->last_errmsg, msg->m_errmsg, PGSTAT_SUBSCRIPTIONERR_MSGLEN);
+ }
+}
+
+/* ----------
+ * pgstat_recv_subscription_purge() -
+ *
+ * Process a SUBSCRIPTIONPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len)
+{
+ if (subscriptionHash == NULL)
+ return;
+
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ PgStat_StatSubEntry *subent;
+
+ subent = pgstat_get_subscription_entry(msg->m_subids[i], false);
+
+ /*
+ * Nothing to do if the subscription entry is not found. This could
+ * happen when the subscription is dropped and the message for
+ * dropping subscription entry arrived before the message for
+ * reporting the error.
+ */
+ if (subent == NULL)
+ continue;
+
+ /* Cleanup the table sync errors */
+ if (subent->sync_errors != NULL)
+ hash_destroy(subent->sync_errors);
+
+ /* Remove the subscription entry */
+ (void) hash_search(subscriptionHash, (void *) &(msg->m_subids[i]),
+ HASH_REMOVE, NULL);
+ }
+}
+
+/* ----------
+ * pgstat_recv_subscription_error_purge() -
+ *
+ * Process a SUBSCRIPTIONERRPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_error_purge(PgStat_MsgSubscriptionErrPurge *msg, int len)
+{
+ PgStat_StatSubEntry *subent;
+
+ subent = pgstat_get_subscription_entry(msg->m_subid, false);
+
+ /*
+ * Nothing to do if the subscription entry is not found. This could
+ * happen when the subscription with msg->m_subid is removed and the
+ * corresponding entry is also removed before receiving the error purge
+ * message.
+ */
+ if (subent == NULL)
+ return;
+
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ Assert(OidIsValid(msg->m_relids[i]));
+ (void) hash_search(subent->sync_errors, (void *) &(msg->m_relids[i]),
+ HASH_REMOVE, NULL);
+ }
+}
+
/* ----------
* pgstat_write_statsfile_needed() -
*
@@ -5817,6 +6386,122 @@ pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotent, TimestampTz ts)
slotent->stat_reset_timestamp = ts;
}
+/* ----------
+ * pgstat_get_subscription_entry
+ *
+ * Return the subscription statistics with the subscription OID. Return NULL
+ * if not found and the caller didn't request to create it.
+ *
+ * 'create' tells whether to create the new subscription entry if it is not
+ * found.
+ * ----------
+ */
+static PgStat_StatSubEntry *
+pgstat_get_subscription_entry(Oid subid, bool create)
+{
+ PgStat_StatSubEntry *subent;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);
+ bool found;
+
+ if (subscriptionHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ if (!create)
+ return NULL;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubEntry);
+ subscriptionHash = hash_create("Subscription stat entries",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+ }
+
+ subent = (PgStat_StatSubEntry *) hash_search(subscriptionHash,
+ (void *) &subid,
+ action, &found);
+
+ /* initialize fields */
+ if (create && !found)
+ {
+ pgstat_reset_subscription_error_entry(&(subent->apply_error), 0);
+ subent->sync_errors = NULL;
+ }
+
+ return subent;
+}
+
+/* ----------
+ * pgstat_get_subscription_error_entry
+ *
+ * Return the entry of subscription error entry with the subscription
+ * OID and relation OID. Return NULL if not found and the caller didn't
+ * request to create it.
+ *
+ * 'create' tells whether to create the new subscription relation entry if it is
+ * not found.
+ * ----------
+ */
+static PgStat_StatSubErrEntry *
+pgstat_get_subscription_error_entry(Oid subid, Oid subrelid, bool create)
+{
+ PgStat_StatSubEntry *subent;
+ PgStat_StatSubErrEntry *errent;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);
+ bool found;
+
+ subent = pgstat_get_subscription_entry(subid, create);
+
+ if (subent == NULL)
+ {
+ Assert(!create);
+ return NULL;
+ }
+
+ /* Return the apply error worker if requested */
+ if (!OidIsValid(subrelid))
+ return &(subent->apply_error);
+
+ if (subent->sync_errors == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ if (!create)
+ return NULL;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubErrEntry);
+ subent->sync_errors = hash_create("Subscription error hash",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+ }
+
+ errent = (PgStat_StatSubErrEntry *) hash_search(subent->sync_errors,
+ (void *) &subrelid,
+ action, &found);
+
+ /* initialize fields */
+ if (create && !found)
+ pgstat_reset_subscription_error_entry(errent, 0);
+
+ return errent;
+}
+
+/* Reset fields other than relid */
+static void
+pgstat_reset_subscription_error_entry(PgStat_StatSubErrEntry *errent,
+ TimestampTz ts)
+{
+ errent->command = 0;
+ errent->xid = InvalidTransactionId;
+ errent->failure_count = 0;
+ errent->last_failure = 0;
+ errent->last_errmsg[0] = '\0';
+ errent->stat_reset_timestamp = ts;
+}
+
/*
* pgstat_slru_index
*
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 8d96c926b4..e91fa86b1a 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3329,6 +3329,7 @@ void
ApplyWorkerMain(Datum main_arg)
{
int worker_slot = DatumGetInt32(main_arg);
+ MemoryContext cctx = CurrentMemoryContext;
MemoryContext oldctx;
char originname[NAMEDATALEN];
XLogRecPtr origin_startpos;
@@ -3429,8 +3430,27 @@ ApplyWorkerMain(Datum main_arg)
{
char *syncslotname;
- /* This is table synchronization worker, call initial sync. */
- syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ PG_TRY();
+ {
+ /* This is table synchronization worker, call initial sync. */
+ syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ }
+ PG_CATCH();
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ /* report the table sync error */
+ pgstat_report_subscription_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ MyLogicalRepWorker->relid,
+ 0,
+ InvalidTransactionId,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
/* allocate slot name in long-lived context */
myslotname = MemoryContextStrdup(ApplyContext, syncslotname);
@@ -3548,7 +3568,32 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- LogicalRepApplyLoop(origin_startpos);
+ PG_TRY();
+ {
+ LogicalRepApplyLoop(origin_startpos);
+ }
+ PG_CATCH();
+ {
+ /* report the apply error */
+ if (apply_error_callback_arg.command != 0)
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ pgstat_report_subscription_error(MySubscription->oid,
+ InvalidOid,
+ apply_error_callback_arg.rel != NULL
+ ? apply_error_callback_arg.rel->localreloid
+ : InvalidOid,
+ apply_error_callback_arg.command,
+ apply_error_callback_arg.remote_xid,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ }
+
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
proc_exit(0);
}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index ff5aedc99c..51f693c22b 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -24,6 +24,7 @@
#include "pgstat.h"
#include "postmaster/bgworker_internals.h"
#include "postmaster/postmaster.h"
+#include "replication/logicalproto.h"
#include "replication/slot.h"
#include "storage/proc.h"
#include "storage/procarray.h"
@@ -2239,6 +2240,23 @@ pg_stat_reset_replication_slot(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
+/* Reset a subscription error stats */
+Datum
+pg_stat_reset_subscription_error(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid;
+
+ if (PG_ARGISNULL(1))
+ relid = InvalidOid; /* reset apply worker error stats */
+ else
+ relid = PG_GETARG_OID(1); /* reset table sync worker error stats */
+
+ pgstat_reset_subscription_error(subid, relid);
+
+ PG_RETURN_VOID();
+}
+
Datum
pg_stat_get_archiver(PG_FUNCTION_ARGS)
{
@@ -2379,3 +2397,97 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
/* Returns the record as Datum */
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
+
+/*
+ * Get the subscription error for the given subscription (and relation).
+ */
+Datum
+pg_stat_get_subscription_error(PG_FUNCTION_ARGS)
+{
+#define PG_STAT_GET_SUBSCRIPTION_ERROR_COLS 9
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid;
+ TupleDesc tupdesc;
+ Datum values[PG_STAT_GET_SUBSCRIPTION_ERROR_COLS];
+ bool nulls[PG_STAT_GET_SUBSCRIPTION_ERROR_COLS];
+ PgStat_StatSubErrEntry *errent;
+ int i;
+
+ /* Initialise values and NULL flags arrays */
+ MemSet(values, 0, sizeof(values));
+ MemSet(nulls, 0, sizeof(nulls));
+
+ /* Initialise attributes information in the tuple descriptor */
+ tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_SUBSCRIPTION_ERROR_COLS);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "subid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 2, "relid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 3, "command",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 4, "xid",
+ XIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 5, "failure_source",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 6, "failure_count",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 7, "last_failure",
+ TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 8, "last_failure_message",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 9, "stats_reset",
+ TIMESTAMPTZOID, -1, 0);
+ BlessTupleDesc(tupdesc);
+
+ if (PG_ARGISNULL(1))
+ relid = InvalidOid;
+ else
+ relid = PG_GETARG_OID(1);
+
+ /* Get subscription errors */
+ errent = pgstat_fetch_subscription_error(subid, relid);
+
+ /* Return NULL if the subscription doesn't have any errors */
+ if (errent == NULL)
+ PG_RETURN_NULL();
+
+ i = 0;
+ values[i++] = ObjectIdGetDatum(subid);
+
+ if (OidIsValid(errent->relid))
+ values[i++] = ObjectIdGetDatum(errent->relid);
+ else
+ nulls[i++] = true;
+
+ if (errent->command == 0)
+ nulls[i++] = true;
+ else
+ values[i++] = CStringGetTextDatum(logicalrep_message_type(errent->command));
+
+ if (TransactionIdIsValid(errent->xid))
+ values[i++] = TransactionIdGetDatum(errent->xid);
+ else
+ nulls[i++] = true;
+
+ if (OidIsValid(relid))
+ values[i++] = CStringGetTextDatum("tablesync");
+ else
+ values[i++] = CStringGetTextDatum("apply");
+
+ values[i++] = Int64GetDatum(errent->failure_count);
+
+ if (errent->last_failure == 0)
+ nulls[i++] = true;
+ else
+ values[i++] = TimestampTzGetDatum(errent->last_failure);
+
+ values[i++] = CStringGetTextDatum(errent->last_errmsg);
+
+ if (errent->stat_reset_timestamp == 0)
+ nulls[i++] = true;
+ else
+ values[i++] = TimestampTzGetDatum(errent->stat_reset_timestamp);
+
+ /* Returns the record as Datum */
+ PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
+}
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index d068d6532e..ac02061347 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5385,6 +5385,14 @@
proargmodes => '{i,o,o,o,o,o,o,o,o,o,o}',
proargnames => '{slot_name,slot_name,spill_txns,spill_count,spill_bytes,stream_txns,stream_count,stream_bytes,total_txns,total_bytes,stats_reset}',
prosrc => 'pg_stat_get_replication_slot' },
+{ oid => '8523', descr => 'statistics: information about subscription error',
+ proname => 'pg_stat_get_subscription_error', prorows => '1', proisstrict => 'f',
+ proretset => 't', provolatile => 's', proparallel => 'r',
+ prorettype => 'record', proargtypes => 'oid oid',
+ proallargtypes => '{oid,oid,oid,oid,text,xid,text,int8,timestamptz,text,timestamptz}',
+ proargmodes => '{i,i,o,o,o,o,o,o,o,o,o}',
+ proargnames => '{subid,relid,subid,relid,command,xid,failure_source,failure_count,last_failure,last_failure_message,stats_reset}',
+ prosrc => 'pg_stat_get_subscription_error' },
{ oid => '6118', descr => 'statistics: information about subscription',
proname => 'pg_stat_get_subscription', prorows => '10', proisstrict => 'f',
proretset => 't', provolatile => 's', proparallel => 'r',
@@ -5772,6 +5780,11 @@
proname => 'pg_stat_reset_replication_slot', proisstrict => 'f',
provolatile => 'v', prorettype => 'void', proargtypes => 'text',
prosrc => 'pg_stat_reset_replication_slot' },
+{ oid => '8524',
+ descr => 'statistics: reset collected statistics for a single subscription error',
+ proname => 'pg_stat_reset_subscription_error', proisstrict => 'f',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid oid',
+ prosrc => 'pg_stat_reset_subscription_error' },
{ oid => '3163', descr => 'current trigger depth',
proname => 'pg_trigger_depth', provolatile => 's', proparallel => 'r',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 509849c7ff..6ff8720631 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -14,6 +14,7 @@
#include "datatype/timestamp.h"
#include "portability/instr_time.h"
#include "postmaster/pgarch.h" /* for MAX_XFN_CHARS */
+#include "replication/logicalproto.h"
#include "utils/backend_progress.h" /* for backward compatibility */
#include "utils/backend_status.h" /* for backward compatibility */
#include "utils/hsearch.h"
@@ -66,6 +67,9 @@ typedef enum StatMsgType
PGSTAT_MTYPE_RESETSINGLECOUNTER,
PGSTAT_MTYPE_RESETSLRUCOUNTER,
PGSTAT_MTYPE_RESETREPLSLOTCOUNTER,
+ PGSTAT_MTYPE_SUBSCRIPTIONERR,
+ PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE,
+ PGSTAT_MTYPE_SUBSCRIPTIONPURGE,
PGSTAT_MTYPE_AUTOVAC_START,
PGSTAT_MTYPE_VACUUM,
PGSTAT_MTYPE_ANALYZE,
@@ -530,6 +534,67 @@ typedef struct PgStat_MsgReplSlot
PgStat_Counter m_total_bytes;
} PgStat_MsgReplSlot;
+/* ----------
+ * PgStat_MsgSubscriptionErr Sent by the apply worker or the table sync worker to
+ * update/reset the error happening during logical
+ * replication.
+ * ----------
+ */
+#define PGSTAT_SUBSCRIPTIONERR_MSGLEN 256
+typedef struct PgStat_MsgSubscriptionErr
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * m_subid and m_subrelid are used to determine the subscription and the
+ * reporter of this error. m_subrelid is InvalidOid if reported by the
+ * apply worker, otherwise by the table sync worker. In table sync worker
+ * case, m_subrelid must be the same as m_relid.
+ */
+ Oid m_subid;
+ Oid m_subrelid;
+
+ /* The reset message uses below field */
+ bool m_reset; /* Reset all fields and set reset_stats
+ * timestamp */
+
+ /* The error report message uses below fields */
+ Oid m_relid;
+ LogicalRepMsgType m_command;
+ TransactionId m_xid;
+ TimestampTz m_failure_time;
+ char m_errmsg[PGSTAT_SUBSCRIPTIONERR_MSGLEN];
+} PgStat_MsgSubscriptionErr;
+
+/* ----------
+ * PgStat_MsgSubscriptionPurge Sent by the autovacuum to purge the subscriptions.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionPurge
+{
+ PgStat_MsgHdr m_hdr;
+ int m_nentries;
+ Oid m_subids[PGSTAT_NUM_SUBSCRIPTIONPURGE];
+} PgStat_MsgSubscriptionPurge;
+
+/* ----------
+ * PgStat_MsgSubscriptionErrPurge Sent by the autovacuum to purge the subscription
+ * errors.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONERRPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(Oid) - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionErrPurge
+{
+ PgStat_MsgHdr m_hdr;
+ Oid m_subid;
+ int m_nentries;
+ Oid m_relids[PGSTAT_NUM_SUBSCRIPTIONERRPURGE];
+} PgStat_MsgSubscriptionErrPurge;
/* ----------
* PgStat_MsgRecoveryConflict Sent by the backend upon recovery conflict
@@ -701,6 +766,9 @@ typedef union PgStat_Msg
PgStat_MsgChecksumFailure msg_checksumfailure;
PgStat_MsgReplSlot msg_replslot;
PgStat_MsgConn msg_conn;
+ PgStat_MsgSubscriptionErr msg_subscriptionerr;
+ PgStat_MsgSubscriptionErrPurge msg_subscriptionerrpurge;
+ PgStat_MsgSubscriptionPurge msg_subscriptionpurge;
} PgStat_Msg;
@@ -916,6 +984,39 @@ typedef struct PgStat_StatReplSlotEntry
TimestampTz stat_reset_timestamp;
} PgStat_StatReplSlotEntry;
+/*
+ * Subscription error statistics kept in the stats collector, representing
+ * an error that occurred during application of logical replicatoin or
+ * initial table synchronization.
+ */
+typedef struct PgStat_StatSubErrEntry
+{
+ Oid relid; /* hash table key */
+ LogicalRepMsgType command;
+ TransactionId xid;
+ PgStat_Counter failure_count;
+ TimestampTz last_failure;
+ char last_errmsg[PGSTAT_SUBSCRIPTIONERR_MSGLEN];
+ TimestampTz stat_reset_timestamp;
+} PgStat_StatSubErrEntry;
+
+/*
+ * Subscription statistics kept in the stats collector.
+ */
+typedef struct PgStat_StatSubEntry
+{
+ Oid subid; /* hash table key */
+
+ /*
+ * Statistics of errors that occurred during logical replication. While
+ * having the hash table for table sync errors we have a separate
+ * statistics value for apply error (apply_error), because we can avoid
+ * building a nested hash table for table sync errors in the case where
+ * there is no table sync error, where is the common case in practice.
+ */
+ PgStat_StatSubErrEntry apply_error;
+ HTAB *sync_errors;
+} PgStat_StatSubEntry;
/*
* Working state needed to accumulate per-function-call timing statistics.
@@ -1009,6 +1110,7 @@ extern void pgstat_reset_shared_counters(const char *);
extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type type);
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
+extern void pgstat_reset_subscription_error(Oid subid, Oid subrelid);
extern void pgstat_report_autovac(Oid dboid);
extern void pgstat_report_vacuum(Oid tableoid, bool shared,
@@ -1024,6 +1126,9 @@ extern void pgstat_report_checksum_failure(void);
extern void pgstat_report_replslot(const PgStat_StatReplSlotEntry *repSlotStat);
extern void pgstat_report_replslot_create(const char *slotname);
extern void pgstat_report_replslot_drop(const char *slotname);
+extern void pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command,
+ TransactionId xid, const char *errmsg);
extern void pgstat_initialize(void);
@@ -1122,6 +1227,7 @@ extern PgStat_GlobalStats *pgstat_fetch_global(void);
extern PgStat_WalStats *pgstat_fetch_stat_wal(void);
extern PgStat_SLRUStats *pgstat_fetch_slru(void);
extern PgStat_StatReplSlotEntry *pgstat_fetch_replslot(NameData slotname);
+extern PgStat_StatSubErrEntry *pgstat_fetch_subscription_error(Oid subid, Oid relid);
extern void pgstat_count_slru_page_zeroed(int slru_idx);
extern void pgstat_count_slru_page_hit(int slru_idx);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2fa00a3c29..3719b8a41a 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2094,6 +2094,28 @@ pg_stat_subscription| SELECT su.oid AS subid,
st.latest_end_time
FROM (pg_subscription su
LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid, pid, received_lsn, last_msg_send_time, last_msg_receipt_time, latest_end_lsn, latest_end_time) ON ((st.subid = su.oid)));
+pg_stat_subscription_errors| SELECT d.datname,
+ sr.subid,
+ s.subname,
+ e.relid,
+ e.command,
+ e.xid,
+ e.failure_source,
+ e.failure_count,
+ e.last_failure,
+ e.last_failure_message,
+ e.stats_reset
+ FROM ( SELECT pg_subscription.oid AS subid,
+ NULL::oid AS relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT pg_subscription_rel.srsubid AS subid,
+ pg_subscription_rel.srrelid AS relid
+ FROM pg_subscription_rel
+ WHERE (pg_subscription_rel.srsubstate <> 'r'::"char")) sr,
+ ((LATERAL pg_stat_get_subscription_error(sr.subid, sr.relid) e(subid, relid, command, xid, failure_source, failure_count, last_failure, last_failure_message, stats_reset)
+ JOIN pg_subscription s ON ((e.subid = s.oid)))
+ JOIN pg_database d ON ((s.subdbid = d.oid)));
pg_stat_sys_indexes| SELECT pg_stat_all_indexes.relid,
pg_stat_all_indexes.indexrelid,
pg_stat_all_indexes.schemaname,
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 423780652f..141b4a3276 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1939,6 +1939,9 @@ PgStat_MsgResetsharedcounter
PgStat_MsgResetsinglecounter
PgStat_MsgResetslrucounter
PgStat_MsgSLRU
+PgStat_MsgSubscriptionErr
+PgStat_MsgSubscriptionErrPurge
+PgStat_MsgSubscriptionPurge
PgStat_MsgTabpurge
PgStat_MsgTabstat
PgStat_MsgTempFile
@@ -1950,6 +1953,8 @@ PgStat_Single_Reset_Type
PgStat_StatDBEntry
PgStat_StatFuncEntry
PgStat_StatReplSlotEntry
+PgStat_StatSubEntry
+PgStat_StatSubErrEntry
PgStat_StatTabEntry
PgStat_SubXactStatus
PgStat_TableCounts
--
2.24.3 (Apple Git-128)
On Fri, Sep 10, 2021 at 12:33 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Sorry for the late response. I've attached the updated patches that
incorporate all comments unless I missed something. Please review
them.
Here's some review comments for the v13-0001 patch:
doc/src/sgml/monitoring.sgml
(1)
There's an extra space in the following line, before "processing":
+ OID of the relation that the worker was processing when the
(2) Suggested wording update:
BEFORE:
+ field is always NULL if the error is reported by
AFTER:
+ field is always NULL if the error is reported by the
(3) Suggested wording update:
BEFORE:
+ by <literal>tablesync</literal> worker.
AFTER:
+ by the <literal>tablesync</literal> worker.
(4)
Missing "." at end of following description (inconsistent with other doc):
+ Time at which these statistics were last reset
(5) Suggested wording update:
BEFORE:
+ can be granted EXECUTE to run the function.
AFTER:
+ can be granted EXECUTE privilege to run the function.
src/backend/postmaster/pgstat.c
(6) Suggested wording update:
BEFORE:
+ * for this relation already completes or the table is no
AFTER:
+ * for this relation already completed or the table is no
(7)
In the code below, since errmsg.m_nentries only ever gets incremented
by the 1st IF condition, it's probably best to include the 2nd IF
block within the 1st IF condition. Then can avoid checking
"errmsg.m_nentries" each loop iteration.
+ if (hash_search(not_ready_rels_htab, (void *) &(errent->relid),
+ HASH_FIND, NULL) == NULL)
+ errmsg.m_relids[errmsg.m_nentries++] = errent->relid;
+
+ /*
+ * If the message is full, send it out and reinitialize to
+ * empty
+ */
+ if (errmsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONERRPURGE)
+ {
+ len = offsetof(PgStat_MsgSubscriptionErrPurge, m_relids[0])
+ + errmsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&errmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE);
+ pgstat_send(&errmsg, len);
+ errmsg.m_nentries = 0;
+ }
(8)
+ * Tell the collector about reset the subscription error.
Is this meant to say "Tell the collector to reset the subscription error." ?
(9)
I think the following:
+ len = offsetof(PgStat_MsgSubscriptionErr, m_errmsg[0]) + strlen(errmsg);
should be:
+ len = offsetof(PgStat_MsgSubscriptionErr, m_errmsg[0]) + strlen(errmsg) + 1;
to account for the \0 terminator.
(10)
I don't think that using the following Assert is really correct here,
because PgStat_MsgSubscriptionErr is not setup to have the maximum
number of m_errmsg[] entries to fill up to PGSTAT_MAX_MSG_SIZE (as are
some of the other pgstat structs):
+ Assert(len < PGSTAT_MAX_MSG_SIZE);
(the max size of all of the pgstat structs is statically asserted anyway)
It would be correct to do the following instead:
+ Assert(strlen(errmsg) < PGSTAT_SUBSCRIPTIONERR_MSGLEN);
The overflow is guarded by the strlcpy() in any case.
(11)
Would be better to write:
+ rc = fwrite(&nerrors, sizeof(nerrors), 1, fpout);
instead of:
+ rc = fwrite(&nerrors, sizeof(int32), 1, fpout);
(12)
Would be better to write:
+ if (fread(&nerrors, 1, sizeof(nerrors), fpin) != sizeof(nerrors))
instead of:
+ if (fread(&nerrors, 1, sizeof(int32), fpin) != sizeof(int32))
src/include/pgstat.h
(13)
BEFORE:
+ * update/reset the error happening during logical
AFTER:
+ * update/reset the error occurring during logical
(14)
Typo: replicatoin -> replication
+ * an error that occurred during application of logical replicatoin or
(15) Suggested wording update:
BEFORE:
+ * there is no table sync error, where is the common case in practice.
AFTER:
+ * there is no table sync error, which is the common case in practice.
Regards,
Greg Nancarrow
Fujitsu Australia
On Fri, Sep 10, 2021 at 12:33 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Sorry for the late response. I've attached the updated patches that
incorporate all comments unless I missed something. Please review
them.
A few review comments for the v13-0002 patch:
(1)
I suggest a small update to the patch comment:
BEFORE:
ALTER SUBSCRIPTION ... RESET command resets subscription
parameters. The parameters that can be set are streaming, binary,
synchronous_commit.
AFTER:
ALTER SUBSCRIPTION ... RESET command resets subscription
parameters to their default value. The parameters that can be reset
are streaming, binary, and synchronous_commit.
(2)
In the documentation, the RESETable parameters should be listed in the
same way and order as for SET:
BEFORE:
+ <para>
+ The parameters that can be reset are: <literal>streaming</literal>,
+ <literal>binary</literal>, <literal>synchronous_commit</literal>.
+ </para>
AFTER:
+ <para>
+ The parameters that can be reset are
<literal>synchronous_commit</literal>,
+ <literal>binary</literal>, and <literal>streaming</literal>.
+ </para>
Also I'm thinking it would be beneficial to say before this:
RESET is used to set parameters back to their default value.
(3)
I notice that if you try to reset the slot_name, you get the following message:
postgres=# alter subscription sub reset (slot_name);
ERROR: unrecognized subscription parameter: "slot_name"
This is a bit misleading, because slot_name IS a subscription
parameter, just not resettable.
It would be better if it said something like: ERROR: not a resettable
subscription parameter: "slot_name"
However, it seems that this is also an existing issue with SET (e.g.
for "refresh" or "two_phase")
postgres=# alter subscription sub set (refresh=true);
ERROR: unrecognized subscription parameter: "refresh"
Regards,
Greg Nancarrow
Fujitsu Australia
On Fri, Sep 10, 2021 at 8:46 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Fri, Sep 10, 2021 at 12:33 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Sorry for the late response. I've attached the updated patches that
incorporate all comments unless I missed something. Please review
them.Here's some review comments for the v13-0001 patch:
Thank you for the comments!
doc/src/sgml/monitoring.sgml
(1)
There's an extra space in the following line, before "processing":+ OID of the relation that the worker was processing when the
Fixed.
(2) Suggested wording update: BEFORE: + field is always NULL if the error is reported by AFTER: + field is always NULL if the error is reported by the
Fixed.
(3) Suggested wording update: BEFORE: + by <literal>tablesync</literal> worker. AFTER: + by the <literal>tablesync</literal> worker.
Fixed.
(4)
Missing "." at end of following description (inconsistent with other doc):+ Time at which these statistics were last reset
Fixed.
(5) Suggested wording update: BEFORE: + can be granted EXECUTE to run the function. AFTER: + can be granted EXECUTE privilege to run the function.
Since descriptions of other stats reset functions don't use "EXECUTE
privilege" so I think it'd be better to leave it for consistency.
src/backend/postmaster/pgstat.c
(6) Suggested wording update: BEFORE: + * for this relation already completes or the table is no AFTER: + * for this relation already completed or the table is no
Fixed.
(7)
In the code below, since errmsg.m_nentries only ever gets incremented
by the 1st IF condition, it's probably best to include the 2nd IF
block within the 1st IF condition. Then can avoid checking
"errmsg.m_nentries" each loop iteration.+ if (hash_search(not_ready_rels_htab, (void *) &(errent->relid), + HASH_FIND, NULL) == NULL) + errmsg.m_relids[errmsg.m_nentries++] = errent->relid; + + /* + * If the message is full, send it out and reinitialize to + * empty + */ + if (errmsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONERRPURGE) + { + len = offsetof(PgStat_MsgSubscriptionErrPurge, m_relids[0]) + + errmsg.m_nentries * sizeof(Oid); + + pgstat_setheader(&errmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE); + pgstat_send(&errmsg, len); + errmsg.m_nentries = 0; + }
Agreed. Instead of including the 2nd if block within the 1st if block,
I changed the 1st if condition to check the opposite condition and
continued the loop if it's true (i.g., the table is still under table
synchronization).
(8)
+ * Tell the collector about reset the subscription error.Is this meant to say "Tell the collector to reset the subscription error." ?
Yes, fixed.
(9)
I think the following:+ len = offsetof(PgStat_MsgSubscriptionErr, m_errmsg[0]) + strlen(errmsg);
should be:
+ len = offsetof(PgStat_MsgSubscriptionErr, m_errmsg[0]) + strlen(errmsg) + 1;
to account for the \0 terminator.
Fixed.
(10)
I don't think that using the following Assert is really correct here,
because PgStat_MsgSubscriptionErr is not setup to have the maximum
number of m_errmsg[] entries to fill up to PGSTAT_MAX_MSG_SIZE (as are
some of the other pgstat structs):+ Assert(len < PGSTAT_MAX_MSG_SIZE);
(the max size of all of the pgstat structs is statically asserted anyway)
It would be correct to do the following instead:
+ Assert(strlen(errmsg) < PGSTAT_SUBSCRIPTIONERR_MSGLEN);
The overflow is guarded by the strlcpy() in any case.
Agreed. Fixed.
(11)
Would be better to write:+ rc = fwrite(&nerrors, sizeof(nerrors), 1, fpout);
instead of:
+ rc = fwrite(&nerrors, sizeof(int32), 1, fpout);
(12)
Would be better to write:+ if (fread(&nerrors, 1, sizeof(nerrors), fpin) != sizeof(nerrors))
instead of:
+ if (fread(&nerrors, 1, sizeof(int32), fpin) != sizeof(int32))
Agreed.
src/include/pgstat.h
(13) BEFORE: + * update/reset the error happening during logical AFTER: + * update/reset the error occurring during logical
Fixed.
(14)
Typo: replicatoin -> replication+ * an error that occurred during application of logical replicatoin or
Fixed.
(15) Suggested wording update: BEFORE: + * there is no table sync error, where is the common case in practice. AFTER: + * there is no table sync error, which is the common case in practice.
Fixed.
I'll submit the updated patches.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
From Thur, Sep 9, 2021 10:33 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Sorry for the late response. I've attached the updated patches that incorporate
all comments unless I missed something. Please review them.
Thanks for the new version patches.
Here are some comments for the v13-0001 patch.
1)
+ pgstat_setheader(&errmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE);
+ pgstat_send(&errmsg, len);
+ errmsg.m_nentries = 0;
+ }
It seems we can invoke pgstat_setheader once before the loop like the
following:
+ errmsg.m_nentries = 0;
+ errmsg.m_subid = subent->subid;
+ pgstat_setheader(&errmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE);
2)
+ pgstat_setheader(&submsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(&submsg, len);
Same as 1), we can invoke pgstat_setheader once before the loop like:
+ submsg.m_nentries = 0;
+ pgstat_setheader(&submsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
3)
+/* ----------
+ * PgStat_MsgSubscriptionErrPurge Sent by the autovacuum to purge the subscription
+ * errors.
The comments said it's sent by autovacuum, would the manual vacuum also send
this message ?
4)
+
+ pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
+}
Does it look cleaner that we use the offset of m_relid here like the following ?
pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_relid));
Best regards,
Hou zj
Sorry for the late reply. I was on vacation.
On Tue, Sep 14, 2021 at 11:27 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
From Thur, Sep 9, 2021 10:33 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Sorry for the late response. I've attached the updated patches that incorporate
all comments unless I missed something. Please review them.Thanks for the new version patches.
Here are some comments for the v13-0001 patch.
Thank you for the comments!
1)
+ pgstat_setheader(&errmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE); + pgstat_send(&errmsg, len); + errmsg.m_nentries = 0; + }It seems we can invoke pgstat_setheader once before the loop like the
following:+ errmsg.m_nentries = 0; + errmsg.m_subid = subent->subid; + pgstat_setheader(&errmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE);2) + pgstat_setheader(&submsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE); + pgstat_send(&submsg, len);Same as 1), we can invoke pgstat_setheader once before the loop like: + submsg.m_nentries = 0; + pgstat_setheader(&submsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
But if we do that, we set the header even if there is no message to
send, right? Looking at other similar code in pgstat_vacuum_stat(), we
set the header just before sending the message. So I'd like to leave
them since it's cleaner.
3)
+/* ---------- + * PgStat_MsgSubscriptionErrPurge Sent by the autovacuum to purge the subscription + * errors.The comments said it's sent by autovacuum, would the manual vacuum also send
this message ?
Right. Fixed.
4) + + pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool)); +}Does it look cleaner that we use the offset of m_relid here like the following ?
pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_relid));
Thank you for the suggestion. After more thought, it was a bit odd to
use PgStat_MsgSubscriptionErr to both report and reset the stats by
sending the part or the full struct. So in the latest version, I've
added a new message struct type to reset the subscription error
statistics.
I've attached the updated version patches. Please review them.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v14-0002-Add-RESET-command-to-ALTER-SUBSCRIPTION-command.patchapplication/octet-stream; name=v14-0002-Add-RESET-command-to-ALTER-SUBSCRIPTION-command.patchDownload
From 157e87f96113589f17007def474ab70edcdd6da8 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 2 Aug 2021 14:23:18 +0900
Subject: [PATCH v14 2/3] Add RESET command to ALTER SUBSCRIPTION command.
ALTER SUBSCRIPTION ... RESET command resets subscription
parameters. The parameters that can be set are streaming, binary,
synchronous_commit.
The RESET parameter for ALTER SUBSCRIPTION is required by the
follow-up commit that introduces a new resettable subscription
parameter "skip_xid".
---
doc/src/sgml/ref/alter_subscription.sgml | 8 ++-
src/backend/commands/subscriptioncmds.c | 59 +++++++++++++++-------
src/backend/parser/gram.y | 11 +++-
src/include/nodes/parsenodes.h | 5 +-
src/test/regress/expected/subscription.out | 14 ++++-
src/test/regress/sql/subscription.sql | 13 +++++
6 files changed, 87 insertions(+), 23 deletions(-)
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index bec5e9c483..c9700e8699 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -29,6 +29,7 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> REFRESH PUB
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> ENABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> DISABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <replaceable>new_name</replaceable>
</synopsis>
@@ -193,16 +194,21 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
<varlistentry>
<term><literal>SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )</literal></term>
+ <term><literal>RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )</literal></term>
<listitem>
<para>
This clause alters parameters originally set by
<xref linkend="sql-createsubscription"/>. See there for more
- information. The parameters that can be altered
+ information. The parameters that can be set
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, and
<literal>streaming</literal>.
</para>
+ <para>
+ The parameters that can be reset are: <literal>streaming</literal>,
+ <literal>binary</literal>, <literal>synchronous_commit</literal>.
+ </para>
</listitem>
</varlistentry>
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index c47ba26369..896ec8b836 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -99,7 +99,8 @@ static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname,
*/
static void
parse_subscription_options(ParseState *pstate, List *stmt_options,
- bits32 supported_opts, SubOpts *opts)
+ bits32 supported_opts, SubOpts *opts,
+ bool is_reset)
{
ListCell *lc;
@@ -134,6 +135,11 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
{
DefElem *defel = (DefElem *) lfirst(lc);
+ if (is_reset && defel->arg != NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("RESET must not include values for parameters")));
+
if (IsSet(supported_opts, SUBOPT_CONNECT) &&
strcmp(defel->defname, "connect") == 0)
{
@@ -192,12 +198,18 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_SYNCHRONOUS_COMMIT;
- opts->synchronous_commit = defGetString(defel);
+ if (!is_reset)
+ {
+ opts->synchronous_commit = defGetString(defel);
- /* Test if the given value is valid for synchronous_commit GUC. */
- (void) set_config_option("synchronous_commit", opts->synchronous_commit,
- PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
- false, 0, false);
+ /*
+ * Test if the given value is valid for synchronous_commit
+ * GUC.
+ */
+ (void) set_config_option("synchronous_commit", opts->synchronous_commit,
+ PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
+ false, 0, false);
+ }
}
else if (IsSet(supported_opts, SUBOPT_REFRESH) &&
strcmp(defel->defname, "refresh") == 0)
@@ -215,7 +227,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_BINARY;
- opts->binary = defGetBoolean(defel);
+ if (!is_reset)
+ opts->binary = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_STREAMING) &&
strcmp(defel->defname, "streaming") == 0)
@@ -224,7 +237,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_STREAMING;
- opts->streaming = defGetBoolean(defel);
+ if (!is_reset)
+ opts->streaming = defGetBoolean(defel);
}
else if (strcmp(defel->defname, "two_phase") == 0)
{
@@ -397,7 +411,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT);
- parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
+ parse_subscription_options(pstate, stmt->options, supported_opts, &opts,
+ false);
/*
* Since creating a replication slot is not transactional, rolling back
@@ -866,14 +881,21 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
switch (stmt->kind)
{
- case ALTER_SUBSCRIPTION_OPTIONS:
+ case ALTER_SUBSCRIPTION_SET_OPTIONS:
+ case ALTER_SUBSCRIPTION_RESET_OPTIONS:
{
- supported_opts = (SUBOPT_SLOT_NAME |
- SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
- SUBOPT_STREAMING);
+ bool is_reset = (stmt->kind == ALTER_SUBSCRIPTION_RESET_OPTIONS);
+
+ if (is_reset)
+ supported_opts = (SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
+ SUBOPT_STREAMING);
+ else
+ supported_opts = (SUBOPT_SLOT_NAME |
+ SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
+ SUBOPT_STREAMING);
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, is_reset);
if (IsSet(opts.specified_opts, SUBOPT_SLOT_NAME))
{
@@ -926,7 +948,8 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
case ALTER_SUBSCRIPTION_ENABLED:
{
parse_subscription_options(pstate, stmt->options,
- SUBOPT_ENABLED, &opts);
+ SUBOPT_ENABLED, &opts, false);
+
Assert(IsSet(opts.specified_opts, SUBOPT_ENABLED));
if (!sub->slotname && opts.enabled)
@@ -961,7 +984,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
{
supported_opts = SUBOPT_COPY_DATA | SUBOPT_REFRESH;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
values[Anum_pg_subscription_subpublications - 1] =
publicationListToArray(stmt->publication);
@@ -1008,7 +1031,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = SUBOPT_REFRESH | SUBOPT_COPY_DATA;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
publist = merge_publications(sub->publications, stmt->publication, isadd, stmt->subname);
values[Anum_pg_subscription_subpublications - 1] =
@@ -1056,7 +1079,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
errmsg("ALTER SUBSCRIPTION ... REFRESH is not allowed for disabled subscriptions")));
parse_subscription_options(pstate, stmt->options,
- SUBOPT_COPY_DATA, &opts);
+ SUBOPT_COPY_DATA, &opts, false);
/*
* The subscription option "two_phase" requires that
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index e3068a374e..70558f964a 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9721,7 +9721,16 @@ AlterSubscriptionStmt:
{
AlterSubscriptionStmt *n =
makeNode(AlterSubscriptionStmt);
- n->kind = ALTER_SUBSCRIPTION_OPTIONS;
+ n->kind = ALTER_SUBSCRIPTION_SET_OPTIONS;
+ n->subname = $3;
+ n->options = $5;
+ $$ = (Node *)n;
+ }
+ | ALTER SUBSCRIPTION name RESET definition
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_RESET_OPTIONS;
n->subname = $3;
n->options = $5;
$$ = (Node *)n;
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 3138877553..539921cb52 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3676,7 +3676,8 @@ typedef struct CreateSubscriptionStmt
typedef enum AlterSubscriptionType
{
- ALTER_SUBSCRIPTION_OPTIONS,
+ ALTER_SUBSCRIPTION_SET_OPTIONS,
+ ALTER_SUBSCRIPTION_RESET_OPTIONS,
ALTER_SUBSCRIPTION_CONNECTION,
ALTER_SUBSCRIPTION_SET_PUBLICATION,
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
@@ -3688,7 +3689,7 @@ typedef enum AlterSubscriptionType
typedef struct AlterSubscriptionStmt
{
NodeTag type;
- AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_OPTIONS, etc */
+ AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_SET_OPTIONS, etc */
char *subname; /* Name of the subscription */
char *conninfo; /* Connection string to publisher */
List *publication; /* One or more publication to subscribe to */
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 15a1ac6398..e4c16cab66 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -281,11 +281,23 @@ ALTER SUBSCRIPTION regress_testsub SET (two_phase = false);
ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+-- ok
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit);
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit, binary, streaming);
+-- fail - unsupported parameters
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ERROR: unrecognized subscription parameter: "connect"
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+ERROR: unrecognized subscription parameter: "enabled"
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+ERROR: RESET must not include values for parameters
\dRs+
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7faa935a2a..3b0fbea897 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -215,6 +215,19 @@ ALTER SUBSCRIPTION regress_testsub SET (two_phase = false);
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+
+-- ok
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit);
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit, binary, streaming);
+
+-- fail - unsupported parameters
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+
\dRs+
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
--
2.24.3 (Apple Git-128)
v14-0003-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchapplication/octet-stream; name=v14-0003-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchDownload
From a75a454011cc75ba95a81b314e95882841db903b Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 2 Aug 2021 14:27:40 +0900
Subject: [PATCH v14 3/3] Add skip_xid option to ALTER SUBSCRIPTION.
If incoming change violates any constraint, logical replication stops
until it's resolved. This commit introduces another way to skip the
transaction in question, other than manually updating the subscriber's
database or using pg_replication_origin_advance().
The user can specify XID by ALTER SUBSCRIPTION ... SET (skip_xid =
XXX), updating pg_subscription.subskipxid field, telling the apply
worker to skip the transaction. The apply worker skips all data
modification changes within the specified transaction.
After skipping the transaciton the apply worker clears
pg_subscription.subskipxid. Also, it clears the error statistics of
the subscription in pg_stat_subscription_errors system view as well in
order the user not to get confused. It's done by sending the message
for clearing a subscription error to the stats collector.
---
doc/src/sgml/logical-replication.sgml | 56 ++++-
doc/src/sgml/ref/alter_subscription.sgml | 37 +++-
src/backend/catalog/pg_subscription.c | 10 +
src/backend/commands/subscriptioncmds.c | 42 +++-
src/backend/postmaster/pgstat.c | 17 +-
src/backend/replication/logical/worker.c | 195 +++++++++++++++-
src/backend/utils/adt/pgstatfuncs.c | 2 +-
src/include/catalog/pg_subscription.h | 4 +
src/include/nodes/parsenodes.h | 1 +
src/include/pgstat.h | 12 +-
src/test/regress/expected/subscription.out | 13 ++
src/test/regress/sql/subscription.sql | 11 +
src/test/subscription/t/024_skip_xact.pl | 244 +++++++++++++++++++++
13 files changed, 625 insertions(+), 19 deletions(-)
create mode 100644 src/test/subscription/t/024_skip_xact.pl
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 88646bc859..f7da60290a 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -333,20 +333,68 @@
<para>
A conflict will produce an error and will stop the replication; it must be
resolved manually by the user. Details about the conflict can be found in
- the subscriber's server log.
+ <xref linkend="monitoring-pg-stat-subscription-errors"/> as well as the
+ subscriber's server log.
</para>
<para>
The resolution can be done either by changing data on the subscriber so
- that it does not conflict with the incoming change or by skipping the
- transaction that conflicts with the existing data. The transaction can be
- skipped by calling the <link linkend="pg-replication-origin-advance">
+ that it does not conflict with the incoming changes or by skipping the whole
+ transaction. This option specifies the ID of the transaction whose
+ application is to be skipped by the logical replication worker. The logical
+ replication worker skips all data modification transaction conflicts with
+ the existing data. When a conflict produce an error, it is shown in
+ <structname>pg_stat_subscription_errors</structname> view as follows:
+ </para>
+
+ <programlisting>
+postgres=# SELECT * FROM pg_stat_subscription_errors;
+-[ RECORD 1 ]--------+-----------------------------------------------------------
+datname | postgres
+subid | 16395
+subname | test_sub
+relid | 16385
+command | INSERT
+xid | 716
+failure_source | apply
+failure_count | 50
+last_failure | 2021-07-21 21:16:02.781779+00
+last_failure_message | duplicate key value violates unique constraint "test_pkey"
+stats_reset |
+</programlisting>
+
+ <para>
+ and it is also shown in subscriber's server log as follows:
+ </para>
+
+<screen>
+ERROR: duplicate key value violates unique constraint "test_pkey"
+DETAIL: Key (id)=(1) already exists.
+CONTEXT: processing remote data during "INSERT" for replication target relation "public.test" in transaction 716 with commit timestamp 2021-07-15 21:54:58.802874+00
+</screen>
+
+ <para>
+ The transaction ID that contains the change violating the constraint can be
+ found from those outputs (transaction ID 716 in the above case). The transaction
+ can be skipped by setting <replaceable>skip_xid</replaceable> on the subscription
+ by <command>ALTER SUBSCRIPTION ... SET</command>. Alternatively, the transaction
+ can also be skipped by calling the <link linkend="pg-replication-origin-advance">
<function>pg_replication_origin_advance()</function></link> function with
a <parameter>node_name</parameter> corresponding to the subscription name,
and a position. The current position of origins can be seen in the
<link linkend="view-pg-replication-origin-status">
<structname>pg_replication_origin_status</structname></link> system view.
</para>
+
+ <para>
+ In this case, you need to consider changing the data on the subscriber so that it
+ doesn't conflict with incoming changes, or dropping the conflicting constraint or
+ unique index, or writing a trigger on the subscriber to suppress or redirect
+ conflicting incoming changes, or as a last resort, by skipping the whole transaction.
+ They skip the whole transaction, including changes that may not violate any
+ constraint. They may easily make the subscriber inconsistent, especially if
+ a user specifies the wrong transaction ID or the position of origin.
+ </para>
</sect1>
<sect1 id="logical-replication-restrictions">
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index c9700e8699..9098dd1014 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -206,8 +206,41 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
<literal>streaming</literal>.
</para>
<para>
- The parameters that can be reset are: <literal>streaming</literal>,
- <literal>binary</literal>, <literal>synchronous_commit</literal>.
+ The parameters that can be reset are: <literal>slot_name</literal>,
+ <literal>synchronous_commit</literal>, <literal>binary</literal>,
+ <literal>streaming</literal>, and following parameter:
+ </para>
+ <para>
+ <variablelist>
+ <varlistentry>
+ <term><literal>skip_xid</literal> (<type>xid</type>)</term>
+ <listitem>
+ <para>
+ If incoming data violates any constraints the logical replication
+ will stop until it is resolved. The resolution can be done either
+ by changing data on the subscriber so that it doesn't conflict with
+ incoming change or by skipping the whole transaction. This option
+ specifies transaction ID that logical replication worker skips to
+ apply. The logical replication worker skips all data modification
+ changes within the specified transaction. Therefore, since it skips
+ the whole transaction including the changes that may not violate the
+ constraint, it should only be used as a last resort. This option has
+ no effect for the transaction that is already prepared with enabling
+ <literal>two_phase</literal> on susbscriber. After the logical
+ replication successfully skips the transaction, the transaction ID
+ (stored in
+ <structname>pg_subscription</structname>.<structfield>subskipxid</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts.
+ </para>
+
+ <para>
+ Setting and resetting of <literal>skip_xid</literal> option is
+ restrited to superusers.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index 25021e25a4..cb22cd7463 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -104,6 +104,16 @@ GetSubscription(Oid subid, bool missing_ok)
Assert(!isnull);
sub->publications = textarray_to_stringlist(DatumGetArrayTypeP(datum));
+ /* Get skip XID */
+ datum = SysCacheGetAttr(SUBSCRIPTIONOID,
+ tup,
+ Anum_pg_subscription_subskipxid,
+ &isnull);
+ if (!isnull)
+ sub->skipxid = DatumGetTransactionId(datum);
+ else
+ sub->skipxid = InvalidTransactionId;
+
ReleaseSysCache(tup);
return sub;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 896ec8b836..fd74037fb8 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -60,6 +60,7 @@
#define SUBOPT_BINARY 0x00000080
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
+#define SUBOPT_SKIP_XID 0x00000400
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -81,6 +82,7 @@ typedef struct SubOpts
bool binary;
bool streaming;
bool twophase;
+ TransactionId skip_xid;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -129,6 +131,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->streaming = false;
if (IsSet(supported_opts, SUBOPT_TWOPHASE_COMMIT))
opts->twophase = false;
+ if (IsSet(supported_opts, SUBOPT_SKIP_XID))
+ opts->skip_xid = InvalidTransactionId;
/* Parse options */
foreach(lc, stmt_options)
@@ -261,6 +265,29 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_TWOPHASE_COMMIT;
opts->twophase = defGetBoolean(defel);
}
+ else if (strcmp(defel->defname, "skip_xid") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_SKIP_XID))
+ errorConflictingDefElem(defel, pstate);
+
+ if (!is_reset)
+ {
+ char *xid_str = defGetString(defel);
+ TransactionId xid;
+
+ /* Parse the argument as TransactionId */
+ xid = DatumGetTransactionId(DirectFunctionCall1(xidin,
+ CStringGetDatum(xid_str)));
+
+ if (!TransactionIdIsNormal(xid))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid transaction id")));
+ opts->skip_xid = xid;
+ }
+
+ opts->specified_opts |= SUBOPT_SKIP_XID;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -485,6 +512,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
CharGetDatum(opts.twophase ?
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -888,7 +916,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (is_reset)
supported_opts = (SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
- SUBOPT_STREAMING);
+ SUBOPT_STREAMING | SUBOPT_SKIP_XID);
else
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
@@ -941,6 +969,18 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
replaces[Anum_pg_subscription_substream - 1] = true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_SKIP_XID))
+ {
+ if (!superuser())
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("must be superuser to set %s", "skip_xid")));
+
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(opts.skip_xid);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+ }
+
update_tuple = true;
break;
}
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index b01c6b5fcc..d3bc6949e8 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -1728,13 +1728,14 @@ pgstat_reset_replslot_counter(const char *name)
* ----------
*/
void
-pgstat_reset_subscription_error(Oid subid, Oid subrelid)
+pgstat_reset_subscription_error(Oid subid, Oid subrelid, bool reset_all)
{
PgStat_MsgSubscriptionErrReset msg;
pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRRESET);
msg.m_subid = subid;
msg.m_subrelid = subrelid;
+ msg.m_reset_all = reset_all;
pgstat_send(&msg, sizeof(PgStat_MsgSubscriptionErrReset));
}
@@ -6278,6 +6279,7 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
static void
pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len)
{
+
PgStat_StatSubErrEntry *errent;
/* Get subscription error */
@@ -6335,8 +6337,17 @@ pgstat_recv_subscription_error_reset(PgStat_MsgSubscriptionErrReset *msg, int le
if (errent == NULL)
return;
- /* reset fields and set reset timestamp */
- pgstat_reset_subscription_error_entry(errent, GetCurrentTimestamp());
+ if (msg->m_reset_all)
+ {
+ /* reset fields and set reset timestamp */
+ pgstat_reset_subscription_error_entry(errent, GetCurrentTimestamp());
+ }
+ else
+ {
+ errent->command = 0;
+ errent->xid = InvalidTransactionId;
+ errent->failure_count = 0;
+ }
/* If the apply error, reset also the relid */
if (!OidIsValid(msg->m_subrelid))
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index e91fa86b1a..fea0e81ad5 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -136,6 +136,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -255,6 +256,21 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * skipping_xid is a valid XID if we're skipping all data modification changes
+ * (INSERT/DELETE/UPDATE/TRUNCATE) of the specified transaction in MySubscription->skipxid.
+ * Please note that we don’t skip receiving the changes particularly in streaming
+ * cases, since we decide whether or not to skip applying the changes when starting
+ * to apply. Once starting skipping changes, we copy the XID to skipping_xid and
+ * then don't stop skipping until we skip the whole transaction even if the
+ * subscription is invalidated and MySubscription->skipxid gets changed or reset.
+ * When stopping the skipping behavior, we reset the skip XID (subskipxid) in the
+ * pg_subscription catalog and associate origin status to the transaction that resets
+ * the skip XID so that we can start streaming from the next transaction.
+ */
+static TransactionId skipping_xid = InvalidTransactionId;
+#define is_skipping_changes() (TransactionIdIsValid(skipping_xid))
+
/* BufFile handle of the current streaming file */
static BufFile *stream_fd = NULL;
@@ -335,6 +351,9 @@ static void apply_error_callback(void *arg);
static inline void set_apply_error_context_xact(TransactionId xid, TimestampTz ts);
static inline void reset_apply_error_context_info(void);
+static void maybe_start_skipping_changes(TransactionId xid);
+static void stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs);
+
/*
* Should this worker apply changes for given relation.
*
@@ -789,6 +808,11 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
+ /*
+ * Enable skipping all changes of this transaction if specified.
+ */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -813,7 +837,18 @@ apply_handle_commit(StringInfo s)
LSN_FORMAT_ARGS(commit_data.commit_lsn),
LSN_FORMAT_ARGS(remote_final_lsn))));
- apply_handle_commit_internal(&commit_data);
+ /*
+ * Stop the skipping transaction if enabled. Otherwise, commit the changes
+ * that are just applied.
+ */
+ if (is_skipping_changes())
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ apply_handle_commit_internal(&commit_data);
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(commit_data.end_lsn);
@@ -841,6 +876,9 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
+ /* Enable skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -899,9 +937,10 @@ apply_handle_prepare(StringInfo s)
/*
* Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction. It is done this way because at
- * commit prepared time, we won't know whether we have skipped preparing a
- * transaction because of no change.
+ * change has happened in this transaction, possibly because we're
+ * skipping data-modification changes of this transaction. It is done this
+ * way because at commit prepared time, we won't know whether we have
+ * skipped preparing a transaction because of no change.
*
* XXX, We can optimize such that at commit prepared time, we first check
* whether we have prepared the transaction or not but that doesn't seem
@@ -915,6 +954,10 @@ apply_handle_prepare(StringInfo s)
CommitTransactionCommand();
pgstat_report_stat(false);
+ /* Stop the skipping changes if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
store_flush_position(prepare_data.end_lsn);
in_remote_transaction = false;
@@ -1046,6 +1089,9 @@ apply_handle_stream_prepare(StringInfo s)
elog(DEBUG1, "received prepare for streamed transaction %u", prepare_data.xid);
+ /* Enable skipping all changes of this transaction if specified. */
+ maybe_start_skipping_changes(prepare_data.xid);
+
/* Replay all the spooled operations. */
apply_spooled_messages(prepare_data.xid, prepare_data.prepare_lsn);
@@ -1056,6 +1102,10 @@ apply_handle_stream_prepare(StringInfo s)
pgstat_report_stat(false);
+ /* Stop the skipping changes if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
store_flush_position(prepare_data.end_lsn);
in_remote_transaction = false;
@@ -1081,9 +1131,10 @@ apply_handle_origin(StringInfo s)
{
/*
* ORIGIN message can only come inside streaming transaction or inside
- * remote transaction and before any actual writes.
+ * remote transaction and before any actual writes unless we're skipping
+ * changes of the transaction.
*/
- if (!in_streamed_transaction &&
+ if (!in_streamed_transaction && !is_skipping_changes() &&
(!in_remote_transaction ||
(IsTransactionState() && !am_tablesync_worker())))
ereport(ERROR,
@@ -1206,6 +1257,7 @@ apply_handle_stream_abort(StringInfo s)
errmsg_internal("STREAM ABORT message without STREAM STOP")));
logicalrep_read_stream_abort(s, &xid, &subxid);
+ maybe_start_skipping_changes(xid);
/*
* If the two XIDs are the same, it's in fact abort of toplevel xact, so
@@ -1289,6 +1341,10 @@ apply_handle_stream_abort(StringInfo s)
CommitTransactionCommand();
}
+ /* Stop the skipping transaction if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
reset_apply_error_context_info();
}
@@ -1428,9 +1484,23 @@ apply_handle_stream_commit(StringInfo s)
elog(DEBUG1, "received commit for streamed transaction %u", xid);
+ /* Enable skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(xid);
+
apply_spooled_messages(xid, commit_data.commit_lsn);
- apply_handle_commit_internal(&commit_data);
+ if (is_skipping_changes())
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ {
+ /* commit the streamed transaction */
+ apply_handle_commit_internal(&commit_data);
+ }
/* unlink the files with serialized changes and subxact info */
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
@@ -2316,6 +2386,17 @@ apply_dispatch(StringInfo s)
LogicalRepMsgType action = pq_getmsgbyte(s);
LogicalRepMsgType saved_command;
+ /*
+ * Skip all data-modification changes if we're skipping changes of this
+ * transaction.
+ */
+ if (is_skipping_changes() &&
+ (action == LOGICAL_REP_MSG_INSERT ||
+ action == LOGICAL_REP_MSG_UPDATE ||
+ action == LOGICAL_REP_MSG_DELETE ||
+ action == LOGICAL_REP_MSG_TRUNCATE))
+ return;
+
/*
* Set the current command being applied. Since this function can be
* called recusively when applying spooled changes, save the current
@@ -3662,3 +3743,103 @@ reset_apply_error_context_info(void)
apply_error_callback_arg.remote_attnum = -1;
set_apply_error_context_xact(InvalidTransactionId, 0);
}
+
+/*
+ * Start skipping changes of the transaction if the given XID matches the
+ * transaction ID specified by skip_xid option.
+ */
+static void
+maybe_start_skipping_changes(TransactionId xid)
+{
+ Assert(!is_skipping_changes());
+ Assert(!TransactionIdIsValid(skipping_xid));
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ if (!TransactionIdIsValid(MySubscription->skipxid) ||
+ MySubscription->skipxid != xid)
+ return;
+
+ skipping_xid = xid;
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction with xid %u",
+ skipping_xid));
+}
+
+/*
+ * Stop skipping changes and reset the skip XID. Also, reset the skip XID
+ * (pg_subscription.subskipxid). If origin_lsn and origin_committs are valid, we
+ * set origin state to the transaction commit that resets the skip XID so that we
+ * can start streaming from the transaction next to the one that we just skipped.
+ */
+static void
+stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs)
+{
+ Relation rel;
+ HeapTuple tup;
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ Assert(is_skipping_changes());
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction with xid %u",
+ skipping_xid)));
+
+ /* Stop skipping changes */
+ skipping_xid = InvalidTransactionId;
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ if (!IsTransactionState())
+ StartTransactionCommand();
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ TransactionIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ /* Set subskipxid to null */
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ /* Update the system catalog to reset the skip XID */
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+
+ heap_freetuple(tup);
+ table_close(rel, RowExclusiveLock);
+
+ if (!XLogRecPtrIsInvalid(origin_lsn))
+ {
+ /*
+ * Update origin state so we can restart streaming from correct
+ * position in case of crash.
+ */
+ replorigin_session_origin_lsn = origin_lsn;
+ replorigin_session_origin_timestamp = origin_committs;
+ }
+
+ CommitTransactionCommand();
+ pgstat_report_stat(false);
+
+ /*
+ * Clear the error statistics of this subscription to let users know the
+ * subscription is no longer getting stuck by the conflict.
+ *
+ * The message for clearing the error statistics can be lost but that's
+ * okay. The user can know the logical replication is working fine in
+ * other ways, for example, checking pg_stat_subscription view. And the
+ * user is able to reset the single subscription error statistics by
+ * pg_reset_subscription_error SQL function.
+ */
+ pgstat_reset_subscription_error(MySubscription->oid, InvalidOid, false);
+}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 51f693c22b..63e7a4b632 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -2252,7 +2252,7 @@ pg_stat_reset_subscription_error(PG_FUNCTION_ARGS)
else
relid = PG_GETARG_OID(1); /* reset table sync worker error stats */
- pgstat_reset_subscription_error(subid, relid);
+ pgstat_reset_subscription_error(subid, relid, true);
PG_RETURN_VOID();
}
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 21061493ea..beaa6e646d 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -67,6 +67,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
char subtwophasestate; /* Stream two-phase transactions */
+ TransactionId subskipxid BKI_FORCE_NULL; /* All changes associated with
+ * this XID are skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -103,6 +106,7 @@ typedef struct Subscription
* binary format */
bool stream; /* Allow streaming in-progress transactions. */
char twophasestate; /* Allow streaming two-phase transactions */
+ TransactionId skipxid; /* All changes of the XID are skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 539921cb52..63503b86da 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3694,6 +3694,7 @@ typedef struct AlterSubscriptionStmt
char *conninfo; /* Connection string to publisher */
List *publication; /* One or more publication to subscribe to */
List *options; /* List of DefElem nodes */
+ TransactionId skip_xid; /* XID to skip */
} AlterSubscriptionStmt;
typedef struct DropSubscriptionStmt
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index e9702ec150..3cda0c9251 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -584,6 +584,16 @@ typedef struct PgStat_MsgSubscriptionErrReset
*/
Oid m_subid;
Oid m_subrelid;
+
+ /*
+ * If true, the collector resets all fields and sets reset_stats timestamp.
+ * Otherwise, it resets all fields except for last_failure, last_errmsg and
+ * stat_reset_timestamp. This is used by the apply worker, when the error
+ * is resolved, to clear the transaction ID, the command, and the relation
+ * OID that were relevant with the error while leaving the information of
+ * the last error.
+ */
+ bool m_reset_all;
} PgStat_MsgSubscriptionErrReset;
/* ----------
@@ -1144,7 +1154,7 @@ extern void pgstat_reset_shared_counters(const char *);
extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type type);
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
-extern void pgstat_reset_subscription_error(Oid subid, Oid subrelid);
+extern void pgstat_reset_subscription_error(Oid subid, Oid subrelid, bool reset_all);
extern void pgstat_report_connect(Oid dboid);
extern void pgstat_report_autovac(Oid dboid);
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index e4c16cab66..e4dc4fb946 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -293,6 +293,19 @@ ERROR: unrecognized subscription parameter: "enabled"
-- fail - RESET must not include values
ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
ERROR: RESET must not include values for parameters
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = '4294967295');
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1.1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+ERROR: invalid transaction id
\dRs+
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 3b0fbea897..c458b38985 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -228,6 +228,17 @@ ALTER SUBSCRIPTION regress_testsub RESET (enabled);
-- fail - RESET must not include values
ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = '4294967295');
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1.1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+
\dRs+
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/subscription/t/024_skip_xact.pl b/src/test/subscription/t/024_skip_xact.pl
new file mode 100644
index 0000000000..affb663803
--- /dev/null
+++ b/src/test/subscription/t/024_skip_xact.pl
@@ -0,0 +1,244 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Tests for skipping logical replication transactions
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 10;
+
+# Test if the error reported on pg_subscription_errors view is expected.
+sub test_subscription_error
+{
+ my ($node, $source, $relname, $expected_error, $msg) = @_;
+
+ # Wait for the error statistics to be updated.
+ $node->poll_query_until(
+ 'postgres', qq[
+SELECT count(1) > 0 FROM pg_stat_subscription_errors
+WHERE relid = '$relname'::regclass AND failure_source = '$source';
+]) or die "Timed out while waiting for statistics to be updated";
+
+ my $result = $node->safe_psql(
+ 'postgres',
+ qq[
+SELECT datname, subname, command, relid::regclass, failure_source, failure_count > 0
+FROM pg_stat_subscription_errors
+WHERE relid = '$relname'::regclass AND failure_source = '$source';
+]);
+ is($result, $expected_error, $msg);
+}
+
+# Check the error reported on pg_stat_subscription view and skip the failed
+# transaction.
+sub test_skip_subscription_error
+{
+ my ($node, $source, $subname, $relname, $expected_error, $msg) = @_;
+
+ # Check the reported error.
+ test_subscription_error($node, $source, $relname, $expected_error, $msg);
+
+ # Get XID of the failed transaction.
+ my $skipxid = $node->safe_psql(
+ 'postgres',
+ "SELECT xid FROM pg_stat_subscription_errors WHERE relid = '$relname'::regclass");
+ $node->safe_psql('postgres',
+ "ALTER SUBSCRIPTION $subname SET (skip_xid = '$skipxid')");
+
+ # Restart the subscriber to restart logical replication without interval.
+ $node->restart;
+
+ # Wait for the failed transaction to be skipped.
+ $node->poll_query_until(
+ 'postgres',
+ qq[
+SELECT subskipxid IS NULL FROM pg_subscription
+WHERE subname = '$subname'
+]) or die "Timed out while waiting for the transaction to be skipped";
+
+ # Also wait for the error details to be cleared.
+ $node->poll_query_until(
+ 'postgres',
+ qq[
+SELECT command IS NULL FROM pg_stat_subscription_errors
+WHERE subname = '$subname' AND failure_source = '$source';
+]) or die "Timed out while waiting for the transaction to be skipped";
+}
+
+# Create publisher node.
+my $node_publisher = PostgresNode->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+logical_decoding_work_mem = 64kB
+]);
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgresNode->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+
+# don't overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+wal_retrieve_retry_interval = 5s
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we create
+# the same tables but with primary keys. Also, insert some data that will conflict
+# with the data replicated from publisher later.
+$node_publisher->safe_psql('postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int);
+CREATE TABLE test_tab2 (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+INSERT INTO test_tab1 VALUES (1);
+INSERT INTO test_tab2 VALUES (1);
+COMMIT;
+]);
+$node_subscriber->safe_psql('postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int primary key);
+CREATE TABLE test_tab2 (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab2 VALUES (1);
+INSERT INTO test_tab_streaming SELECT 10000, md5(10000::text);
+COMMIT;
+]);
+
+# Check if there is no subscription errors before starting logical replication.
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_errors");
+is($result, qq(0), 'check no subscription error');
+
+# Setup publications.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ q[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Create subscriptions. The table sync for test_tab2 on tap_sub will enter to
+# infinite error due to violating the unique constraint.
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (two_phase = on);");
+my $appname_streaming = 'tap_sub_streaming';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming' PUBLICATION tap_pub_streaming WITH (streaming = on, two_phase = on);");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Wait for initial table sync for test_tab1 and test_tab_streaming to finish.
+$node_subscriber->poll_query_until('postgres',
+ q[
+SELECT count(1) = 2 FROM pg_subscription_rel
+WHERE srrelid in ('test_tab1'::regclass, 'test_tab_streaming'::regclass) AND srsubstate = 'r'
+]) or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check the initial data.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab1");
+is($result, q(1), 'check initial data was copied to subscriber');
+
+# Insert more data to test_tab1, raising an error on the subscriber due to violating
+# the unique constraint on test_tab1. Then skip the transaction in question.
+$node_publisher->safe_psql('postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab1 VALUES (1);
+COMMIT;
+]);
+test_skip_subscription_error($node_subscriber,
+ 'apply', 'tap_sub', 'test_tab1',
+ qq(postgres|tap_sub|INSERT|test_tab1|apply|t),
+ 'skip the error reported by the apply worker');
+
+# Check the table sync worker's error in the view.
+test_subscription_error($node_subscriber,
+ 'tablesync', 'test_tab2',
+ qq(postgres|tap_sub||test_tab2|tablesync|t),
+ 'skip the error reported by the table sync worker');
+
+# Insert enough rows to test_tab_streaming to exceed the 64kB limit, also raising an
+# error on the subscriber during applying spooled changes for the same reason. Then
+# skip the transactio in question.
+$node_publisher->safe_psql('postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+COMMIT;
+]);
+test_skip_subscription_error($node_subscriber,
+ 'apply', 'tap_sub_streaming', 'test_tab_streaming',
+ qq(postgres|tap_sub_streaming|INSERT|test_tab_streaming|apply|t),
+ 'skip the error reported by the table sync worker during during applying streaming changes');
+
+# Insert data to test_tab1 and test_tab_streaming that don't conflict.
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab1 VALUES (2)");
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab_streaming VALUES (10001, md5(10001::text))");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Check the data is successfully replicated after skipping the transactions.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT * FROM test_tab1");
+is($result, q(1
+2), "subscription gets changes after skipped transaction");
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM test_tab_streaming");
+is($result, q(2), "subscription gets changes after skipped streamed transaction");
+
+# Tests for skipping the transactions that are prepared and stream_prepared. We insert
+# the same data as the previous tests but prepare the transactions. Those insertions
+# raise an error on the subscriptions. Then we skip the transactions in question.
+$node_publisher->safe_psql('postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab1 VALUES (1);
+PREPARE TRANSACTION 'skip_sub1';
+COMMIT PREPARED 'skip_sub1';
+]);
+test_skip_subscription_error($node_subscriber,
+ 'apply', 'tap_sub', 'test_tab1',
+ qq(postgres|tap_sub|INSERT|test_tab1|apply|t),
+ 'skip the error on changes of the prepared transaction');
+
+$node_publisher->safe_psql('postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+PREPARE TRANSACTION 'skip_sub2';
+COMMIT PREPARED 'skip_sub2';
+]);
+test_skip_subscription_error($node_subscriber,
+ 'apply', 'tap_sub_streaming', 'test_tab_streaming',
+ qq(postgres|tap_sub_streaming|INSERT|test_tab_streaming|apply|t),
+ 'skip the error on changes of the prepared-streamed transaction');
+
+# Check if the view doesn't show any entries after dropping the subscriptions.
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+DROP SUBSCRIPTION tap_sub;
+DROP SUBSCRIPTION tap_sub_streaming;
+]);
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_errors");
+is($result, q(0), 'no error after dropping subscription');
--
2.24.3 (Apple Git-128)
v14-0001-Add-pg_stat_subscription_errors-statistics-view.patchapplication/octet-stream; name=v14-0001-Add-pg_stat_subscription_errors-statistics-view.patchDownload
From c4add94376eb6455ac9951791cd097439dee083e Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 16 Jul 2021 23:10:22 +0900
Subject: [PATCH v14 1/3] Add pg_stat_subscription_errors statistics view.
This commits adds new system view pg_stat_subscription_errors view
that show information about any errors which occur during application
of logical replication changes as well as during performing initial
table synchronization.
The subscription error entries are removed by autovacuum workers after
table synchronization compeletes in table sync worker cases and after
dropping the subscription in apply worker cases.
It also adds SQL function pg_stat_reset_subscription_error() to
reset the single subscription error.
---
doc/src/sgml/monitoring.sgml | 169 ++++++
src/backend/catalog/system_functions.sql | 2 +
src/backend/catalog/system_views.sql | 27 +
src/backend/postmaster/pgstat.c | 709 +++++++++++++++++++++++
src/backend/replication/logical/worker.c | 51 +-
src/backend/utils/adt/pgstatfuncs.c | 112 ++++
src/include/catalog/pg_proc.dat | 13 +
src/include/pgstat.h | 127 ++++
src/test/regress/expected/rules.out | 22 +
src/tools/pgindent/typedefs.list | 5 +
10 files changed, 1234 insertions(+), 3 deletions(-)
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 2281ba120f..b0e426033e 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -346,6 +346,15 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
</entry>
</row>
+ <row>
+ <entry><structname>pg_stat_subscription_errors</structname><indexterm><primary>pg_stat_subscription_errors</primary></indexterm></entry>
+ <entry>One row per error that occurred on subscription, showing information about
+ each subscription error.
+ See <link linkend="monitoring-pg-stat-subscription-errors">
+ <structname>pg_stat_subscription_errors</structname></link> for details.
+ </entry>
+ </row>
+
<row>
<entry><structname>pg_stat_ssl</structname><indexterm><primary>pg_stat_ssl</primary></indexterm></entry>
<entry>One row per connection (regular and replication), showing information about
@@ -3050,6 +3059,144 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
</sect2>
+ <sect2 id="monitoring-pg-stat-subscription-errors">
+ <title><structname>pg_stat_subscription_errors</structname></title>
+
+ <indexterm>
+ <primary>pg_stat_subscription_errors</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_stat_subscription_errors</structname> view will contain one
+ row per subscription error reported by workers applying logical replication
+ changes and workers handling the initial data copy of the subscribed tables.
+ </para>
+
+ <table id="pg-stat-subscription-errors" xreflabel="pg_stat_subscription-errors">
+ <title><structname>pg_stat_subscription_errors</structname> View</title>
+ <tgroup cols="1">
+ <thead>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ Column Type
+ </para>
+ <para>
+ Description
+ </para></entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>datname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the database in which the subscription was created.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the subscription.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the subscription.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>relid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker was processing when the
+ error occurred.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>command</structfield> <type>text</type>
+ </para>
+ <para>
+ Name of command being applied when the error occurred. This
+ field is always NULL if the error is reported by the
+ <literal>tablesync</literal> worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>xid</structfield> <type>xid</type>
+ </para>
+ <para>
+ Transaction ID of the publisher node being applied when the error
+ occurred. This field is always NULL if the error is reported
+ by the <literal>tablesync</literal> worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>failure_source</structfield> <type>text</type>
+ </para>
+ <para>
+ Type of worker reporting the error: <literal>apply</literal> or
+ <literal>tablesync</literal>.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>failure_count</structfield> <type>uint8</type>
+ </para>
+ <para>
+ Number of times the error occurred in the worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_failure</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which the last error occurred.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_failure_message</structfield> <type>text</type>
+ </para>
+ <para>
+ Last reported error message.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>stats_reset</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which these statistics were last reset.
+ </para></entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ </sect2>
+
<sect2 id="monitoring-pg-stat-ssl-view">
<title><structname>pg_stat_ssl</structname></title>
@@ -5172,6 +5319,28 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
can be granted EXECUTE to run the function.
</para></entry>
</row>
+
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_stat_reset_subscription_error</primary>
+ </indexterm>
+ <function>pg_stat_reset_subscription_error</function> ( <parameter>subid</parameter> <type>oid</type>, <parameter>relid</parameter> <type>oid</type> )
+ <returnvalue>void</returnvalue>
+ </para>
+ <para>
+ Resets statistics of a single subscription error. If
+ the argument <parameter>relid</parameter> is not <literal>NULL</literal>,
+ resets error statistics of the <literal>tablesync</literal> worker for
+ the relation with <parameter>relid</parameter>. Otherwise, resets the
+ error statistics of the <literal>apply</literal> worker running on the
+ subscription with <parameter>subid</parameter>.
+ </para>
+ <para>
+ This function is restricted to superusers by default, but other users
+ can be granted EXECUTE to run the function.
+ </para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index a416e94d37..c9aa6f04d3 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -639,6 +639,8 @@ REVOKE EXECUTE ON FUNCTION pg_stat_reset_single_function_counters(oid) FROM publ
REVOKE EXECUTE ON FUNCTION pg_stat_reset_replication_slot(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_stat_reset_subscription_error(oid, oid) FROM public;
+
REVOKE EXECUTE ON FUNCTION lo_import(text) FROM public;
REVOKE EXECUTE ON FUNCTION lo_import(text, oid) FROM public;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 55f6e3711d..b0cd8d2546 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1257,3 +1257,30 @@ REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
substream, subtwophasestate, subslotname, subsynccommit, subpublications)
ON pg_subscription TO public;
+
+CREATE VIEW pg_stat_subscription_errors AS
+ SELECT
+ d.datname,
+ sr.subid,
+ s.subname,
+ e.relid,
+ e.command,
+ e.xid,
+ e.failure_source,
+ e.failure_count,
+ e.last_failure,
+ e.last_failure_message,
+ e.stats_reset
+ FROM (SELECT
+ oid as subid,
+ NULL as relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT
+ srsubid as subid,
+ srrelid as relid
+ FROM pg_subscription_rel
+ WHERE srsubstate <> 'r') sr,
+ LATERAL pg_stat_get_subscription_error(sr.subid, sr.relid) e
+ JOIN pg_subscription s ON (e.subid = s.oid)
+ JOIN pg_database d ON (s.subdbid = d.oid);
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index b7d0fbaefd..b01c6b5fcc 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -41,6 +41,8 @@
#include "catalog/catalog.h"
#include "catalog/pg_database.h"
#include "catalog/pg_proc.h"
+#include "catalog/pg_subscription.h"
+#include "catalog/pg_subscription_rel.h"
#include "common/ip.h"
#include "executor/instrument.h"
#include "libpq/libpq.h"
@@ -106,6 +108,7 @@
#define PGSTAT_TAB_HASH_SIZE 512
#define PGSTAT_FUNCTION_HASH_SIZE 512
#define PGSTAT_REPLSLOT_HASH_SIZE 32
+#define PGSTAT_SUBSCRIPTION_HASH_SIZE 32
/* ----------
@@ -282,6 +285,7 @@ static PgStat_GlobalStats globalStats;
static PgStat_WalStats walStats;
static PgStat_SLRUStats slruStats[SLRU_NUM_ELEMENTS];
static HTAB *replSlotStatHash = NULL;
+static HTAB *subscriptionHash = NULL;
/*
* List of OIDs of databases we need to write out. If an entry is InvalidOid,
@@ -332,6 +336,14 @@ static bool pgstat_db_requested(Oid databaseid);
static PgStat_StatReplSlotEntry *pgstat_get_replslot_entry(NameData name, bool create_it);
static void pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotstats, TimestampTz ts);
+static PgStat_StatSubEntry *pgstat_get_subscription_entry(Oid subid,
+ bool create);
+static PgStat_StatSubErrEntry *pgstat_get_subscription_error_entry(Oid subid,
+ Oid subrelid,
+ bool create);
+static void pgstat_reset_subscription_error_entry(PgStat_StatSubErrEntry *errent,
+ TimestampTz ts);
+
static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg, TimestampTz now);
static void pgstat_send_funcstats(void);
static void pgstat_send_slru(void);
@@ -373,6 +385,12 @@ static void pgstat_recv_connect(PgStat_MsgConnect *msg, int len);
static void pgstat_recv_disconnect(PgStat_MsgDisconnect *msg, int len);
static void pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len);
static void pgstat_recv_tempfile(PgStat_MsgTempFile *msg, int len);
+static void pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len);
+static void pgstat_recv_subscription_error_reset(PgStat_MsgSubscriptionErrReset *msg,
+ int len);
+static void pgstat_recv_subscription_error_purge(PgStat_MsgSubscriptionErrPurge *msg,
+ int len);
+static void pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len);
/* ------------------------------------------------------------
* Public functions called from postmaster follow
@@ -1178,6 +1196,165 @@ pgstat_vacuum_stat(void)
}
}
+ /*
+ * Search for all the dead subscriptions and error entries in stats
+ * hashtable and tell the stats collector to drop them.
+ */
+ if (subscriptionHash)
+ {
+ PgStat_MsgSubscriptionPurge submsg;
+ PgStat_StatSubEntry *subent;
+ HTAB *htab;
+
+ htab = pgstat_collect_oids(SubscriptionRelationId, Anum_pg_subscription_oid);
+
+ submsg.m_nentries = 0;
+ hash_seq_init(&hstat, subscriptionHash);
+ while ((subent = (PgStat_StatSubEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ PgStat_MsgSubscriptionErrPurge errmsg;
+ PgStat_StatSubErrEntry *errent;
+ HASH_SEQ_STATUS hstat_rel;
+ List *not_ready_rels_list;
+ HTAB *not_ready_rels_htab = NULL;
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* Check if the subscription is dead */
+ if (hash_search(htab, (void *) &(subent->subid), HASH_FIND, NULL) == NULL)
+ {
+ /* This subscription is dead, add the subid to the message */
+ submsg.m_subids[submsg.m_nentries++] = subent->subid;
+
+ /*
+ * If the message is full, send it out and reinitialize to
+ * empty
+ */
+ if (submsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONPURGE)
+ {
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + submsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&submsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(&submsg, len);
+ submsg.m_nentries = 0;
+ }
+
+ continue;
+ }
+
+ /*
+ * Nothing to do here if the subscription exists but has no table
+ * sync error entries.
+ */
+ if (subent->sync_errors == NULL)
+ continue;
+
+ /*
+ * The subscription has table sync error entries. We search errors of
+ * the table sync workers who are already in sync state. Those errors
+ * should be removed.
+ */
+ not_ready_rels_list = GetSubscriptionNotReadyRelations(subent->subid);
+
+ if (not_ready_rels_list != NIL)
+ {
+ HASHCTL hash_ctl;
+ ListCell *lc;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(SubscriptionRelState);
+ not_ready_rels_htab = hash_create("not ready relations in subscription",
+ 64,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+
+ /*
+ * The number of not-ready relations can be high for example right
+ * after creating a subscription, so we load the list of
+ * SubscriptionRelState into the hash table for faster lookups.
+ */
+ foreach(lc, not_ready_rels_list)
+ {
+ SubscriptionRelState *r_elem = (SubscriptionRelState *) lfirst(lc);
+ SubscriptionRelState *r_entry;
+
+ CHECK_FOR_INTERRUPTS();
+ r_entry = hash_search(not_ready_rels_htab, (void *) &(r_elem->relid),
+ HASH_ENTER, NULL);
+ memcpy(r_entry, r_elem, sizeof(SubscriptionRelState));
+ }
+
+ list_free(not_ready_rels_list);
+ }
+
+ errmsg.m_nentries = 0;
+ errmsg.m_subid = subent->subid;
+
+ /*
+ * Search for all table sync error entries of which relation is already
+ * in ready state
+ */
+ hash_seq_init(&hstat_rel, subent->sync_errors);
+ while ((errent = (PgStat_StatSubErrEntry *) hash_seq_search(&hstat_rel)) != NULL)
+ {
+ Assert(OidIsValid(errent->relid));
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* Skip this table if it doesn't complete yet */
+ if (not_ready_rels_htab != NULL &&
+ hash_search(not_ready_rels_htab, (void *) &(errent->relid),
+ HASH_FIND, NULL) != NULL)
+ continue;
+
+ errmsg.m_relids[errmsg.m_nentries++] = errent->relid;
+
+ /*
+ * If the message is full, send it out and reinitialize to
+ * empty
+ */
+ if (errmsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONERRPURGE)
+ {
+ len = offsetof(PgStat_MsgSubscriptionErrPurge, m_relids[0])
+ + errmsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&errmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE);
+ pgstat_send(&errmsg, len);
+ errmsg.m_nentries = 0;
+ }
+ }
+
+ /* Send the rest of dead error entries */
+ if (errmsg.m_nentries > 0)
+ {
+ len = offsetof(PgStat_MsgSubscriptionErrPurge, m_relids[0])
+ + errmsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&errmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE);
+ pgstat_send(&errmsg, len);
+ errmsg.m_nentries = 0;
+ }
+
+ /* Clean up */
+ if (not_ready_rels_htab != NULL)
+ hash_destroy(not_ready_rels_htab);
+ }
+
+ /* Send the rest of dead subscriptions */
+ if (submsg.m_nentries > 0)
+ {
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + submsg.m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&submsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(&submsg, len);
+ }
+
+ /* Clean up */
+ hash_destroy(htab);
+ }
+
/*
* Lookup our own database entry; if not found, nothing more to do.
*/
@@ -1544,6 +1721,24 @@ pgstat_reset_replslot_counter(const char *name)
pgstat_send(&msg, sizeof(msg));
}
+/* ----------
+ * pgstat_reset_subscription_error() -
+ *
+ * Tell the collector to reset the subscription error.
+ * ----------
+ */
+void
+pgstat_reset_subscription_error(Oid subid, Oid subrelid)
+{
+ PgStat_MsgSubscriptionErrReset msg;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRRESET);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+
+ pgstat_send(&msg, sizeof(PgStat_MsgSubscriptionErrReset));
+}
+
/* ----------
* pgstat_report_autovac() -
*
@@ -1869,6 +2064,35 @@ pgstat_report_replslot_drop(const char *slotname)
pgstat_send(&msg, sizeof(PgStat_MsgReplSlot));
}
+/* ----------
+ * pgstat_report_subscription_error() -
+ *
+ * Tell the collector about the subscription error.
+ * ----------
+ */
+void
+pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command, TransactionId xid,
+ const char *errmsg)
+{
+ PgStat_MsgSubscriptionErr msg;
+ int len;
+
+ Assert(strlen(errmsg) < PGSTAT_SUBSCRIPTIONERR_MSGLEN);
+ len = offsetof(PgStat_MsgSubscriptionErr, m_errmsg[0]) + strlen(errmsg) + 1;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_relid = relid;
+ msg.m_command = command;
+ msg.m_xid = xid;
+ msg.m_failure_time = GetCurrentTimestamp();
+ strlcpy(msg.m_errmsg, errmsg, PGSTAT_SUBSCRIPTIONERR_MSGLEN);
+
+ pgstat_send(&msg, len);
+}
+
/* ----------
* pgstat_ping() -
*
@@ -2987,6 +3211,22 @@ pgstat_fetch_replslot(NameData slotname)
return pgstat_get_replslot_entry(slotname, false);
}
+/*
+ * ---------
+ * pgstat_fetch_subscription_error() -
+ *
+ * Support function for the SQL-callable pgstat* functions. Returns
+ * a pointer to the subscription error struct.
+ * ---------
+ */
+PgStat_StatSubErrEntry *
+pgstat_fetch_subscription_error(Oid subid, Oid relid)
+{
+ backend_read_statsfile();
+
+ return pgstat_get_subscription_error_entry(subid, relid, false);
+}
+
/*
* Shut down a single backend's statistics reporting at process exit.
*
@@ -3568,6 +3808,24 @@ PgstatCollectorMain(int argc, char *argv[])
pgstat_recv_disconnect(&msg.msg_disconnect, len);
break;
+ case PGSTAT_MTYPE_SUBSCRIPTIONERR:
+ pgstat_recv_subscription_error(&msg.msg_subscriptionerr, len);
+ break;
+
+ case PGSTAT_MTYPE_SUBSCRIPTIONERRRESET:
+ pgstat_recv_subscription_error_reset(&msg.msg_subscriptionerrreset,
+ len);
+ break;
+
+ case PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE:
+ pgstat_recv_subscription_error_purge(&msg.msg_subscriptionerrpurge,
+ len);
+ break;
+
+ case PGSTAT_MTYPE_SUBSCRIPTIONPURGE:
+ pgstat_recv_subscription_purge(&msg.msg_subscriptionpurge, len);
+ break;
+
default:
break;
}
@@ -3868,6 +4126,57 @@ pgstat_write_statsfiles(bool permanent, bool allDbs)
}
}
+ /*
+ * Write subscription error structs
+ */
+ if (subscriptionHash)
+ {
+ PgStat_StatSubEntry *subent;
+
+ hash_seq_init(&hstat, subscriptionHash);
+ while ((subent = (PgStat_StatSubEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ PgStat_StatSubErrEntry *errent;
+ int32 nerrors = (subent->sync_errors == NULL)
+ ? 0
+ : (int32) hash_get_num_entries(subent->sync_errors);
+
+ /*
+ * We always write at least subscription entry since it could have
+ * apply worker error.
+ */
+ fputc('S', fpout);
+ rc = fwrite(subent, sizeof(PgStat_StatSubEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+
+ /* The number of errors follows */
+ rc = fwrite(&nerrors, sizeof(nerrors), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+
+ /* Then, the error entries follow */
+ if (nerrors > 0)
+ {
+ HASH_SEQ_STATUS relhstat;
+
+ hash_seq_init(&relhstat, subent->sync_errors);
+ while ((errent = (PgStat_StatSubErrEntry *) hash_seq_search(&relhstat)) != NULL)
+ {
+ /*
+ * XXX we write the whole PgStat_StatSubErrEntry entry
+ * that contains the fixed-length error message string
+ * which is PGSTAT_SUBSCRIPTIONERR_MSGLEN in length,
+ * making the stats file bloat. It's okay since we assume
+ * that the number of error entries is not high. But if
+ * the expectation became false we should write the string
+ * and its length instead.
+ */
+ rc = fwrite(errent, sizeof(PgStat_StatSubErrEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+ }
+ }
+ }
+ }
+
/*
* No more output to be done. Close the temp file and replace the old
* pgstat.stat with it. The ferror() check replaces testing for error
@@ -4329,6 +4638,105 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
break;
}
+ /*
+ * 'S' A PgStat_StatSubEntry struct followed by the number of
+ * errors and PgStat_StatSubErrEntry structs, describing
+ * subscription and its errors.
+ */
+ case 'S':
+ {
+ PgStat_StatSubEntry subbuf;
+ PgStat_StatSubEntry *subent;
+ int32 nerrors;
+
+ /* Read the subscription entry */
+ if (fread(&subbuf, 1, sizeof(PgStat_StatSubEntry), fpin) !=
+ sizeof(PgStat_StatSubEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /* Create hash table if we don't have it already. */
+ if (subscriptionHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ subscriptionHash = hash_create("Subscription stat entries",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ }
+
+ /* Enter the subscription entry and initialize fields */
+ subent =
+ (PgStat_StatSubEntry *) hash_search(subscriptionHash,
+ (void *) &(subbuf.subid),
+ HASH_ENTER, NULL);
+ memcpy(&(subent->apply_error), &(subbuf.apply_error),
+ sizeof(PgStat_StatSubErrEntry));
+ subent->sync_errors = NULL;
+
+ /*
+ * Read the number of table sync errors in the
+ * subscription
+ */
+ if (fread(&nerrors, 1, sizeof(nerrors), fpin) != sizeof(nerrors))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /* Read table sync error entries */
+ for (int i = 0; i < nerrors; i++)
+ {
+ PgStat_StatSubErrEntry errbuf;
+ PgStat_StatSubErrEntry *errent;
+
+ if (fread(&errbuf, 1, sizeof(PgStat_StatSubErrEntry), fpin) !=
+ sizeof(PgStat_StatSubErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ if (subent->sync_errors == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubErrEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ subent->sync_errors = hash_create("table sync errors",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ }
+
+ /*
+ * Enter the table sync error information to the
+ * subscription hash
+ */
+ errent =
+ (PgStat_StatSubErrEntry *) hash_search(subent->sync_errors,
+ (void *) &(errbuf.relid),
+ HASH_ENTER, NULL);
+
+ memcpy(errent, &errbuf, sizeof(PgStat_StatSubErrEntry));
+ }
+
+ break;
+ }
+
case 'E':
goto done;
@@ -4671,6 +5079,50 @@ pgstat_read_db_statsfile_timestamp(Oid databaseid, bool permanent,
}
break;
+ /*
+ * 'S' A PgStat_StatSubEntry struct followed by the number of
+ * errors and PgStat_StatSubErrEntry structs describing
+ * subscription errors.
+ */
+ case 'S':
+ {
+ PgStat_StatSubEntry subbuf;
+ PgStat_StatSubErrEntry errbuf;
+ int32 nerrors;
+
+ if (fread(&subbuf, 1, sizeof(PgStat_StatSubEntry), fpin)
+ != sizeof(PgStat_StatSubEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ FreeFile(fpin);
+ return false;
+ }
+
+ if (fread(&nerrors, 1, sizeof(nerrors), fpin) != sizeof(nerrors))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ for (int i = 0; i < nerrors; i++)
+ {
+ if (fread(&errbuf, 1, sizeof(PgStat_StatSubErrEntry), fpin) !=
+ sizeof(PgStat_StatSubErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+ }
+ }
+
+ break;
+
case 'E':
goto done;
@@ -4876,6 +5328,7 @@ pgstat_clear_snapshot(void)
pgStatLocalContext = NULL;
pgStatDBHash = NULL;
replSlotStatHash = NULL;
+ subscriptionHash = NULL;
/*
* Historically the backend_status.c facilities lived in this file, and
@@ -5816,6 +6269,147 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
}
}
+/* ----------
+ * pgstat_recv_subscription_error() -
+ *
+ * Process a SUBSCRIPTIONERR message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len)
+{
+ PgStat_StatSubErrEntry *errent;
+
+ /* Get subscription error */
+ errent = pgstat_get_subscription_error_entry(msg->m_subid,
+ msg->m_subrelid,
+ true);
+
+ /*
+ * If the error is reported by the table sync worker, OIDs in the message
+ * and the entry must be matched. Otherwise, the reporter must be the apply
+ * worker.
+ */
+ Assert(errent);
+ Assert((OidIsValid(msg->m_subrelid) && msg->m_subrelid == msg->m_relid &&
+ msg->m_subrelid == errent->relid) || !OidIsValid(msg->m_subrelid));
+
+ /*
+ * If the error is reported by the apply worker, we always have to update
+ * the relid since the apply worker could report different relid per error.
+ * In table sync error case, relid should be set by a hash table lookup since
+ * it's the hash entry key. So we don't update it.
+ */
+ if (!OidIsValid(msg->m_subrelid))
+ errent->relid = msg->m_relid;
+
+ /* update the error entry */
+ errent->command = msg->m_command;
+ errent->xid = msg->m_xid;
+ errent->failure_count++;
+ errent->last_failure = msg->m_failure_time;
+ strlcpy(errent->last_errmsg, msg->m_errmsg, PGSTAT_SUBSCRIPTIONERR_MSGLEN);
+}
+
+/* ----------
+ * pgstat_recv_subscription_error_reset() -
+ *
+ * Process a SUBSCRIPTIONRESET message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_error_reset(PgStat_MsgSubscriptionErrReset *msg, int len)
+{
+ PgStat_StatSubErrEntry *errent;
+
+ /* Get subscription error */
+ errent = pgstat_get_subscription_error_entry(msg->m_subid,
+ msg->m_subrelid,
+ false);
+
+ /*
+ * Nothing to do if the subscription error entry is not found. This could
+ * happen when the subscription is dropped and the message for dropping
+ * subscription entry arrived before the message for resetting the error.
+ */
+ if (errent == NULL)
+ return;
+
+ /* reset fields and set reset timestamp */
+ pgstat_reset_subscription_error_entry(errent, GetCurrentTimestamp());
+
+ /* If the apply error, reset also the relid */
+ if (!OidIsValid(msg->m_subrelid))
+ errent->relid = InvalidOid;
+}
+
+/* ----------
+ * pgstat_recv_subscription_purge() -
+ *
+ * Process a SUBSCRIPTIONPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len)
+{
+ if (subscriptionHash == NULL)
+ return;
+
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ PgStat_StatSubEntry *subent;
+
+ subent = pgstat_get_subscription_entry(msg->m_subids[i], false);
+
+ /*
+ * Nothing to do if the subscription entry is not found. This could
+ * happen when the subscription is dropped and the message for
+ * dropping subscription entry arrived before the message for
+ * reporting the error.
+ */
+ if (subent == NULL)
+ continue;
+
+ /* Cleanup the table sync errors */
+ if (subent->sync_errors != NULL)
+ hash_destroy(subent->sync_errors);
+
+ /* Remove the subscription entry */
+ (void) hash_search(subscriptionHash, (void *) &(msg->m_subids[i]),
+ HASH_REMOVE, NULL);
+ }
+}
+
+/* ----------
+ * pgstat_recv_subscription_error_purge() -
+ *
+ * Process a SUBSCRIPTIONERRPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_error_purge(PgStat_MsgSubscriptionErrPurge *msg, int len)
+{
+ PgStat_StatSubEntry *subent;
+
+ subent = pgstat_get_subscription_entry(msg->m_subid, false);
+
+ /*
+ * Nothing to do if the subscription entry is not found. This could
+ * happen when the subscription with msg->m_subid is removed and the
+ * corresponding entry is also removed before receiving the error purge
+ * message.
+ */
+ if (subent == NULL)
+ return;
+
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ Assert(OidIsValid(msg->m_relids[i]));
+ (void) hash_search(subent->sync_errors, (void *) &(msg->m_relids[i]),
+ HASH_REMOVE, NULL);
+ }
+}
+
/* ----------
* pgstat_write_statsfile_needed() -
*
@@ -5934,6 +6528,121 @@ pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotent, TimestampTz ts)
slotent->stat_reset_timestamp = ts;
}
+/* ----------
+ * pgstat_get_subscription_entry
+ *
+ * Return the subscription statistics with the subscription OID. If no
+ * subscription entry exists, initialize it, if the create parameter is true.
+ * Else, return NULL.
+ * ----------
+ */
+static PgStat_StatSubEntry *
+pgstat_get_subscription_entry(Oid subid, bool create)
+{
+ PgStat_StatSubEntry *subent;
+ HASHACTION action;
+ bool found;
+
+ if (subscriptionHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ if (!create)
+ return NULL;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubEntry);
+ subscriptionHash = hash_create("Subscription stat entries",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+ }
+
+ action = (create ? HASH_ENTER : HASH_FIND);
+ subent = (PgStat_StatSubEntry *) hash_search(subscriptionHash,
+ (void *) &subid,
+ action, &found);
+
+ /* initialize fields */
+ if (create && !found)
+ {
+ MemSet(&(subent->apply_error), 0, sizeof(PgStat_StatSubErrEntry));
+ subent->sync_errors = NULL;
+ }
+
+ return subent;
+}
+
+/* ----------
+ * pgstat_get_subscription_error_entry
+ *
+ * Return the entry of subscription error entry with the subscription
+ * OID and relation OID. If no subscription error entry exists, initialize it,
+ * if the create parameter is true. Else, return NULL.
+ * ----------
+ */
+static PgStat_StatSubErrEntry *
+pgstat_get_subscription_error_entry(Oid subid, Oid subrelid, bool create)
+{
+ PgStat_StatSubEntry *subent;
+ PgStat_StatSubErrEntry *errent;
+ HASHACTION action;
+ bool found;
+
+ subent = pgstat_get_subscription_entry(subid, create);
+
+ if (subent == NULL)
+ {
+ Assert(!create);
+ return NULL;
+ }
+
+ if (!OidIsValid(subrelid))
+ {
+ /* Return the apply error worker */
+ return &(subent->apply_error);
+ }
+
+ if (subent->sync_errors == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ if (!create)
+ return NULL;
+
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubErrEntry);
+ subent->sync_errors = hash_create("table sync errors",
+ PGSTAT_SUBSCRIPTION_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+ }
+
+ action = (create ? HASH_ENTER : HASH_FIND);
+ errent = (PgStat_StatSubErrEntry *) hash_search(subent->sync_errors,
+ (void *) &subrelid,
+ action, &found);
+
+ /* initialize fields */
+ if (create && !found)
+ pgstat_reset_subscription_error_entry(errent, 0);
+
+ return errent;
+}
+
+/* Reset fields other than relid and set the reset timestamp */
+static void
+pgstat_reset_subscription_error_entry(PgStat_StatSubErrEntry *errent,
+ TimestampTz ts)
+{
+ errent->command = 0;
+ errent->xid = InvalidTransactionId;
+ errent->failure_count = 0;
+ errent->last_failure = 0;
+ errent->last_errmsg[0] = '\0';
+ errent->stat_reset_timestamp = ts;
+}
+
/*
* pgstat_slru_index
*
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 8d96c926b4..e91fa86b1a 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3329,6 +3329,7 @@ void
ApplyWorkerMain(Datum main_arg)
{
int worker_slot = DatumGetInt32(main_arg);
+ MemoryContext cctx = CurrentMemoryContext;
MemoryContext oldctx;
char originname[NAMEDATALEN];
XLogRecPtr origin_startpos;
@@ -3429,8 +3430,27 @@ ApplyWorkerMain(Datum main_arg)
{
char *syncslotname;
- /* This is table synchronization worker, call initial sync. */
- syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ PG_TRY();
+ {
+ /* This is table synchronization worker, call initial sync. */
+ syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ }
+ PG_CATCH();
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ /* report the table sync error */
+ pgstat_report_subscription_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ MyLogicalRepWorker->relid,
+ 0,
+ InvalidTransactionId,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
/* allocate slot name in long-lived context */
myslotname = MemoryContextStrdup(ApplyContext, syncslotname);
@@ -3548,7 +3568,32 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- LogicalRepApplyLoop(origin_startpos);
+ PG_TRY();
+ {
+ LogicalRepApplyLoop(origin_startpos);
+ }
+ PG_CATCH();
+ {
+ /* report the apply error */
+ if (apply_error_callback_arg.command != 0)
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ pgstat_report_subscription_error(MySubscription->oid,
+ InvalidOid,
+ apply_error_callback_arg.rel != NULL
+ ? apply_error_callback_arg.rel->localreloid
+ : InvalidOid,
+ apply_error_callback_arg.command,
+ apply_error_callback_arg.remote_xid,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ }
+
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
proc_exit(0);
}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index ff5aedc99c..51f693c22b 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -24,6 +24,7 @@
#include "pgstat.h"
#include "postmaster/bgworker_internals.h"
#include "postmaster/postmaster.h"
+#include "replication/logicalproto.h"
#include "replication/slot.h"
#include "storage/proc.h"
#include "storage/procarray.h"
@@ -2239,6 +2240,23 @@ pg_stat_reset_replication_slot(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
+/* Reset a subscription error stats */
+Datum
+pg_stat_reset_subscription_error(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid;
+
+ if (PG_ARGISNULL(1))
+ relid = InvalidOid; /* reset apply worker error stats */
+ else
+ relid = PG_GETARG_OID(1); /* reset table sync worker error stats */
+
+ pgstat_reset_subscription_error(subid, relid);
+
+ PG_RETURN_VOID();
+}
+
Datum
pg_stat_get_archiver(PG_FUNCTION_ARGS)
{
@@ -2379,3 +2397,97 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
/* Returns the record as Datum */
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
+
+/*
+ * Get the subscription error for the given subscription (and relation).
+ */
+Datum
+pg_stat_get_subscription_error(PG_FUNCTION_ARGS)
+{
+#define PG_STAT_GET_SUBSCRIPTION_ERROR_COLS 9
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid;
+ TupleDesc tupdesc;
+ Datum values[PG_STAT_GET_SUBSCRIPTION_ERROR_COLS];
+ bool nulls[PG_STAT_GET_SUBSCRIPTION_ERROR_COLS];
+ PgStat_StatSubErrEntry *errent;
+ int i;
+
+ /* Initialise values and NULL flags arrays */
+ MemSet(values, 0, sizeof(values));
+ MemSet(nulls, 0, sizeof(nulls));
+
+ /* Initialise attributes information in the tuple descriptor */
+ tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_SUBSCRIPTION_ERROR_COLS);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "subid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 2, "relid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 3, "command",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 4, "xid",
+ XIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 5, "failure_source",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 6, "failure_count",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 7, "last_failure",
+ TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 8, "last_failure_message",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 9, "stats_reset",
+ TIMESTAMPTZOID, -1, 0);
+ BlessTupleDesc(tupdesc);
+
+ if (PG_ARGISNULL(1))
+ relid = InvalidOid;
+ else
+ relid = PG_GETARG_OID(1);
+
+ /* Get subscription errors */
+ errent = pgstat_fetch_subscription_error(subid, relid);
+
+ /* Return NULL if the subscription doesn't have any errors */
+ if (errent == NULL)
+ PG_RETURN_NULL();
+
+ i = 0;
+ values[i++] = ObjectIdGetDatum(subid);
+
+ if (OidIsValid(errent->relid))
+ values[i++] = ObjectIdGetDatum(errent->relid);
+ else
+ nulls[i++] = true;
+
+ if (errent->command == 0)
+ nulls[i++] = true;
+ else
+ values[i++] = CStringGetTextDatum(logicalrep_message_type(errent->command));
+
+ if (TransactionIdIsValid(errent->xid))
+ values[i++] = TransactionIdGetDatum(errent->xid);
+ else
+ nulls[i++] = true;
+
+ if (OidIsValid(relid))
+ values[i++] = CStringGetTextDatum("tablesync");
+ else
+ values[i++] = CStringGetTextDatum("apply");
+
+ values[i++] = Int64GetDatum(errent->failure_count);
+
+ if (errent->last_failure == 0)
+ nulls[i++] = true;
+ else
+ values[i++] = TimestampTzGetDatum(errent->last_failure);
+
+ values[i++] = CStringGetTextDatum(errent->last_errmsg);
+
+ if (errent->stat_reset_timestamp == 0)
+ nulls[i++] = true;
+ else
+ values[i++] = TimestampTzGetDatum(errent->stat_reset_timestamp);
+
+ /* Returns the record as Datum */
+ PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
+}
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index d068d6532e..ac02061347 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5385,6 +5385,14 @@
proargmodes => '{i,o,o,o,o,o,o,o,o,o,o}',
proargnames => '{slot_name,slot_name,spill_txns,spill_count,spill_bytes,stream_txns,stream_count,stream_bytes,total_txns,total_bytes,stats_reset}',
prosrc => 'pg_stat_get_replication_slot' },
+{ oid => '8523', descr => 'statistics: information about subscription error',
+ proname => 'pg_stat_get_subscription_error', prorows => '1', proisstrict => 'f',
+ proretset => 't', provolatile => 's', proparallel => 'r',
+ prorettype => 'record', proargtypes => 'oid oid',
+ proallargtypes => '{oid,oid,oid,oid,text,xid,text,int8,timestamptz,text,timestamptz}',
+ proargmodes => '{i,i,o,o,o,o,o,o,o,o,o}',
+ proargnames => '{subid,relid,subid,relid,command,xid,failure_source,failure_count,last_failure,last_failure_message,stats_reset}',
+ prosrc => 'pg_stat_get_subscription_error' },
{ oid => '6118', descr => 'statistics: information about subscription',
proname => 'pg_stat_get_subscription', prorows => '10', proisstrict => 'f',
proretset => 't', provolatile => 's', proparallel => 'r',
@@ -5772,6 +5780,11 @@
proname => 'pg_stat_reset_replication_slot', proisstrict => 'f',
provolatile => 'v', prorettype => 'void', proargtypes => 'text',
prosrc => 'pg_stat_reset_replication_slot' },
+{ oid => '8524',
+ descr => 'statistics: reset collected statistics for a single subscription error',
+ proname => 'pg_stat_reset_subscription_error', proisstrict => 'f',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid oid',
+ prosrc => 'pg_stat_reset_subscription_error' },
{ oid => '3163', descr => 'current trigger depth',
proname => 'pg_trigger_depth', provolatile => 's', proparallel => 'r',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index bcd3588ea2..e9702ec150 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -14,6 +14,7 @@
#include "datatype/timestamp.h"
#include "portability/instr_time.h"
#include "postmaster/pgarch.h" /* for MAX_XFN_CHARS */
+#include "replication/logicalproto.h"
#include "utils/backend_progress.h" /* for backward compatibility */
#include "utils/backend_status.h" /* for backward compatibility */
#include "utils/hsearch.h"
@@ -66,6 +67,10 @@ typedef enum StatMsgType
PGSTAT_MTYPE_RESETSINGLECOUNTER,
PGSTAT_MTYPE_RESETSLRUCOUNTER,
PGSTAT_MTYPE_RESETREPLSLOTCOUNTER,
+ PGSTAT_MTYPE_SUBSCRIPTIONERR,
+ PGSTAT_MTYPE_SUBSCRIPTIONERRRESET,
+ PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE,
+ PGSTAT_MTYPE_SUBSCRIPTIONPURGE,
PGSTAT_MTYPE_AUTOVAC_START,
PGSTAT_MTYPE_VACUUM,
PGSTAT_MTYPE_ANALYZE,
@@ -536,6 +541,81 @@ typedef struct PgStat_MsgReplSlot
PgStat_Counter m_total_bytes;
} PgStat_MsgReplSlot;
+/* ----------
+ * PgStat_MsgSubscriptionErr Sent by the apply worker or the table sync worker to
+ * report the error occurred during logical replication.
+ * ----------
+ */
+#define PGSTAT_SUBSCRIPTIONERR_MSGLEN 256
+typedef struct PgStat_MsgSubscriptionErr
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * m_subid and m_subrelid are used to determine the subscription and the
+ * reporter of this error. m_subrelid is InvalidOid if reported by the
+ * apply worker, otherwise by the table sync worker. In table sync worker
+ * case, m_subrelid must be the same as m_relid.
+ */
+ Oid m_subid;
+ Oid m_subrelid;
+
+ /* Error information */
+ Oid m_relid;
+ LogicalRepMsgType m_command;
+ TransactionId m_xid;
+ TimestampTz m_failure_time;
+ char m_errmsg[PGSTAT_SUBSCRIPTIONERR_MSGLEN];
+} PgStat_MsgSubscriptionErr;
+
+/* ----------
+ * PgStat_MsgSubscriptionReset Sent by the backend to reset the subscription
+ * error fields.
+ * ----------
+ */
+typedef struct PgStat_MsgSubscriptionErrReset
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * Same as PgStat_MsgSubscriptionErr, m_subid and m_subrelid are used to
+ * determine the subscription and the reporter of the error: the apply worker
+ * or the table sync worker.
+ */
+ Oid m_subid;
+ Oid m_subrelid;
+} PgStat_MsgSubscriptionErrReset;
+
+/* ----------
+ * PgStat_MsgSubscriptionPurge Sent by the backend and autovacuum to tell the
+ * collector about the dead subscriptions.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionPurge
+{
+ PgStat_MsgHdr m_hdr;
+ int m_nentries;
+ Oid m_subids[PGSTAT_NUM_SUBSCRIPTIONPURGE];
+} PgStat_MsgSubscriptionPurge;
+
+/* ----------
+ * PgStat_MsgSubscriptionErrPurge Sent by the backend and autovacuum to purge
+ * the subscription errors.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONERRPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(Oid) - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionErrPurge
+{
+ PgStat_MsgHdr m_hdr;
+ Oid m_subid;
+ int m_nentries;
+ Oid m_relids[PGSTAT_NUM_SUBSCRIPTIONERRPURGE];
+} PgStat_MsgSubscriptionErrPurge;
/* ----------
* PgStat_MsgRecoveryConflict Sent by the backend upon recovery conflict
@@ -714,6 +794,10 @@ typedef union PgStat_Msg
PgStat_MsgReplSlot msg_replslot;
PgStat_MsgConnect msg_connect;
PgStat_MsgDisconnect msg_disconnect;
+ PgStat_MsgSubscriptionErr msg_subscriptionerr;
+ PgStat_MsgSubscriptionErrReset msg_subscriptionerrreset;
+ PgStat_MsgSubscriptionErrPurge msg_subscriptionerrpurge;
+ PgStat_MsgSubscriptionPurge msg_subscriptionpurge;
} PgStat_Msg;
@@ -929,6 +1013,44 @@ typedef struct PgStat_StatReplSlotEntry
TimestampTz stat_reset_timestamp;
} PgStat_StatReplSlotEntry;
+/*
+ * Subscription error statistics kept in the stats collector, representing
+ * an error that occurred during application of logical replication or
+ * initial table synchronization.
+ */
+typedef struct PgStat_StatSubErrEntry
+{
+ Oid relid; /* hash table key */
+ LogicalRepMsgType command;
+ TransactionId xid;
+ PgStat_Counter failure_count;
+ TimestampTz last_failure;
+ char last_errmsg[PGSTAT_SUBSCRIPTIONERR_MSGLEN];
+ TimestampTz stat_reset_timestamp;
+} PgStat_StatSubErrEntry;
+
+/*
+ * Subscription statistics kept in the stats collector.
+ */
+typedef struct PgStat_StatSubEntry
+{
+ Oid subid; /* hash table key */
+
+ /*
+ * Statistics of errors that occurred during logical replication. While
+ * having the hash table for table sync errors we have a separate
+ * statistics value for apply error (apply_error), because we can avoid
+ * building a nested hash table for table sync errors in the case where
+ * there is no table sync error, which is the common case in practice.
+ *
+ * Note that the lifetime of error entries of the apply worker and the
+ * table sync worker are also different. Both are removed altogether
+ * after the subscription is dropped but the table sync errors are
+ * removed also after the table synchronization is completed.
+ */
+ PgStat_StatSubErrEntry apply_error;
+ HTAB *sync_errors;
+} PgStat_StatSubEntry;
/*
* Working state needed to accumulate per-function-call timing statistics.
@@ -1022,6 +1144,7 @@ extern void pgstat_reset_shared_counters(const char *);
extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type type);
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
+extern void pgstat_reset_subscription_error(Oid subid, Oid subrelid);
extern void pgstat_report_connect(Oid dboid);
extern void pgstat_report_autovac(Oid dboid);
@@ -1038,6 +1161,9 @@ extern void pgstat_report_checksum_failure(void);
extern void pgstat_report_replslot(const PgStat_StatReplSlotEntry *repSlotStat);
extern void pgstat_report_replslot_create(const char *slotname);
extern void pgstat_report_replslot_drop(const char *slotname);
+extern void pgstat_report_subscription_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command,
+ TransactionId xid, const char *errmsg);
extern void pgstat_initialize(void);
@@ -1136,6 +1262,7 @@ extern PgStat_GlobalStats *pgstat_fetch_global(void);
extern PgStat_WalStats *pgstat_fetch_stat_wal(void);
extern PgStat_SLRUStats *pgstat_fetch_slru(void);
extern PgStat_StatReplSlotEntry *pgstat_fetch_replslot(NameData slotname);
+extern PgStat_StatSubErrEntry *pgstat_fetch_subscription_error(Oid subid, Oid relid);
extern void pgstat_count_slru_page_zeroed(int slru_idx);
extern void pgstat_count_slru_page_hit(int slru_idx);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2fa00a3c29..3719b8a41a 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2094,6 +2094,28 @@ pg_stat_subscription| SELECT su.oid AS subid,
st.latest_end_time
FROM (pg_subscription su
LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid, pid, received_lsn, last_msg_send_time, last_msg_receipt_time, latest_end_lsn, latest_end_time) ON ((st.subid = su.oid)));
+pg_stat_subscription_errors| SELECT d.datname,
+ sr.subid,
+ s.subname,
+ e.relid,
+ e.command,
+ e.xid,
+ e.failure_source,
+ e.failure_count,
+ e.last_failure,
+ e.last_failure_message,
+ e.stats_reset
+ FROM ( SELECT pg_subscription.oid AS subid,
+ NULL::oid AS relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT pg_subscription_rel.srsubid AS subid,
+ pg_subscription_rel.srrelid AS relid
+ FROM pg_subscription_rel
+ WHERE (pg_subscription_rel.srsubstate <> 'r'::"char")) sr,
+ ((LATERAL pg_stat_get_subscription_error(sr.subid, sr.relid) e(subid, relid, command, xid, failure_source, failure_count, last_failure, last_failure_message, stats_reset)
+ JOIN pg_subscription s ON ((e.subid = s.oid)))
+ JOIN pg_database d ON ((s.subdbid = d.oid)));
pg_stat_sys_indexes| SELECT pg_stat_all_indexes.relid,
pg_stat_all_indexes.indexrelid,
pg_stat_all_indexes.schemaname,
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 402a6617a9..82dec0851f 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1940,6 +1940,9 @@ PgStat_MsgResetsharedcounter
PgStat_MsgResetsinglecounter
PgStat_MsgResetslrucounter
PgStat_MsgSLRU
+PgStat_MsgSubscriptionErr
+PgStat_MsgSubscriptionErrPurge
+PgStat_MsgSubscriptionPurge
PgStat_MsgTabpurge
PgStat_MsgTabstat
PgStat_MsgTempFile
@@ -1951,6 +1954,8 @@ PgStat_Single_Reset_Type
PgStat_StatDBEntry
PgStat_StatFuncEntry
PgStat_StatReplSlotEntry
+PgStat_StatSubEntry
+PgStat_StatSubErrEntry
PgStat_StatTabEntry
PgStat_SubXactStatus
PgStat_TableCounts
--
2.24.3 (Apple Git-128)
Hi,
On Fri, Sep 3, 2021 at 4:33 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote:
On Aug 30, 2021, at 12:06 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached rebased patches.
Thanks for these patches, Sawada-san!
Sorry for the very late response.
Thank you for the suggestions and the patch!
The first patch in your series, v12-0001, seems useful to me even before committing any of the rest. I would like to integrate the new pg_stat_subscription_errors view it creates into regression tests for other logical replication features under development.
In particular, it can be hard to write TAP tests that need to wait for subscriptions to catch up or fail. With your view committed, a new PostgresNode function to wait for catchup or for failure can be added, and then developers of different projects can all use that.
I like the idea of creating a common function that waits for the
subscription to be ready (i.e., all relations are either in 'r' or 's'
state). There are many places where we wait for all subscription
relations to be ready in existing tap tests. We would be able to
replace those codes with the function. But I'm not sure that it's
useful to have a function that waits for the subscriptions to either
be ready or raise an error. In tap tests, I think that if we wait for
the subscription to raise an error, we should wait only for the error
but not for the subscription to be ready. Thoughts?
I am attaching a version of such a function, plus some tests of your patch (since it does not appear to have any). Would you mind reviewing these and giving comments or including them in your next patch version?
I've looked at the patch and here are some comments:
+
+-- no errors should be reported
+SELECT * FROM pg_stat_subscription_errors;
+
+
+-- Test that the subscription errors view exists, and has the right columns
+-- If we expected any rows to exist, we would need to filter out unstable
+-- columns. But since there should be no errors, we just select them all.
+select * from pg_stat_subscription_errors;
The patch adds checks of pg_stat_subscription_errors in order to test
if the subscription doesn't have any error. But since the subscription
errors are updated in an asynchronous manner, we cannot say the
subscription is working fine by checking the view only once.
---
The newly added tap tests by 025_errors.pl have two subscribers raise
a table sync error, which seems very similar to the tests that
024_skip_xact.pl adds. So I'm not sure we need this tap test as a
separate tap test file.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, Sep 21, 2021 at 2:53 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached the updated version patches. Please review them.
Some comments on the v14-0001 patch:
(1)
Patch comment
The existing patch comment doesn't read well. I suggest the following updates:
BEFORE:
Add pg_stat_subscription_errors statistics view.
This commits adds new system view pg_stat_logical_replication_error,
showing errors happening during applying logical replication changes
as well as during performing initial table synchronization.
The subscription error entries are removed by autovacuum workers when
the table synchronization competed in table sync worker cases and when
dropping the subscription in apply worker cases.
It also adds SQL function pg_stat_reset_subscription_error() to
reset the single subscription error.
AFTER:
Add a subscription errors statistics view "pg_stat_subscription_errors".
This commit adds a new system view pg_stat_logical_replication_errors,
that shows information about any errors which occur during application
of logical replication changes as well as during performing initial table
synchronization.
The subscription error entries are removed by autovacuum workers after
table synchronization completes in table sync worker cases and after
dropping the subscription in apply worker cases.
It also adds an SQL function pg_stat_reset_subscription_error() to
reset a single subscription error.
src/backend/postmaster/pgstat.c
(2)
In pgstat_read_db_statsfile_timestamp(), you've added the following
code for case 'S':
+ case 'S':
+ {
+ PgStat_StatSubEntry subbuf;
+ PgStat_StatSubErrEntry errbuf;
+ int32 nerrors;
+
+ if (fread(&subbuf, 1, sizeof(PgStat_StatSubEntry), fpin)
+ != sizeof(PgStat_StatSubEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ FreeFile(fpin);
+ return false;
+ }
+
+ if (fread(&nerrors, 1, sizeof(nerrors), fpin) != sizeof(nerrors))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ for (int i = 0; i < nerrors; i++)
+ {
+ if (fread(&errbuf, 1, sizeof(PgStat_StatSubErrEntry), fpin) !=
+ sizeof(PgStat_StatSubErrEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+ }
+ }
+
+ break;
+
Why in the 2nd and 3rd instances of calling fread() and detecting a
corrupted statistics file, does it:
goto done;
instead of:
FreeFile(fpin);
return false;
?
(so ends up returning true for these instances)
It looks like a mistake, but if it's intentional then comments need to
be added to explain it.
(3)
In pgstat_get_subscription_error_entry(), there seems to be a bad comment.
Shouldn't:
+ /* Return the apply error worker */
+ return &(subent->apply_error);
be:
+ /* Return the apply worker error */
+ return &(subent->apply_error);
src/tools/pgindent/typedefs.list
(4)
"PgStat_MsgSubscriptionErrReset" is missing from the list.
Regards,
Greg Nancarrow
Fujitsu Australia
On Tue, Sep 21, 2021 at 2:53 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached the updated version patches. Please review them.
A few review comments for the v14-0002 patch:
(1)
I suggest a small update to the patch comment:
BEFORE:
ALTER SUBSCRIPTION ... RESET command resets subscription
parameters. The parameters that can be set are streaming, binary,
synchronous_commit.
AFTER:
ALTER SUBSCRIPTION ... RESET command resets subscription
parameters to their default value. The parameters that can be reset
are streaming, binary, and synchronous_commit.
(2)
In the documentation, the RESETable parameters should be listed in the
same way and order as for SET:
BEFORE:
+ <para>
+ The parameters that can be reset are: <literal>streaming</literal>,
+ <literal>binary</literal>, <literal>synchronous_commit</literal>.
+ </para>
AFTER:
+ <para>
+ The parameters that can be reset are
<literal>synchronous_commit</literal>,
+ <literal>binary</literal>, and <literal>streaming</literal>.
+ </para>
Also, I'm thinking it would be beneficial to say the following before this:
RESET is used to set parameters back to their default value.
(3)
I notice that if you try to reset the slot_name, you get the following message:
postgres=# alter subscription sub reset (slot_name);
ERROR: unrecognized subscription parameter: "slot_name"
This is a bit misleading, because "slot_name" actually IS a
subscription parameter, just not resettable.
It would be better in this case if it said something like:
ERROR: not a resettable subscription parameter: "slot_name"
However, it seems that this is also an existing issue with SET (e.g.
for "refresh" or "two_phase"):
postgres=# alter subscription sub set (refresh=true);
ERROR: unrecognized subscription parameter: "refresh"
Regards,
Greg Nancarrow
Fujitsu Australia
On Tuesday, September 21, 2021 12:53 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached the updated version patches. Please review them.
Thanks for updating the patch,
here are a few comments on the v14-0001 patch.
1)
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(SubscriptionRelState);
+ not_ready_rels_htab = hash_create("not ready relations in subscription",
+ 64,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+
ISTM we can pass list_length(not_ready_rels_list) as the nelem to hash_create.
2)
+ /*
+ * Search for all the dead subscriptions and error entries in stats
+ * hashtable and tell the stats collector to drop them.
+ */
+ if (subscriptionHash)
+ {
...
+ HTAB *htab;
+
It seems we already delacre a "HTAB *htab;" in function pgstat_vacuum_stat(),
can we use the existing htab here ?
3)
PGSTAT_MTYPE_RESETREPLSLOTCOUNTER,
+ PGSTAT_MTYPE_SUBSCRIPTIONERR,
+ PGSTAT_MTYPE_SUBSCRIPTIONERRRESET,
+ PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE,
+ PGSTAT_MTYPE_SUBSCRIPTIONPURGE,
PGSTAT_MTYPE_AUTOVAC_START,
Can we append these values at the end of the Enum struct which won't affect the
other Enum values.
Best regards,
Hou zj
On Tue, Sep 21, 2021 at 10:23 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached the updated version patches. Please review them.
Review comments for v14-0001-Add-pg_stat_subscription_errors-statistics-view
==============================================================
1.
<entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>command</structfield> <type>text</type>
+ </para>
+ <para>
+ Name of command being applied when the error occurred. This
+ field is always NULL if the error is reported by the
+ <literal>tablesync</literal> worker.
+ </para></entry>
+ </row>
..
..
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>xid</structfield> <type>xid</type>
+ </para>
+ <para>
+ Transaction ID of the publisher node being applied when the error
+ occurred. This field is always NULL if the error is reported
+ by the <literal>tablesync</literal> worker.
+ </para></entry>
Shouldn't we display command and transaction id even for table sync
worker if it occurs during sync phase (syncing with apply worker
position)
2.
+ /*
+ * The number of not-ready relations can be high for example right
+ * after creating a subscription, so we load the list of
+ * SubscriptionRelState into the hash table for faster lookups.
+ */
I am not sure this optimization of converting to not-ready relations
list to hash table is worth it. Are we expecting thousands of
relations per subscription? I think that will be a rare case even if
it is there.
3.
+static void
+pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len)
+{
+ if (subscriptionHash == NULL)
+ return;
+
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ PgStat_StatSubEntry *subent;
+
+ subent = pgstat_get_subscription_entry(msg->m_subids[i], false);
+
+ /*
+ * Nothing to do if the subscription entry is not found. This could
+ * happen when the subscription is dropped and the message for
+ * dropping subscription entry arrived before the message for
+ * reporting the error.
+ */
+ if (subent == NULL)
Is the above comment true even during the purge? I can think of this
during normal processing but not during the purge.
4.
+typedef struct PgStat_MsgSubscriptionErr
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * m_subid and m_subrelid are used to determine the subscription and the
+ * reporter of this error. m_subrelid is InvalidOid if reported by the
+ * apply worker, otherwise by the table sync worker. In table sync worker
+ * case, m_subrelid must be the same as m_relid.
+ */
+ Oid m_subid;
+ Oid m_subrelid;
+
+ /* Error information */
+ Oid m_relid;
Is m_subrelid is used only to distinguish the type of worker? I think
it could be InvalidOid during the syncing phase in the table sync
worker.
5.
+/*
+ * Subscription error statistics kept in the stats collector, representing
+ * an error that occurred during application of logical replication or
The part of the message " ... application of logical replication ..."
sounds a little unclear. Shall we instead write: " ... application of
logical message ..."?
6.
+typedef struct PgStat_StatSubEntry
+{
+ Oid subid; /* hash table key */
+
+ /*
+ * Statistics of errors that occurred during logical replication. While
+ * having the hash table for table sync errors we have a separate
+ * statistics value for apply error (apply_error), because we can avoid
+ * building a nested hash table for table sync errors in the case where
+ * there is no table sync error, which is the common case in practice.
+ *
The above comment is not clear to me. Why do you need to have a
separate hash table for table sync errors? And what makes it avoid
building nested hash table?
--
With Regards,
Amit Kapila.
On Fri, Sep 24, 2021 at 8:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Sep 21, 2021 at 10:23 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached the updated version patches. Please review them.
Review comments for v14-0001-Add-pg_stat_subscription_errors-statistics-view ============================================================== 1. <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>command</structfield> <type>text</type> + </para> + <para> + Name of command being applied when the error occurred. This + field is always NULL if the error is reported by the + <literal>tablesync</literal> worker. + </para></entry> + </row> .. .. + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>xid</structfield> <type>xid</type> + </para> + <para> + Transaction ID of the publisher node being applied when the error + occurred. This field is always NULL if the error is reported + by the <literal>tablesync</literal> worker. + </para></entry>Shouldn't we display command and transaction id even for table sync
worker if it occurs during sync phase (syncing with apply worker
position)
Right. I'll fix it.
2. + /* + * The number of not-ready relations can be high for example right + * after creating a subscription, so we load the list of + * SubscriptionRelState into the hash table for faster lookups. + */I am not sure this optimization of converting to not-ready relations
list to hash table is worth it. Are we expecting thousands of
relations per subscription? I think that will be a rare case even if
it is there.
Yeah, it seems overkill. I'll use the simple list. If this becomes a
problem, we can add such optimization later.
3. +static void +pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len) +{ + if (subscriptionHash == NULL) + return; + + for (int i = 0; i < msg->m_nentries; i++) + { + PgStat_StatSubEntry *subent; + + subent = pgstat_get_subscription_entry(msg->m_subids[i], false); + + /* + * Nothing to do if the subscription entry is not found. This could + * happen when the subscription is dropped and the message for + * dropping subscription entry arrived before the message for + * reporting the error. + */ + if (subent == NULL)Is the above comment true even during the purge? I can think of this
during normal processing but not during the purge.
Right, the comment is not true during the purge. Since subent could be
NULL if concurrent autovacuum workers do pgstat_vacuum_stat() I'll
change the comment.
4. +typedef struct PgStat_MsgSubscriptionErr +{ + PgStat_MsgHdr m_hdr; + + /* + * m_subid and m_subrelid are used to determine the subscription and the + * reporter of this error. m_subrelid is InvalidOid if reported by the + * apply worker, otherwise by the table sync worker. In table sync worker + * case, m_subrelid must be the same as m_relid. + */ + Oid m_subid; + Oid m_subrelid; + + /* Error information */ + Oid m_relid;Is m_subrelid is used only to distinguish the type of worker? I think
it could be InvalidOid during the syncing phase in the table sync
worker.
Right. I'll fix it.
5. +/* + * Subscription error statistics kept in the stats collector, representing + * an error that occurred during application of logical replication orThe part of the message " ... application of logical replication ..."
sounds a little unclear. Shall we instead write: " ... application of
logical message ..."?
Will fix.
6. +typedef struct PgStat_StatSubEntry +{ + Oid subid; /* hash table key */ + + /* + * Statistics of errors that occurred during logical replication. While + * having the hash table for table sync errors we have a separate + * statistics value for apply error (apply_error), because we can avoid + * building a nested hash table for table sync errors in the case where + * there is no table sync error, which is the common case in practice. + *The above comment is not clear to me. Why do you need to have a
separate hash table for table sync errors? And what makes it avoid
building nested hash table?
In the previous patch, a subscription stats entry
(PgStat_StatSubEntry) had one hash table that had error entries of
both apply and table sync. Since a subscription can have one apply
worker and multiple table sync workers it makes sense to me to have
the subscription entry have a hash table for them. The reason why we
have one error entry for an apply error and a hash table for table
sync errors is that there is the common case where an apply error
happens whereas any table sync error doesn’t. With this optimization,
if the subscription has only apply error, since we can store it into
aply_error field, we can avoid building a hash table for sync errors.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Fri, Sep 24, 2021 at 5:27 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Tue, Sep 21, 2021 at 2:53 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached the updated version patches. Please review them.
A few review comments for the v14-0002 patch:
Thank you for the comments!
(1)
I suggest a small update to the patch comment:BEFORE:
ALTER SUBSCRIPTION ... RESET command resets subscription
parameters. The parameters that can be set are streaming, binary,
synchronous_commit.AFTER:
ALTER SUBSCRIPTION ... RESET command resets subscription
parameters to their default value. The parameters that can be reset
are streaming, binary, and synchronous_commit.(2)
In the documentation, the RESETable parameters should be listed in the
same way and order as for SET:BEFORE: + <para> + The parameters that can be reset are: <literal>streaming</literal>, + <literal>binary</literal>, <literal>synchronous_commit</literal>. + </para> AFTER: + <para> + The parameters that can be reset are <literal>synchronous_commit</literal>, + <literal>binary</literal>, and <literal>streaming</literal>. + </para>Also, I'm thinking it would be beneficial to say the following before this:
RESET is used to set parameters back to their default value.
I agreed with all of the above comments. I'll incorporate them into
the next version patch that I'm going to submit next Monday.
(3)
I notice that if you try to reset the slot_name, you get the following message:
postgres=# alter subscription sub reset (slot_name);
ERROR: unrecognized subscription parameter: "slot_name"This is a bit misleading, because "slot_name" actually IS a
subscription parameter, just not resettable.
It would be better in this case if it said something like:
ERROR: not a resettable subscription parameter: "slot_name"However, it seems that this is also an existing issue with SET (e.g.
for "refresh" or "two_phase"):
postgres=# alter subscription sub set (refresh=true);
ERROR: unrecognized subscription parameter: "refresh"
Good point. Maybe we can improve it in a separate patch?
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Fri, Sep 24, 2021 at 5:53 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
On Tuesday, September 21, 2021 12:53 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached the updated version patches. Please review them.
Thanks for updating the patch,
here are a few comments on the v14-0001 patch.
Thank you for the comments!
1) + hash_ctl.keysize = sizeof(Oid); + hash_ctl.entrysize = sizeof(SubscriptionRelState); + not_ready_rels_htab = hash_create("not ready relations in subscription", + 64, + &hash_ctl, + HASH_ELEM | HASH_BLOBS); +ISTM we can pass list_length(not_ready_rels_list) as the nelem to hash_create.
As Amit pointed out, it seems not necessary to build a temporary hash
table for this purpose.
2)
+ /* + * Search for all the dead subscriptions and error entries in stats + * hashtable and tell the stats collector to drop them. + */ + if (subscriptionHash) + { ... + HTAB *htab; +It seems we already delacre a "HTAB *htab;" in function pgstat_vacuum_stat(),
can we use the existing htab here ?
Right. Will remove it.
3)
PGSTAT_MTYPE_RESETREPLSLOTCOUNTER, + PGSTAT_MTYPE_SUBSCRIPTIONERR, + PGSTAT_MTYPE_SUBSCRIPTIONERRRESET, + PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE, + PGSTAT_MTYPE_SUBSCRIPTIONPURGE, PGSTAT_MTYPE_AUTOVAC_START,Can we append these values at the end of the Enum struct which won't affect the
other Enum values.
Yes, I'll move them to the end of the Enum struct.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Fri, Sep 24, 2021 at 6:44 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Sep 24, 2021 at 8:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
6. +typedef struct PgStat_StatSubEntry +{ + Oid subid; /* hash table key */ + + /* + * Statistics of errors that occurred during logical replication. While + * having the hash table for table sync errors we have a separate + * statistics value for apply error (apply_error), because we can avoid + * building a nested hash table for table sync errors in the case where + * there is no table sync error, which is the common case in practice. + *The above comment is not clear to me. Why do you need to have a
separate hash table for table sync errors? And what makes it avoid
building nested hash table?In the previous patch, a subscription stats entry
(PgStat_StatSubEntry) had one hash table that had error entries of
both apply and table sync. Since a subscription can have one apply
worker and multiple table sync workers it makes sense to me to have
the subscription entry have a hash table for them.
Sure, but each tablesync worker must have a separate relid. Why can't
we have a single hash table for both apply and table sync workers
which are hashed by sub_id + rel_id? For apply worker, the rel_id will
always be zero (InvalidOId) and tablesync workers will have a unique
OID for rel_id, so we should be able to uniquely identify each of
apply and table sync workers.
--
With Regards,
Amit Kapila.
On Sat, Sep 25, 2021 at 4:23 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Sep 24, 2021 at 6:44 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Sep 24, 2021 at 8:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
6. +typedef struct PgStat_StatSubEntry +{ + Oid subid; /* hash table key */ + + /* + * Statistics of errors that occurred during logical replication. While + * having the hash table for table sync errors we have a separate + * statistics value for apply error (apply_error), because we can avoid + * building a nested hash table for table sync errors in the case where + * there is no table sync error, which is the common case in practice. + *The above comment is not clear to me. Why do you need to have a
separate hash table for table sync errors? And what makes it avoid
building nested hash table?In the previous patch, a subscription stats entry
(PgStat_StatSubEntry) had one hash table that had error entries of
both apply and table sync. Since a subscription can have one apply
worker and multiple table sync workers it makes sense to me to have
the subscription entry have a hash table for them.Sure, but each tablesync worker must have a separate relid. Why can't
we have a single hash table for both apply and table sync workers
which are hashed by sub_id + rel_id? For apply worker, the rel_id will
always be zero (InvalidOId) and tablesync workers will have a unique
OID for rel_id, so we should be able to uniquely identify each of
apply and table sync workers.
What I imagined is to extend the subscription statistics, for
instance, transaction stats[1]/messages/by-id/OSBPR01MB48887CA8F40C8D984A6DC00CED199@OSBPR01MB4888.jpnprd01.prod.outlook.com. By having a hash table for
subscriptions, we can store those statistics into an entry of the hash
table and we can think of subscription errors as also statistics of
the subscription. So we can have another hash table for errors in an
entry of the subscription hash table. For example, the subscription
entry struct will be something like:
typedef struct PgStat_StatSubEntry
{
Oid subid; /* hash key */
HTAB *errors; /* apply and table sync errors */
/* transaction stats of subscription */
PgStat_Counter xact_commit;
PgStat_Counter xact_commit_bytes;
PgStat_Counter xact_error;
PgStat_Counter xact_error_bytes;
PgStat_Counter xact_abort;
PgStat_Counter xact_abort_bytes;
PgStat_Counter failure_count;
} PgStat_StatSubEntry;
When a subscription is dropped, we can easily drop the subscription
entry along with those statistics including the errors from the hash
table.
Regards,
[1]: /messages/by-id/OSBPR01MB48887CA8F40C8D984A6DC00CED199@OSBPR01MB4888.jpnprd01.prod.outlook.com
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Mon, Sep 27, 2021 at 6:21 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Sat, Sep 25, 2021 at 4:23 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Sure, but each tablesync worker must have a separate relid. Why can't
we have a single hash table for both apply and table sync workers
which are hashed by sub_id + rel_id? For apply worker, the rel_id will
always be zero (InvalidOId) and tablesync workers will have a unique
OID for rel_id, so we should be able to uniquely identify each of
apply and table sync workers.What I imagined is to extend the subscription statistics, for
instance, transaction stats[1]. By having a hash table for
subscriptions, we can store those statistics into an entry of the hash
table and we can think of subscription errors as also statistics of
the subscription. So we can have another hash table for errors in an
entry of the subscription hash table. For example, the subscription
entry struct will be something like:typedef struct PgStat_StatSubEntry
{
Oid subid; /* hash key */HTAB *errors; /* apply and table sync errors */
/* transaction stats of subscription */
PgStat_Counter xact_commit;
PgStat_Counter xact_commit_bytes;
PgStat_Counter xact_error;
PgStat_Counter xact_error_bytes;
PgStat_Counter xact_abort;
PgStat_Counter xact_abort_bytes;
PgStat_Counter failure_count;
} PgStat_StatSubEntry;
I think these additional stats will be displayed via
pg_stat_subscription, right? If so, the current stats of that view are
all in-memory and are per LogicalRepWorker which means that for those
stats also we will have different entries for apply and table sync
worker. If this understanding is correct, won't it be better to
represent this as below?
typedef struct PgStat_StatSubWorkerEntry
{
/* hash key */
Oid subid;
Oid relid
/* worker stats which includes xact stats */
PgStat_SubWorkerStats worker_stats
/* error stats */
PgStat_StatSubErrEntry worker_error_stats;
} PgStat_StatSubWorkerEntry;
typedef struct PgStat_SubWorkerStats
{
/* define existing stats here */
....
/* transaction stats of subscription */
PgStat_Counter xact_commit;
PgStat_Counter xact_commit_bytes;
PgStat_Counter xact_error;
PgStat_Counter xact_error_bytes;
PgStat_Counter xact_abort;
PgStat_Counter xact_abort_bytes;
} PgStat_SubWorkerStats;
Now, at drop subscription, we do need to find and remove all the subid
+ relid entries.
--
With Regards,
Amit Kapila.
On Fri, Sep 24, 2021 at 7:01 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Sep 3, 2021 at 4:33 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote:
I am attaching a version of such a function, plus some tests of your patch (since it does not appear to have any). Would you mind reviewing these and giving comments or including them in your next patch version?
I've looked at the patch and here are some comments:
+ +-- no errors should be reported +SELECT * FROM pg_stat_subscription_errors; ++ +-- Test that the subscription errors view exists, and has the right columns +-- If we expected any rows to exist, we would need to filter out unstable +-- columns. But since there should be no errors, we just select them all. +select * from pg_stat_subscription_errors;The patch adds checks of pg_stat_subscription_errors in order to test
if the subscription doesn't have any error. But since the subscription
errors are updated in an asynchronous manner, we cannot say the
subscription is working fine by checking the view only once.
One question I have here is, can we reliably write few tests just for
the new view patch? Right now, it has no test, having a few tests will
be better. Here, because the apply worker will keep on failing till we
stop it or resolve the conflict, can we rely on that fact? The idea
is that even if one of the entry is missed by stats collector, a new
one (probably the same one) will be issued and we can wait till we see
one error in view. We can add additional PostgresNode.pm
infrastructure once the main patch is committed.
--
With Regards,
Amit Kapila.
On Mon, Sep 27, 2021 at 12:24 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Sep 27, 2021 at 6:21 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Sat, Sep 25, 2021 at 4:23 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Sure, but each tablesync worker must have a separate relid. Why can't
we have a single hash table for both apply and table sync workers
which are hashed by sub_id + rel_id? For apply worker, the rel_id will
always be zero (InvalidOId) and tablesync workers will have a unique
OID for rel_id, so we should be able to uniquely identify each of
apply and table sync workers.What I imagined is to extend the subscription statistics, for
instance, transaction stats[1]. By having a hash table for
subscriptions, we can store those statistics into an entry of the hash
table and we can think of subscription errors as also statistics of
the subscription. So we can have another hash table for errors in an
entry of the subscription hash table. For example, the subscription
entry struct will be something like:typedef struct PgStat_StatSubEntry
{
Oid subid; /* hash key */HTAB *errors; /* apply and table sync errors */
/* transaction stats of subscription */
PgStat_Counter xact_commit;
PgStat_Counter xact_commit_bytes;
PgStat_Counter xact_error;
PgStat_Counter xact_error_bytes;
PgStat_Counter xact_abort;
PgStat_Counter xact_abort_bytes;
PgStat_Counter failure_count;
} PgStat_StatSubEntry;I think these additional stats will be displayed via
pg_stat_subscription, right? If so, the current stats of that view are
all in-memory and are per LogicalRepWorker which means that for those
stats also we will have different entries for apply and table sync
worker. If this understanding is correct, won't it be better to
represent this as below?
I was thinking that we have a different stats view for example
pg_stat_subscription_xacts that has entries per subscription. But your
idea seems better to me.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Mon, Sep 27, 2021 at 12:50 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Sep 27, 2021 at 12:24 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Sep 27, 2021 at 6:21 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Sat, Sep 25, 2021 at 4:23 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Sure, but each tablesync worker must have a separate relid. Why can't
we have a single hash table for both apply and table sync workers
which are hashed by sub_id + rel_id? For apply worker, the rel_id will
always be zero (InvalidOId) and tablesync workers will have a unique
OID for rel_id, so we should be able to uniquely identify each of
apply and table sync workers.What I imagined is to extend the subscription statistics, for
instance, transaction stats[1]. By having a hash table for
subscriptions, we can store those statistics into an entry of the hash
table and we can think of subscription errors as also statistics of
the subscription. So we can have another hash table for errors in an
entry of the subscription hash table. For example, the subscription
entry struct will be something like:typedef struct PgStat_StatSubEntry
{
Oid subid; /* hash key */HTAB *errors; /* apply and table sync errors */
/* transaction stats of subscription */
PgStat_Counter xact_commit;
PgStat_Counter xact_commit_bytes;
PgStat_Counter xact_error;
PgStat_Counter xact_error_bytes;
PgStat_Counter xact_abort;
PgStat_Counter xact_abort_bytes;
PgStat_Counter failure_count;
} PgStat_StatSubEntry;I think these additional stats will be displayed via
pg_stat_subscription, right? If so, the current stats of that view are
all in-memory and are per LogicalRepWorker which means that for those
stats also we will have different entries for apply and table sync
worker. If this understanding is correct, won't it be better to
represent this as below?I was thinking that we have a different stats view for example
pg_stat_subscription_xacts that has entries per subscription. But your
idea seems better to me.
I mean that showing statistics (including transaction statistics and
errors) per logical replication worker seems better to me, no matter
what view shows these statistics. I'll change the patch in that way.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Mon, Sep 27, 2021 at 12:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Sep 24, 2021 at 7:01 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Sep 3, 2021 at 4:33 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote:
I am attaching a version of such a function, plus some tests of your patch (since it does not appear to have any). Would you mind reviewing these and giving comments or including them in your next patch version?
I've looked at the patch and here are some comments:
+ +-- no errors should be reported +SELECT * FROM pg_stat_subscription_errors; ++ +-- Test that the subscription errors view exists, and has the right columns +-- If we expected any rows to exist, we would need to filter out unstable +-- columns. But since there should be no errors, we just select them all. +select * from pg_stat_subscription_errors;The patch adds checks of pg_stat_subscription_errors in order to test
if the subscription doesn't have any error. But since the subscription
errors are updated in an asynchronous manner, we cannot say the
subscription is working fine by checking the view only once.One question I have here is, can we reliably write few tests just for
the new view patch? Right now, it has no test, having a few tests will
be better. Here, because the apply worker will keep on failing till we
stop it or resolve the conflict, can we rely on that fact? The idea
is that even if one of the entry is missed by stats collector, a new
one (probably the same one) will be issued and we can wait till we see
one error in view. We can add additional PostgresNode.pm
infrastructure once the main patch is committed.
Yes, the new tests added by 0003 patch (skip_xid patch) use that fact.
After the error is shown in the view, we fetch the XID from the view
to specify as skip_xid. The tests just for the
pg_stat_subscription_errors view will be a subset of these tests. So
probably we can add it in 0001 patch and 0003 patch can extend the
tests so that it tests skip_xid option.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Mon, Sep 27, 2021 at 11:20 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Sep 27, 2021 at 12:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Sep 24, 2021 at 7:01 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Sep 3, 2021 at 4:33 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote:
I am attaching a version of such a function, plus some tests of your patch (since it does not appear to have any). Would you mind reviewing these and giving comments or including them in your next patch version?
I've looked at the patch and here are some comments:
+ +-- no errors should be reported +SELECT * FROM pg_stat_subscription_errors; ++ +-- Test that the subscription errors view exists, and has the right columns +-- If we expected any rows to exist, we would need to filter out unstable +-- columns. But since there should be no errors, we just select them all. +select * from pg_stat_subscription_errors;The patch adds checks of pg_stat_subscription_errors in order to test
if the subscription doesn't have any error. But since the subscription
errors are updated in an asynchronous manner, we cannot say the
subscription is working fine by checking the view only once.One question I have here is, can we reliably write few tests just for
the new view patch? Right now, it has no test, having a few tests will
be better. Here, because the apply worker will keep on failing till we
stop it or resolve the conflict, can we rely on that fact? The idea
is that even if one of the entry is missed by stats collector, a new
one (probably the same one) will be issued and we can wait till we see
one error in view. We can add additional PostgresNode.pm
infrastructure once the main patch is committed.Yes, the new tests added by 0003 patch (skip_xid patch) use that fact.
After the error is shown in the view, we fetch the XID from the view
to specify as skip_xid. The tests just for the
pg_stat_subscription_errors view will be a subset of these tests. So
probably we can add it in 0001 patch and 0003 patch can extend the
tests so that it tests skip_xid option.
This makes sense to me.
--
With Regards,
Amit Kapila.
On Mon, Sep 27, 2021 at 11:02 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Sep 27, 2021 at 12:50 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Sep 27, 2021 at 12:24 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Sep 27, 2021 at 6:21 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Sat, Sep 25, 2021 at 4:23 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Sure, but each tablesync worker must have a separate relid. Why can't
we have a single hash table for both apply and table sync workers
which are hashed by sub_id + rel_id? For apply worker, the rel_id will
always be zero (InvalidOId) and tablesync workers will have a unique
OID for rel_id, so we should be able to uniquely identify each of
apply and table sync workers.What I imagined is to extend the subscription statistics, for
instance, transaction stats[1]. By having a hash table for
subscriptions, we can store those statistics into an entry of the hash
table and we can think of subscription errors as also statistics of
the subscription. So we can have another hash table for errors in an
entry of the subscription hash table. For example, the subscription
entry struct will be something like:typedef struct PgStat_StatSubEntry
{
Oid subid; /* hash key */HTAB *errors; /* apply and table sync errors */
/* transaction stats of subscription */
PgStat_Counter xact_commit;
PgStat_Counter xact_commit_bytes;
PgStat_Counter xact_error;
PgStat_Counter xact_error_bytes;
PgStat_Counter xact_abort;
PgStat_Counter xact_abort_bytes;
PgStat_Counter failure_count;
} PgStat_StatSubEntry;I think these additional stats will be displayed via
pg_stat_subscription, right? If so, the current stats of that view are
all in-memory and are per LogicalRepWorker which means that for those
stats also we will have different entries for apply and table sync
worker. If this understanding is correct, won't it be better to
represent this as below?I was thinking that we have a different stats view for example
pg_stat_subscription_xacts that has entries per subscription. But your
idea seems better to me.I mean that showing statistics (including transaction statistics and
errors) per logical replication worker seems better to me, no matter
what view shows these statistics. I'll change the patch in that way.
Sounds good.
--
With Regards,
Amit Kapila.
On Mon, Sep 27, 2021 at 2:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Sep 27, 2021 at 11:02 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Sep 27, 2021 at 12:50 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Sep 27, 2021 at 12:24 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Sep 27, 2021 at 6:21 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Sat, Sep 25, 2021 at 4:23 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Sure, but each tablesync worker must have a separate relid. Why can't
we have a single hash table for both apply and table sync workers
which are hashed by sub_id + rel_id? For apply worker, the rel_id will
always be zero (InvalidOId) and tablesync workers will have a unique
OID for rel_id, so we should be able to uniquely identify each of
apply and table sync workers.What I imagined is to extend the subscription statistics, for
instance, transaction stats[1]. By having a hash table for
subscriptions, we can store those statistics into an entry of the hash
table and we can think of subscription errors as also statistics of
the subscription. So we can have another hash table for errors in an
entry of the subscription hash table. For example, the subscription
entry struct will be something like:typedef struct PgStat_StatSubEntry
{
Oid subid; /* hash key */HTAB *errors; /* apply and table sync errors */
/* transaction stats of subscription */
PgStat_Counter xact_commit;
PgStat_Counter xact_commit_bytes;
PgStat_Counter xact_error;
PgStat_Counter xact_error_bytes;
PgStat_Counter xact_abort;
PgStat_Counter xact_abort_bytes;
PgStat_Counter failure_count;
} PgStat_StatSubEntry;I think these additional stats will be displayed via
pg_stat_subscription, right? If so, the current stats of that view are
all in-memory and are per LogicalRepWorker which means that for those
stats also we will have different entries for apply and table sync
worker. If this understanding is correct, won't it be better to
represent this as below?I was thinking that we have a different stats view for example
pg_stat_subscription_xacts that has entries per subscription. But your
idea seems better to me.I mean that showing statistics (including transaction statistics and
errors) per logical replication worker seems better to me, no matter
what view shows these statistics. I'll change the patch in that way.
I've attached updated patches that incorporate all comments I got so
far. Please review them.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v15-0003-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchapplication/octet-stream; name=v15-0003-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchDownload
From d83601e0f1234fae27f3e27e020a99ded55ec227 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 2 Aug 2021 14:27:40 +0900
Subject: [PATCH v15 3/3] Add skip_xid option to ALTER SUBSCRIPTION.
If incoming change violates any constraint, logical replication stops
until it's resolved. This commit introduces another way to skip the
transaction in question, other than manually updating the subscriber's
database or using pg_replication_origin_advance().
The user can specify XID by ALTER SUBSCRIPTION ... SET (skip_xid =
XXX), updating pg_subscription.subskipxid field, telling the apply
worker to skip the transaction. The apply worker skips all data
modification changes within the specified transaction.
After skipping the transaciton the apply worker clears
pg_subscription.subskipxid. Also, it clears the error statistics of
the subscription in pg_stat_subscription_errors system view as well in
order the user not to get confused. It's done by sending the message
for clearing a subscription error to the stats collector.
---
doc/src/sgml/logical-replication.sgml | 55 +++++-
doc/src/sgml/ref/alter_subscription.sgml | 37 +++-
src/backend/catalog/pg_subscription.c | 10 ++
src/backend/commands/subscriptioncmds.c | 42 ++++-
src/backend/replication/logical/worker.c | 183 +++++++++++++++++++-
src/include/catalog/pg_subscription.h | 4 +
src/include/nodes/parsenodes.h | 1 +
src/test/regress/expected/subscription.out | 13 ++
src/test/regress/sql/subscription.sql | 11 ++
src/test/subscription/t/025_error_report.pl | 107 +++++++++++-
10 files changed, 443 insertions(+), 20 deletions(-)
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 88646bc859..4cfcd9faaf 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -333,20 +333,67 @@
<para>
A conflict will produce an error and will stop the replication; it must be
resolved manually by the user. Details about the conflict can be found in
- the subscriber's server log.
+ <xref linkend="monitoring-pg-stat-subscription-errors"/> as well as the
+ subscriber's server log.
</para>
<para>
The resolution can be done either by changing data on the subscriber so
- that it does not conflict with the incoming change or by skipping the
- transaction that conflicts with the existing data. The transaction can be
- skipped by calling the <link linkend="pg-replication-origin-advance">
+ that it does not conflict with the incoming changes or by skipping the whole
+ transaction. This option specifies the ID of the transaction whose
+ application is to be skipped by the logical replication worker. The logical
+ replication worker skips all data modification transaction conflicts with
+ the existing data. When a conflict produce an error, it is shown in
+ <structname>pg_stat_subscription_errors</structname> view as follows:
+ </para>
+
+ <programlisting>
+postgres=# SELECT * FROM pg_stat_subscription_errors;
+-[ RECORD 1 ]----+-----------------------------------------------------------
+subid | 16391
+subname | test_sub
+subrelid |
+relid | 16384
+command | INSERT
+xid | 716
+count | 50
+error_message | duplicate key value violates unique constraint "test_pkey"
+last_failed_time | 2021-09-29 15:52:45.165754+00
+stats_reset |
+</programlisting>
+
+ <para>
+ and it is also shown in subscriber's server log as follows:
+ </para>
+
+<screen>
+ERROR: duplicate key value violates unique constraint "test_pkey"
+DETAIL: Key (id)=(1) already exists.
+CONTEXT: processing remote data during "INSERT" for replication target relation "public.test" in transaction 716 with commit timestamp 2021-07-15 21:54:58.802874+00
+</screen>
+
+ <para>
+ The transaction ID that contains the change violating the constraint can be
+ found from those outputs (transaction ID 716 in the above case). The transaction
+ can be skipped by setting <replaceable>skip_xid</replaceable> on the subscription
+ by <command>ALTER SUBSCRIPTION ... SET</command>. Alternatively, the transaction
+ can also be skipped by calling the <link linkend="pg-replication-origin-advance">
<function>pg_replication_origin_advance()</function></link> function with
a <parameter>node_name</parameter> corresponding to the subscription name,
and a position. The current position of origins can be seen in the
<link linkend="view-pg-replication-origin-status">
<structname>pg_replication_origin_status</structname></link> system view.
</para>
+
+ <para>
+ In this case, you need to consider changing the data on the subscriber so that it
+ doesn't conflict with incoming changes, or dropping the conflicting constraint or
+ unique index, or writing a trigger on the subscriber to suppress or redirect
+ conflicting incoming changes, or as a last resort, by skipping the whole transaction.
+ They skip the whole transaction, including changes that may not violate any
+ constraint. They may easily make the subscriber inconsistent, especially if
+ a user specifies the wrong transaction ID or the position of origin.
+ </para>
</sect1>
<sect1 id="logical-replication-restrictions">
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index c6ea386caa..df634a4fd1 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -207,8 +207,41 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
<literal>streaming</literal>.
</para>
<para>
- The parameters that can be reset are: <literal>streaming</literal>,
- <literal>binary</literal>, <literal>synchronous_commit</literal>.
+ The parameters that can be reset are: <literal>slot_name</literal>,
+ <literal>synchronous_commit</literal>, <literal>binary</literal>,
+ <literal>streaming</literal>, and following parameter:
+ </para>
+ <para>
+ <variablelist>
+ <varlistentry>
+ <term><literal>skip_xid</literal> (<type>xid</type>)</term>
+ <listitem>
+ <para>
+ If incoming data violates any constraints the logical replication
+ will stop until it is resolved. The resolution can be done either
+ by changing data on the subscriber so that it doesn't conflict with
+ incoming change or by skipping the whole transaction. This option
+ specifies transaction ID that logical replication worker skips to
+ apply. The logical replication worker skips all data modification
+ changes within the specified transaction. Therefore, since it skips
+ the whole transaction including the changes that may not violate the
+ constraint, it should only be used as a last resort. This option has
+ no effect for the transaction that is already prepared with enabling
+ <literal>two_phase</literal> on susbscriber. After the logical
+ replication successfully skips the transaction, the transaction ID
+ (stored in
+ <structname>pg_subscription</structname>.<structfield>subskipxid</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts.
+ </para>
+
+ <para>
+ Setting and resetting of <literal>skip_xid</literal> option is
+ restrited to superusers.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index 25021e25a4..cb22cd7463 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -104,6 +104,16 @@ GetSubscription(Oid subid, bool missing_ok)
Assert(!isnull);
sub->publications = textarray_to_stringlist(DatumGetArrayTypeP(datum));
+ /* Get skip XID */
+ datum = SysCacheGetAttr(SUBSCRIPTIONOID,
+ tup,
+ Anum_pg_subscription_subskipxid,
+ &isnull);
+ if (!isnull)
+ sub->skipxid = DatumGetTransactionId(datum);
+ else
+ sub->skipxid = InvalidTransactionId;
+
ReleaseSysCache(tup);
return sub;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 896ec8b836..fd74037fb8 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -60,6 +60,7 @@
#define SUBOPT_BINARY 0x00000080
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
+#define SUBOPT_SKIP_XID 0x00000400
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -81,6 +82,7 @@ typedef struct SubOpts
bool binary;
bool streaming;
bool twophase;
+ TransactionId skip_xid;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -129,6 +131,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->streaming = false;
if (IsSet(supported_opts, SUBOPT_TWOPHASE_COMMIT))
opts->twophase = false;
+ if (IsSet(supported_opts, SUBOPT_SKIP_XID))
+ opts->skip_xid = InvalidTransactionId;
/* Parse options */
foreach(lc, stmt_options)
@@ -261,6 +265,29 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_TWOPHASE_COMMIT;
opts->twophase = defGetBoolean(defel);
}
+ else if (strcmp(defel->defname, "skip_xid") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_SKIP_XID))
+ errorConflictingDefElem(defel, pstate);
+
+ if (!is_reset)
+ {
+ char *xid_str = defGetString(defel);
+ TransactionId xid;
+
+ /* Parse the argument as TransactionId */
+ xid = DatumGetTransactionId(DirectFunctionCall1(xidin,
+ CStringGetDatum(xid_str)));
+
+ if (!TransactionIdIsNormal(xid))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid transaction id")));
+ opts->skip_xid = xid;
+ }
+
+ opts->specified_opts |= SUBOPT_SKIP_XID;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -485,6 +512,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
CharGetDatum(opts.twophase ?
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -888,7 +916,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (is_reset)
supported_opts = (SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
- SUBOPT_STREAMING);
+ SUBOPT_STREAMING | SUBOPT_SKIP_XID);
else
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
@@ -941,6 +969,18 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
replaces[Anum_pg_subscription_substream - 1] = true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_SKIP_XID))
+ {
+ if (!superuser())
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("must be superuser to set %s", "skip_xid")));
+
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(opts.skip_xid);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+ }
+
update_tuple = true;
break;
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index ac3236a573..906313f382 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -136,6 +136,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -255,6 +256,21 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * skipping_xid is a valid XID during skipping all data modification changes
+ * (INSERT/DELETE/UPDATE/TRUNCATE) of the specified transaction at MySubscription->skipxid.
+ * Please note that we don’t skip receiving the changes particularly in streaming
+ * cases, since we decide whether or not to skip applying the changes when starting
+ * to apply. Once starting skipping changes, we copy the XID to skipping_xid and
+ * then don't stop skipping until we skip the whole transaction even if the
+ * subscription is invalidated and MySubscription->skipxid gets changed or reset.
+ * When stopping the skipping behavior, we reset the skip XID (subskipxid) in the
+ * pg_subscription catalog and associate origin status to the transaction that resets
+ * the skip XID so that we can start streaming from the next transaction.
+ */
+static TransactionId skipping_xid = InvalidTransactionId;
+#define is_skipping_changes() (TransactionIdIsValid(skipping_xid))
+
/* BufFile handle of the current streaming file */
static BufFile *stream_fd = NULL;
@@ -335,6 +351,9 @@ static void apply_error_callback(void *arg);
static inline void set_apply_error_context_xact(TransactionId xid, TimestampTz ts);
static inline void reset_apply_error_context_info(void);
+static void maybe_start_skipping_changes(TransactionId xid);
+static void stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs);
+
/*
* Should this worker apply changes for given relation.
*
@@ -789,6 +808,11 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
+ /*
+ * Enable skipping all changes of this transaction if specified.
+ */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -813,7 +837,18 @@ apply_handle_commit(StringInfo s)
LSN_FORMAT_ARGS(commit_data.commit_lsn),
LSN_FORMAT_ARGS(remote_final_lsn))));
- apply_handle_commit_internal(&commit_data);
+ /*
+ * Stop the skipping transaction if enabled. Otherwise, commit the changes
+ * that are just applied.
+ */
+ if (is_skipping_changes())
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ apply_handle_commit_internal(&commit_data);
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(commit_data.end_lsn);
@@ -841,6 +876,9 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
+ /* Enable skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -899,9 +937,10 @@ apply_handle_prepare(StringInfo s)
/*
* Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction. It is done this way because at
- * commit prepared time, we won't know whether we have skipped preparing a
- * transaction because of no change.
+ * change has happened in this transaction, possibly because we're
+ * skipping data-modification changes of this transaction. It is done this
+ * way because at commit prepared time, we won't know whether we have
+ * skipped preparing a transaction because of no change.
*
* XXX, We can optimize such that at commit prepared time, we first check
* whether we have prepared the transaction or not but that doesn't seem
@@ -915,6 +954,10 @@ apply_handle_prepare(StringInfo s)
CommitTransactionCommand();
pgstat_report_stat(false);
+ /* Stop the skipping changes if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
store_flush_position(prepare_data.end_lsn);
in_remote_transaction = false;
@@ -1046,6 +1089,9 @@ apply_handle_stream_prepare(StringInfo s)
elog(DEBUG1, "received prepare for streamed transaction %u", prepare_data.xid);
+ /* Enable skipping all changes of this transaction if specified. */
+ maybe_start_skipping_changes(prepare_data.xid);
+
/* Replay all the spooled operations. */
apply_spooled_messages(prepare_data.xid, prepare_data.prepare_lsn);
@@ -1056,6 +1102,10 @@ apply_handle_stream_prepare(StringInfo s)
pgstat_report_stat(false);
+ /* Stop the skipping changes if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
store_flush_position(prepare_data.end_lsn);
in_remote_transaction = false;
@@ -1081,9 +1131,10 @@ apply_handle_origin(StringInfo s)
{
/*
* ORIGIN message can only come inside streaming transaction or inside
- * remote transaction and before any actual writes.
+ * remote transaction and before any actual writes unless we're skipping
+ * changes of the transaction.
*/
- if (!in_streamed_transaction &&
+ if (!in_streamed_transaction && !is_skipping_changes() &&
(!in_remote_transaction ||
(IsTransactionState() && !am_tablesync_worker())))
ereport(ERROR,
@@ -1206,6 +1257,7 @@ apply_handle_stream_abort(StringInfo s)
errmsg_internal("STREAM ABORT message without STREAM STOP")));
logicalrep_read_stream_abort(s, &xid, &subxid);
+ maybe_start_skipping_changes(xid);
/*
* If the two XIDs are the same, it's in fact abort of toplevel xact, so
@@ -1289,6 +1341,10 @@ apply_handle_stream_abort(StringInfo s)
CommitTransactionCommand();
}
+ /* Stop the skipping transaction if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
reset_apply_error_context_info();
}
@@ -1428,9 +1484,23 @@ apply_handle_stream_commit(StringInfo s)
elog(DEBUG1, "received commit for streamed transaction %u", xid);
+ /* Enable skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(xid);
+
apply_spooled_messages(xid, commit_data.commit_lsn);
- apply_handle_commit_internal(&commit_data);
+ if (is_skipping_changes())
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ {
+ /* commit the streamed transaction */
+ apply_handle_commit_internal(&commit_data);
+ }
/* unlink the files with serialized changes and subxact info */
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
@@ -2316,6 +2386,17 @@ apply_dispatch(StringInfo s)
LogicalRepMsgType action = pq_getmsgbyte(s);
LogicalRepMsgType saved_command;
+ /*
+ * Skip all data-modification changes if we're skipping changes of this
+ * transaction.
+ */
+ if (is_skipping_changes() &&
+ (action == LOGICAL_REP_MSG_INSERT ||
+ action == LOGICAL_REP_MSG_UPDATE ||
+ action == LOGICAL_REP_MSG_DELETE ||
+ action == LOGICAL_REP_MSG_TRUNCATE))
+ return;
+
/*
* Set the current command being applied. Since this function can be
* called recusively when applying spooled changes, save the current
@@ -3662,3 +3743,91 @@ reset_apply_error_context_info(void)
apply_error_callback_arg.remote_attnum = -1;
set_apply_error_context_xact(InvalidTransactionId, 0);
}
+
+/*
+ * Start skipping changes of the transaction if the given XID matches the
+ * transaction ID specified by skip_xid option.
+ */
+static void
+maybe_start_skipping_changes(TransactionId xid)
+{
+ Assert(!is_skipping_changes());
+ Assert(!TransactionIdIsValid(skipping_xid));
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ if (!TransactionIdIsValid(MySubscription->skipxid) ||
+ MySubscription->skipxid != xid)
+ return;
+
+ skipping_xid = xid;
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction %u",
+ skipping_xid));
+}
+
+/*
+ * Stop skipping changes and reset the skip XID. Also, reset the skip XID
+ * (pg_subscription.subskipxid). If origin_lsn and origin_committs are valid, we
+ * set origin state to the transaction commit that resets the skip XID so that we
+ * can start streaming from the transaction next to the one that we just skipped.
+ */
+static void
+stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs)
+{
+ Relation rel;
+ HeapTuple tup;
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ Assert(is_skipping_changes());
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction %u",
+ skipping_xid)));
+
+ /* Stop skipping changes */
+ skipping_xid = InvalidTransactionId;
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ if (!IsTransactionState())
+ StartTransactionCommand();
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ ObjectIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ /* Set subskipxid to null */
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ /* Update the system catalog to reset the skip XID */
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+
+ heap_freetuple(tup);
+ table_close(rel, RowExclusiveLock);
+
+ if (!XLogRecPtrIsInvalid(origin_lsn))
+ {
+ /*
+ * Update origin state so we can restart streaming from correct
+ * position in case of crash.
+ */
+ replorigin_session_origin_lsn = origin_lsn;
+ replorigin_session_origin_timestamp = origin_committs;
+ }
+
+ CommitTransactionCommand();
+ pgstat_report_stat(false);
+}
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 21061493ea..beaa6e646d 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -67,6 +67,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
char subtwophasestate; /* Stream two-phase transactions */
+ TransactionId subskipxid BKI_FORCE_NULL; /* All changes associated with
+ * this XID are skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -103,6 +106,7 @@ typedef struct Subscription
* binary format */
bool stream; /* Allow streaming in-progress transactions. */
char twophasestate; /* Allow streaming two-phase transactions */
+ TransactionId skipxid; /* All changes of the XID are skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 539921cb52..63503b86da 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3694,6 +3694,7 @@ typedef struct AlterSubscriptionStmt
char *conninfo; /* Connection string to publisher */
List *publication; /* One or more publication to subscribe to */
List *options; /* List of DefElem nodes */
+ TransactionId skip_xid; /* XID to skip */
} AlterSubscriptionStmt;
typedef struct DropSubscriptionStmt
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index e4c16cab66..e4dc4fb946 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -293,6 +293,19 @@ ERROR: unrecognized subscription parameter: "enabled"
-- fail - RESET must not include values
ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
ERROR: RESET must not include values for parameters
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = '4294967295');
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1.1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+ERROR: invalid transaction id
\dRs+
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 3b0fbea897..c458b38985 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -228,6 +228,17 @@ ALTER SUBSCRIPTION regress_testsub RESET (enabled);
-- fail - RESET must not include values
ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = '4294967295');
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1.1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+
\dRs+
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/subscription/t/025_error_report.pl b/src/test/subscription/t/025_error_report.pl
index c6fea0d046..dc2e9d6618 100644
--- a/src/test/subscription/t/025_error_report.pl
+++ b/src/test/subscription/t/025_error_report.pl
@@ -1,12 +1,14 @@
# Copyright (c) 2021, PostgreSQL Global Development Group
-# Tests for subscription error reporting.
+# Tests for subscription error reporting and skipping logical
+# replication transactions.
+
use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 5;
+use Test::More tests => 14;
# Test if the error reported on pg_subscription_errors view is expected.
sub test_subscription_error
@@ -32,6 +34,35 @@ WHERE relid = '$relname'::regclass;
]);
is($result, $expected_error, $msg);
}
+# Check the error reported on pg_stat_subscription view and skip the failed
+# transaction.
+sub test_skip_subscription_error
+{
+ my ($node, $subname, $relname, $xid, $expected_error, $msg) = @_;
+
+ # Check the reported error.
+ test_subscription_error($node, $relname, $xid, $expected_error, $msg);
+
+ # Get XID of the failed transaction.
+ my $skipxid = $node->safe_psql(
+ 'postgres',
+ "SELECT xid FROM pg_stat_subscription_errors WHERE relid = '$relname'::regclass");
+ is($skipxid, $xid, "remote xid and skip_xid are equal");
+
+ $node->safe_psql('postgres',
+ "ALTER SUBSCRIPTION $subname SET (skip_xid = '$skipxid')");
+
+ # Restart the subscriber node to restart logical replication with no interval.
+ $node->restart;
+
+ # Wait for the failed transaction to be skipped.
+ $node->poll_query_until(
+ 'postgres',
+ qq[
+SELECT subskipxid IS NULL FROM pg_subscription
+WHERE subname = '$subname'
+]) or die "Timed out while waiting for the transaction to be skipped";
+}
# Create publisher node.
my $node_publisher = PostgresNode->new('publisher');
@@ -123,7 +154,7 @@ $result = $node_subscriber->safe_psql('postgres',
is($result, q(1), 'check initial data are copied to subscriber');
# Insert more data to test_tab1, raising an error on the subscriber due to violation
-# of the unique constraint on test_tab1.
+# of the unique constraint on test_tab1. Then skip the transaction in question.
my $xid = $node_publisher->safe_psql(
'postgres',
qq[
@@ -132,15 +163,79 @@ INSERT INTO test_tab1 VALUES (1);
SELECT pg_current_xact_id()::xid;
COMMIT;
]);
-test_subscription_error($node_subscriber, 'test_tab1', $xid,
- qq(tap_sub|INSERT|test_tab1|t),
- 'check the error reported by the apply worker');
+test_skip_subscription_error($node_subscriber, 'tap_sub', 'test_tab1',
+ $xid, qq(tap_sub|INSERT|test_tab1|t),
+ 'check the error reported by the apply worker');
# Check the table sync worker's error in the view.
test_subscription_error($node_subscriber, 'test_tab2', '',
qq(tap_sub||test_tab2|t),
'check the error reported by the table sync worker');
+# Insert enough rows to test_tab_streaming to exceed the 64kB limit, also raising an
+# error on the subscriber during applying spooled changes for the same reason. Then
+# skip the transaction in question.
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_skip_subscription_error($node_subscriber, 'tap_sub_streaming', 'test_tab_streaming',
+ $xid, qq(tap_sub_streaming|INSERT|test_tab_streaming|t),
+ 'skip the error reported by the table sync worker during applying streaming changes');
+
+# Insert data to test_tab1 and test_tab_streaming that don't conflict.
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab1 VALUES (2)");
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab_streaming VALUES (10001, md5(10001::text))");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Check the data is successfully replicated after skipping the transactions.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT * FROM test_tab1");
+is($result, q(1
+2), "subscription gets changes after skipped transaction");
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM test_tab_streaming");
+is($result, q(2), "subscription gets changes after skipped streamed transaction");
+
+# Tests for skipping the transactions that are prepared and stream-prepared. We insert
+# the same data as the previous tests but prepare the transactions. Those insertions
+# raise an error on the subscriptions. Then we skip the transactions in question.
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab1 VALUES (1);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'skip_sub1';
+COMMIT PREPARED 'skip_sub1';
+]);
+test_skip_subscription_error($node_subscriber, 'tap_sub', 'test_tab1',
+ $xid, qq(tap_sub|INSERT|test_tab1|t),
+ 'skip the error on changes of the prepared transaction');
+
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'skip_sub2';
+COMMIT PREPARED 'skip_sub2';
+]);
+test_skip_subscription_error($node_subscriber, 'tap_sub_streaming', 'test_tab_streaming',
+ $xid, qq(tap_sub_streaming|INSERT|test_tab_streaming|t),
+ 'skip the error on changes of the prepared-streamed transaction');
+
# Check if the view doesn't show any entries after dropping the subscriptions.
$node_subscriber->safe_psql(
'postgres',
--
2.24.3 (Apple Git-128)
v15-0002-Add-RESET-command-to-ALTER-SUBSCRIPTION-command.patchapplication/octet-stream; name=v15-0002-Add-RESET-command-to-ALTER-SUBSCRIPTION-command.patchDownload
From f68d8b54109c3251457ccb630f13daadb14a7a47 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 2 Aug 2021 14:23:18 +0900
Subject: [PATCH v15 2/3] Add RESET command to ALTER SUBSCRIPTION command.
ALTER SUBSCRIPTION ... RESET command resets subscription
parameters. The parameters that can be set are streaming, binary,
synchronous_commit.
The RESET parameter for ALTER SUBSCRIPTION is required by the
follow-up commit that introduces a new resettable subscription
parameter "skip_xid".
---
doc/src/sgml/ref/alter_subscription.sgml | 8 ++-
src/backend/commands/subscriptioncmds.c | 59 +++++++++++++++-------
src/backend/parser/gram.y | 11 +++-
src/include/nodes/parsenodes.h | 5 +-
src/test/regress/expected/subscription.out | 14 ++++-
src/test/regress/sql/subscription.sql | 13 +++++
6 files changed, 87 insertions(+), 23 deletions(-)
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 0b027cc346..c6ea386caa 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -29,6 +29,7 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> REFRESH PUB
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> ENABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> DISABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <replaceable>new_name</replaceable>
</synopsis>
@@ -194,16 +195,21 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
<varlistentry>
<term><literal>SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )</literal></term>
+ <term><literal>RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )</literal></term>
<listitem>
<para>
This clause alters parameters originally set by
<xref linkend="sql-createsubscription"/>. See there for more
- information. The parameters that can be altered
+ information. The parameters that can be set
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, and
<literal>streaming</literal>.
</para>
+ <para>
+ The parameters that can be reset are: <literal>streaming</literal>,
+ <literal>binary</literal>, <literal>synchronous_commit</literal>.
+ </para>
</listitem>
</varlistentry>
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index c47ba26369..896ec8b836 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -99,7 +99,8 @@ static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname,
*/
static void
parse_subscription_options(ParseState *pstate, List *stmt_options,
- bits32 supported_opts, SubOpts *opts)
+ bits32 supported_opts, SubOpts *opts,
+ bool is_reset)
{
ListCell *lc;
@@ -134,6 +135,11 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
{
DefElem *defel = (DefElem *) lfirst(lc);
+ if (is_reset && defel->arg != NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("RESET must not include values for parameters")));
+
if (IsSet(supported_opts, SUBOPT_CONNECT) &&
strcmp(defel->defname, "connect") == 0)
{
@@ -192,12 +198,18 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_SYNCHRONOUS_COMMIT;
- opts->synchronous_commit = defGetString(defel);
+ if (!is_reset)
+ {
+ opts->synchronous_commit = defGetString(defel);
- /* Test if the given value is valid for synchronous_commit GUC. */
- (void) set_config_option("synchronous_commit", opts->synchronous_commit,
- PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
- false, 0, false);
+ /*
+ * Test if the given value is valid for synchronous_commit
+ * GUC.
+ */
+ (void) set_config_option("synchronous_commit", opts->synchronous_commit,
+ PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
+ false, 0, false);
+ }
}
else if (IsSet(supported_opts, SUBOPT_REFRESH) &&
strcmp(defel->defname, "refresh") == 0)
@@ -215,7 +227,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_BINARY;
- opts->binary = defGetBoolean(defel);
+ if (!is_reset)
+ opts->binary = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_STREAMING) &&
strcmp(defel->defname, "streaming") == 0)
@@ -224,7 +237,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_STREAMING;
- opts->streaming = defGetBoolean(defel);
+ if (!is_reset)
+ opts->streaming = defGetBoolean(defel);
}
else if (strcmp(defel->defname, "two_phase") == 0)
{
@@ -397,7 +411,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT);
- parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
+ parse_subscription_options(pstate, stmt->options, supported_opts, &opts,
+ false);
/*
* Since creating a replication slot is not transactional, rolling back
@@ -866,14 +881,21 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
switch (stmt->kind)
{
- case ALTER_SUBSCRIPTION_OPTIONS:
+ case ALTER_SUBSCRIPTION_SET_OPTIONS:
+ case ALTER_SUBSCRIPTION_RESET_OPTIONS:
{
- supported_opts = (SUBOPT_SLOT_NAME |
- SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
- SUBOPT_STREAMING);
+ bool is_reset = (stmt->kind == ALTER_SUBSCRIPTION_RESET_OPTIONS);
+
+ if (is_reset)
+ supported_opts = (SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
+ SUBOPT_STREAMING);
+ else
+ supported_opts = (SUBOPT_SLOT_NAME |
+ SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
+ SUBOPT_STREAMING);
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, is_reset);
if (IsSet(opts.specified_opts, SUBOPT_SLOT_NAME))
{
@@ -926,7 +948,8 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
case ALTER_SUBSCRIPTION_ENABLED:
{
parse_subscription_options(pstate, stmt->options,
- SUBOPT_ENABLED, &opts);
+ SUBOPT_ENABLED, &opts, false);
+
Assert(IsSet(opts.specified_opts, SUBOPT_ENABLED));
if (!sub->slotname && opts.enabled)
@@ -961,7 +984,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
{
supported_opts = SUBOPT_COPY_DATA | SUBOPT_REFRESH;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
values[Anum_pg_subscription_subpublications - 1] =
publicationListToArray(stmt->publication);
@@ -1008,7 +1031,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = SUBOPT_REFRESH | SUBOPT_COPY_DATA;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
publist = merge_publications(sub->publications, stmt->publication, isadd, stmt->subname);
values[Anum_pg_subscription_subpublications - 1] =
@@ -1056,7 +1079,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
errmsg("ALTER SUBSCRIPTION ... REFRESH is not allowed for disabled subscriptions")));
parse_subscription_options(pstate, stmt->options,
- SUBOPT_COPY_DATA, &opts);
+ SUBOPT_COPY_DATA, &opts, false);
/*
* The subscription option "two_phase" requires that
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index e3068a374e..70558f964a 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9721,7 +9721,16 @@ AlterSubscriptionStmt:
{
AlterSubscriptionStmt *n =
makeNode(AlterSubscriptionStmt);
- n->kind = ALTER_SUBSCRIPTION_OPTIONS;
+ n->kind = ALTER_SUBSCRIPTION_SET_OPTIONS;
+ n->subname = $3;
+ n->options = $5;
+ $$ = (Node *)n;
+ }
+ | ALTER SUBSCRIPTION name RESET definition
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_RESET_OPTIONS;
n->subname = $3;
n->options = $5;
$$ = (Node *)n;
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 3138877553..539921cb52 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3676,7 +3676,8 @@ typedef struct CreateSubscriptionStmt
typedef enum AlterSubscriptionType
{
- ALTER_SUBSCRIPTION_OPTIONS,
+ ALTER_SUBSCRIPTION_SET_OPTIONS,
+ ALTER_SUBSCRIPTION_RESET_OPTIONS,
ALTER_SUBSCRIPTION_CONNECTION,
ALTER_SUBSCRIPTION_SET_PUBLICATION,
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
@@ -3688,7 +3689,7 @@ typedef enum AlterSubscriptionType
typedef struct AlterSubscriptionStmt
{
NodeTag type;
- AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_OPTIONS, etc */
+ AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_SET_OPTIONS, etc */
char *subname; /* Name of the subscription */
char *conninfo; /* Connection string to publisher */
List *publication; /* One or more publication to subscribe to */
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 15a1ac6398..e4c16cab66 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -281,11 +281,23 @@ ALTER SUBSCRIPTION regress_testsub SET (two_phase = false);
ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+-- ok
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit);
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit, binary, streaming);
+-- fail - unsupported parameters
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ERROR: unrecognized subscription parameter: "connect"
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+ERROR: unrecognized subscription parameter: "enabled"
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+ERROR: RESET must not include values for parameters
\dRs+
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7faa935a2a..3b0fbea897 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -215,6 +215,19 @@ ALTER SUBSCRIPTION regress_testsub SET (two_phase = false);
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+
+-- ok
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit);
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit, binary, streaming);
+
+-- fail - unsupported parameters
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+
\dRs+
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
--
2.24.3 (Apple Git-128)
v15-0001-Add-a-subscription-errors-statistics-view-pg_sta.patchapplication/octet-stream; name=v15-0001-Add-a-subscription-errors-statistics-view-pg_sta.patchDownload
From 3c3ab0bd589aef7ba165f022c7adfe018fa65cec Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 16 Jul 2021 23:10:22 +0900
Subject: [PATCH v15 1/3] Add a subscription errors statistics view
"pg_stat_subscription_errors".
This commit adds a new system view pg_stat_logical_replication_errors,
that shows information about any errors which occur during application
of logical replication changes as well as during performing initial table
synchronization.
The subscription error entries are removed by autovacuum workers after
table synchronization completes in table sync worker cases and after
dropping the subscription in apply worker cases.
It also adds an SQL function pg_stat_reset_subscription_error() to
reset a single subscription error.
---
doc/src/sgml/monitoring.sgml | 160 +++++
src/backend/catalog/system_functions.sql | 2 +
src/backend/catalog/system_views.sql | 25 +
src/backend/postmaster/pgstat.c | 609 ++++++++++++++++++++
src/backend/replication/logical/worker.c | 51 +-
src/backend/utils/adt/pgstatfuncs.c | 121 ++++
src/include/catalog/pg_proc.dat | 13 +
src/include/pgstat.h | 121 ++++
src/test/regress/expected/rules.out | 20 +
src/test/subscription/t/025_error_report.pl | 154 +++++
src/tools/pgindent/typedefs.list | 7 +
11 files changed, 1280 insertions(+), 3 deletions(-)
create mode 100644 src/test/subscription/t/025_error_report.pl
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 2cd8920645..6c57cd61d5 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -346,6 +346,15 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
</entry>
</row>
+ <row>
+ <entry><structname>pg_stat_subscription_errors</structname><indexterm><primary>pg_stat_subscription_errors</primary></indexterm></entry>
+ <entry>One row per error that occurred on subscription, showing information about
+ each subscription error.
+ See <link linkend="monitoring-pg-stat-subscription-errors">
+ <structname>pg_stat_subscription_errors</structname></link> for details.
+ </entry>
+ </row>
+
<row>
<entry><structname>pg_stat_ssl</structname><indexterm><primary>pg_stat_ssl</primary></indexterm></entry>
<entry>One row per connection (regular and replication), showing information about
@@ -3050,6 +3059,135 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
</sect2>
+ <sect2 id="monitoring-pg-stat-subscription-errors">
+ <title><structname>pg_stat_subscription_errors</structname></title>
+
+ <indexterm>
+ <primary>pg_stat_subscription_errors</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_stat_subscription_errors</structname> view will contain one
+ row per subscription error reported by workers applying logical replication
+ changes and workers handling the initial data copy of the subscribed tables.
+ </para>
+
+ <table id="pg-stat-subscription-errors" xreflabel="pg_stat_subscription-errors">
+ <title><structname>pg_stat_subscription_errors</structname> View</title>
+ <tgroup cols="1">
+ <thead>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ Column Type
+ </para>
+ <para>
+ Description
+ </para></entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the subscription
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the subscription
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subrelid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker is synchronizing; NULL for the
+ main apply worker
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>relid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker was processing when the
+ error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>command</structfield> <type>text</type>
+ </para>
+ <para>
+ Name of command being applied when the error occurred. This field
+ is always NULL if the error was reported during the initial data
+ copy.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>xid</structfield> <type>xid</type>
+ </para>
+ <para>
+ Transaction ID of the publisher node being applied when the error
+ occurred. This field is always NULL if the error was reported
+ during the initial data copy.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>count</structfield> <type>uint8</type>
+ </para>
+ <para>
+ Number of consecutive times the error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>error_message</structfield> <type>text</type>
+ </para>
+ <para>
+ Message of the error
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_failed_time</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which the last error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>stats_reset</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which these statistics were last reset
+ </para></entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ </sect2>
+
<sect2 id="monitoring-pg-stat-ssl-view">
<title><structname>pg_stat_ssl</structname></title>
@@ -5172,6 +5310,28 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
can be granted EXECUTE to run the function.
</para></entry>
</row>
+
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_stat_reset_subscription_error</primary>
+ </indexterm>
+ <function>pg_stat_reset_subscription_error</function> ( <parameter>subid</parameter> <type>oid</type>, <parameter>relid</parameter> <type>oid</type> )
+ <returnvalue>void</returnvalue>
+ </para>
+ <para>
+ Resets statistics of a single subscription error. If
+ the argument <parameter>relid</parameter> is not <literal>NULL</literal>,
+ resets error statistics of the <literal>tablesync</literal> worker for
+ the relation with <parameter>relid</parameter>. Otherwise, resets the
+ error statistics of the <literal>apply</literal> worker running on the
+ subscription with <parameter>subid</parameter>.
+ </para>
+ <para>
+ This function is restricted to superusers by default, but other users
+ can be granted EXECUTE to run the function.
+ </para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index a416e94d37..c9aa6f04d3 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -639,6 +639,8 @@ REVOKE EXECUTE ON FUNCTION pg_stat_reset_single_function_counters(oid) FROM publ
REVOKE EXECUTE ON FUNCTION pg_stat_reset_replication_slot(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_stat_reset_subscription_error(oid, oid) FROM public;
+
REVOKE EXECUTE ON FUNCTION lo_import(text) FROM public;
REVOKE EXECUTE ON FUNCTION lo_import(text, oid) FROM public;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 55f6e3711d..6e891b960e 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1257,3 +1257,28 @@ REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
substream, subtwophasestate, subslotname, subsynccommit, subpublications)
ON pg_subscription TO public;
+
+CREATE VIEW pg_stat_subscription_errors AS
+ SELECT
+ e.subid,
+ s.subname,
+ e.subrelid,
+ e.relid,
+ e.command,
+ e.xid,
+ e.count,
+ e.error_message,
+ e.last_failed_time,
+ e.stats_reset
+ FROM (SELECT
+ oid as subid,
+ NULL as relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT
+ srsubid as subid,
+ srrelid as relid
+ FROM pg_subscription_rel
+ WHERE srsubstate <> 'r') sr,
+ LATERAL pg_stat_get_subscription_error(sr.subid, sr.relid) e
+ JOIN pg_subscription s ON (e.subid = s.oid);
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index b7d0fbaefd..7a5615c1df 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -41,6 +41,8 @@
#include "catalog/catalog.h"
#include "catalog/pg_database.h"
#include "catalog/pg_proc.h"
+#include "catalog/pg_subscription.h"
+#include "catalog/pg_subscription_rel.h"
#include "common/ip.h"
#include "executor/instrument.h"
#include "libpq/libpq.h"
@@ -106,6 +108,7 @@
#define PGSTAT_TAB_HASH_SIZE 512
#define PGSTAT_FUNCTION_HASH_SIZE 512
#define PGSTAT_REPLSLOT_HASH_SIZE 32
+#define PGSTAT_SUBWORKER_HASH_SIZE 32
/* ----------
@@ -282,6 +285,7 @@ static PgStat_GlobalStats globalStats;
static PgStat_WalStats walStats;
static PgStat_SLRUStats slruStats[SLRU_NUM_ELEMENTS];
static HTAB *replSlotStatHash = NULL;
+static HTAB *subWorkerStatHash = NULL;
/*
* List of OIDs of databases we need to write out. If an entry is InvalidOid,
@@ -332,6 +336,13 @@ static bool pgstat_db_requested(Oid databaseid);
static PgStat_StatReplSlotEntry *pgstat_get_replslot_entry(NameData name, bool create_it);
static void pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotstats, TimestampTz ts);
+static PgStat_StatSubWorkerEntry *pgstat_get_subworker_entry(Oid subid, Oid subrelid,
+ bool create);
+static void pgstat_reset_subworker_error(PgStat_StatSubWorkerEntry *wentry, TimestampTz ts);
+static void pgstat_report_subworker_purge(PgStat_MsgSubWorkerPurge *msg);
+static void pgstat_report_subworker_error_purge(PgStat_MsgSubWorkerErrorPurge *msg);
+static void pgstat_vacuum_subworker_stats(void);
+
static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg, TimestampTz now);
static void pgstat_send_funcstats(void);
static void pgstat_send_slru(void);
@@ -356,6 +367,7 @@ static void pgstat_recv_resetsharedcounter(PgStat_MsgResetsharedcounter *msg, in
static void pgstat_recv_resetsinglecounter(PgStat_MsgResetsinglecounter *msg, int len);
static void pgstat_recv_resetslrucounter(PgStat_MsgResetslrucounter *msg, int len);
static void pgstat_recv_resetreplslotcounter(PgStat_MsgResetreplslotcounter *msg, int len);
+static void pgstat_recv_resetsubworkererror(PgStat_MsgResetsubworkererror *msg, int len);
static void pgstat_recv_autovac(PgStat_MsgAutovacStart *msg, int len);
static void pgstat_recv_vacuum(PgStat_MsgVacuum *msg, int len);
static void pgstat_recv_analyze(PgStat_MsgAnalyze *msg, int len);
@@ -373,6 +385,10 @@ static void pgstat_recv_connect(PgStat_MsgConnect *msg, int len);
static void pgstat_recv_disconnect(PgStat_MsgDisconnect *msg, int len);
static void pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len);
static void pgstat_recv_tempfile(PgStat_MsgTempFile *msg, int len);
+static void pgstat_recv_subworker_error(PgStat_MsgSubWorkerError *msg, int len);
+static void pgstat_recv_subworker_error_purge(PgStat_MsgSubWorkerErrorPurge *msg,
+ int len);
+static void pgstat_recv_subworker_purge(PgStat_MsgSubWorkerPurge *msg, int len);
/* ------------------------------------------------------------
* Public functions called from postmaster follow
@@ -1178,6 +1194,10 @@ pgstat_vacuum_stat(void)
}
}
+ /* Cleanup the dead subscription workers statistics */
+ if (subWorkerStatHash)
+ pgstat_vacuum_subworker_stats();
+
/*
* Lookup our own database entry; if not found, nothing more to do.
*/
@@ -1355,6 +1375,218 @@ pgstat_collect_oids(Oid catalogid, AttrNumber anum_oid)
}
+/* PgStat_StatSubWorkerEntry comparator, sorting subid and subrelid */
+static int
+subworker_stats_comparator(const ListCell *a, const ListCell *b)
+{
+ PgStat_StatSubWorkerEntry *entry1 = (PgStat_StatSubWorkerEntry *) lfirst(a);
+ PgStat_StatSubWorkerEntry *entry2 = (PgStat_StatSubWorkerEntry *) lfirst(b);
+ int ret;
+
+ ret = oid_cmp(&entry1->key.subid, &entry2->key.subid);
+ if (ret != 0)
+ return ret;
+
+ return oid_cmp(&entry1->key.subrelid, &entry2->key.subrelid);
+}
+
+/* ----------
+ * pgstat_vacuum_subworker_stat() -
+ *
+ * This is a subroutine for pgstat_vacuum_stat to tell the collector about
+ * the all the dead subscription worker statistics.
+ */
+static void
+pgstat_vacuum_subworker_stats(void)
+{
+ struct subid_dbid_mapping
+ {
+ Oid subid;
+ Oid dbid;
+ };
+ HTAB *subdbmap;
+ HASHCTL hash_ctl;
+ HASH_SEQ_STATUS hstat;
+ Relation rel;
+ HeapTuple tup;
+ Snapshot snapshot;
+ TupleDesc desc;
+ TableScanDesc scan;
+ PgStat_MsgSubWorkerPurge wpmsg;
+ PgStat_MsgSubWorkerErrorPurge epmsg;
+ PgStat_StatSubWorkerEntry *wentry;
+ List *subworker_stats = NIL;
+ List *not_ready_rels = NIL;
+ ListCell *lc1;
+
+ /* Create a map for mapping subscriptoin OID and database OID */
+ hash_ctl.keysize = sizeof(Oid);
+ hash_ctl.entrysize = sizeof(struct subid_dbid_mapping);
+ subdbmap = hash_create("Temporary map of subscription and database OIDs",
+ PGSTAT_SUBWORKER_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+
+ rel = table_open(SubscriptionRelationId, AccessShareLock);
+ snapshot = RegisterSnapshot(GetLatestSnapshot());
+ scan = table_beginscan(rel, snapshot, 0, NULL);
+ desc = RelationGetDescr(rel);
+
+ /* Register entries into the hash table */
+ while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+ {
+ struct subid_dbid_mapping buf;
+ struct subid_dbid_mapping *entry;
+ bool isnull;
+
+ CHECK_FOR_INTERRUPTS();
+
+ buf.subid = heap_getattr(tup, Anum_pg_subscription_oid, desc, &isnull);
+ Assert(!isnull);
+
+ buf.dbid = heap_getattr(tup, Anum_pg_subscription_subdbid, desc, &isnull);
+ Assert(!isnull);
+
+ entry = hash_search(subdbmap, (void *) &(buf.subid), HASH_ENTER, NULL);
+ entry->dbid = buf.dbid;
+ }
+ table_endscan(scan);
+ UnregisterSnapshot(snapshot);
+ table_close(rel, AccessShareLock);
+
+ /* Build the list of worker stats and sort it by subid and relid */
+ hash_seq_init(&hstat, subWorkerStatHash);
+ while ((wentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&hstat)) != NULL)
+ subworker_stats = lappend(subworker_stats, wentry);
+ list_sort(subworker_stats, subworker_stats_comparator);
+
+ wpmsg.m_nentries = 0;
+ epmsg.m_nentries = 0;
+ epmsg.m_subid = InvalidOid;
+
+ /*
+ * Search for all the dead subscriptions and unnecessary table sync worker
+ * entries in stats hashtable and tell the stats collector to drop them.
+ */
+ foreach(lc1, subworker_stats)
+ {
+ struct subid_dbid_mapping *hentry;
+ ListCell *lc2;
+ bool keep_it = false;
+
+ wentry = (PgStat_StatSubWorkerEntry *) lfirst(lc1);
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* Skip if we already registered this subscription to purge */
+ if (wpmsg.m_nentries > 0 &&
+ wpmsg.m_subids[wpmsg.m_nentries - 1] == wentry->key.subid)
+ continue;
+
+ /* Check if the subscription is dead */
+ if ((hentry = hash_search(subdbmap, (void *) &(wentry->key.subid),
+ HASH_FIND, NULL)) == NULL)
+ {
+ /* This subscription is dead, add the subid to the message */
+ wpmsg.m_subids[wpmsg.m_nentries++] = wentry->key.subid;
+
+ /*
+ * If the message is full, send it out and reinitialize to empty
+ */
+ if (wpmsg.m_nentries >= PGSTAT_NUM_SUBWORKERPURGE)
+ {
+ pgstat_report_subworker_purge(&wpmsg);
+ wpmsg.m_nentries = 0;
+ }
+
+ continue;
+ }
+
+ /*
+ * This subscription is live. The next step is that we search errors
+ * of the table sync workers who are already in sync state. These
+ * errors should be removed.
+ */
+
+ /* We remove only table sync errors in the current database */
+ if (hentry->dbid != MyDatabaseId)
+ continue;
+
+ /* Skip if it's an apply worker error */
+ if (!OidIsValid(wentry->key.subrelid))
+ continue;
+
+ if (epmsg.m_subid != wentry->key.subid)
+ {
+ /*
+ * Send the purge message for previously collected table sync
+ * errors, if there is.
+ */
+ if (epmsg.m_nentries > 0)
+ {
+ pgstat_report_subworker_error_purge(&epmsg);
+ epmsg.m_nentries = 0;
+ }
+
+ /* Clean up if necessary */
+ if (not_ready_rels != NIL)
+ list_free_deep(not_ready_rels);
+
+ /* Refresh the not-ready-relations of this subscription */
+ not_ready_rels = GetSubscriptionNotReadyRelations(wentry->key.subid);
+
+ /* Prepare the error purge message for the subscription */
+ epmsg.m_subid = wentry->key.subid;
+ }
+
+ /*
+ * Check if the table is still being synchronized or no longer belongs
+ * to the subscription.
+ */
+ foreach(lc2, not_ready_rels)
+ {
+ SubscriptionRelState *relstate = (SubscriptionRelState *) lfirst(lc2);
+
+ if (relstate->relid == wentry->key.subrelid)
+ {
+ /* This table is still being synchronized, so keep it */
+ keep_it = true;
+ break;
+ }
+ }
+
+ if (keep_it)
+ continue;
+
+ /* Add the table to the error purge message */
+ epmsg.m_relids[epmsg.m_nentries++] = wentry->key.subrelid;
+
+ /*
+ * If the error purge message is full, send it out and reinitialize to
+ * empty
+ */
+ if (epmsg.m_nentries >= PGSTAT_NUM_SUBWORKERERRORPURGE)
+ {
+ pgstat_report_subworker_error_purge(&epmsg);
+ epmsg.m_nentries = 0;
+ }
+ }
+
+ /* Send the rest of dead subscriptions */
+ if (wpmsg.m_nentries > 0)
+ pgstat_report_subworker_purge(&wpmsg);
+
+ /* Send the rest of dead error entries */
+ if (epmsg.m_nentries > 0)
+ pgstat_report_subworker_error_purge(&epmsg);
+
+ /* Clean up */
+ if (not_ready_rels != NIL)
+ list_free_deep(not_ready_rels);
+
+ hash_destroy(subdbmap);
+}
+
/* ----------
* pgstat_drop_database() -
*
@@ -1544,6 +1776,24 @@ pgstat_reset_replslot_counter(const char *name)
pgstat_send(&msg, sizeof(msg));
}
+/* ----------
+ * pgstat_reset_subscription_error_stats() -
+ *
+ * Tell the collector to reset the subscription worker error.
+ * ----------
+ */
+void
+pgstat_reset_subworker_error_stats(Oid subid, Oid subrelid)
+{
+ PgStat_MsgResetsubworkererror msg;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_RESETSUBWORKERERROR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+
+ pgstat_send(&msg, sizeof(PgStat_MsgResetsubworkererror));
+}
+
/* ----------
* pgstat_report_autovac() -
*
@@ -1804,6 +2054,47 @@ pgstat_should_report_connstat(void)
return MyBackendType == B_BACKEND;
}
+/* --------
+ * pgstat_report_subworker_purge() -
+ *
+ * Tell the collector about dead subscriptions.
+ * --------
+ */
+static void
+pgstat_report_subworker_purge(PgStat_MsgSubWorkerPurge *msg)
+{
+ int len;
+
+ Assert(msg->m_nentries > 0);
+
+ len = offsetof(PgStat_MsgSubWorkerPurge, m_subids[0])
+ + msg->m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&msg->m_hdr, PGSTAT_MTYPE_SUBWORKERPURGE);
+ pgstat_send(msg, len);
+}
+
+/* --------
+ * pgstat_report_subworker_error_purge() -
+ *
+ * Tell the collector to remove table sync errors.
+ * --------
+ */
+static void
+pgstat_report_subworker_error_purge(PgStat_MsgSubWorkerErrorPurge *msg)
+{
+ int len;
+
+ Assert(OidIsValid(msg->m_subid));
+ Assert(msg->m_nentries > 0);
+
+ len = offsetof(PgStat_MsgSubWorkerErrorPurge, m_relids[0])
+ + msg->m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&msg->m_hdr, PGSTAT_MTYPE_SUBWORKERERRORPURGE);
+ pgstat_send(msg, len);
+}
+
/* ----------
* pgstat_report_replslot() -
*
@@ -1869,6 +2160,35 @@ pgstat_report_replslot_drop(const char *slotname)
pgstat_send(&msg, sizeof(PgStat_MsgReplSlot));
}
+/* ----------
+ * pgstat_report_subworker_error() -
+ *
+ * Tell the collector about the subscription worker error.
+ * ----------
+ */
+void
+pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command, TransactionId xid,
+ const char *errmsg)
+{
+ PgStat_MsgSubWorkerError msg;
+ int len;
+
+ Assert(strlen(errmsg) < PGSTAT_SUBWORKERERROR_MSGLEN);
+ len = offsetof(PgStat_MsgSubWorkerError, m_message[0]) + strlen(errmsg) + 1;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBWORKERERROR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_relid = relid;
+ msg.m_command = command;
+ msg.m_xid = xid;
+ msg.m_timestamp = GetCurrentTimestamp();
+ strlcpy(msg.m_message, errmsg, PGSTAT_SUBWORKERERROR_MSGLEN);
+
+ pgstat_send(&msg, len);
+}
+
/* ----------
* pgstat_ping() -
*
@@ -2987,6 +3307,22 @@ pgstat_fetch_replslot(NameData slotname)
return pgstat_get_replslot_entry(slotname, false);
}
+/*
+ * ---------
+ * pgstat_fetch_subworker() -
+ *
+ * Support function for the SQL-callable pgstat* functions. Returns
+ * a pointer to the subscription worker struct.
+ * ---------
+ */
+PgStat_StatSubWorkerEntry *
+pgstat_fetch_subworker(Oid subid, Oid subrelid)
+{
+ backend_read_statsfile();
+
+ return pgstat_get_subworker_entry(subid, subrelid, false);
+}
+
/*
* Shut down a single backend's statistics reporting at process exit.
*
@@ -3498,6 +3834,11 @@ PgstatCollectorMain(int argc, char *argv[])
len);
break;
+ case PGSTAT_MTYPE_RESETSUBWORKERERROR:
+ pgstat_recv_resetsubworkererror(&msg.msg_resetsubworkererror,
+ len);
+ break;
+
case PGSTAT_MTYPE_AUTOVAC_START:
pgstat_recv_autovac(&msg.msg_autovacuum_start, len);
break;
@@ -3568,6 +3909,19 @@ PgstatCollectorMain(int argc, char *argv[])
pgstat_recv_disconnect(&msg.msg_disconnect, len);
break;
+ case PGSTAT_MTYPE_SUBWORKERERROR:
+ pgstat_recv_subworker_error(&msg.msg_subworkererror, len);
+ break;
+
+ case PGSTAT_MTYPE_SUBWORKERERRORPURGE:
+ pgstat_recv_subworker_error_purge(&msg.msg_subworkererrorpurge,
+ len);
+ break;
+
+ case PGSTAT_MTYPE_SUBWORKERPURGE:
+ pgstat_recv_subworker_purge(&msg.msg_subworkerpurge, len);
+ break;
+
default:
break;
}
@@ -3868,6 +4222,22 @@ pgstat_write_statsfiles(bool permanent, bool allDbs)
}
}
+ /*
+ * Write subscription worker stats struct
+ */
+ if (subWorkerStatHash)
+ {
+ PgStat_StatSubWorkerEntry *wentry;
+
+ hash_seq_init(&hstat, subWorkerStatHash);
+ while ((wentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ fputc('S', fpout);
+ rc = fwrite(wentry, sizeof(PgStat_StatSubWorkerEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+ }
+ }
+
/*
* No more output to be done. Close the temp file and replace the old
* pgstat.stat with it. The ferror() check replaces testing for error
@@ -4329,6 +4699,48 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
break;
}
+ /*
+ * 'S' A PgStat_StatSubWorkerEntry struct describing a
+ * subscription worker statistics.
+ */
+ case 'S':
+ {
+ PgStat_StatSubWorkerEntry wbuf;
+ PgStat_StatSubWorkerEntry *wentry;
+
+ /* Read the subscription entry */
+ if (fread(&wbuf, 1, sizeof(PgStat_StatSubWorkerEntry), fpin)
+ != sizeof(PgStat_StatSubWorkerEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /* Create hash table if we don't have it already. */
+ if (subWorkerStatHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(PgStat_StatSubWorkerKey);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubWorkerEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ subWorkerStatHash = hash_create("Subscription worker stat entries",
+ PGSTAT_SUBWORKER_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ }
+
+ /* Enter the subscription entry and initialize fields */
+ wentry =
+ (PgStat_StatSubWorkerEntry *) hash_search(subWorkerStatHash,
+ (void *) &wbuf.key,
+ HASH_ENTER, NULL);
+ memcpy(wentry, &wbuf, sizeof(PgStat_StatSubWorkerEntry));
+ break;
+ }
+
case 'E':
goto done;
@@ -4541,6 +4953,7 @@ pgstat_read_db_statsfile_timestamp(Oid databaseid, bool permanent,
PgStat_WalStats myWalStats;
PgStat_SLRUStats mySLRUStats[SLRU_NUM_ELEMENTS];
PgStat_StatReplSlotEntry myReplSlotStats;
+ PgStat_StatSubWorkerEntry mySubWorkerStats;
FILE *fpin;
int32 format_id;
const char *statfile = permanent ? PGSTAT_STAT_PERMANENT_FILENAME : pgstat_stat_filename;
@@ -4671,6 +5084,22 @@ pgstat_read_db_statsfile_timestamp(Oid databaseid, bool permanent,
}
break;
+ /*
+ * 'S' A PgStat_StatSubWorkerEntry struct describing a
+ * subscription worker statistics.
+ */
+ case 'S':
+ if (fread(&mySubWorkerStats, 1, sizeof(mySubWorkerStats), fpin)
+ != sizeof(mySubWorkerStats))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ FreeFile(fpin);
+ return false;
+ }
+ break;
+
case 'E':
goto done;
@@ -4876,6 +5305,7 @@ pgstat_clear_snapshot(void)
pgStatLocalContext = NULL;
pgStatDBHash = NULL;
replSlotStatHash = NULL;
+ subWorkerStatHash = NULL;
/*
* Historically the backend_status.c facilities lived in this file, and
@@ -5344,6 +5774,33 @@ pgstat_recv_resetreplslotcounter(PgStat_MsgResetreplslotcounter *msg,
}
}
+/* ----------
+ * pgstat_recv_resetsubworkererror() -
+ *
+ * Process a RESETSUBWORKERERROR message.
+ * ----------
+ */
+static void
+pgstat_recv_resetsubworkererror(PgStat_MsgResetsubworkererror *msg, int len)
+{
+ PgStat_StatSubWorkerEntry *wentry;
+
+ Assert(OidIsValid(msg->m_subid));
+
+ /* Get subscription worker stats */
+ wentry = pgstat_get_subworker_entry(msg->m_subid, msg->m_subrelid, false);
+
+ /*
+ * Nothing to do if the subscription error entry is not found. This could
+ * happen when the subscription is dropped and the message for dropping
+ * subscription entry arrived before the message for resetting the error.
+ */
+ if (wentry == NULL)
+ return;
+
+ /* reset the entry and set reset timestamp */
+ pgstat_reset_subworker_error(wentry, GetCurrentTimestamp());
+}
/* ----------
* pgstat_recv_autovac() -
@@ -5816,6 +6273,93 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
}
}
+/* ----------
+ * pgstat_recv_subworker_error() -
+ *
+ * Process a SUBWORKERERROR message.
+ * ----------
+ */
+static void
+pgstat_recv_subworker_error(PgStat_MsgSubWorkerError *msg, int len)
+{
+ PgStat_StatSubWorkerEntry *wentry;
+
+ /* Get the subscription worker stats */
+ wentry = pgstat_get_subworker_entry(msg->m_subid, msg->m_subrelid, true);
+ Assert(wentry);
+
+ /*
+ * Update only the counter and timestamp if we received the same error
+ * again
+ */
+ if (wentry->relid == msg->m_relid &&
+ wentry->command == msg->m_command &&
+ wentry->xid == msg->m_xid &&
+ strncmp(wentry->message, msg->m_message, strlen(wentry->message)) == 0)
+ {
+ wentry->count++;
+ wentry->timestamp = msg->m_timestamp;
+ return;
+ }
+
+ /* Otherwise, update the error information */
+ wentry->relid = msg->m_relid;
+ wentry->command = msg->m_command;
+ wentry->xid = msg->m_xid;
+ wentry->count = 1;
+ wentry->timestamp = msg->m_timestamp;
+ strlcpy(wentry->message, msg->m_message, PGSTAT_SUBWORKERERROR_MSGLEN);
+}
+
+/* ----------
+ * pgstat_recv_subworker_purge() -
+ *
+ * Process a SUBWORKERPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subworker_purge(PgStat_MsgSubWorkerPurge *msg, int len)
+{
+ if (subWorkerStatHash == NULL)
+ return;
+
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ HASH_SEQ_STATUS sstat;
+ PgStat_StatSubWorkerEntry *wentry;
+
+ /* Remove all worker statistics of the subscription */
+ hash_seq_init(&sstat, subWorkerStatHash);
+ while ((wentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&sstat)) != NULL)
+ {
+ if (wentry->key.subid == msg->m_subids[i])
+ (void) hash_search(subWorkerStatHash, (void *) &(wentry->key),
+ HASH_REMOVE, NULL);
+ }
+ }
+}
+
+/* ----------
+ * pgstat_recv_subworker_error_purge() -
+ *
+ * Process a SUBWORKERERRORPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subworker_error_purge(PgStat_MsgSubWorkerErrorPurge *msg, int len)
+{
+ PgStat_StatSubWorkerKey key;
+
+ key.subid = msg->m_subid;
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ Assert(OidIsValid(msg->m_relids[i]));
+
+ key.subrelid = msg->m_relids[i];
+ (void) hash_search(subWorkerStatHash, (void *) &key, HASH_REMOVE, NULL);
+ }
+}
+
/* ----------
* pgstat_write_statsfile_needed() -
*
@@ -5934,6 +6478,71 @@ pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotent, TimestampTz ts)
slotent->stat_reset_timestamp = ts;
}
+/* ----------
+ * pgstat_get_subworker_entry
+ *
+ * Return the entry of subscription worker entry with the subscription
+ * OID and relation OID. If subrelid is InvalidOid, it returns an entry
+ * of the apply worker otherwise of the table sync worker associated with
+ * subrelid. If no subscription entry exists, initialize it, if the
+ * create parameter is true. Else, return NULL.
+ * ----------
+ */
+static PgStat_StatSubWorkerEntry *
+pgstat_get_subworker_entry(Oid subid, Oid subrelid, bool create)
+{
+ PgStat_StatSubWorkerEntry *wentry;
+ PgStat_StatSubWorkerKey key;
+ HASHACTION action;
+ bool found;
+
+ if (subWorkerStatHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ if (!create)
+ return NULL;
+
+ hash_ctl.keysize = sizeof(PgStat_StatSubWorkerKey);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubWorkerEntry);
+ subWorkerStatHash = hash_create("Subscription worker stat entries",
+ PGSTAT_SUBWORKER_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+ }
+
+ key.subid = subid;
+ key.subrelid = subrelid;
+ action = (create ? HASH_ENTER : HASH_FIND);
+ wentry = (PgStat_StatSubWorkerEntry *) hash_search(subWorkerStatHash,
+ (void *) &key,
+ action, &found);
+
+ /* initialize fields */
+ if (create && !found)
+ pgstat_reset_subworker_error(wentry, 0);
+
+ return wentry;
+}
+
+/* ----------
+ * pgstat_reset_subworker_error
+ *
+ * Reset the given subscription worker error stats.
+ * ----------
+ */
+static void
+pgstat_reset_subworker_error(PgStat_StatSubWorkerEntry *wentry, TimestampTz ts)
+{
+ wentry->relid = InvalidOid;
+ wentry->command = 0;
+ wentry->xid = InvalidTransactionId;
+ wentry->count = 0;
+ wentry->timestamp = 0;
+ wentry->message[0] = '\0';
+ wentry->stat_reset_timestamp = ts;
+}
+
/*
* pgstat_slru_index
*
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 8d96c926b4..ac3236a573 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3329,6 +3329,7 @@ void
ApplyWorkerMain(Datum main_arg)
{
int worker_slot = DatumGetInt32(main_arg);
+ MemoryContext cctx = CurrentMemoryContext;
MemoryContext oldctx;
char originname[NAMEDATALEN];
XLogRecPtr origin_startpos;
@@ -3429,8 +3430,27 @@ ApplyWorkerMain(Datum main_arg)
{
char *syncslotname;
- /* This is table synchronization worker, call initial sync. */
- syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ PG_TRY();
+ {
+ /* This is table synchronization worker, call initial sync. */
+ syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ }
+ PG_CATCH();
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ /* report the table sync error */
+ pgstat_report_subworker_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ MyLogicalRepWorker->relid,
+ 0,
+ InvalidTransactionId,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
/* allocate slot name in long-lived context */
myslotname = MemoryContextStrdup(ApplyContext, syncslotname);
@@ -3548,7 +3568,32 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- LogicalRepApplyLoop(origin_startpos);
+ PG_TRY();
+ {
+ LogicalRepApplyLoop(origin_startpos);
+ }
+ PG_CATCH();
+ {
+ /* report the apply error */
+ if (apply_error_callback_arg.command != 0)
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ pgstat_report_subworker_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ apply_error_callback_arg.rel != NULL
+ ? apply_error_callback_arg.rel->localreloid
+ : InvalidOid,
+ apply_error_callback_arg.command,
+ apply_error_callback_arg.remote_xid,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ }
+
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
proc_exit(0);
}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index ff5aedc99c..b2e324036c 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -24,6 +24,7 @@
#include "pgstat.h"
#include "postmaster/bgworker_internals.h"
#include "postmaster/postmaster.h"
+#include "replication/logicalproto.h"
#include "replication/slot.h"
#include "storage/proc.h"
#include "storage/procarray.h"
@@ -2239,6 +2240,23 @@ pg_stat_reset_replication_slot(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
+/* Reset a subscription error stats */
+Datum
+pg_stat_reset_subscription_error(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid;
+
+ if (PG_ARGISNULL(1))
+ relid = InvalidOid; /* reset apply worker error stats */
+ else
+ relid = PG_GETARG_OID(1); /* reset table sync worker error stats */
+
+ pgstat_reset_subworker_error_stats(subid, relid);
+
+ PG_RETURN_VOID();
+}
+
Datum
pg_stat_get_archiver(PG_FUNCTION_ARGS)
{
@@ -2379,3 +2397,106 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
/* Returns the record as Datum */
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
+
+/*
+ * Get the subscription error for the given subscription (and relation).
+ */
+Datum
+pg_stat_get_subscription_error(PG_FUNCTION_ARGS)
+{
+#define PG_STAT_GET_SUBSCRIPTION_ERROR_COLS 9
+ Oid subid = PG_GETARG_OID(0);
+ Oid subrelid;
+ TupleDesc tupdesc;
+ Datum values[PG_STAT_GET_SUBSCRIPTION_ERROR_COLS];
+ bool nulls[PG_STAT_GET_SUBSCRIPTION_ERROR_COLS];
+ PgStat_StatSubWorkerEntry *wentry;
+ int i;
+
+ if (PG_ARGISNULL(1))
+ subrelid = InvalidOid;
+ else
+ subrelid = PG_GETARG_OID(1);
+
+ /* Initialise attributes information in the tuple descriptor */
+ tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_SUBSCRIPTION_ERROR_COLS);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "subid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subrelid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 3, "relid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 4, "command",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 5, "xid",
+ XIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 6, "count",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 7, "error_message",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 8, "last_failed_time",
+ TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 9, "stats_reset",
+ TIMESTAMPTZOID, -1, 0);
+ BlessTupleDesc(tupdesc);
+
+ /* Get subscription worker stats */
+ wentry = pgstat_fetch_subworker(subid, subrelid);
+
+ /* Return NULL if the subscription doesn't have any errors */
+ if (wentry == NULL)
+ PG_RETURN_NULL();
+
+ /* Initialise values and NULL flags arrays */
+ MemSet(values, 0, sizeof(values));
+ MemSet(nulls, 0, sizeof(nulls));
+
+ i = 0;
+ /* subid */
+ values[i++] = ObjectIdGetDatum(subid);
+
+ /* subrelid */
+ if (OidIsValid(subrelid))
+ values[i++] = ObjectIdGetDatum(subrelid);
+ else
+ nulls[i++] = true;
+
+ /* relid */
+ if (OidIsValid(wentry->relid))
+ values[i++] = ObjectIdGetDatum(wentry->relid);
+ else
+ nulls[i++] = true;
+
+ /* command */
+ if (wentry->command != 0)
+ values[i++] = CStringGetTextDatum(logicalrep_message_type(wentry->command));
+ else
+ nulls[i++] = true;
+
+ /* xid */
+ if (TransactionIdIsValid(wentry->xid))
+ values[i++] = TransactionIdGetDatum(wentry->xid);
+ else
+ nulls[i++] = true;
+
+ /* count */
+ values[i++] = Int64GetDatum(wentry->count);
+
+ /* error_message */
+ values[i++] = CStringGetTextDatum(wentry->message);
+
+ /* last_failed_time */
+ if (wentry->timestamp != 0)
+ values[i++] = TimestampTzGetDatum(wentry->timestamp);
+ else
+ nulls[i++] = true;
+
+ /* stats_reset */
+ if (wentry->stat_reset_timestamp != 0)
+ values[i++] = TimestampTzGetDatum(wentry->stat_reset_timestamp);
+ else
+ nulls[i++] = true;
+
+ /* Returns the record as Datum */
+ PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
+}
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index d068d6532e..a901fe9a55 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5385,6 +5385,14 @@
proargmodes => '{i,o,o,o,o,o,o,o,o,o,o}',
proargnames => '{slot_name,slot_name,spill_txns,spill_count,spill_bytes,stream_txns,stream_count,stream_bytes,total_txns,total_bytes,stats_reset}',
prosrc => 'pg_stat_get_replication_slot' },
+{ oid => '8523', descr => 'statistics: information about subscription error',
+ proname => 'pg_stat_get_subscription_error', prorows => '1', proisstrict => 'f',
+ proretset => 't', provolatile => 's', proparallel => 'r',
+ prorettype => 'record', proargtypes => 'oid oid',
+ proallargtypes => '{oid,oid,oid,oid,oid,text,xid,int8,text,timestamptz,timestamptz}',
+ proargmodes => '{i,i,o,o,o,o,o,o,o,o,o}',
+ proargnames => '{subid,subrelid,subid,subrelid,relid,command,xid,count,error_message,last_failed_time,stats_reset}',
+ prosrc => 'pg_stat_get_subscription_error' },
{ oid => '6118', descr => 'statistics: information about subscription',
proname => 'pg_stat_get_subscription', prorows => '10', proisstrict => 'f',
proretset => 't', provolatile => 's', proparallel => 'r',
@@ -5772,6 +5780,11 @@
proname => 'pg_stat_reset_replication_slot', proisstrict => 'f',
provolatile => 'v', prorettype => 'void', proargtypes => 'text',
prosrc => 'pg_stat_reset_replication_slot' },
+{ oid => '8524',
+ descr => 'statistics: reset collected statistics for a single subscription error',
+ proname => 'pg_stat_reset_subscription_error', proisstrict => 'f',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid oid',
+ prosrc => 'pg_stat_reset_subscription_error' },
{ oid => '3163', descr => 'current trigger depth',
proname => 'pg_trigger_depth', provolatile => 's', proparallel => 'r',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index bcd3588ea2..fdcfea3ec4 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -14,6 +14,7 @@
#include "datatype/timestamp.h"
#include "portability/instr_time.h"
#include "postmaster/pgarch.h" /* for MAX_XFN_CHARS */
+#include "replication/logicalproto.h"
#include "utils/backend_progress.h" /* for backward compatibility */
#include "utils/backend_status.h" /* for backward compatibility */
#include "utils/hsearch.h"
@@ -66,6 +67,7 @@ typedef enum StatMsgType
PGSTAT_MTYPE_RESETSINGLECOUNTER,
PGSTAT_MTYPE_RESETSLRUCOUNTER,
PGSTAT_MTYPE_RESETREPLSLOTCOUNTER,
+ PGSTAT_MTYPE_RESETSUBWORKERERROR,
PGSTAT_MTYPE_AUTOVAC_START,
PGSTAT_MTYPE_VACUUM,
PGSTAT_MTYPE_ANALYZE,
@@ -83,6 +85,9 @@ typedef enum StatMsgType
PGSTAT_MTYPE_REPLSLOT,
PGSTAT_MTYPE_CONNECT,
PGSTAT_MTYPE_DISCONNECT,
+ PGSTAT_MTYPE_SUBWORKERERROR,
+ PGSTAT_MTYPE_SUBWORKERERRORPURGE,
+ PGSTAT_MTYPE_SUBWORKERPURGE,
} StatMsgType;
/* ----------
@@ -389,6 +394,24 @@ typedef struct PgStat_MsgResetreplslotcounter
bool clearall;
} PgStat_MsgResetreplslotcounter;
+/* ----------
+ * PgStat_MsgRestsubworkererror Sent by the backend to reset the subscription
+ * worker error information.
+ * ----------
+ */
+typedef struct PgStat_MsgResetsubworkererror
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * Same as PgStat_MsgSubWorkerError, m_subid and m_subrelid are used to
+ * determine the subscription and the reporter of the error: the apply
+ * worker or the table sync worker.
+ */
+ Oid m_subid;
+ Oid m_subrelid;
+} PgStat_MsgResetsubworkererror;
+
/* ----------
* PgStat_MsgAutovacStart Sent by the autovacuum daemon to signal
* that a database is going to be processed
@@ -536,6 +559,67 @@ typedef struct PgStat_MsgReplSlot
PgStat_Counter m_total_bytes;
} PgStat_MsgReplSlot;
+/* ----------
+ * PgStat_MsgSubWorkerError Sent by the apply worker or the table sync worker to
+ * report the error occurred during logical replication.
+ * ----------
+ */
+#define PGSTAT_SUBWORKERERROR_MSGLEN 256
+typedef struct PgStat_MsgSubWorkerError
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * m_subid and m_subrelid are used to determine the subscription and the
+ * reporter of the error. m_subrelid is InvalidOid if reported by an apply
+ * worker otherwise reported by a table sync worker.
+ */
+ Oid m_subid;
+ Oid m_subrelid;
+
+ /*
+ * Oid of the table that the reporter was actually processing. This can be
+ * InvalidOid if the worker was applying a non-data-modification change
+ * such as STREAM_STOP.
+ */
+ Oid m_relid;
+
+ LogicalRepMsgType m_command;
+ TransactionId m_xid;
+ TimestampTz m_timestamp;
+ char m_message[PGSTAT_SUBWORKERERROR_MSGLEN];
+} PgStat_MsgSubWorkerError;
+
+/* ----------
+ * PgStat_MsgSubWorkerPurge Sent by the backend and autovacuum to tell the
+ * collector about the dead subscriptions.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBWORKERPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubWorkerPurge
+{
+ PgStat_MsgHdr m_hdr;
+ int m_nentries;
+ Oid m_subids[PGSTAT_NUM_SUBWORKERPURGE];
+} PgStat_MsgSubWorkerPurge;
+
+/* ----------
+ * PgStat_MsgSubWorkerErrorPurge Sent by the backend and autovacuum to purge
+ * the subscription errors.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBWORKERERRORPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(Oid) - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubWorkerErrorPurge
+{
+ PgStat_MsgHdr m_hdr;
+ Oid m_subid;
+ int m_nentries;
+ Oid m_relids[PGSTAT_NUM_SUBWORKERERRORPURGE];
+} PgStat_MsgSubWorkerErrorPurge;
/* ----------
* PgStat_MsgRecoveryConflict Sent by the backend upon recovery conflict
@@ -697,6 +781,7 @@ typedef union PgStat_Msg
PgStat_MsgResetsinglecounter msg_resetsinglecounter;
PgStat_MsgResetslrucounter msg_resetslrucounter;
PgStat_MsgResetreplslotcounter msg_resetreplslotcounter;
+ PgStat_MsgResetsubworkererror msg_resetsubworkererror;
PgStat_MsgAutovacStart msg_autovacuum_start;
PgStat_MsgVacuum msg_vacuum;
PgStat_MsgAnalyze msg_analyze;
@@ -714,6 +799,9 @@ typedef union PgStat_Msg
PgStat_MsgReplSlot msg_replslot;
PgStat_MsgConnect msg_connect;
PgStat_MsgDisconnect msg_disconnect;
+ PgStat_MsgSubWorkerError msg_subworkererror;
+ PgStat_MsgSubWorkerErrorPurge msg_subworkererrorpurge;
+ PgStat_MsgSubWorkerPurge msg_subworkerpurge;
} PgStat_Msg;
@@ -929,6 +1017,34 @@ typedef struct PgStat_StatReplSlotEntry
TimestampTz stat_reset_timestamp;
} PgStat_StatReplSlotEntry;
+/* The lookup key for subscription worker hash table */
+typedef struct PgStat_StatSubWorkerKey
+{
+ Oid subid;
+ Oid subrelid; /* InvalidOid for apply worker, otherwise for
+ * table sync worker */
+} PgStat_StatSubWorkerKey;
+
+/*
+ * Logical replication apply worker and table sync worker statistics kept in the
+ * stats collector.
+ */
+typedef struct PgStat_StatSubWorkerEntry
+{
+ PgStat_StatSubWorkerKey key; /* hash key (must be first) */
+
+ /*
+ * Subscription worker error statistics representing an error that occurred
+ * during application of logical replication or the initial table synchronization.
+ */
+ Oid relid;
+ LogicalRepMsgType command;
+ TransactionId xid;
+ PgStat_Counter count;
+ TimestampTz timestamp;
+ char message[PGSTAT_SUBWORKERERROR_MSGLEN];
+ TimestampTz stat_reset_timestamp;
+} PgStat_StatSubWorkerEntry;
/*
* Working state needed to accumulate per-function-call timing statistics.
@@ -1022,6 +1138,7 @@ extern void pgstat_reset_shared_counters(const char *);
extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type type);
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
+extern void pgstat_reset_subworker_error_stats(Oid subid, Oid subrelid);
extern void pgstat_report_connect(Oid dboid);
extern void pgstat_report_autovac(Oid dboid);
@@ -1038,6 +1155,9 @@ extern void pgstat_report_checksum_failure(void);
extern void pgstat_report_replslot(const PgStat_StatReplSlotEntry *repSlotStat);
extern void pgstat_report_replslot_create(const char *slotname);
extern void pgstat_report_replslot_drop(const char *slotname);
+extern void pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command,
+ TransactionId xid, const char *errmsg);
extern void pgstat_initialize(void);
@@ -1136,6 +1256,7 @@ extern PgStat_GlobalStats *pgstat_fetch_global(void);
extern PgStat_WalStats *pgstat_fetch_stat_wal(void);
extern PgStat_SLRUStats *pgstat_fetch_slru(void);
extern PgStat_StatReplSlotEntry *pgstat_fetch_replslot(NameData slotname);
+extern PgStat_StatSubWorkerEntry *pgstat_fetch_subworker(Oid subid, Oid subrelid);
extern void pgstat_count_slru_page_zeroed(int slru_idx);
extern void pgstat_count_slru_page_hit(int slru_idx);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2fa00a3c29..7ecd4f167a 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2094,6 +2094,26 @@ pg_stat_subscription| SELECT su.oid AS subid,
st.latest_end_time
FROM (pg_subscription su
LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid, pid, received_lsn, last_msg_send_time, last_msg_receipt_time, latest_end_lsn, latest_end_time) ON ((st.subid = su.oid)));
+pg_stat_subscription_errors| SELECT e.subid,
+ s.subname,
+ e.subrelid,
+ e.relid,
+ e.command,
+ e.xid,
+ e.count,
+ e.error_message,
+ e.last_failed_time,
+ e.stats_reset
+ FROM ( SELECT pg_subscription.oid AS subid,
+ NULL::oid AS relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT pg_subscription_rel.srsubid AS subid,
+ pg_subscription_rel.srrelid AS relid
+ FROM pg_subscription_rel
+ WHERE (pg_subscription_rel.srsubstate <> 'r'::"char")) sr,
+ (LATERAL pg_stat_get_subscription_error(sr.subid, sr.relid) e(subid, subrelid, relid, command, xid, count, error_message, last_failed_time, stats_reset)
+ JOIN pg_subscription s ON ((e.subid = s.oid)));
pg_stat_sys_indexes| SELECT pg_stat_all_indexes.relid,
pg_stat_all_indexes.indexrelid,
pg_stat_all_indexes.schemaname,
diff --git a/src/test/subscription/t/025_error_report.pl b/src/test/subscription/t/025_error_report.pl
new file mode 100644
index 0000000000..c6fea0d046
--- /dev/null
+++ b/src/test/subscription/t/025_error_report.pl
@@ -0,0 +1,154 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Tests for subscription error reporting.
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 5;
+
+# Test if the error reported on pg_subscription_errors view is expected.
+sub test_subscription_error
+{
+ my ($node, $relname, $xid, $expected_error, $msg) = @_;
+
+ my $check_sql = qq[
+SELECT count(1) > 0 FROM pg_stat_subscription_errors
+WHERE relid = '$relname'::regclass];
+ $check_sql .= " AND xid = '$xid'::xid;" if $xid ne '';
+
+ # Wait for the error statistics to be updated.
+ $node->poll_query_until(
+ 'postgres', $check_sql,
+) or die "Timed out while waiting for statistics to be updated";
+
+ my $result = $node->safe_psql(
+ 'postgres',
+ qq[
+SELECT subname, command, relid::regclass, count > 0
+FROM pg_stat_subscription_errors
+WHERE relid = '$relname'::regclass;
+]);
+ is($result, $expected_error, $msg);
+}
+
+# Create publisher node.
+my $node_publisher = PostgresNode->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+logical_decoding_work_mem = 64kB
+]);
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgresNode->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+
+# The subscriber will enter an infinite error loop, so we don't want
+# to overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+wal_retrieve_retry_interval = 5s
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we create
+# the same tables but with primary keys. Also, insert some data that will conflict
+# with the data replicated from publisher later.
+$node_publisher->safe_psql(
+ 'postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int);
+CREATE TABLE test_tab2 (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+INSERT INTO test_tab1 VALUES (1);
+INSERT INTO test_tab2 VALUES (1);
+COMMIT;
+]);
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int primary key);
+CREATE TABLE test_tab2 (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab2 VALUES (1);
+INSERT INTO test_tab_streaming SELECT 10000, md5(10000::text);
+COMMIT;
+]);
+
+# Setup publications.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ q[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Check if there is no subscription errors before starting logical replication.
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_errors");
+is($result, qq(0), 'check no subscription error');
+
+# Create subscriptions. The table sync for test_tab2 on tap_sub will enter to
+# infinite error due to violating the unique constraint.
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (streaming = off, two_phase = on);");
+my $appname_streaming = 'tap_sub_streaming';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming' PUBLICATION tap_pub_streaming WITH (streaming = on, two_phase = on);");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Wait for initial table sync for test_tab1 and test_tab_streaming to finish.
+$node_subscriber->poll_query_until('postgres',
+ q[
+SELECT count(1) = 2 FROM pg_subscription_rel
+WHERE srrelid in ('test_tab1'::regclass, 'test_tab_streaming'::regclass) AND srsubstate in ('r', 's')
+]) or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check the initial data.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab1");
+is($result, q(1), 'check initial data are copied to subscriber');
+
+# Insert more data to test_tab1, raising an error on the subscriber due to violation
+# of the unique constraint on test_tab1.
+my $xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab1 VALUES (1);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_subscription_error($node_subscriber, 'test_tab1', $xid,
+ qq(tap_sub|INSERT|test_tab1|t),
+ 'check the error reported by the apply worker');
+
+# Check the table sync worker's error in the view.
+test_subscription_error($node_subscriber, 'test_tab2', '',
+ qq(tap_sub||test_tab2|t),
+ 'check the error reported by the table sync worker');
+
+# Check if the view doesn't show any entries after dropping the subscriptions.
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+DROP SUBSCRIPTION tap_sub;
+DROP SUBSCRIPTION tap_sub_streaming;
+]);
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_errors");
+is($result, q(0), 'no error after dropping subscription');
+
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index cb5b5ec74c..8ff6294267 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1939,7 +1939,11 @@ PgStat_MsgResetreplslotcounter
PgStat_MsgResetsharedcounter
PgStat_MsgResetsinglecounter
PgStat_MsgResetslrucounter
+PgStat_MsgResetsubworkererror
PgStat_MsgSLRU
+PgStat_MsgSubWorkerError
+PgStat_MsgSubWorkerErrorPurge
+PgStat_MsgSubWorkerPurge
PgStat_MsgTabpurge
PgStat_MsgTabstat
PgStat_MsgTempFile
@@ -1951,6 +1955,9 @@ PgStat_Single_Reset_Type
PgStat_StatDBEntry
PgStat_StatFuncEntry
PgStat_StatReplSlotEntry
+PgStat_StatSubWorkerEntry
+PgStat_StatSubWorkerKey
+PgStat_SubWorkerError
PgStat_StatTabEntry
PgStat_SubXactStatus
PgStat_TableCounts
--
2.24.3 (Apple Git-128)
On 30.09.21 07:45, Masahiko Sawada wrote:
I've attached updated patches that incorporate all comments I got so
far. Please review them.
I'm uneasy about the way the xids-to-be-skipped are presented as
subscriptions options, similar to settings such as "binary". I see how
that is convenient, but it's not really the same thing, in how you use
it, is it? Even if we share some details internally, I feel that there
should be a separate syntax somehow.
Also, what happens when you forget to reset the xid after it has passed?
Will it get skipped again after wraparound?
On Fri, Oct 1, 2021 at 5:05 AM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
On 30.09.21 07:45, Masahiko Sawada wrote:
I've attached updated patches that incorporate all comments I got so
far. Please review them.I'm uneasy about the way the xids-to-be-skipped are presented as
subscriptions options, similar to settings such as "binary". I see how
that is convenient, but it's not really the same thing, in how you use
it, is it? Even if we share some details internally, I feel that there
should be a separate syntax somehow.
Since I was thinking that ALTER SUBSCRIPTION ... SET is used to alter
parameters originally set by CREATE SUBSCRIPTION, in the first several
version patches it added a separate syntax for this feature like ALTER
SUBSCRIPTION ... SET SKIP TRANSACTION xxx. But Amit was concerned
about an additional syntax and consistency with disable_on_error[1]/messages/by-id/CAA4eK1LjrU8x+x=bFazVD10pgOVy0PEE8mpz3nQhDG+mmU8ivQ@mail.gmail.com
which is proposed by Mark Diliger[2]/messages/by-id/DB35438F-9356-4841-89A0-412709EBD3AB@enterprisedb.com, so I’ve changed it to a
subscription option. I tried to find a policy of that by checking the
existing syntaxes but I could not find, and interestingly when it
comes to ALTER SUBSCRIPTION syntax, we support both ENABLE/DISABLE
syntax and SET (enabled = on/off) option.
Also, what happens when you forget to reset the xid after it has passed?
Will it get skipped again after wraparound?
Yes. Currently it's a user's responsibility. We thoroughly documented
the risk of this feature and thus it should be used as a last resort
since it may easily make the subscriber inconsistent, especially if a
user specifies the wrong transaction ID.
Regards,
[1]: /messages/by-id/CAA4eK1LjrU8x+x=bFazVD10pgOVy0PEE8mpz3nQhDG+mmU8ivQ@mail.gmail.com
[2]: /messages/by-id/DB35438F-9356-4841-89A0-412709EBD3AB@enterprisedb.com
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Fri, Oct 1, 2021 at 6:30 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Oct 1, 2021 at 5:05 AM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:Also, what happens when you forget to reset the xid after it has passed?
Will it get skipped again after wraparound?Yes.
Aren't we resetting the skip_xid once we skip that transaction in
stop_skipping_changes()? If so, it shouldn't be possible to skip it
again after the wraparound. Am I missing something?
Now, if the user has wrongly set some XID which we can't skip as that
is already in past or something like that then I think it is the
user's problem and that's why it can be done only by super users. I
think we have even thought of protecting that via cross-checking with
the information in view but as the view data is lossy, we can't rely
on that. I think users can even set some valid XID that never has any
error and we will still skip it which is what can be done today also
by pg_replication_origin_advance(). I am not sure if we can do much
about such scenarios except to carefully document them.
--
With Regards,
Amit Kapila.
On Fri, Oct 1, 2021 at 6:30 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Oct 1, 2021 at 5:05 AM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:On 30.09.21 07:45, Masahiko Sawada wrote:
I've attached updated patches that incorporate all comments I got so
far. Please review them.I'm uneasy about the way the xids-to-be-skipped are presented as
subscriptions options, similar to settings such as "binary". I see how
that is convenient, but it's not really the same thing, in how you use
it, is it? Even if we share some details internally, I feel that there
should be a separate syntax somehow.Since I was thinking that ALTER SUBSCRIPTION ... SET is used to alter
parameters originally set by CREATE SUBSCRIPTION, in the first several
version patches it added a separate syntax for this feature like ALTER
SUBSCRIPTION ... SET SKIP TRANSACTION xxx. But Amit was concerned
about an additional syntax and consistency with disable_on_error[1]
which is proposed by Mark Diliger[2], so I’ve changed it to a
subscription option.
Yeah, the basic idea is that this is not the only option we will
support for taking actions on error/conflict. For example, we might
want to disable subscriptions or allow skipping transactions based on
XID, LSN, etc. So, developing separate syntax for each of the options
doesn't seem like a good idea. However considering Peter's point, how
about something like:
Alter Subscription <sub_name> On Error ( subscription_parameter [=
value] [, ... ] );
OR
Alter Subscription <sub_name> On Conflict ( subscription_parameter [=
value] [, ... ] );
--
With Regards,
Amit Kapila.
On Fri, Oct 1, 2021 at 2:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Oct 1, 2021 at 6:30 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Oct 1, 2021 at 5:05 AM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:Also, what happens when you forget to reset the xid after it has passed?
Will it get skipped again after wraparound?Yes.
Aren't we resetting the skip_xid once we skip that transaction in
stop_skipping_changes()? If so, it shouldn't be possible to skip it
again after the wraparound. Am I missing something?
Oops, I'd misunderstood the question. Yes, Amit is right. Once we skip
the transaction, skip_xid is automatically reset. So users don't need
to reset it manually after skipping the transaction. Sorry for the
confusion.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Fri, Oct 1, 2021 at 5:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Oct 1, 2021 at 6:30 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Oct 1, 2021 at 5:05 AM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:On 30.09.21 07:45, Masahiko Sawada wrote:
I've attached updated patches that incorporate all comments I got so
far. Please review them.I'm uneasy about the way the xids-to-be-skipped are presented as
subscriptions options, similar to settings such as "binary". I see how
that is convenient, but it's not really the same thing, in how you use
it, is it? Even if we share some details internally, I feel that there
should be a separate syntax somehow.Since I was thinking that ALTER SUBSCRIPTION ... SET is used to alter
parameters originally set by CREATE SUBSCRIPTION, in the first several
version patches it added a separate syntax for this feature like ALTER
SUBSCRIPTION ... SET SKIP TRANSACTION xxx. But Amit was concerned
about an additional syntax and consistency with disable_on_error[1]
which is proposed by Mark Diliger[2], so I’ve changed it to a
subscription option.Yeah, the basic idea is that this is not the only option we will
support for taking actions on error/conflict. For example, we might
want to disable subscriptions or allow skipping transactions based on
XID, LSN, etc.
I guess disabling subscriptions on error/conflict and skipping the
particular transactions are somewhat different types of functions.
Disabling subscriptions on error/conflict seems likes a setting
parameter of subscriptions. The users might want to specify this
option at creation time. Whereas, skipping the particular transaction
is a repair function that the user might want to use on the spot in
case of a failure. I’m concerned a bit that combining these functions
to one syntax could confuse the users.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Mon, Oct 4, 2021 at 6:01 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Oct 1, 2021 at 5:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Oct 1, 2021 at 6:30 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Oct 1, 2021 at 5:05 AM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:On 30.09.21 07:45, Masahiko Sawada wrote:
I've attached updated patches that incorporate all comments I got so
far. Please review them.I'm uneasy about the way the xids-to-be-skipped are presented as
subscriptions options, similar to settings such as "binary". I see how
that is convenient, but it's not really the same thing, in how you use
it, is it? Even if we share some details internally, I feel that there
should be a separate syntax somehow.Since I was thinking that ALTER SUBSCRIPTION ... SET is used to alter
parameters originally set by CREATE SUBSCRIPTION, in the first several
version patches it added a separate syntax for this feature like ALTER
SUBSCRIPTION ... SET SKIP TRANSACTION xxx. But Amit was concerned
about an additional syntax and consistency with disable_on_error[1]
which is proposed by Mark Diliger[2], so I’ve changed it to a
subscription option.Yeah, the basic idea is that this is not the only option we will
support for taking actions on error/conflict. For example, we might
want to disable subscriptions or allow skipping transactions based on
XID, LSN, etc.I guess disabling subscriptions on error/conflict and skipping the
particular transactions are somewhat different types of functions.
Disabling subscriptions on error/conflict seems likes a setting
parameter of subscriptions. The users might want to specify this
option at creation time.
Okay, but they can still specify it by using "On Error" syntax.
Whereas, skipping the particular transaction
is a repair function that the user might want to use on the spot in
case of a failure. I’m concerned a bit that combining these functions
to one syntax could confuse the users.
Fair enough, I was mainly trying to combine the syntax for all actions
that we can take "On Error". We can allow to set them either at Create
Subscription or Alter Subscription time.
I think here the main point is that does this addresses Peter's
concern for this Patch to use a separate syntax? Peter E., can you
please confirm? Do let us know if you have something else going in
your mind?
--
With Regards,
Amit Kapila.
On Mon, Oct 4, 2021 at 4:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I think here the main point is that does this addresses Peter's
concern for this Patch to use a separate syntax? Peter E., can you
please confirm? Do let us know if you have something else going in
your mind?
Peter's concern seemed to be that the use of a subscription option,
though convenient, isn't an intuitive natural fit for providing this
feature (i.e. ability to skip a transaction by xid). I tend to have
that feeling about using a subscription option for this feature. I'm
not sure what possible alternative syntax he had in mind and currently
can't really think of a good one myself that fits the purpose.
I think that the 1st and 2nd patch are useful in their own right, but
couldn't this feature (i.e. the 3rd patch) be provided instead as an
additional Replication Management function (see 9.27.6)?
e.g. pg_replication_skip_xid
Regards,
Greg Nancarrow
Fujitsu Australia
On Thursday, September 30, 2021 2:45 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached updated patches that incorporate all comments I got so far. Please
review them.
Hi
Sorry, if I misunderstand something but
did someone check what happens when we
execute ALTER SUBSCRIPTION ... RESET (streaming)
in the middle of one txn which has several streaming of data to the sub,
especially after some part of txn has been already streamed.
My intention of this is something like *if* we can find an actual harm of this,
I wanted to suggest the necessity of a safeguard or some measure into the patch.
An example)
Set the logical_decoding_work_mem = 64kB on the pub.
and create a table and subscription with streaming = true.
In addition, log_min_messages = DEBUG1 on the sub
is helpful to check the LOG on the sub in stream_open_file().
<Session 1> connect to the publisher
BEGIN;
INSERT INTO tab VALUES (generate_series(1, 1000)); -- this exceeds the memory limit
SELECT * FROM pg_stat_replication_slots; -- check the actual streaming bytes&counts just in case
<Session 2> connect to the subscriber
-- after checking some logs of "open file .... for streamed changes" on the sub
ALTER SUBSCRIPTION mysub RESET (streaming)
<Session 1>
INSERT INTO tab VALUES (generate_series(1001, 2000)); -- again, exceeds the limit
COMMIT;
I observed that the subscriber doesn't
accept STREAM_COMMIT in this case but gets BEGIN&COMMIT instead at the end.
I couldn't find any apparent and immediate issue from those steps
but is that no problem ?
Probably, this kind of situation applies to other reset target options ?
Best Regards,
Takamichi Osumi
On Thu, Sep 30, 2021 at 3:45 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached updated patches that incorporate all comments I got so
far. Please review them.
Some comments about the v15-0001 patch:
(1) patch adds a whitespace error
Applying: Add a subscription errors statistics view
"pg_stat_subscription_errors".
.git/rebase-apply/patch:1656: new blank line at EOF.
+
warning: 1 line adds whitespace errors.
(2) Patch comment says "This commit adds a new system view
pg_stat_logical_replication_errors ..."
BUT this is the wrong name, it should be "pg_stat_subscription_errors".
doc/src/sgml/monitoring.sgml
(3)
"Message of the error" doesn't sound right. I suggest just saying "The
error message".
(4) view column "last_failed_time"
I think it would be better to name this "last_error_time".
src/backend/postmaster/pgstat.c
(5) pgstat_vacuum_subworker_stats()
Spelling mistake in the following comment:
/* Create a map for mapping subscriptoin OID and database OID */
subscriptoin -> subscription
(6)
In the following functions:
pgstat_read_statsfiles
pgstat_read_db_statsfile_timestamp
The following comment should say "... struct describing subscription
worker statistics."
(i.e. need to remove the "a")
+ * 'S' A PgStat_StatSubWorkerEntry struct describing a
+ * subscription worker statistics.
(7) pgstat_get_subworker_entry
Suggest comment change:
BEFORE:
+ * Return the entry of subscription worker entry with the subscription
AFTER:
+ * Return subscription worker entry with the given subscription
(8) pgstat_recv_subworker_error
+ /*
+ * Update only the counter and timestamp if we received the same error
+ * again
+ */
+ if (wentry->relid == msg->m_relid &&
+ wentry->command == msg->m_command &&
+ wentry->xid == msg->m_xid &&
+ strncmp(wentry->message, msg->m_message, strlen(wentry->message)) == 0)
+ {
Is there a reason that the above check uses strncmp() with
strlen(wentry->message), instead of just strcmp()?
msg->m_message is treated as the same error message if it is the same
up to strlen(wentry->message)?
Perhaps if that is intentional, then the comment should be updated.
src/tools/pgindent/typedefs.list
(9)
The added "PgStat_SubWorkerError" should be removed from the
typedefs.list (as there is no such new typedef).
Regards,
Greg Nancarrow
Fujitsu Australia
On Thursday, September 30, 2021 2:45 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached updated patches that incorporate all comments I got so far. Please
review them.
Hello
Minor two comments for v15-0001 patch.
(1) a typo in pgstat_vacuum_subworker_stat()
+ /*
+ * This subscription is live. The next step is that we search errors
+ * of the table sync workers who are already in sync state. These
+ * errors should be removed.
+ */
This subscription is "alive" ?
(2) Suggestion to add one comment next to '0' in ApplyWorkerMain()
+ /* report the table sync error */
+ pgstat_report_subworker_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ MyLogicalRepWorker->relid,
+ 0,
+ InvalidTransactionId,
+ errdata->message);
How about writing /* no corresponding message type for table synchronization */ or something ?
Best Regards,
Takamichi Osumi
On 04.10.21 02:31, Masahiko Sawada wrote:
I guess disabling subscriptions on error/conflict and skipping the
particular transactions are somewhat different types of functions.
Disabling subscriptions on error/conflict seems likes a setting
parameter of subscriptions. The users might want to specify this
option at creation time. Whereas, skipping the particular transaction
is a repair function that the user might want to use on the spot in
case of a failure. I’m concerned a bit that combining these functions
to one syntax could confuse the users.
Also, would the skip option be dumped and restored using pg_dump? Maybe
there is an argument for yes, but if not, then we probably need a
different path of handling it separate from the more permanent options.
On Fri, Oct 8, 2021 at 4:09 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
On Thursday, September 30, 2021 2:45 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached updated patches that incorporate all comments I got so far. Please
review them.Hi
Sorry, if I misunderstand something but
did someone check what happens when we
execute ALTER SUBSCRIPTION ... RESET (streaming)
in the middle of one txn which has several streaming of data to the sub,
especially after some part of txn has been already streamed.
My intention of this is something like *if* we can find an actual harm of this,
I wanted to suggest the necessity of a safeguard or some measure into the patch.An example)
Set the logical_decoding_work_mem = 64kB on the pub.
and create a table and subscription with streaming = true.
In addition, log_min_messages = DEBUG1 on the sub
is helpful to check the LOG on the sub in stream_open_file().<Session 1> connect to the publisher
BEGIN;
INSERT INTO tab VALUES (generate_series(1, 1000)); -- this exceeds the memory limit
SELECT * FROM pg_stat_replication_slots; -- check the actual streaming bytes&counts just in case<Session 2> connect to the subscriber
-- after checking some logs of "open file .... for streamed changes" on the sub
ALTER SUBSCRIPTION mysub RESET (streaming)<Session 1>
INSERT INTO tab VALUES (generate_series(1001, 2000)); -- again, exceeds the limit
COMMIT;I observed that the subscriber doesn't
accept STREAM_COMMIT in this case but gets BEGIN&COMMIT instead at the end.
I couldn't find any apparent and immediate issue from those steps
but is that no problem ?
Probably, this kind of situation applies to other reset target options ?
I think that if a subscription parameter such as ‘streaming’ and
‘binary’ is changed, an apply worker exits and the launcher starts a
new worker (see maybe_reread_subscription()). So I guess that in this
case, the apply worker exited during receiving streamed changes,
restarted, and received the same changes with ‘streaming = off’,
therefore it got BEGIN and COMMIT instead. I think that this happens
even by using ‘SET (‘streaming’ = off)’.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Monday, October 11, 2021 11:51 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Oct 8, 2021 at 4:09 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:On Thursday, September 30, 2021 2:45 PM Masahiko Sawada
<sawada.mshk@gmail.com> wrote:
I've attached updated patches that incorporate all comments I got so
far. Please review them.Sorry, if I misunderstand something but did someone check what happens
when we execute ALTER SUBSCRIPTION ... RESET (streaming) in the middle
of one txn which has several streaming of data to the sub, especially
after some part of txn has been already streamed.
My intention of this is something like *if* we can find an actual harm
of this, I wanted to suggest the necessity of a safeguard or some measureinto the patch.
...
I observed that the subscriber doesn't accept STREAM_COMMIT in this
case but gets BEGIN&COMMIT instead at the end.
I couldn't find any apparent and immediate issue from those steps but
is that no problem ?
Probably, this kind of situation applies to other reset target options ?I think that if a subscription parameter such as ‘streaming’ and ‘binary’ is
changed, an apply worker exits and the launcher starts a new worker (see
maybe_reread_subscription()). So I guess that in this case, the apply worker
exited during receiving streamed changes, restarted, and received the same
changes with ‘streaming = off’, therefore it got BEGIN and COMMIT instead. I
think that this happens even by using ‘SET (‘streaming’ = off)’.
You are right. Yes, I checked that the apply worker did exit
and the new apply worker process dealt with the INSERT in the above case.
Also, setting streaming = false was same.
Thanks a lot for your explanation.
Best Regards,
Takamichi Osumi
On Sun, Oct 10, 2021 at 11:04 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
On 04.10.21 02:31, Masahiko Sawada wrote:
I guess disabling subscriptions on error/conflict and skipping the
particular transactions are somewhat different types of functions.
Disabling subscriptions on error/conflict seems likes a setting
parameter of subscriptions. The users might want to specify this
option at creation time. Whereas, skipping the particular transaction
is a repair function that the user might want to use on the spot in
case of a failure. I’m concerned a bit that combining these functions
to one syntax could confuse the users.Also, would the skip option be dumped and restored using pg_dump? Maybe
there is an argument for yes, but if not, then we probably need a
different path of handling it separate from the more permanent options.
Good point. I don’t think the skip option should be dumped and
restored using pg_dump since the utilization of transaction ids in
another installation is different.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Fri, Oct 8, 2021 at 8:17 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Thu, Sep 30, 2021 at 3:45 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached updated patches that incorporate all comments I got so
far. Please review them.Some comments about the v15-0001 patch:
Thank you for the comments!
(1) patch adds a whitespace error
Applying: Add a subscription errors statistics view
"pg_stat_subscription_errors".
.git/rebase-apply/patch:1656: new blank line at EOF.
+
warning: 1 line adds whitespace errors.
Fixed.
(2) Patch comment says "This commit adds a new system view
pg_stat_logical_replication_errors ..."
BUT this is the wrong name, it should be "pg_stat_subscription_errors".
Fixed.
doc/src/sgml/monitoring.sgml
(3)
"Message of the error" doesn't sound right. I suggest just saying "The
error message".
Fixed.
(4) view column "last_failed_time"
I think it would be better to name this "last_error_time".
Okay, fixed.
src/backend/postmaster/pgstat.c
(5) pgstat_vacuum_subworker_stats()
Spelling mistake in the following comment:
/* Create a map for mapping subscriptoin OID and database OID */
subscriptoin -> subscription
Fixed.
(6)
In the following functions:pgstat_read_statsfiles
pgstat_read_db_statsfile_timestampThe following comment should say "... struct describing subscription
worker statistics."
(i.e. need to remove the "a")+ * 'S' A PgStat_StatSubWorkerEntry struct describing a + * subscription worker statistics.
Fixed.
(7) pgstat_get_subworker_entry
Suggest comment change:
BEFORE: + * Return the entry of subscription worker entry with the subscription AFTER: + * Return subscription worker entry with the given subscription
Fixed.
(8) pgstat_recv_subworker_error
+ /* + * Update only the counter and timestamp if we received the same error + * again + */ + if (wentry->relid == msg->m_relid && + wentry->command == msg->m_command && + wentry->xid == msg->m_xid && + strncmp(wentry->message, msg->m_message, strlen(wentry->message)) == 0) + {Is there a reason that the above check uses strncmp() with
strlen(wentry->message), instead of just strcmp()?
msg->m_message is treated as the same error message if it is the same
up to strlen(wentry->message)?
Perhaps if that is intentional, then the comment should be updated.
It's better to use strcmp() in this case. Fixed.
src/tools/pgindent/typedefs.list
(9)
The added "PgStat_SubWorkerError" should be removed from the
typedefs.list (as there is no such new typedef).
Fixed.
I've attached updated patches.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v16-0003-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchapplication/octet-stream; name=v16-0003-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchDownload
From ae161a52ed34e515e208c53b362c907827529a23 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 2 Aug 2021 14:27:40 +0900
Subject: [PATCH v16 3/3] Add skip_xid option to ALTER SUBSCRIPTION.
If incoming change violates any constraint, logical replication stops
until it's resolved. This commit introduces another way to skip the
transaction in question, other than manually updating the subscriber's
database or using pg_replication_origin_advance().
The user can specify XID by ALTER SUBSCRIPTION ... SET (skip_xid =
XXX), updating pg_subscription.subskipxid field, telling the apply
worker to skip the transaction. The apply worker skips all data
modification changes within the specified transaction.
After skipping the transaciton the apply worker clears
pg_subscription.subskipxid. Also, it clears the error statistics of
the subscription in pg_stat_subscription_errors system view as well in
order the user not to get confused. It's done by sending the message
for clearing a subscription error to the stats collector.
---
doc/src/sgml/logical-replication.sgml | 55 +++++-
doc/src/sgml/ref/alter_subscription.sgml | 37 +++-
src/backend/catalog/pg_subscription.c | 10 ++
src/backend/commands/subscriptioncmds.c | 42 ++++-
src/backend/replication/logical/worker.c | 183 +++++++++++++++++++-
src/include/catalog/pg_subscription.h | 4 +
src/include/nodes/parsenodes.h | 1 +
src/test/regress/expected/subscription.out | 13 ++
src/test/regress/sql/subscription.sql | 11 ++
src/test/subscription/t/025_error_report.pl | 107 +++++++++++-
10 files changed, 443 insertions(+), 20 deletions(-)
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 88646bc859..4cfcd9faaf 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -333,20 +333,67 @@
<para>
A conflict will produce an error and will stop the replication; it must be
resolved manually by the user. Details about the conflict can be found in
- the subscriber's server log.
+ <xref linkend="monitoring-pg-stat-subscription-errors"/> as well as the
+ subscriber's server log.
</para>
<para>
The resolution can be done either by changing data on the subscriber so
- that it does not conflict with the incoming change or by skipping the
- transaction that conflicts with the existing data. The transaction can be
- skipped by calling the <link linkend="pg-replication-origin-advance">
+ that it does not conflict with the incoming changes or by skipping the whole
+ transaction. This option specifies the ID of the transaction whose
+ application is to be skipped by the logical replication worker. The logical
+ replication worker skips all data modification transaction conflicts with
+ the existing data. When a conflict produce an error, it is shown in
+ <structname>pg_stat_subscription_errors</structname> view as follows:
+ </para>
+
+ <programlisting>
+postgres=# SELECT * FROM pg_stat_subscription_errors;
+-[ RECORD 1 ]----+-----------------------------------------------------------
+subid | 16391
+subname | test_sub
+subrelid |
+relid | 16384
+command | INSERT
+xid | 716
+count | 50
+error_message | duplicate key value violates unique constraint "test_pkey"
+last_failed_time | 2021-09-29 15:52:45.165754+00
+stats_reset |
+</programlisting>
+
+ <para>
+ and it is also shown in subscriber's server log as follows:
+ </para>
+
+<screen>
+ERROR: duplicate key value violates unique constraint "test_pkey"
+DETAIL: Key (id)=(1) already exists.
+CONTEXT: processing remote data during "INSERT" for replication target relation "public.test" in transaction 716 with commit timestamp 2021-07-15 21:54:58.802874+00
+</screen>
+
+ <para>
+ The transaction ID that contains the change violating the constraint can be
+ found from those outputs (transaction ID 716 in the above case). The transaction
+ can be skipped by setting <replaceable>skip_xid</replaceable> on the subscription
+ by <command>ALTER SUBSCRIPTION ... SET</command>. Alternatively, the transaction
+ can also be skipped by calling the <link linkend="pg-replication-origin-advance">
<function>pg_replication_origin_advance()</function></link> function with
a <parameter>node_name</parameter> corresponding to the subscription name,
and a position. The current position of origins can be seen in the
<link linkend="view-pg-replication-origin-status">
<structname>pg_replication_origin_status</structname></link> system view.
</para>
+
+ <para>
+ In this case, you need to consider changing the data on the subscriber so that it
+ doesn't conflict with incoming changes, or dropping the conflicting constraint or
+ unique index, or writing a trigger on the subscriber to suppress or redirect
+ conflicting incoming changes, or as a last resort, by skipping the whole transaction.
+ They skip the whole transaction, including changes that may not violate any
+ constraint. They may easily make the subscriber inconsistent, especially if
+ a user specifies the wrong transaction ID or the position of origin.
+ </para>
</sect1>
<sect1 id="logical-replication-restrictions">
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index c6ea386caa..df634a4fd1 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -207,8 +207,41 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
<literal>streaming</literal>.
</para>
<para>
- The parameters that can be reset are: <literal>streaming</literal>,
- <literal>binary</literal>, <literal>synchronous_commit</literal>.
+ The parameters that can be reset are: <literal>slot_name</literal>,
+ <literal>synchronous_commit</literal>, <literal>binary</literal>,
+ <literal>streaming</literal>, and following parameter:
+ </para>
+ <para>
+ <variablelist>
+ <varlistentry>
+ <term><literal>skip_xid</literal> (<type>xid</type>)</term>
+ <listitem>
+ <para>
+ If incoming data violates any constraints the logical replication
+ will stop until it is resolved. The resolution can be done either
+ by changing data on the subscriber so that it doesn't conflict with
+ incoming change or by skipping the whole transaction. This option
+ specifies transaction ID that logical replication worker skips to
+ apply. The logical replication worker skips all data modification
+ changes within the specified transaction. Therefore, since it skips
+ the whole transaction including the changes that may not violate the
+ constraint, it should only be used as a last resort. This option has
+ no effect for the transaction that is already prepared with enabling
+ <literal>two_phase</literal> on susbscriber. After the logical
+ replication successfully skips the transaction, the transaction ID
+ (stored in
+ <structname>pg_subscription</structname>.<structfield>subskipxid</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts.
+ </para>
+
+ <para>
+ Setting and resetting of <literal>skip_xid</literal> option is
+ restrited to superusers.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index 25021e25a4..cb22cd7463 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -104,6 +104,16 @@ GetSubscription(Oid subid, bool missing_ok)
Assert(!isnull);
sub->publications = textarray_to_stringlist(DatumGetArrayTypeP(datum));
+ /* Get skip XID */
+ datum = SysCacheGetAttr(SUBSCRIPTIONOID,
+ tup,
+ Anum_pg_subscription_subskipxid,
+ &isnull);
+ if (!isnull)
+ sub->skipxid = DatumGetTransactionId(datum);
+ else
+ sub->skipxid = InvalidTransactionId;
+
ReleaseSysCache(tup);
return sub;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 896ec8b836..fd74037fb8 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -60,6 +60,7 @@
#define SUBOPT_BINARY 0x00000080
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
+#define SUBOPT_SKIP_XID 0x00000400
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -81,6 +82,7 @@ typedef struct SubOpts
bool binary;
bool streaming;
bool twophase;
+ TransactionId skip_xid;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -129,6 +131,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->streaming = false;
if (IsSet(supported_opts, SUBOPT_TWOPHASE_COMMIT))
opts->twophase = false;
+ if (IsSet(supported_opts, SUBOPT_SKIP_XID))
+ opts->skip_xid = InvalidTransactionId;
/* Parse options */
foreach(lc, stmt_options)
@@ -261,6 +265,29 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_TWOPHASE_COMMIT;
opts->twophase = defGetBoolean(defel);
}
+ else if (strcmp(defel->defname, "skip_xid") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_SKIP_XID))
+ errorConflictingDefElem(defel, pstate);
+
+ if (!is_reset)
+ {
+ char *xid_str = defGetString(defel);
+ TransactionId xid;
+
+ /* Parse the argument as TransactionId */
+ xid = DatumGetTransactionId(DirectFunctionCall1(xidin,
+ CStringGetDatum(xid_str)));
+
+ if (!TransactionIdIsNormal(xid))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid transaction id")));
+ opts->skip_xid = xid;
+ }
+
+ opts->specified_opts |= SUBOPT_SKIP_XID;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -485,6 +512,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
CharGetDatum(opts.twophase ?
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -888,7 +916,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (is_reset)
supported_opts = (SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
- SUBOPT_STREAMING);
+ SUBOPT_STREAMING | SUBOPT_SKIP_XID);
else
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
@@ -941,6 +969,18 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
replaces[Anum_pg_subscription_substream - 1] = true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_SKIP_XID))
+ {
+ if (!superuser())
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("must be superuser to set %s", "skip_xid")));
+
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(opts.skip_xid);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+ }
+
update_tuple = true;
break;
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 981c195a87..d301b4fdbf 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -136,6 +136,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -255,6 +256,21 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * skipping_xid is a valid XID during skipping all data modification changes
+ * (INSERT/DELETE/UPDATE/TRUNCATE) of the specified transaction at MySubscription->skipxid.
+ * Please note that we don’t skip receiving the changes particularly in streaming
+ * cases, since we decide whether or not to skip applying the changes when starting
+ * to apply. Once starting skipping changes, we copy the XID to skipping_xid and
+ * then don't stop skipping until we skip the whole transaction even if the
+ * subscription is invalidated and MySubscription->skipxid gets changed or reset.
+ * When stopping the skipping behavior, we reset the skip XID (subskipxid) in the
+ * pg_subscription catalog and associate origin status to the transaction that resets
+ * the skip XID so that we can start streaming from the next transaction.
+ */
+static TransactionId skipping_xid = InvalidTransactionId;
+#define is_skipping_changes() (TransactionIdIsValid(skipping_xid))
+
/* BufFile handle of the current streaming file */
static BufFile *stream_fd = NULL;
@@ -335,6 +351,9 @@ static void apply_error_callback(void *arg);
static inline void set_apply_error_context_xact(TransactionId xid, TimestampTz ts);
static inline void reset_apply_error_context_info(void);
+static void maybe_start_skipping_changes(TransactionId xid);
+static void stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs);
+
/*
* Should this worker apply changes for given relation.
*
@@ -789,6 +808,11 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
+ /*
+ * Enable skipping all changes of this transaction if specified.
+ */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -813,7 +837,18 @@ apply_handle_commit(StringInfo s)
LSN_FORMAT_ARGS(commit_data.commit_lsn),
LSN_FORMAT_ARGS(remote_final_lsn))));
- apply_handle_commit_internal(&commit_data);
+ /*
+ * Stop the skipping transaction if enabled. Otherwise, commit the changes
+ * that are just applied.
+ */
+ if (is_skipping_changes())
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ apply_handle_commit_internal(&commit_data);
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(commit_data.end_lsn);
@@ -841,6 +876,9 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
+ /* Enable skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -899,9 +937,10 @@ apply_handle_prepare(StringInfo s)
/*
* Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction. It is done this way because at
- * commit prepared time, we won't know whether we have skipped preparing a
- * transaction because of no change.
+ * change has happened in this transaction, possibly because we're
+ * skipping data-modification changes of this transaction. It is done this
+ * way because at commit prepared time, we won't know whether we have
+ * skipped preparing a transaction because of no change.
*
* XXX, We can optimize such that at commit prepared time, we first check
* whether we have prepared the transaction or not but that doesn't seem
@@ -915,6 +954,10 @@ apply_handle_prepare(StringInfo s)
CommitTransactionCommand();
pgstat_report_stat(false);
+ /* Stop the skipping changes if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
store_flush_position(prepare_data.end_lsn);
in_remote_transaction = false;
@@ -1046,6 +1089,9 @@ apply_handle_stream_prepare(StringInfo s)
elog(DEBUG1, "received prepare for streamed transaction %u", prepare_data.xid);
+ /* Enable skipping all changes of this transaction if specified. */
+ maybe_start_skipping_changes(prepare_data.xid);
+
/* Replay all the spooled operations. */
apply_spooled_messages(prepare_data.xid, prepare_data.prepare_lsn);
@@ -1056,6 +1102,10 @@ apply_handle_stream_prepare(StringInfo s)
pgstat_report_stat(false);
+ /* Stop the skipping changes if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
store_flush_position(prepare_data.end_lsn);
in_remote_transaction = false;
@@ -1081,9 +1131,10 @@ apply_handle_origin(StringInfo s)
{
/*
* ORIGIN message can only come inside streaming transaction or inside
- * remote transaction and before any actual writes.
+ * remote transaction and before any actual writes unless we're skipping
+ * changes of the transaction.
*/
- if (!in_streamed_transaction &&
+ if (!in_streamed_transaction && !is_skipping_changes() &&
(!in_remote_transaction ||
(IsTransactionState() && !am_tablesync_worker())))
ereport(ERROR,
@@ -1206,6 +1257,7 @@ apply_handle_stream_abort(StringInfo s)
errmsg_internal("STREAM ABORT message without STREAM STOP")));
logicalrep_read_stream_abort(s, &xid, &subxid);
+ maybe_start_skipping_changes(xid);
/*
* If the two XIDs are the same, it's in fact abort of toplevel xact, so
@@ -1289,6 +1341,10 @@ apply_handle_stream_abort(StringInfo s)
CommitTransactionCommand();
}
+ /* Stop the skipping transaction if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
reset_apply_error_context_info();
}
@@ -1428,9 +1484,23 @@ apply_handle_stream_commit(StringInfo s)
elog(DEBUG1, "received commit for streamed transaction %u", xid);
+ /* Enable skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(xid);
+
apply_spooled_messages(xid, commit_data.commit_lsn);
- apply_handle_commit_internal(&commit_data);
+ if (is_skipping_changes())
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ {
+ /* commit the streamed transaction */
+ apply_handle_commit_internal(&commit_data);
+ }
/* unlink the files with serialized changes and subxact info */
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
@@ -2316,6 +2386,17 @@ apply_dispatch(StringInfo s)
LogicalRepMsgType action = pq_getmsgbyte(s);
LogicalRepMsgType saved_command;
+ /*
+ * Skip all data-modification changes if we're skipping changes of this
+ * transaction.
+ */
+ if (is_skipping_changes() &&
+ (action == LOGICAL_REP_MSG_INSERT ||
+ action == LOGICAL_REP_MSG_UPDATE ||
+ action == LOGICAL_REP_MSG_DELETE ||
+ action == LOGICAL_REP_MSG_TRUNCATE))
+ return;
+
/*
* Set the current command being applied. Since this function can be
* called recusively when applying spooled changes, save the current
@@ -3665,3 +3746,91 @@ reset_apply_error_context_info(void)
apply_error_callback_arg.remote_attnum = -1;
set_apply_error_context_xact(InvalidTransactionId, 0);
}
+
+/*
+ * Start skipping changes of the transaction if the given XID matches the
+ * transaction ID specified by skip_xid option.
+ */
+static void
+maybe_start_skipping_changes(TransactionId xid)
+{
+ Assert(!is_skipping_changes());
+ Assert(!TransactionIdIsValid(skipping_xid));
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ if (!TransactionIdIsValid(MySubscription->skipxid) ||
+ MySubscription->skipxid != xid)
+ return;
+
+ skipping_xid = xid;
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction %u",
+ skipping_xid));
+}
+
+/*
+ * Stop skipping changes and reset the skip XID. Also, reset the skip XID
+ * (pg_subscription.subskipxid). If origin_lsn and origin_committs are valid, we
+ * set origin state to the transaction commit that resets the skip XID so that we
+ * can start streaming from the transaction next to the one that we just skipped.
+ */
+static void
+stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs)
+{
+ Relation rel;
+ HeapTuple tup;
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ Assert(is_skipping_changes());
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction %u",
+ skipping_xid)));
+
+ /* Stop skipping changes */
+ skipping_xid = InvalidTransactionId;
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ if (!IsTransactionState())
+ StartTransactionCommand();
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ ObjectIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ /* Set subskipxid to null */
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ /* Update the system catalog to reset the skip XID */
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+
+ heap_freetuple(tup);
+ table_close(rel, RowExclusiveLock);
+
+ if (!XLogRecPtrIsInvalid(origin_lsn))
+ {
+ /*
+ * Update origin state so we can restart streaming from correct
+ * position in case of crash.
+ */
+ replorigin_session_origin_lsn = origin_lsn;
+ replorigin_session_origin_timestamp = origin_committs;
+ }
+
+ CommitTransactionCommand();
+ pgstat_report_stat(false);
+}
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 21061493ea..beaa6e646d 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -67,6 +67,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
char subtwophasestate; /* Stream two-phase transactions */
+ TransactionId subskipxid BKI_FORCE_NULL; /* All changes associated with
+ * this XID are skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -103,6 +106,7 @@ typedef struct Subscription
* binary format */
bool stream; /* Allow streaming in-progress transactions. */
char twophasestate; /* Allow streaming two-phase transactions */
+ TransactionId skipxid; /* All changes of the XID are skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 539921cb52..63503b86da 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3694,6 +3694,7 @@ typedef struct AlterSubscriptionStmt
char *conninfo; /* Connection string to publisher */
List *publication; /* One or more publication to subscribe to */
List *options; /* List of DefElem nodes */
+ TransactionId skip_xid; /* XID to skip */
} AlterSubscriptionStmt;
typedef struct DropSubscriptionStmt
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index e4c16cab66..e4dc4fb946 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -293,6 +293,19 @@ ERROR: unrecognized subscription parameter: "enabled"
-- fail - RESET must not include values
ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
ERROR: RESET must not include values for parameters
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = '4294967295');
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1.1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+ERROR: invalid transaction id
\dRs+
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 3b0fbea897..c458b38985 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -228,6 +228,17 @@ ALTER SUBSCRIPTION regress_testsub RESET (enabled);
-- fail - RESET must not include values
ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = '4294967295');
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1.1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+
\dRs+
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/subscription/t/025_error_report.pl b/src/test/subscription/t/025_error_report.pl
index c5af95f339..94556e2007 100644
--- a/src/test/subscription/t/025_error_report.pl
+++ b/src/test/subscription/t/025_error_report.pl
@@ -1,12 +1,14 @@
# Copyright (c) 2021, PostgreSQL Global Development Group
-# Tests for subscription error reporting.
+# Tests for subscription error reporting and skipping logical
+# replication transactions.
+
use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 5;
+use Test::More tests => 14;
# Test if the error reported on pg_subscription_errors view is expected.
sub test_subscription_error
@@ -32,6 +34,35 @@ WHERE relid = '$relname'::regclass;
]);
is($result, $expected_error, $msg);
}
+# Check the error reported on pg_stat_subscription view and skip the failed
+# transaction.
+sub test_skip_subscription_error
+{
+ my ($node, $subname, $relname, $xid, $expected_error, $msg) = @_;
+
+ # Check the reported error.
+ test_subscription_error($node, $relname, $xid, $expected_error, $msg);
+
+ # Get XID of the failed transaction.
+ my $skipxid = $node->safe_psql(
+ 'postgres',
+ "SELECT xid FROM pg_stat_subscription_errors WHERE relid = '$relname'::regclass");
+ is($skipxid, $xid, "remote xid and skip_xid are equal");
+
+ $node->safe_psql('postgres',
+ "ALTER SUBSCRIPTION $subname SET (skip_xid = '$skipxid')");
+
+ # Restart the subscriber node to restart logical replication with no interval.
+ $node->restart;
+
+ # Wait for the failed transaction to be skipped.
+ $node->poll_query_until(
+ 'postgres',
+ qq[
+SELECT subskipxid IS NULL FROM pg_subscription
+WHERE subname = '$subname'
+]) or die "Timed out while waiting for the transaction to be skipped";
+}
# Create publisher node.
my $node_publisher = PostgresNode->new('publisher');
@@ -123,7 +154,7 @@ $result = $node_subscriber->safe_psql('postgres',
is($result, q(1), 'check initial data are copied to subscriber');
# Insert more data to test_tab1, raising an error on the subscriber due to violation
-# of the unique constraint on test_tab1.
+# of the unique constraint on test_tab1. Then skip the transaction in question.
my $xid = $node_publisher->safe_psql(
'postgres',
qq[
@@ -132,15 +163,79 @@ INSERT INTO test_tab1 VALUES (1);
SELECT pg_current_xact_id()::xid;
COMMIT;
]);
-test_subscription_error($node_subscriber, 'test_tab1', $xid,
- qq(tap_sub|INSERT|test_tab1|t),
- 'check the error reported by the apply worker');
+test_skip_subscription_error($node_subscriber, 'tap_sub', 'test_tab1',
+ $xid, qq(tap_sub|INSERT|test_tab1|t),
+ 'check the error reported by the apply worker');
# Check the table sync worker's error in the view.
test_subscription_error($node_subscriber, 'test_tab2', '',
qq(tap_sub||test_tab2|t),
'check the error reported by the table sync worker');
+# Insert enough rows to test_tab_streaming to exceed the 64kB limit, also raising an
+# error on the subscriber during applying spooled changes for the same reason. Then
+# skip the transaction in question.
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_skip_subscription_error($node_subscriber, 'tap_sub_streaming', 'test_tab_streaming',
+ $xid, qq(tap_sub_streaming|INSERT|test_tab_streaming|t),
+ 'skip the error reported by the table sync worker during applying streaming changes');
+
+# Insert data to test_tab1 and test_tab_streaming that don't conflict.
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab1 VALUES (2)");
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab_streaming VALUES (10001, md5(10001::text))");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Check the data is successfully replicated after skipping the transactions.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT * FROM test_tab1");
+is($result, q(1
+2), "subscription gets changes after skipped transaction");
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM test_tab_streaming");
+is($result, q(2), "subscription gets changes after skipped streamed transaction");
+
+# Tests for skipping the transactions that are prepared and stream-prepared. We insert
+# the same data as the previous tests but prepare the transactions. Those insertions
+# raise an error on the subscriptions. Then we skip the transactions in question.
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab1 VALUES (1);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'skip_sub1';
+COMMIT PREPARED 'skip_sub1';
+]);
+test_skip_subscription_error($node_subscriber, 'tap_sub', 'test_tab1',
+ $xid, qq(tap_sub|INSERT|test_tab1|t),
+ 'skip the error on changes of the prepared transaction');
+
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'skip_sub2';
+COMMIT PREPARED 'skip_sub2';
+]);
+test_skip_subscription_error($node_subscriber, 'tap_sub_streaming', 'test_tab_streaming',
+ $xid, qq(tap_sub_streaming|INSERT|test_tab_streaming|t),
+ 'skip the error on changes of the prepared-streamed transaction');
+
# Check if the view doesn't show any entries after dropping the subscriptions.
$node_subscriber->safe_psql(
'postgres',
--
2.24.3 (Apple Git-128)
v16-0001-Add-a-subscription-errors-statistics-view-pg_sta.patchapplication/octet-stream; name=v16-0001-Add-a-subscription-errors-statistics-view-pg_sta.patchDownload
From 9f8c21c735e111ff4794ea00bab4234265bfe567 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 16 Jul 2021 23:10:22 +0900
Subject: [PATCH v16 1/3] Add a subscription errors statistics view
"pg_stat_subscription_errors".
This commit adds a new system view pg_stat_subscription_errors,
that shows information about any errors which occur during application
of logical replication changes as well as during performing initial table
synchronization.
The subscription error entries are removed by autovacuum workers after
table synchronization completes in table sync worker cases and after
dropping the subscription in apply worker cases.
It also adds an SQL function pg_stat_reset_subscription_error() to
reset a single subscription error.
---
doc/src/sgml/monitoring.sgml | 160 ++++++
src/backend/catalog/system_functions.sql | 2 +
src/backend/catalog/system_views.sql | 25 +
src/backend/postmaster/pgstat.c | 568 ++++++++++++++++++++
src/backend/replication/logical/worker.c | 54 +-
src/backend/utils/adt/pgstatfuncs.c | 121 +++++
src/include/catalog/pg_proc.dat | 13 +
src/include/pgstat.h | 123 +++++
src/test/regress/expected/rules.out | 20 +
src/test/subscription/t/025_error_report.pl | 156 ++++++
src/tools/pgindent/typedefs.list | 6 +
11 files changed, 1245 insertions(+), 3 deletions(-)
create mode 100644 src/test/subscription/t/025_error_report.pl
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 7355835202..494e7ec3df 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -337,6 +337,15 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
</entry>
</row>
+ <row>
+ <entry><structname>pg_stat_subscription_errors</structname><indexterm><primary>pg_stat_subscription_errors</primary></indexterm></entry>
+ <entry>One row per error that occurred on subscription, showing information about
+ each subscription error.
+ See <link linkend="monitoring-pg-stat-subscription-errors">
+ <structname>pg_stat_subscription_errors</structname></link> for details.
+ </entry>
+ </row>
+
<row>
<entry><structname>pg_stat_ssl</structname><indexterm><primary>pg_stat_ssl</primary></indexterm></entry>
<entry>One row per connection (regular and replication), showing information about
@@ -3050,6 +3059,135 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
</sect2>
+ <sect2 id="monitoring-pg-stat-subscription-errors">
+ <title><structname>pg_stat_subscription_errors</structname></title>
+
+ <indexterm>
+ <primary>pg_stat_subscription_errors</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_stat_subscription_errors</structname> view will contain one
+ row per subscription error reported by workers applying logical replication
+ changes and workers handling the initial data copy of the subscribed tables.
+ </para>
+
+ <table id="pg-stat-subscription-errors" xreflabel="pg_stat_subscription-errors">
+ <title><structname>pg_stat_subscription_errors</structname> View</title>
+ <tgroup cols="1">
+ <thead>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ Column Type
+ </para>
+ <para>
+ Description
+ </para></entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the subscription
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the subscription
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subrelid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker is synchronizing; NULL for the
+ main apply worker
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>relid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker was processing when the
+ error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>command</structfield> <type>text</type>
+ </para>
+ <para>
+ Name of command being applied when the error occurred. This field
+ is always NULL if the error was reported during the initial data
+ copy.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>xid</structfield> <type>xid</type>
+ </para>
+ <para>
+ Transaction ID of the publisher node being applied when the error
+ occurred. This field is always NULL if the error was reported
+ during the initial data copy.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>count</structfield> <type>uint8</type>
+ </para>
+ <para>
+ Number of consecutive times the error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>error_message</structfield> <type>text</type>
+ </para>
+ <para>
+ The error message
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_error_time</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which the last error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>stats_reset</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which these statistics were last reset
+ </para></entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ </sect2>
+
<sect2 id="monitoring-pg-stat-ssl-view">
<title><structname>pg_stat_ssl</structname></title>
@@ -5172,6 +5310,28 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
can be granted EXECUTE to run the function.
</para></entry>
</row>
+
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_stat_reset_subscription_error</primary>
+ </indexterm>
+ <function>pg_stat_reset_subscription_error</function> ( <parameter>subid</parameter> <type>oid</type>, <parameter>relid</parameter> <type>oid</type> )
+ <returnvalue>void</returnvalue>
+ </para>
+ <para>
+ Resets statistics of a single subscription error. If
+ the argument <parameter>relid</parameter> is not <literal>NULL</literal>,
+ resets error statistics of the <literal>tablesync</literal> worker for
+ the relation with <parameter>relid</parameter>. Otherwise, resets the
+ error statistics of the <literal>apply</literal> worker running on the
+ subscription with <parameter>subid</parameter>.
+ </para>
+ <para>
+ This function is restricted to superusers by default, but other users
+ can be granted EXECUTE to run the function.
+ </para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index a416e94d37..c9aa6f04d3 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -639,6 +639,8 @@ REVOKE EXECUTE ON FUNCTION pg_stat_reset_single_function_counters(oid) FROM publ
REVOKE EXECUTE ON FUNCTION pg_stat_reset_replication_slot(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_stat_reset_subscription_error(oid, oid) FROM public;
+
REVOKE EXECUTE ON FUNCTION lo_import(text) FROM public;
REVOKE EXECUTE ON FUNCTION lo_import(text, oid) FROM public;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 55f6e3711d..e3a52f6899 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1257,3 +1257,28 @@ REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
substream, subtwophasestate, subslotname, subsynccommit, subpublications)
ON pg_subscription TO public;
+
+CREATE VIEW pg_stat_subscription_errors AS
+ SELECT
+ e.subid,
+ s.subname,
+ e.subrelid,
+ e.relid,
+ e.command,
+ e.xid,
+ e.count,
+ e.error_message,
+ e.last_error_time,
+ e.stats_reset
+ FROM (SELECT
+ oid as subid,
+ NULL as relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT
+ srsubid as subid,
+ srrelid as relid
+ FROM pg_subscription_rel
+ WHERE srsubstate <> 'r') sr,
+ LATERAL pg_stat_get_subscription_error(sr.subid, sr.relid) e
+ JOIN pg_subscription s ON (e.subid = s.oid);
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index b7d0fbaefd..2634e5321c 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -41,6 +41,8 @@
#include "catalog/catalog.h"
#include "catalog/pg_database.h"
#include "catalog/pg_proc.h"
+#include "catalog/pg_subscription.h"
+#include "catalog/pg_subscription_rel.h"
#include "common/ip.h"
#include "executor/instrument.h"
#include "libpq/libpq.h"
@@ -106,6 +108,7 @@
#define PGSTAT_TAB_HASH_SIZE 512
#define PGSTAT_FUNCTION_HASH_SIZE 512
#define PGSTAT_REPLSLOT_HASH_SIZE 32
+#define PGSTAT_SUBWORKER_HASH_SIZE 32
/* ----------
@@ -282,6 +285,7 @@ static PgStat_GlobalStats globalStats;
static PgStat_WalStats walStats;
static PgStat_SLRUStats slruStats[SLRU_NUM_ELEMENTS];
static HTAB *replSlotStatHash = NULL;
+static HTAB *subWorkerStatHash = NULL;
/*
* List of OIDs of databases we need to write out. If an entry is InvalidOid,
@@ -332,6 +336,13 @@ static bool pgstat_db_requested(Oid databaseid);
static PgStat_StatReplSlotEntry *pgstat_get_replslot_entry(NameData name, bool create_it);
static void pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotstats, TimestampTz ts);
+static PgStat_StatSubWorkerEntry *pgstat_get_subworker_entry(Oid subid, Oid subrelid,
+ bool create);
+static void pgstat_reset_subworker_error(PgStat_StatSubWorkerEntry *wentry, TimestampTz ts);
+static void pgstat_report_subworker_purge(PgStat_MsgSubWorkerPurge *msg);
+static void pgstat_report_subworker_error_purge(PgStat_MsgSubWorkerErrorPurge *msg);
+static void pgstat_vacuum_subworker_stats(void);
+
static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg, TimestampTz now);
static void pgstat_send_funcstats(void);
static void pgstat_send_slru(void);
@@ -356,6 +367,7 @@ static void pgstat_recv_resetsharedcounter(PgStat_MsgResetsharedcounter *msg, in
static void pgstat_recv_resetsinglecounter(PgStat_MsgResetsinglecounter *msg, int len);
static void pgstat_recv_resetslrucounter(PgStat_MsgResetslrucounter *msg, int len);
static void pgstat_recv_resetreplslotcounter(PgStat_MsgResetreplslotcounter *msg, int len);
+static void pgstat_recv_resetsubworkererror(PgStat_MsgResetsubworkererror *msg, int len);
static void pgstat_recv_autovac(PgStat_MsgAutovacStart *msg, int len);
static void pgstat_recv_vacuum(PgStat_MsgVacuum *msg, int len);
static void pgstat_recv_analyze(PgStat_MsgAnalyze *msg, int len);
@@ -373,6 +385,9 @@ static void pgstat_recv_connect(PgStat_MsgConnect *msg, int len);
static void pgstat_recv_disconnect(PgStat_MsgDisconnect *msg, int len);
static void pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len);
static void pgstat_recv_tempfile(PgStat_MsgTempFile *msg, int len);
+static void pgstat_recv_subworker_error(PgStat_MsgSubWorkerError *msg, int len);
+static void pgstat_recv_subworker_error_purge(PgStat_MsgSubWorkerErrorPurge *msg, int len);
+static void pgstat_recv_subworker_purge(PgStat_MsgSubWorkerPurge *msg, int len);
/* ------------------------------------------------------------
* Public functions called from postmaster follow
@@ -1178,6 +1193,10 @@ pgstat_vacuum_stat(void)
}
}
+ /* Cleanup the dead subscription workers statistics */
+ if (subWorkerStatHash)
+ pgstat_vacuum_subworker_stats();
+
/*
* Lookup our own database entry; if not found, nothing more to do.
*/
@@ -1355,6 +1374,175 @@ pgstat_collect_oids(Oid catalogid, AttrNumber anum_oid)
}
+/* PgStat_StatSubWorkerEntry comparator sorting subid and subrelid */
+static int
+subworker_stats_comparator(const ListCell *a, const ListCell *b)
+{
+ PgStat_StatSubWorkerEntry *entry1 = (PgStat_StatSubWorkerEntry *) lfirst(a);
+ PgStat_StatSubWorkerEntry *entry2 = (PgStat_StatSubWorkerEntry *) lfirst(b);
+ int ret;
+
+ ret = oid_cmp(&entry1->key.subid, &entry2->key.subid);
+ if (ret != 0)
+ return ret;
+
+ return oid_cmp(&entry1->key.subrelid, &entry2->key.subrelid);
+}
+
+/* ----------
+ * pgstat_vacuum_subworker_stat() -
+ *
+ * This is a subroutine for pgstat_vacuum_stat to tell the collector about
+ * the all the dead subscription worker statistics.
+ */
+static void
+pgstat_vacuum_subworker_stats(void)
+{
+ PgStat_MsgSubWorkerPurge wpmsg;
+ PgStat_MsgSubWorkerErrorPurge epmsg;
+ PgStat_StatSubWorkerEntry *wentry;
+ HTAB *subids;
+ HASH_SEQ_STATUS hstat;
+ List *subworker_stats = NIL;
+ List *not_ready_rels = NIL;
+ ListCell *lc1;
+
+ /* Build the list of worker stats and sort it by subid and relid */
+ hash_seq_init(&hstat, subWorkerStatHash);
+ while ((wentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&hstat)) != NULL)
+ subworker_stats = lappend(subworker_stats, wentry);
+ list_sort(subworker_stats, subworker_stats_comparator);
+
+ /* Read pg_subscription and make a list of OIDs of all existing subscriptions */
+ subids = pgstat_collect_oids(SubscriptionRelationId, Anum_pg_subscription_oid);
+
+ /*
+ * Search for all the dead subscriptions and unnecessary table sync worker
+ * entries in stats hashtable and tell the stats collector to drop them.
+ */
+ wpmsg.m_nentries = 0;
+ epmsg.m_nentries = 0;
+ epmsg.m_subid = InvalidOid;
+ foreach(lc1, subworker_stats)
+ {
+ ListCell *lc2;
+ bool keep_it = false;
+
+ wentry = (PgStat_StatSubWorkerEntry *) lfirst(lc1);
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* Skip if we already registered this subscription to purge */
+ if (wpmsg.m_nentries > 0 &&
+ wpmsg.m_subids[wpmsg.m_nentries - 1] == wentry->key.subid)
+ continue;
+
+ /* Check if the subscription is dead */
+ if (hash_search(subids, (void *) &(wentry->key.subid), HASH_FIND, NULL) == NULL)
+ {
+ /* This subscription is dead, add the subid to the message */
+ wpmsg.m_subids[wpmsg.m_nentries++] = wentry->key.subid;
+
+ /*
+ * If the message is full, send it out and reinitialize to empty
+ */
+ if (wpmsg.m_nentries >= PGSTAT_NUM_SUBWORKERPURGE)
+ {
+ pgstat_report_subworker_purge(&wpmsg);
+ wpmsg.m_nentries = 0;
+ }
+
+ continue;
+ }
+
+ /*
+ * This subscription is alive. The next step is that we search
+ * errors of table sync workers who are already in sync state.
+ * These errors should be removed.
+ */
+
+ /* We remove only table sync errors in the current database */
+ if (wentry->dbid != MyDatabaseId)
+ continue;
+
+ /* Skip if it's an apply worker error */
+ if (!OidIsValid(wentry->key.subrelid))
+ continue;
+
+ if (epmsg.m_subid != wentry->key.subid)
+ {
+ /*
+ * Send the purge message for previously collected table sync
+ * errors, if there is.
+ */
+ if (epmsg.m_nentries > 0)
+ {
+ pgstat_report_subworker_error_purge(&epmsg);
+ epmsg.m_nentries = 0;
+ }
+
+ /* Clean up if necessary */
+ if (not_ready_rels != NIL)
+ list_free_deep(not_ready_rels);
+
+ /* Refresh the not-ready-relations of this subscription */
+ not_ready_rels = GetSubscriptionNotReadyRelations(wentry->key.subid);
+
+ /* Prepare the error purge message for the subscription */
+ epmsg.m_subid = wentry->key.subid;
+ }
+
+ /*
+ * Check if the table is still being synchronized or no longer belongs
+ * to the subscription.
+ */
+ foreach(lc2, not_ready_rels)
+ {
+ SubscriptionRelState *relstate = (SubscriptionRelState *) lfirst(lc2);
+
+ if (relstate->relid == wentry->key.subrelid)
+ {
+ /* This table is still being synchronized, so keep it */
+ keep_it = true;
+ break;
+ }
+ }
+
+ if (keep_it)
+ continue;
+
+ ereport(LOG, (errmsg("ADD TABLE SYNC ERROR rel %u to message",
+ wentry->key.subrelid)));
+
+ /* Add the table to the error purge message */
+ epmsg.m_relids[epmsg.m_nentries++] = wentry->key.subrelid;
+
+ /*
+ * If the error purge message is full, send it out and reinitialize to
+ * empty
+ */
+ if (epmsg.m_nentries >= PGSTAT_NUM_SUBWORKERERRORPURGE)
+ {
+ pgstat_report_subworker_error_purge(&epmsg);
+ epmsg.m_nentries = 0;
+ }
+ }
+
+ /* Send the rest of dead subscriptions */
+ if (wpmsg.m_nentries > 0)
+ pgstat_report_subworker_purge(&wpmsg);
+
+ /* Send the rest of dead error entries */
+ if (epmsg.m_nentries > 0)
+ pgstat_report_subworker_error_purge(&epmsg);
+
+ /* Clean up */
+ if (not_ready_rels != NIL)
+ list_free_deep(not_ready_rels);
+
+ hash_destroy(subids);
+}
+
/* ----------
* pgstat_drop_database() -
*
@@ -1544,6 +1732,24 @@ pgstat_reset_replslot_counter(const char *name)
pgstat_send(&msg, sizeof(msg));
}
+/* ----------
+ * pgstat_reset_subscription_error_stats() -
+ *
+ * Tell the collector to reset the subscription worker error.
+ * ----------
+ */
+void
+pgstat_reset_subworker_error_stats(Oid subid, Oid subrelid)
+{
+ PgStat_MsgResetsubworkererror msg;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_RESETSUBWORKERERROR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+
+ pgstat_send(&msg, sizeof(PgStat_MsgResetsubworkererror));
+}
+
/* ----------
* pgstat_report_autovac() -
*
@@ -1804,6 +2010,47 @@ pgstat_should_report_connstat(void)
return MyBackendType == B_BACKEND;
}
+/* --------
+ * pgstat_report_subworker_purge() -
+ *
+ * Tell the collector about dead subscriptions.
+ * --------
+ */
+static void
+pgstat_report_subworker_purge(PgStat_MsgSubWorkerPurge *msg)
+{
+ int len;
+
+ Assert(msg->m_nentries > 0);
+
+ len = offsetof(PgStat_MsgSubWorkerPurge, m_subids[0])
+ + msg->m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&msg->m_hdr, PGSTAT_MTYPE_SUBWORKERPURGE);
+ pgstat_send(msg, len);
+}
+
+/* --------
+ * pgstat_report_subworker_error_purge() -
+ *
+ * Tell the collector to remove table sync errors.
+ * --------
+ */
+static void
+pgstat_report_subworker_error_purge(PgStat_MsgSubWorkerErrorPurge *msg)
+{
+ int len;
+
+ Assert(OidIsValid(msg->m_subid));
+ Assert(msg->m_nentries > 0);
+
+ len = offsetof(PgStat_MsgSubWorkerErrorPurge, m_relids[0])
+ + msg->m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&msg->m_hdr, PGSTAT_MTYPE_SUBWORKERERRORPURGE);
+ pgstat_send(msg, len);
+}
+
/* ----------
* pgstat_report_replslot() -
*
@@ -1869,6 +2116,36 @@ pgstat_report_replslot_drop(const char *slotname)
pgstat_send(&msg, sizeof(PgStat_MsgReplSlot));
}
+/* ----------
+ * pgstat_report_subworker_error() -
+ *
+ * Tell the collector about the subscription worker error.
+ * ----------
+ */
+void
+pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command, TransactionId xid,
+ const char *errmsg)
+{
+ PgStat_MsgSubWorkerError msg;
+ int len;
+
+ Assert(strlen(errmsg) < PGSTAT_SUBWORKERERROR_MSGLEN);
+ len = offsetof(PgStat_MsgSubWorkerError, m_message[0]) + strlen(errmsg) + 1;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBWORKERERROR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_dbid = MyDatabaseId;
+ msg.m_relid = relid;
+ msg.m_command = command;
+ msg.m_xid = xid;
+ msg.m_timestamp = GetCurrentTimestamp();
+ strlcpy(msg.m_message, errmsg, PGSTAT_SUBWORKERERROR_MSGLEN);
+
+ pgstat_send(&msg, len);
+}
+
/* ----------
* pgstat_ping() -
*
@@ -2987,6 +3264,22 @@ pgstat_fetch_replslot(NameData slotname)
return pgstat_get_replslot_entry(slotname, false);
}
+/*
+ * ---------
+ * pgstat_fetch_subworker() -
+ *
+ * Support function for the SQL-callable pgstat* functions. Returns
+ * a pointer to the subscription worker struct.
+ * ---------
+ */
+PgStat_StatSubWorkerEntry *
+pgstat_fetch_subworker(Oid subid, Oid subrelid)
+{
+ backend_read_statsfile();
+
+ return pgstat_get_subworker_entry(subid, subrelid, false);
+}
+
/*
* Shut down a single backend's statistics reporting at process exit.
*
@@ -3498,6 +3791,11 @@ PgstatCollectorMain(int argc, char *argv[])
len);
break;
+ case PGSTAT_MTYPE_RESETSUBWORKERERROR:
+ pgstat_recv_resetsubworkererror(&msg.msg_resetsubworkererror,
+ len);
+ break;
+
case PGSTAT_MTYPE_AUTOVAC_START:
pgstat_recv_autovac(&msg.msg_autovacuum_start, len);
break;
@@ -3568,6 +3866,19 @@ PgstatCollectorMain(int argc, char *argv[])
pgstat_recv_disconnect(&msg.msg_disconnect, len);
break;
+ case PGSTAT_MTYPE_SUBWORKERERROR:
+ pgstat_recv_subworker_error(&msg.msg_subworkererror, len);
+ break;
+
+ case PGSTAT_MTYPE_SUBWORKERERRORPURGE:
+ pgstat_recv_subworker_error_purge(&msg.msg_subworkererrorpurge,
+ len);
+ break;
+
+ case PGSTAT_MTYPE_SUBWORKERPURGE:
+ pgstat_recv_subworker_purge(&msg.msg_subworkerpurge, len);
+ break;
+
default:
break;
}
@@ -3868,6 +4179,22 @@ pgstat_write_statsfiles(bool permanent, bool allDbs)
}
}
+ /*
+ * Write subscription worker stats struct
+ */
+ if (subWorkerStatHash)
+ {
+ PgStat_StatSubWorkerEntry *wentry;
+
+ hash_seq_init(&hstat, subWorkerStatHash);
+ while ((wentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ fputc('S', fpout);
+ rc = fwrite(wentry, sizeof(PgStat_StatSubWorkerEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+ }
+ }
+
/*
* No more output to be done. Close the temp file and replace the old
* pgstat.stat with it. The ferror() check replaces testing for error
@@ -4329,6 +4656,48 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
break;
}
+ /*
+ * 'S' A PgStat_StatSubWorkerEntry struct describing
+ * subscription worker statistics.
+ */
+ case 'S':
+ {
+ PgStat_StatSubWorkerEntry wbuf;
+ PgStat_StatSubWorkerEntry *wentry;
+
+ /* Read the subscription entry */
+ if (fread(&wbuf, 1, sizeof(PgStat_StatSubWorkerEntry), fpin)
+ != sizeof(PgStat_StatSubWorkerEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /* Create hash table if we don't have it already. */
+ if (subWorkerStatHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(PgStat_StatSubWorkerKey);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubWorkerEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ subWorkerStatHash = hash_create("Subscription worker stat entries",
+ PGSTAT_SUBWORKER_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ }
+
+ /* Enter the subscription entry and initialize fields */
+ wentry =
+ (PgStat_StatSubWorkerEntry *) hash_search(subWorkerStatHash,
+ (void *) &wbuf.key,
+ HASH_ENTER, NULL);
+ memcpy(wentry, &wbuf, sizeof(PgStat_StatSubWorkerEntry));
+ break;
+ }
+
case 'E':
goto done;
@@ -4541,6 +4910,7 @@ pgstat_read_db_statsfile_timestamp(Oid databaseid, bool permanent,
PgStat_WalStats myWalStats;
PgStat_SLRUStats mySLRUStats[SLRU_NUM_ELEMENTS];
PgStat_StatReplSlotEntry myReplSlotStats;
+ PgStat_StatSubWorkerEntry mySubWorkerStats;
FILE *fpin;
int32 format_id;
const char *statfile = permanent ? PGSTAT_STAT_PERMANENT_FILENAME : pgstat_stat_filename;
@@ -4671,6 +5041,22 @@ pgstat_read_db_statsfile_timestamp(Oid databaseid, bool permanent,
}
break;
+ /*
+ * 'S' A PgStat_StatSubWorkerEntry struct describing
+ * subscription worker statistics.
+ */
+ case 'S':
+ if (fread(&mySubWorkerStats, 1, sizeof(mySubWorkerStats), fpin)
+ != sizeof(mySubWorkerStats))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ FreeFile(fpin);
+ return false;
+ }
+ break;
+
case 'E':
goto done;
@@ -4876,6 +5262,7 @@ pgstat_clear_snapshot(void)
pgStatLocalContext = NULL;
pgStatDBHash = NULL;
replSlotStatHash = NULL;
+ subWorkerStatHash = NULL;
/*
* Historically the backend_status.c facilities lived in this file, and
@@ -5344,6 +5731,33 @@ pgstat_recv_resetreplslotcounter(PgStat_MsgResetreplslotcounter *msg,
}
}
+/* ----------
+ * pgstat_recv_resetsubworkererror() -
+ *
+ * Process a RESETSUBWORKERERROR message.
+ * ----------
+ */
+static void
+pgstat_recv_resetsubworkererror(PgStat_MsgResetsubworkererror *msg, int len)
+{
+ PgStat_StatSubWorkerEntry *wentry;
+
+ Assert(OidIsValid(msg->m_subid));
+
+ /* Get subscription worker stats */
+ wentry = pgstat_get_subworker_entry(msg->m_subid, msg->m_subrelid, false);
+
+ /*
+ * Nothing to do if the subscription error entry is not found. This could
+ * happen when the subscription is dropped and the message for dropping
+ * subscription entry arrived before the message for resetting the error.
+ */
+ if (wentry == NULL)
+ return;
+
+ /* reset the entry and set reset timestamp */
+ pgstat_reset_subworker_error(wentry, GetCurrentTimestamp());
+}
/* ----------
* pgstat_recv_autovac() -
@@ -5816,6 +6230,95 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
}
}
+/* ----------
+ * pgstat_recv_subworker_error() -
+ *
+ * Process a SUBWORKERERROR message.
+ * ----------
+ */
+static void
+pgstat_recv_subworker_error(PgStat_MsgSubWorkerError *msg, int len)
+{
+ PgStat_StatSubWorkerEntry *wentry;
+
+ /* Get the subscription worker stats */
+ wentry = pgstat_get_subworker_entry(msg->m_subid, msg->m_subrelid, true);
+ Assert(wentry);
+
+ /*
+ * Update only the counter and timestamp if we received the same error
+ * again
+ */
+ if (wentry->dbid == msg->m_dbid &&
+ wentry->relid == msg->m_relid &&
+ wentry->command == msg->m_command &&
+ wentry->xid == msg->m_xid &&
+ strcmp(wentry->message, msg->m_message) == 0)
+ {
+ wentry->count++;
+ wentry->timestamp = msg->m_timestamp;
+ return;
+ }
+
+ /* Otherwise, update the error information */
+ wentry->dbid = msg->m_dbid;
+ wentry->relid = msg->m_relid;
+ wentry->command = msg->m_command;
+ wentry->xid = msg->m_xid;
+ wentry->count = 1;
+ wentry->timestamp = msg->m_timestamp;
+ strlcpy(wentry->message, msg->m_message, PGSTAT_SUBWORKERERROR_MSGLEN);
+}
+
+/* ----------
+ * pgstat_recv_subworker_purge() -
+ *
+ * Process a SUBWORKERPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subworker_purge(PgStat_MsgSubWorkerPurge *msg, int len)
+{
+ if (subWorkerStatHash == NULL)
+ return;
+
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ HASH_SEQ_STATUS sstat;
+ PgStat_StatSubWorkerEntry *wentry;
+
+ /* Remove all worker statistics of the subscription */
+ hash_seq_init(&sstat, subWorkerStatHash);
+ while ((wentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&sstat)) != NULL)
+ {
+ if (wentry->key.subid == msg->m_subids[i])
+ (void) hash_search(subWorkerStatHash, (void *) &(wentry->key),
+ HASH_REMOVE, NULL);
+ }
+ }
+}
+
+/* ----------
+ * pgstat_recv_subworker_error_purge() -
+ *
+ * Process a SUBWORKERERRORPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subworker_error_purge(PgStat_MsgSubWorkerErrorPurge *msg, int len)
+{
+ PgStat_StatSubWorkerKey key;
+
+ key.subid = msg->m_subid;
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ Assert(OidIsValid(msg->m_relids[i]));
+
+ key.subrelid = msg->m_relids[i];
+ (void) hash_search(subWorkerStatHash, (void *) &key, HASH_REMOVE, NULL);
+ }
+}
+
/* ----------
* pgstat_write_statsfile_needed() -
*
@@ -5934,6 +6437,71 @@ pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotent, TimestampTz ts)
slotent->stat_reset_timestamp = ts;
}
+/* ----------
+ * pgstat_get_subworker_entry
+ *
+ * Return subscription worker entry with the given subscription OID and
+ * relation OID. If subrelid is InvalidOid, it returns an entry of the
+ * apply worker otherwise of the table sync worker associated with subrelid.
+ * If no subscription entry exists, initialize it, if the create parameter
+ * is true. Else, return NULL.
+ * ----------
+ */
+static PgStat_StatSubWorkerEntry *
+pgstat_get_subworker_entry(Oid subid, Oid subrelid, bool create)
+{
+ PgStat_StatSubWorkerEntry *wentry;
+ PgStat_StatSubWorkerKey key;
+ HASHACTION action;
+ bool found;
+
+ if (subWorkerStatHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ if (!create)
+ return NULL;
+
+ hash_ctl.keysize = sizeof(PgStat_StatSubWorkerKey);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubWorkerEntry);
+ subWorkerStatHash = hash_create("Subscription worker stat entries",
+ PGSTAT_SUBWORKER_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+ }
+
+ key.subid = subid;
+ key.subrelid = subrelid;
+ action = (create ? HASH_ENTER : HASH_FIND);
+ wentry = (PgStat_StatSubWorkerEntry *) hash_search(subWorkerStatHash,
+ (void *) &key,
+ action, &found);
+
+ /* initialize fields */
+ if (create && !found)
+ pgstat_reset_subworker_error(wentry, 0);
+
+ return wentry;
+}
+
+/* ----------
+ * pgstat_reset_subworker_error
+ *
+ * Reset the given subscription worker error stats.
+ * ----------
+ */
+static void
+pgstat_reset_subworker_error(PgStat_StatSubWorkerEntry *wentry, TimestampTz ts)
+{
+ wentry->relid = InvalidOid;
+ wentry->command = 0;
+ wentry->xid = InvalidTransactionId;
+ wentry->count = 0;
+ wentry->timestamp = 0;
+ wentry->message[0] = '\0';
+ wentry->stat_reset_timestamp = ts;
+}
+
/*
* pgstat_slru_index
*
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 8d96c926b4..981c195a87 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3329,6 +3329,7 @@ void
ApplyWorkerMain(Datum main_arg)
{
int worker_slot = DatumGetInt32(main_arg);
+ MemoryContext cctx = CurrentMemoryContext;
MemoryContext oldctx;
char originname[NAMEDATALEN];
XLogRecPtr origin_startpos;
@@ -3429,8 +3430,30 @@ ApplyWorkerMain(Datum main_arg)
{
char *syncslotname;
- /* This is table synchronization worker, call initial sync. */
- syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ PG_TRY();
+ {
+ /* This is table synchronization worker, call initial sync. */
+ syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ }
+ PG_CATCH();
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ /*
+ * Report the table sync error. There is no corresponding message
+ * type for table synchroniztion.
+ */
+ pgstat_report_subworker_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ MyLogicalRepWorker->relid,
+ 0, /* message type */
+ InvalidTransactionId,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
/* allocate slot name in long-lived context */
myslotname = MemoryContextStrdup(ApplyContext, syncslotname);
@@ -3548,7 +3571,32 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- LogicalRepApplyLoop(origin_startpos);
+ PG_TRY();
+ {
+ LogicalRepApplyLoop(origin_startpos);
+ }
+ PG_CATCH();
+ {
+ /* report the apply error */
+ if (apply_error_callback_arg.command != 0)
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ pgstat_report_subworker_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ apply_error_callback_arg.rel != NULL
+ ? apply_error_callback_arg.rel->localreloid
+ : InvalidOid,
+ apply_error_callback_arg.command,
+ apply_error_callback_arg.remote_xid,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ }
+
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
proc_exit(0);
}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index ff5aedc99c..ec9a4e43f5 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -24,6 +24,7 @@
#include "pgstat.h"
#include "postmaster/bgworker_internals.h"
#include "postmaster/postmaster.h"
+#include "replication/logicalproto.h"
#include "replication/slot.h"
#include "storage/proc.h"
#include "storage/procarray.h"
@@ -2239,6 +2240,23 @@ pg_stat_reset_replication_slot(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
+/* Reset a subscription error stats */
+Datum
+pg_stat_reset_subscription_error(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid;
+
+ if (PG_ARGISNULL(1))
+ relid = InvalidOid; /* reset apply worker error stats */
+ else
+ relid = PG_GETARG_OID(1); /* reset table sync worker error stats */
+
+ pgstat_reset_subworker_error_stats(subid, relid);
+
+ PG_RETURN_VOID();
+}
+
Datum
pg_stat_get_archiver(PG_FUNCTION_ARGS)
{
@@ -2379,3 +2397,106 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
/* Returns the record as Datum */
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
+
+/*
+ * Get the subscription error for the given subscription (and relation).
+ */
+Datum
+pg_stat_get_subscription_error(PG_FUNCTION_ARGS)
+{
+#define PG_STAT_GET_SUBSCRIPTION_ERROR_COLS 9
+ Oid subid = PG_GETARG_OID(0);
+ Oid subrelid;
+ TupleDesc tupdesc;
+ Datum values[PG_STAT_GET_SUBSCRIPTION_ERROR_COLS];
+ bool nulls[PG_STAT_GET_SUBSCRIPTION_ERROR_COLS];
+ PgStat_StatSubWorkerEntry *wentry;
+ int i;
+
+ if (PG_ARGISNULL(1))
+ subrelid = InvalidOid;
+ else
+ subrelid = PG_GETARG_OID(1);
+
+ /* Initialise attributes information in the tuple descriptor */
+ tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_SUBSCRIPTION_ERROR_COLS);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "subid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subrelid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 3, "relid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 4, "command",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 5, "xid",
+ XIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 6, "count",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 7, "error_message",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 8, "last_error_time",
+ TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 9, "stats_reset",
+ TIMESTAMPTZOID, -1, 0);
+ BlessTupleDesc(tupdesc);
+
+ /* Get subscription worker stats */
+ wentry = pgstat_fetch_subworker(subid, subrelid);
+
+ /* Return NULL if the subscription doesn't have any errors */
+ if (wentry == NULL)
+ PG_RETURN_NULL();
+
+ /* Initialise values and NULL flags arrays */
+ MemSet(values, 0, sizeof(values));
+ MemSet(nulls, 0, sizeof(nulls));
+
+ i = 0;
+ /* subid */
+ values[i++] = ObjectIdGetDatum(subid);
+
+ /* subrelid */
+ if (OidIsValid(subrelid))
+ values[i++] = ObjectIdGetDatum(subrelid);
+ else
+ nulls[i++] = true;
+
+ /* relid */
+ if (OidIsValid(wentry->relid))
+ values[i++] = ObjectIdGetDatum(wentry->relid);
+ else
+ nulls[i++] = true;
+
+ /* command */
+ if (wentry->command != 0)
+ values[i++] = CStringGetTextDatum(logicalrep_message_type(wentry->command));
+ else
+ nulls[i++] = true;
+
+ /* xid */
+ if (TransactionIdIsValid(wentry->xid))
+ values[i++] = TransactionIdGetDatum(wentry->xid);
+ else
+ nulls[i++] = true;
+
+ /* count */
+ values[i++] = Int64GetDatum(wentry->count);
+
+ /* error_message */
+ values[i++] = CStringGetTextDatum(wentry->message);
+
+ /* last_error_time */
+ if (wentry->timestamp != 0)
+ values[i++] = TimestampTzGetDatum(wentry->timestamp);
+ else
+ nulls[i++] = true;
+
+ /* stats_reset */
+ if (wentry->stat_reset_timestamp != 0)
+ values[i++] = TimestampTzGetDatum(wentry->stat_reset_timestamp);
+ else
+ nulls[i++] = true;
+
+ /* Returns the record as Datum */
+ PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
+}
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index d068d6532e..99fdd78816 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5385,6 +5385,14 @@
proargmodes => '{i,o,o,o,o,o,o,o,o,o,o}',
proargnames => '{slot_name,slot_name,spill_txns,spill_count,spill_bytes,stream_txns,stream_count,stream_bytes,total_txns,total_bytes,stats_reset}',
prosrc => 'pg_stat_get_replication_slot' },
+{ oid => '8523', descr => 'statistics: information about subscription error',
+ proname => 'pg_stat_get_subscription_error', prorows => '1', proisstrict => 'f',
+ proretset => 't', provolatile => 's', proparallel => 'r',
+ prorettype => 'record', proargtypes => 'oid oid',
+ proallargtypes => '{oid,oid,oid,oid,oid,text,xid,int8,text,timestamptz,timestamptz}',
+ proargmodes => '{i,i,o,o,o,o,o,o,o,o,o}',
+ proargnames => '{subid,subrelid,subid,subrelid,relid,command,xid,count,error_message,last_error_time,stats_reset}',
+ prosrc => 'pg_stat_get_subscription_error' },
{ oid => '6118', descr => 'statistics: information about subscription',
proname => 'pg_stat_get_subscription', prorows => '10', proisstrict => 'f',
proretset => 't', provolatile => 's', proparallel => 'r',
@@ -5772,6 +5780,11 @@
proname => 'pg_stat_reset_replication_slot', proisstrict => 'f',
provolatile => 'v', prorettype => 'void', proargtypes => 'text',
prosrc => 'pg_stat_reset_replication_slot' },
+{ oid => '8524',
+ descr => 'statistics: reset collected statistics for a single subscription error',
+ proname => 'pg_stat_reset_subscription_error', proisstrict => 'f',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid oid',
+ prosrc => 'pg_stat_reset_subscription_error' },
{ oid => '3163', descr => 'current trigger depth',
proname => 'pg_trigger_depth', provolatile => 's', proparallel => 'r',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index bcd3588ea2..43138f02e7 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -14,6 +14,7 @@
#include "datatype/timestamp.h"
#include "portability/instr_time.h"
#include "postmaster/pgarch.h" /* for MAX_XFN_CHARS */
+#include "replication/logicalproto.h"
#include "utils/backend_progress.h" /* for backward compatibility */
#include "utils/backend_status.h" /* for backward compatibility */
#include "utils/hsearch.h"
@@ -66,6 +67,7 @@ typedef enum StatMsgType
PGSTAT_MTYPE_RESETSINGLECOUNTER,
PGSTAT_MTYPE_RESETSLRUCOUNTER,
PGSTAT_MTYPE_RESETREPLSLOTCOUNTER,
+ PGSTAT_MTYPE_RESETSUBWORKERERROR,
PGSTAT_MTYPE_AUTOVAC_START,
PGSTAT_MTYPE_VACUUM,
PGSTAT_MTYPE_ANALYZE,
@@ -83,6 +85,9 @@ typedef enum StatMsgType
PGSTAT_MTYPE_REPLSLOT,
PGSTAT_MTYPE_CONNECT,
PGSTAT_MTYPE_DISCONNECT,
+ PGSTAT_MTYPE_SUBWORKERERROR,
+ PGSTAT_MTYPE_SUBWORKERERRORPURGE,
+ PGSTAT_MTYPE_SUBWORKERPURGE,
} StatMsgType;
/* ----------
@@ -389,6 +394,24 @@ typedef struct PgStat_MsgResetreplslotcounter
bool clearall;
} PgStat_MsgResetreplslotcounter;
+/* ----------
+ * PgStat_MsgRestsubworkererror Sent by the backend to reset the subscription
+ * worker error information.
+ * ----------
+ */
+typedef struct PgStat_MsgResetsubworkererror
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * Same as PgStat_MsgSubWorkerError, m_subid and m_subrelid are used to
+ * determine the subscription and the reporter of the error: the apply
+ * worker or the table sync worker.
+ */
+ Oid m_subid;
+ Oid m_subrelid;
+} PgStat_MsgResetsubworkererror;
+
/* ----------
* PgStat_MsgAutovacStart Sent by the autovacuum daemon to signal
* that a database is going to be processed
@@ -536,6 +559,68 @@ typedef struct PgStat_MsgReplSlot
PgStat_Counter m_total_bytes;
} PgStat_MsgReplSlot;
+/* ----------
+ * PgStat_MsgSubWorkerError Sent by the apply worker or the table sync worker to
+ * report the error occurred during logical replication.
+ * ----------
+ */
+#define PGSTAT_SUBWORKERERROR_MSGLEN 256
+typedef struct PgStat_MsgSubWorkerError
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * m_subid and m_subrelid are used to determine the subscription and the
+ * reporter of the error. m_subrelid is InvalidOid if reported by an apply
+ * worker otherwise reported by a table sync worker.
+ */
+ Oid m_subid;
+ Oid m_subrelid;
+
+ /*
+ * Oids of the database and the table that the reporter was actually processing.
+ * This can be InvalidOid if the worker was applying a non-data-modification
+ * change such as STREAM_STOP.
+ */
+ Oid m_dbid;
+ Oid m_relid;
+
+ LogicalRepMsgType m_command;
+ TransactionId m_xid;
+ TimestampTz m_timestamp;
+ char m_message[PGSTAT_SUBWORKERERROR_MSGLEN];
+} PgStat_MsgSubWorkerError;
+
+/* ----------
+ * PgStat_MsgSubWorkerPurge Sent by the backend and autovacuum to tell the
+ * collector about the dead subscriptions.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBWORKERPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubWorkerPurge
+{
+ PgStat_MsgHdr m_hdr;
+ int m_nentries;
+ Oid m_subids[PGSTAT_NUM_SUBWORKERPURGE];
+} PgStat_MsgSubWorkerPurge;
+
+/* ----------
+ * PgStat_MsgSubWorkerErrorPurge Sent by the backend and autovacuum to purge
+ * the subscription errors.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBWORKERERRORPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(Oid) - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubWorkerErrorPurge
+{
+ PgStat_MsgHdr m_hdr;
+ Oid m_subid;
+ int m_nentries;
+ Oid m_relids[PGSTAT_NUM_SUBWORKERERRORPURGE];
+} PgStat_MsgSubWorkerErrorPurge;
/* ----------
* PgStat_MsgRecoveryConflict Sent by the backend upon recovery conflict
@@ -697,6 +782,7 @@ typedef union PgStat_Msg
PgStat_MsgResetsinglecounter msg_resetsinglecounter;
PgStat_MsgResetslrucounter msg_resetslrucounter;
PgStat_MsgResetreplslotcounter msg_resetreplslotcounter;
+ PgStat_MsgResetsubworkererror msg_resetsubworkererror;
PgStat_MsgAutovacStart msg_autovacuum_start;
PgStat_MsgVacuum msg_vacuum;
PgStat_MsgAnalyze msg_analyze;
@@ -714,6 +800,9 @@ typedef union PgStat_Msg
PgStat_MsgReplSlot msg_replslot;
PgStat_MsgConnect msg_connect;
PgStat_MsgDisconnect msg_disconnect;
+ PgStat_MsgSubWorkerError msg_subworkererror;
+ PgStat_MsgSubWorkerErrorPurge msg_subworkererrorpurge;
+ PgStat_MsgSubWorkerPurge msg_subworkerpurge;
} PgStat_Msg;
@@ -929,6 +1018,35 @@ typedef struct PgStat_StatReplSlotEntry
TimestampTz stat_reset_timestamp;
} PgStat_StatReplSlotEntry;
+/* The lookup key for subscription worker hash table */
+typedef struct PgStat_StatSubWorkerKey
+{
+ Oid subid;
+ Oid subrelid; /* InvalidOid for apply worker, otherwise for
+ * table sync worker */
+} PgStat_StatSubWorkerKey;
+
+/*
+ * Logical replication apply worker and table sync worker statistics kept in the
+ * stats collector.
+ */
+typedef struct PgStat_StatSubWorkerEntry
+{
+ PgStat_StatSubWorkerKey key; /* hash key (must be first) */
+
+ /*
+ * Subscription worker error statistics representing an error that occurred
+ * during application of logical replication or the initial table synchronization.
+ */
+ Oid dbid;
+ Oid relid;
+ LogicalRepMsgType command;
+ TransactionId xid;
+ PgStat_Counter count;
+ TimestampTz timestamp;
+ char message[PGSTAT_SUBWORKERERROR_MSGLEN];
+ TimestampTz stat_reset_timestamp;
+} PgStat_StatSubWorkerEntry;
/*
* Working state needed to accumulate per-function-call timing statistics.
@@ -1022,6 +1140,7 @@ extern void pgstat_reset_shared_counters(const char *);
extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type type);
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
+extern void pgstat_reset_subworker_error_stats(Oid subid, Oid subrelid);
extern void pgstat_report_connect(Oid dboid);
extern void pgstat_report_autovac(Oid dboid);
@@ -1038,6 +1157,9 @@ extern void pgstat_report_checksum_failure(void);
extern void pgstat_report_replslot(const PgStat_StatReplSlotEntry *repSlotStat);
extern void pgstat_report_replslot_create(const char *slotname);
extern void pgstat_report_replslot_drop(const char *slotname);
+extern void pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command,
+ TransactionId xid, const char *errmsg);
extern void pgstat_initialize(void);
@@ -1136,6 +1258,7 @@ extern PgStat_GlobalStats *pgstat_fetch_global(void);
extern PgStat_WalStats *pgstat_fetch_stat_wal(void);
extern PgStat_SLRUStats *pgstat_fetch_slru(void);
extern PgStat_StatReplSlotEntry *pgstat_fetch_replslot(NameData slotname);
+extern PgStat_StatSubWorkerEntry *pgstat_fetch_subworker(Oid subid, Oid subrelid);
extern void pgstat_count_slru_page_zeroed(int slru_idx);
extern void pgstat_count_slru_page_hit(int slru_idx);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2fa00a3c29..a7714829ee 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2094,6 +2094,26 @@ pg_stat_subscription| SELECT su.oid AS subid,
st.latest_end_time
FROM (pg_subscription su
LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid, pid, received_lsn, last_msg_send_time, last_msg_receipt_time, latest_end_lsn, latest_end_time) ON ((st.subid = su.oid)));
+pg_stat_subscription_errors| SELECT e.subid,
+ s.subname,
+ e.subrelid,
+ e.relid,
+ e.command,
+ e.xid,
+ e.count,
+ e.error_message,
+ e.last_error_time,
+ e.stats_reset
+ FROM ( SELECT pg_subscription.oid AS subid,
+ NULL::oid AS relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT pg_subscription_rel.srsubid AS subid,
+ pg_subscription_rel.srrelid AS relid
+ FROM pg_subscription_rel
+ WHERE (pg_subscription_rel.srsubstate <> 'r'::"char")) sr,
+ (LATERAL pg_stat_get_subscription_error(sr.subid, sr.relid) e(subid, subrelid, relid, command, xid, count, error_message, last_error_time, stats_reset)
+ JOIN pg_subscription s ON ((e.subid = s.oid)));
pg_stat_sys_indexes| SELECT pg_stat_all_indexes.relid,
pg_stat_all_indexes.indexrelid,
pg_stat_all_indexes.schemaname,
diff --git a/src/test/subscription/t/025_error_report.pl b/src/test/subscription/t/025_error_report.pl
new file mode 100644
index 0000000000..c5af95f339
--- /dev/null
+++ b/src/test/subscription/t/025_error_report.pl
@@ -0,0 +1,156 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Tests for subscription error reporting.
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 5;
+
+# Test if the error reported on pg_subscription_errors view is expected.
+sub test_subscription_error
+{
+ my ($node, $relname, $xid, $expected_error, $msg) = @_;
+
+ my $check_sql = qq[
+SELECT count(1) > 0 FROM pg_stat_subscription_errors
+WHERE relid = '$relname'::regclass];
+ $check_sql .= " AND xid = '$xid'::xid;" if $xid ne '';
+
+ # Wait for the error statistics to be updated.
+ $node->poll_query_until(
+ 'postgres', $check_sql,
+) or die "Timed out while waiting for statistics to be updated";
+
+ my $result = $node->safe_psql(
+ 'postgres',
+ qq[
+SELECT subname, command, relid::regclass, count > 0
+FROM pg_stat_subscription_errors
+WHERE relid = '$relname'::regclass;
+]);
+ is($result, $expected_error, $msg);
+}
+
+# Create publisher node.
+my $node_publisher = PostgresNode->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+logical_decoding_work_mem = 64kB
+]);
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgresNode->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+
+# The subscriber will enter an infinite error loop, so we don't want
+# to overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+wal_retrieve_retry_interval = 5s
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we create
+# the same tables but with primary keys. Also, insert some data that will conflict
+# with the data replicated from publisher later.
+$node_publisher->safe_psql(
+ 'postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int);
+CREATE TABLE test_tab2 (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+INSERT INTO test_tab1 VALUES (1);
+INSERT INTO test_tab2 VALUES (1);
+COMMIT;
+]);
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int primary key);
+CREATE TABLE test_tab2 (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab2 VALUES (1);
+INSERT INTO test_tab_streaming SELECT 10000, md5(10000::text);
+COMMIT;
+]);
+
+# Setup publications.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ q[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Check if there is no subscription errors before starting logical replication.
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_errors");
+is($result, qq(0), 'check no subscription error');
+
+# Create subscriptions. The table sync for test_tab2 on tap_sub will enter to
+# infinite error due to violating the unique constraint.
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (streaming = off, two_phase = on);");
+my $appname_streaming = 'tap_sub_streaming';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming' PUBLICATION tap_pub_streaming WITH (streaming = on, two_phase = on);");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Wait for initial table sync for test_tab1 and test_tab_streaming to finish.
+$node_subscriber->poll_query_until('postgres',
+ q[
+SELECT count(1) = 2 FROM pg_subscription_rel
+WHERE srrelid in ('test_tab1'::regclass, 'test_tab_streaming'::regclass) AND srsubstate in ('r', 's')
+]) or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check the initial data.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab1");
+is($result, q(1), 'check initial data are copied to subscriber');
+
+# Insert more data to test_tab1, raising an error on the subscriber due to violation
+# of the unique constraint on test_tab1.
+my $xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab1 VALUES (1);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_subscription_error($node_subscriber, 'test_tab1', $xid,
+ qq(tap_sub|INSERT|test_tab1|t),
+ 'check the error reported by the apply worker');
+
+# Check the table sync worker's error in the view.
+test_subscription_error($node_subscriber, 'test_tab2', '',
+ qq(tap_sub||test_tab2|t),
+ 'check the error reported by the table sync worker');
+
+# Check if the view doesn't show any entries after dropping the subscriptions.
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+DROP SUBSCRIPTION tap_sub;
+DROP SUBSCRIPTION tap_sub_streaming;
+]);
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_errors");
+is($result, q(0), 'no error after dropping subscription');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index cb5b5ec74c..6916c290f5 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1939,7 +1939,11 @@ PgStat_MsgResetreplslotcounter
PgStat_MsgResetsharedcounter
PgStat_MsgResetsinglecounter
PgStat_MsgResetslrucounter
+PgStat_MsgResetsubworkererror
PgStat_MsgSLRU
+PgStat_MsgSubWorkerError
+PgStat_MsgSubWorkerErrorPurge
+PgStat_MsgSubWorkerPurge
PgStat_MsgTabpurge
PgStat_MsgTabstat
PgStat_MsgTempFile
@@ -1951,6 +1955,8 @@ PgStat_Single_Reset_Type
PgStat_StatDBEntry
PgStat_StatFuncEntry
PgStat_StatReplSlotEntry
+PgStat_StatSubWorkerEntry
+PgStat_StatSubWorkerKey
PgStat_StatTabEntry
PgStat_SubXactStatus
PgStat_TableCounts
--
2.24.3 (Apple Git-128)
v16-0002-Add-RESET-command-to-ALTER-SUBSCRIPTION-command.patchapplication/octet-stream; name=v16-0002-Add-RESET-command-to-ALTER-SUBSCRIPTION-command.patchDownload
From 6a712a3dcc5f69314a4f54987f99cd28f7ceb425 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 2 Aug 2021 14:23:18 +0900
Subject: [PATCH v16 2/3] Add RESET command to ALTER SUBSCRIPTION command.
ALTER SUBSCRIPTION ... RESET command resets subscription
parameters. The parameters that can be set are streaming, binary,
synchronous_commit.
The RESET parameter for ALTER SUBSCRIPTION is required by the
follow-up commit that introduces a new resettable subscription
parameter "skip_xid".
---
doc/src/sgml/ref/alter_subscription.sgml | 8 ++-
src/backend/commands/subscriptioncmds.c | 59 +++++++++++++++-------
src/backend/parser/gram.y | 11 +++-
src/include/nodes/parsenodes.h | 5 +-
src/test/regress/expected/subscription.out | 14 ++++-
src/test/regress/sql/subscription.sql | 13 +++++
6 files changed, 87 insertions(+), 23 deletions(-)
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 0b027cc346..c6ea386caa 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -29,6 +29,7 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> REFRESH PUB
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> ENABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> DISABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <replaceable>new_name</replaceable>
</synopsis>
@@ -194,16 +195,21 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
<varlistentry>
<term><literal>SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )</literal></term>
+ <term><literal>RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )</literal></term>
<listitem>
<para>
This clause alters parameters originally set by
<xref linkend="sql-createsubscription"/>. See there for more
- information. The parameters that can be altered
+ information. The parameters that can be set
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, and
<literal>streaming</literal>.
</para>
+ <para>
+ The parameters that can be reset are: <literal>streaming</literal>,
+ <literal>binary</literal>, <literal>synchronous_commit</literal>.
+ </para>
</listitem>
</varlistentry>
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index c47ba26369..896ec8b836 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -99,7 +99,8 @@ static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname,
*/
static void
parse_subscription_options(ParseState *pstate, List *stmt_options,
- bits32 supported_opts, SubOpts *opts)
+ bits32 supported_opts, SubOpts *opts,
+ bool is_reset)
{
ListCell *lc;
@@ -134,6 +135,11 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
{
DefElem *defel = (DefElem *) lfirst(lc);
+ if (is_reset && defel->arg != NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("RESET must not include values for parameters")));
+
if (IsSet(supported_opts, SUBOPT_CONNECT) &&
strcmp(defel->defname, "connect") == 0)
{
@@ -192,12 +198,18 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_SYNCHRONOUS_COMMIT;
- opts->synchronous_commit = defGetString(defel);
+ if (!is_reset)
+ {
+ opts->synchronous_commit = defGetString(defel);
- /* Test if the given value is valid for synchronous_commit GUC. */
- (void) set_config_option("synchronous_commit", opts->synchronous_commit,
- PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
- false, 0, false);
+ /*
+ * Test if the given value is valid for synchronous_commit
+ * GUC.
+ */
+ (void) set_config_option("synchronous_commit", opts->synchronous_commit,
+ PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
+ false, 0, false);
+ }
}
else if (IsSet(supported_opts, SUBOPT_REFRESH) &&
strcmp(defel->defname, "refresh") == 0)
@@ -215,7 +227,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_BINARY;
- opts->binary = defGetBoolean(defel);
+ if (!is_reset)
+ opts->binary = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_STREAMING) &&
strcmp(defel->defname, "streaming") == 0)
@@ -224,7 +237,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_STREAMING;
- opts->streaming = defGetBoolean(defel);
+ if (!is_reset)
+ opts->streaming = defGetBoolean(defel);
}
else if (strcmp(defel->defname, "two_phase") == 0)
{
@@ -397,7 +411,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT);
- parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
+ parse_subscription_options(pstate, stmt->options, supported_opts, &opts,
+ false);
/*
* Since creating a replication slot is not transactional, rolling back
@@ -866,14 +881,21 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
switch (stmt->kind)
{
- case ALTER_SUBSCRIPTION_OPTIONS:
+ case ALTER_SUBSCRIPTION_SET_OPTIONS:
+ case ALTER_SUBSCRIPTION_RESET_OPTIONS:
{
- supported_opts = (SUBOPT_SLOT_NAME |
- SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
- SUBOPT_STREAMING);
+ bool is_reset = (stmt->kind == ALTER_SUBSCRIPTION_RESET_OPTIONS);
+
+ if (is_reset)
+ supported_opts = (SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
+ SUBOPT_STREAMING);
+ else
+ supported_opts = (SUBOPT_SLOT_NAME |
+ SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
+ SUBOPT_STREAMING);
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, is_reset);
if (IsSet(opts.specified_opts, SUBOPT_SLOT_NAME))
{
@@ -926,7 +948,8 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
case ALTER_SUBSCRIPTION_ENABLED:
{
parse_subscription_options(pstate, stmt->options,
- SUBOPT_ENABLED, &opts);
+ SUBOPT_ENABLED, &opts, false);
+
Assert(IsSet(opts.specified_opts, SUBOPT_ENABLED));
if (!sub->slotname && opts.enabled)
@@ -961,7 +984,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
{
supported_opts = SUBOPT_COPY_DATA | SUBOPT_REFRESH;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
values[Anum_pg_subscription_subpublications - 1] =
publicationListToArray(stmt->publication);
@@ -1008,7 +1031,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = SUBOPT_REFRESH | SUBOPT_COPY_DATA;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
publist = merge_publications(sub->publications, stmt->publication, isadd, stmt->subname);
values[Anum_pg_subscription_subpublications - 1] =
@@ -1056,7 +1079,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
errmsg("ALTER SUBSCRIPTION ... REFRESH is not allowed for disabled subscriptions")));
parse_subscription_options(pstate, stmt->options,
- SUBOPT_COPY_DATA, &opts);
+ SUBOPT_COPY_DATA, &opts, false);
/*
* The subscription option "two_phase" requires that
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 08f1bf1031..a7e2853f0e 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9721,7 +9721,16 @@ AlterSubscriptionStmt:
{
AlterSubscriptionStmt *n =
makeNode(AlterSubscriptionStmt);
- n->kind = ALTER_SUBSCRIPTION_OPTIONS;
+ n->kind = ALTER_SUBSCRIPTION_SET_OPTIONS;
+ n->subname = $3;
+ n->options = $5;
+ $$ = (Node *)n;
+ }
+ | ALTER SUBSCRIPTION name RESET definition
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_RESET_OPTIONS;
n->subname = $3;
n->options = $5;
$$ = (Node *)n;
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 3138877553..539921cb52 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3676,7 +3676,8 @@ typedef struct CreateSubscriptionStmt
typedef enum AlterSubscriptionType
{
- ALTER_SUBSCRIPTION_OPTIONS,
+ ALTER_SUBSCRIPTION_SET_OPTIONS,
+ ALTER_SUBSCRIPTION_RESET_OPTIONS,
ALTER_SUBSCRIPTION_CONNECTION,
ALTER_SUBSCRIPTION_SET_PUBLICATION,
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
@@ -3688,7 +3689,7 @@ typedef enum AlterSubscriptionType
typedef struct AlterSubscriptionStmt
{
NodeTag type;
- AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_OPTIONS, etc */
+ AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_SET_OPTIONS, etc */
char *subname; /* Name of the subscription */
char *conninfo; /* Connection string to publisher */
List *publication; /* One or more publication to subscribe to */
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 15a1ac6398..e4c16cab66 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -281,11 +281,23 @@ ALTER SUBSCRIPTION regress_testsub SET (two_phase = false);
ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+-- ok
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit);
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit, binary, streaming);
+-- fail - unsupported parameters
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ERROR: unrecognized subscription parameter: "connect"
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+ERROR: unrecognized subscription parameter: "enabled"
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+ERROR: RESET must not include values for parameters
\dRs+
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7faa935a2a..3b0fbea897 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -215,6 +215,19 @@ ALTER SUBSCRIPTION regress_testsub SET (two_phase = false);
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+
+-- ok
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit);
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit, binary, streaming);
+
+-- fail - unsupported parameters
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+
\dRs+
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
--
2.24.3 (Apple Git-128)
On Fri, Oct 8, 2021 at 9:22 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
On Thursday, September 30, 2021 2:45 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached updated patches that incorporate all comments I got so far. Please
review them.Hello
Minor two comments for v15-0001 patch.
(1) a typo in pgstat_vacuum_subworker_stat()
+ /* + * This subscription is live. The next step is that we search errors + * of the table sync workers who are already in sync state. These + * errors should be removed. + */This subscription is "alive" ?
(2) Suggestion to add one comment next to '0' in ApplyWorkerMain()
+ /* report the table sync error */ + pgstat_report_subworker_error(MyLogicalRepWorker->subid, + MyLogicalRepWorker->relid, + MyLogicalRepWorker->relid, + 0, + InvalidTransactionId, + errdata->message);How about writing /* no corresponding message type for table synchronization */ or something ?
Thank you for the comments! Those comments are incorporated into the
latest patches I just submitted[1]/messages/by-id/CAD21AoDST8-ykrCLcWbWnTLj1u52-ZhiEP+bRU7kv5oBhfSy_Q@mail.gmail.com.
Regards,
[1]: /messages/by-id/CAD21AoDST8-ykrCLcWbWnTLj1u52-ZhiEP+bRU7kv5oBhfSy_Q@mail.gmail.com
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, Oct 12, 2021 at 4:00 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached updated patches.
Some comments for the v16-0003 patch:
(1) doc/src/sgml/logical-replication.sgml
The output from "SELECT * FROM pg_stat_subscription_errors;" still
shows "last_failed_time" instead of "last_error_time".
doc/src/sgml/ref/alter_subscription.sgml
(2)
Suggested update (and fix typo: restrited -> restricted):
BEFORE:
+ Setting and resetting of <literal>skip_xid</literal> option is
+ restrited to superusers.
AFTER:
+ The setting and resetting of the
<literal>skip_xid</literal> option is
+ restricted to superusers.
(3)
Suggested improvement to the wording:
BEFORE:
+ incoming change or by skipping the whole transaction. This option
+ specifies transaction ID that logical replication worker skips to
+ apply. The logical replication worker skips all data modification
AFTER:
+ incoming changes or by skipping the whole transaction. This option
+ specifies the ID of the transaction whose application is to
be skipped
+ by the logical replication worker. The logical replication worker
+ skips all data modification
(4) src/backend/replication/logical/worker.c
Suggested improvement to the comment wording:
BEFORE:
+ * Stop the skipping transaction if enabled. Otherwise, commit the changes
AFTER:
+ * Stop skipping the transaction changes, if enabled. Otherwise,
commit the changes
(5) skip_xid value validation
The validation of the specified skip_xid XID value isn't great.
For example, the following value are accepted:
ALTER SUBSCRIPTION sub SET (skip_xid='123abcz');
ALTER SUBSCRIPTION sub SET (skip_xid='99$@*');
Regards,
Greg Nancarrow
Fujitsu Australia
On Tue, Oct 12, 2021 at 4:00 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached updated patches.
Some comments for the v16-0001 patch:
src/backend/postmaster/pgstat.c
(1) pgstat_vacuum_subworker_stat()
Remove "the" from beginning of the following comment line:
+ * the all the dead subscription worker statistics.
(2) pgstat_reset_subscription_error_stats()
This function would be better named "pgstat_reset_subscription_subworker_error".
(3) pgstat_report_subworker_purge()
Improve comment:
BEFORE:
+ * Tell the collector about dead subscriptions.
AFTER:
+ * Tell the collector to remove dead subscriptions.
(4) pgstat_get_subworker_entry()
I notice that in the following code:
+ if (create && !found)
+ pgstat_reset_subworker_error(wentry, 0);
The newly-created PgStat_StatSubWorkerEntry doesn't get the "dbid"
member set, so I think it's a junk value in this case, yet the caller
of pgstat_get_subworker_entry(..., true) is referencing it:
+ /* Get the subscription worker stats */
+ wentry = pgstat_get_subworker_entry(msg->m_subid, msg->m_subrelid, true);
+ Assert(wentry);
+
+ /*
+ * Update only the counter and timestamp if we received the same error
+ * again
+ */
+ if (wentry->dbid == msg->m_dbid &&
+ wentry->relid == msg->m_relid &&
+ wentry->command == msg->m_command &&
+ wentry->xid == msg->m_xid &&
+ strcmp(wentry->message, msg->m_message) == 0)
+ {
+ wentry->count++;
+ wentry->timestamp = msg->m_timestamp;
+ return;
+ }
Maybe the cheapest solution is to just set dbid in
pgstat_reset_subworker_error()?
src/backend/replication/logical/worker.c
(5) Fix typo
synchroniztion -> synchronization
+ * type for table synchroniztion.
Regards,
Greg Nancarrow
Fujitsu Australia
On Tue, Oct 12, 2021 at 4:00 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached updated patches.
A couple more comments for some issues that I noticed in the v16 patches:
v16-0002
doc/src/sgml/ref/alter_subscription.sgml
(1) Order of parameters that can be reset doesn't match those that can be set.
Also, it doesn't match the order specified in the documentation
updates in the v16-0003 patch.
Suggested change:
BEFORE:
+ The parameters that can be reset are: <literal>streaming</literal>,
+ <literal>binary</literal>, <literal>synchronous_commit</literal>.
AFTER:
+ The parameters that can be reset are:
<literal>synchronous_commit</literal>,
+ <literal>binary</literal>, <literal>streaming</literal>.
v16-0003
doc/src/sgml/ref/alter_subscription.sgml
(1) Documentation update says "slot_name" is a parameter that can be
reset, but this is not correct, it can't be reset.
Also, the doc update is missing "the" before "parameter".
Suggested change:
BEFORE:
+ The parameters that can be reset are: <literal>slot_name</literal>,
+ <literal>synchronous_commit</literal>, <literal>binary</literal>,
+ <literal>streaming</literal>, and following parameter:
AFTER:
+ The parameters that can be reset are:
<literal>synchronous_commit</literal>,
+ <literal>binary</literal>, <literal>streaming</literal>, and
the following
+ parameter:
Regards,
Greg Nancarrow
Fujitsu Australia
On Tue, Oct 12, 2021 at 7:58 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Tue, Oct 12, 2021 at 4:00 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached updated patches.
Some comments for the v16-0003 patch:
Thank you for the comments!
(1) doc/src/sgml/logical-replication.sgml
The output from "SELECT * FROM pg_stat_subscription_errors;" still
shows "last_failed_time" instead of "last_error_time".
Fixed.
doc/src/sgml/ref/alter_subscription.sgml
(2)Suggested update (and fix typo: restrited -> restricted):
BEFORE: + Setting and resetting of <literal>skip_xid</literal> option is + restrited to superusers. AFTER: + The setting and resetting of the <literal>skip_xid</literal> option is + restricted to superusers.
Fixed.
(3)
Suggested improvement to the wording:BEFORE: + incoming change or by skipping the whole transaction. This option + specifies transaction ID that logical replication worker skips to + apply. The logical replication worker skips all data modification AFTER: + incoming changes or by skipping the whole transaction. This option + specifies the ID of the transaction whose application is to be skipped + by the logical replication worker. The logical replication worker + skips all data modification
Updated.
(4) src/backend/replication/logical/worker.c
Suggested improvement to the comment wording:
BEFORE: + * Stop the skipping transaction if enabled. Otherwise, commit the changes AFTER: + * Stop skipping the transaction changes, if enabled. Otherwise, commit the changes
Fixed.
(5) skip_xid value validation
The validation of the specified skip_xid XID value isn't great.
For example, the following value are accepted:ALTER SUBSCRIPTION sub SET (skip_xid='123abcz');
ALTER SUBSCRIPTION sub SET (skip_xid='99$@*');
Hmm, this is probably a problem of xid data type. For example, we can do like:
postgres(1:12686)=# select 'aa123'::xid;
xid
-----
0
(1 row)
postgres(1:12686)=# select '123aa'::xid;
xid
-----
123
(1 row)
It seems a problem to me. Perhaps we can fix it in a separate patch.
What do you think?
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wed, Oct 13, 2021 at 10:59 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Tue, Oct 12, 2021 at 4:00 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached updated patches.
Some comments for the v16-0001 patch:
Thank you for the comments!
src/backend/postmaster/pgstat.c
(1) pgstat_vacuum_subworker_stat()
Remove "the" from beginning of the following comment line:
+ * the all the dead subscription worker statistics.
Fixed.
(2) pgstat_reset_subscription_error_stats()
This function would be better named "pgstat_reset_subscription_subworker_error".
'subworker' contains an abbreviation of 'subscription'. So it seems
redundant to me. No?
(3) pgstat_report_subworker_purge()
Improve comment:
BEFORE: + * Tell the collector about dead subscriptions. AFTER: + * Tell the collector to remove dead subscriptions.
Fixed.
(4) pgstat_get_subworker_entry()
I notice that in the following code:
+ if (create && !found) + pgstat_reset_subworker_error(wentry, 0);The newly-created PgStat_StatSubWorkerEntry doesn't get the "dbid"
member set, so I think it's a junk value in this case, yet the caller
of pgstat_get_subworker_entry(..., true) is referencing it:+ /* Get the subscription worker stats */ + wentry = pgstat_get_subworker_entry(msg->m_subid, msg->m_subrelid, true); + Assert(wentry); + + /* + * Update only the counter and timestamp if we received the same error + * again + */ + if (wentry->dbid == msg->m_dbid && + wentry->relid == msg->m_relid && + wentry->command == msg->m_command && + wentry->xid == msg->m_xid && + strcmp(wentry->message, msg->m_message) == 0) + { + wentry->count++; + wentry->timestamp = msg->m_timestamp; + return; + }Maybe the cheapest solution is to just set dbid in
pgstat_reset_subworker_error()?
I've change the code to reset dbid in pgstat_reset_subworker_error().
src/backend/replication/logical/worker.c
(5) Fix typo
synchroniztion -> synchronization
+ * type for table synchroniztion.
Fixed.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thu, Oct 14, 2021 at 5:45 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Tue, Oct 12, 2021 at 4:00 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached updated patches.
A couple more comments for some issues that I noticed in the v16 patches:
v16-0002
doc/src/sgml/ref/alter_subscription.sgml
(1) Order of parameters that can be reset doesn't match those that can be set.
Also, it doesn't match the order specified in the documentation
updates in the v16-0003 patch.Suggested change:
BEFORE: + The parameters that can be reset are: <literal>streaming</literal>, + <literal>binary</literal>, <literal>synchronous_commit</literal>. AFTER: + The parameters that can be reset are: <literal>synchronous_commit</literal>, + <literal>binary</literal>, <literal>streaming</literal>.
Fixed.
v16-0003
doc/src/sgml/ref/alter_subscription.sgml
(1) Documentation update says "slot_name" is a parameter that can be
reset, but this is not correct, it can't be reset.
Also, the doc update is missing "the" before "parameter".Suggested change:
BEFORE: + The parameters that can be reset are: <literal>slot_name</literal>, + <literal>synchronous_commit</literal>, <literal>binary</literal>, + <literal>streaming</literal>, and following parameter: AFTER: + The parameters that can be reset are: <literal>synchronous_commit</literal>, + <literal>binary</literal>, <literal>streaming</literal>, and the following + parameter:
Fixed.
I've attached updated patches that incorporate all comments I got so far.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v17-0003-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchapplication/octet-stream; name=v17-0003-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchDownload
From 0454fbf6629a5123096a894653a940868f1faf1c Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 2 Aug 2021 14:27:40 +0900
Subject: [PATCH v17 3/3] Add skip_xid option to ALTER SUBSCRIPTION.
If incoming change violates any constraint, logical replication stops
until it's resolved. This commit introduces another way to skip the
transaction in question, other than manually updating the subscriber's
database or using pg_replication_origin_advance().
The user can specify XID by ALTER SUBSCRIPTION ... SET (skip_xid =
XXX), updating pg_subscription.subskipxid field, telling the apply
worker to skip the transaction. The apply worker skips all data
modification changes within the specified transaction.
After skipping the transaciton the apply worker clears
pg_subscription.subskipxid. Also, it clears the error statistics of
the subscription in pg_stat_subscription_errors system view as well in
order the user not to get confused. It's done by sending the message
for clearing a subscription error to the stats collector.
---
doc/src/sgml/logical-replication.sgml | 55 +++++-
doc/src/sgml/ref/alter_subscription.sgml | 34 +++-
src/backend/catalog/pg_subscription.c | 10 ++
src/backend/commands/subscriptioncmds.c | 42 ++++-
src/backend/replication/logical/worker.c | 183 +++++++++++++++++++-
src/include/catalog/pg_subscription.h | 4 +
src/include/nodes/parsenodes.h | 1 +
src/test/regress/expected/subscription.out | 13 ++
src/test/regress/sql/subscription.sql | 11 ++
src/test/subscription/t/025_error_report.pl | 107 +++++++++++-
10 files changed, 441 insertions(+), 19 deletions(-)
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 88646bc859..93743d6d00 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -333,20 +333,67 @@
<para>
A conflict will produce an error and will stop the replication; it must be
resolved manually by the user. Details about the conflict can be found in
- the subscriber's server log.
+ <xref linkend="monitoring-pg-stat-subscription-errors"/> as well as the
+ subscriber's server log.
</para>
<para>
The resolution can be done either by changing data on the subscriber so
- that it does not conflict with the incoming change or by skipping the
- transaction that conflicts with the existing data. The transaction can be
- skipped by calling the <link linkend="pg-replication-origin-advance">
+ that it does not conflict with the incoming changes or by skipping the whole
+ transaction. This option specifies the ID of the transaction whose
+ application is to be skipped by the logical replication worker. The logical
+ replication worker skips all data modification transaction conflicts with
+ the existing data. When a conflict produce an error, it is shown in
+ <structname>pg_stat_subscription_errors</structname> view as follows:
+ </para>
+
+ <programlisting>
+postgres=# SELECT * FROM pg_stat_subscription_errors;
+-[ RECORD 1 ]----+-----------------------------------------------------------
+subid | 16391
+subname | test_sub
+subrelid |
+relid | 16384
+command | INSERT
+xid | 716
+count | 50
+error_message | duplicate key value violates unique constraint "test_pkey"
+last_error_time | 2021-09-29 15:52:45.165754+00
+stats_reset |
+</programlisting>
+
+ <para>
+ and it is also shown in subscriber's server log as follows:
+ </para>
+
+<screen>
+ERROR: duplicate key value violates unique constraint "test_pkey"
+DETAIL: Key (id)=(1) already exists.
+CONTEXT: processing remote data during "INSERT" for replication target relation "public.test" in transaction 716 with commit timestamp 2021-07-15 21:54:58.802874+00
+</screen>
+
+ <para>
+ The transaction ID that contains the change violating the constraint can be
+ found from those outputs (transaction ID 716 in the above case). The transaction
+ can be skipped by setting <replaceable>skip_xid</replaceable> on the subscription
+ by <command>ALTER SUBSCRIPTION ... SET</command>. Alternatively, the transaction
+ can also be skipped by calling the <link linkend="pg-replication-origin-advance">
<function>pg_replication_origin_advance()</function></link> function with
a <parameter>node_name</parameter> corresponding to the subscription name,
and a position. The current position of origins can be seen in the
<link linkend="view-pg-replication-origin-status">
<structname>pg_replication_origin_status</structname></link> system view.
</para>
+
+ <para>
+ In this case, you need to consider changing the data on the subscriber so that it
+ doesn't conflict with incoming changes, or dropping the conflicting constraint or
+ unique index, or writing a trigger on the subscriber to suppress or redirect
+ conflicting incoming changes, or as a last resort, by skipping the whole transaction.
+ They skip the whole transaction, including changes that may not violate any
+ constraint. They may easily make the subscriber inconsistent, especially if
+ a user specifies the wrong transaction ID or the position of origin.
+ </para>
</sect1>
<sect1 id="logical-replication-restrictions">
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 2ed35f5408..b4b6ab8989 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -209,7 +209,39 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
<para>
The parameters that can be reset are:
<literal>synchronous_commit</literal>, <literal>binary</literal>,
- and <literal>streaming</literal>.
+ <literal>streaming</literal>, and following parameter:
+ </para>
+ <para>
+ <variablelist>
+ <varlistentry>
+ <term><literal>skip_xid</literal> (<type>xid</type>)</term>
+ <listitem>
+ <para>
+ If incoming data violates any constraints the logical replication
+ will stop until it is resolved. The resolution can be done either
+ by changing data on the subscriber so that it doesn't conflict with
+ incoming change or by skipping the whole transaction. This option
+ specifies the ID of the transaction whose application is to be skipped
+ by the logical replication worker. The logical replication worker
+ skips all data modification changes within the specified transaction.
+ Therefore, since it skips the whole transaction including the changes
+ that may not violate the constraint, it should only be used as a last
+ resort. This option has no effect for the transaction that is already
+ prepared with enabling <literal>two_phase</literal> on susbscriber.
+ After the logical replication successfully skips the transaction, the
+ transaction ID (stored in
+ <structname>pg_subscription</structname>.<structfield>subskipxid</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts.
+ </para>
+
+ <para>
+ Setting and resetting of <literal>skip_xid</literal> option is
+ restricted to superusers.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index 25021e25a4..cb22cd7463 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -104,6 +104,16 @@ GetSubscription(Oid subid, bool missing_ok)
Assert(!isnull);
sub->publications = textarray_to_stringlist(DatumGetArrayTypeP(datum));
+ /* Get skip XID */
+ datum = SysCacheGetAttr(SUBSCRIPTIONOID,
+ tup,
+ Anum_pg_subscription_subskipxid,
+ &isnull);
+ if (!isnull)
+ sub->skipxid = DatumGetTransactionId(datum);
+ else
+ sub->skipxid = InvalidTransactionId;
+
ReleaseSysCache(tup);
return sub;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 896ec8b836..fd74037fb8 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -60,6 +60,7 @@
#define SUBOPT_BINARY 0x00000080
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
+#define SUBOPT_SKIP_XID 0x00000400
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -81,6 +82,7 @@ typedef struct SubOpts
bool binary;
bool streaming;
bool twophase;
+ TransactionId skip_xid;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -129,6 +131,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->streaming = false;
if (IsSet(supported_opts, SUBOPT_TWOPHASE_COMMIT))
opts->twophase = false;
+ if (IsSet(supported_opts, SUBOPT_SKIP_XID))
+ opts->skip_xid = InvalidTransactionId;
/* Parse options */
foreach(lc, stmt_options)
@@ -261,6 +265,29 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_TWOPHASE_COMMIT;
opts->twophase = defGetBoolean(defel);
}
+ else if (strcmp(defel->defname, "skip_xid") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_SKIP_XID))
+ errorConflictingDefElem(defel, pstate);
+
+ if (!is_reset)
+ {
+ char *xid_str = defGetString(defel);
+ TransactionId xid;
+
+ /* Parse the argument as TransactionId */
+ xid = DatumGetTransactionId(DirectFunctionCall1(xidin,
+ CStringGetDatum(xid_str)));
+
+ if (!TransactionIdIsNormal(xid))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid transaction id")));
+ opts->skip_xid = xid;
+ }
+
+ opts->specified_opts |= SUBOPT_SKIP_XID;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -485,6 +512,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
CharGetDatum(opts.twophase ?
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -888,7 +916,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (is_reset)
supported_opts = (SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
- SUBOPT_STREAMING);
+ SUBOPT_STREAMING | SUBOPT_SKIP_XID);
else
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
@@ -941,6 +969,18 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
replaces[Anum_pg_subscription_substream - 1] = true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_SKIP_XID))
+ {
+ if (!superuser())
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("must be superuser to set %s", "skip_xid")));
+
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(opts.skip_xid);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+ }
+
update_tuple = true;
break;
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 3a40684fa5..bed3b72853 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -136,6 +136,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -255,6 +256,21 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * skipping_xid is a valid XID during skipping all data modification changes
+ * (INSERT/DELETE/UPDATE/TRUNCATE) of the specified transaction at MySubscription->skipxid.
+ * Please note that we don’t skip receiving the changes particularly in streaming
+ * cases, since we decide whether or not to skip applying the changes when starting
+ * to apply. Once starting skipping changes, we copy the XID to skipping_xid and
+ * then don't stop skipping until we skip the whole transaction even if the
+ * subscription is invalidated and MySubscription->skipxid gets changed or reset.
+ * When stopping the skipping behavior, we reset the skip XID (subskipxid) in the
+ * pg_subscription catalog and associate origin status to the transaction that resets
+ * the skip XID so that we can start streaming from the next transaction.
+ */
+static TransactionId skipping_xid = InvalidTransactionId;
+#define is_skipping_changes() (TransactionIdIsValid(skipping_xid))
+
/* BufFile handle of the current streaming file */
static BufFile *stream_fd = NULL;
@@ -335,6 +351,9 @@ static void apply_error_callback(void *arg);
static inline void set_apply_error_context_xact(TransactionId xid, TimestampTz ts);
static inline void reset_apply_error_context_info(void);
+static void maybe_start_skipping_changes(TransactionId xid);
+static void stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs);
+
/*
* Should this worker apply changes for given relation.
*
@@ -789,6 +808,11 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
+ /*
+ * Enable skipping all changes of this transaction if specified.
+ */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -813,7 +837,18 @@ apply_handle_commit(StringInfo s)
LSN_FORMAT_ARGS(commit_data.commit_lsn),
LSN_FORMAT_ARGS(remote_final_lsn))));
- apply_handle_commit_internal(&commit_data);
+ /*
+ * Stop skipping the transaction changes, if enabled. Otherwise, commit
+ * the changes that are just applied.
+ */
+ if (is_skipping_changes())
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ apply_handle_commit_internal(&commit_data);
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(commit_data.end_lsn);
@@ -841,6 +876,9 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
+ /* Enable skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -899,9 +937,10 @@ apply_handle_prepare(StringInfo s)
/*
* Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction. It is done this way because at
- * commit prepared time, we won't know whether we have skipped preparing a
- * transaction because of no change.
+ * change has happened in this transaction, possibly because we're
+ * skipping data-modification changes of this transaction. It is done this
+ * way because at commit prepared time, we won't know whether we have
+ * skipped preparing a transaction because of no change.
*
* XXX, We can optimize such that at commit prepared time, we first check
* whether we have prepared the transaction or not but that doesn't seem
@@ -915,6 +954,10 @@ apply_handle_prepare(StringInfo s)
CommitTransactionCommand();
pgstat_report_stat(false);
+ /* Stop skipping transaction changes, if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
store_flush_position(prepare_data.end_lsn);
in_remote_transaction = false;
@@ -1046,6 +1089,9 @@ apply_handle_stream_prepare(StringInfo s)
elog(DEBUG1, "received prepare for streamed transaction %u", prepare_data.xid);
+ /* Enable skipping all changes of this transaction if specified. */
+ maybe_start_skipping_changes(prepare_data.xid);
+
/* Replay all the spooled operations. */
apply_spooled_messages(prepare_data.xid, prepare_data.prepare_lsn);
@@ -1056,6 +1102,10 @@ apply_handle_stream_prepare(StringInfo s)
pgstat_report_stat(false);
+ /* Stop skipping transaction changes, if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
store_flush_position(prepare_data.end_lsn);
in_remote_transaction = false;
@@ -1081,9 +1131,10 @@ apply_handle_origin(StringInfo s)
{
/*
* ORIGIN message can only come inside streaming transaction or inside
- * remote transaction and before any actual writes.
+ * remote transaction and before any actual writes unless we're skipping
+ * changes of the transaction.
*/
- if (!in_streamed_transaction &&
+ if (!in_streamed_transaction && !is_skipping_changes() &&
(!in_remote_transaction ||
(IsTransactionState() && !am_tablesync_worker())))
ereport(ERROR,
@@ -1206,6 +1257,7 @@ apply_handle_stream_abort(StringInfo s)
errmsg_internal("STREAM ABORT message without STREAM STOP")));
logicalrep_read_stream_abort(s, &xid, &subxid);
+ maybe_start_skipping_changes(xid);
/*
* If the two XIDs are the same, it's in fact abort of toplevel xact, so
@@ -1289,6 +1341,10 @@ apply_handle_stream_abort(StringInfo s)
CommitTransactionCommand();
}
+ /* Stop skipping transaction transaction, if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
reset_apply_error_context_info();
}
@@ -1428,9 +1484,23 @@ apply_handle_stream_commit(StringInfo s)
elog(DEBUG1, "received commit for streamed transaction %u", xid);
+ /* Enable skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(xid);
+
apply_spooled_messages(xid, commit_data.commit_lsn);
- apply_handle_commit_internal(&commit_data);
+ if (is_skipping_changes())
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ {
+ /* commit the streamed transaction */
+ apply_handle_commit_internal(&commit_data);
+ }
/* unlink the files with serialized changes and subxact info */
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
@@ -2316,6 +2386,17 @@ apply_dispatch(StringInfo s)
LogicalRepMsgType action = pq_getmsgbyte(s);
LogicalRepMsgType saved_command;
+ /*
+ * Skip all data-modification changes if we're skipping changes of this
+ * transaction.
+ */
+ if (is_skipping_changes() &&
+ (action == LOGICAL_REP_MSG_INSERT ||
+ action == LOGICAL_REP_MSG_UPDATE ||
+ action == LOGICAL_REP_MSG_DELETE ||
+ action == LOGICAL_REP_MSG_TRUNCATE))
+ return;
+
/*
* Set the current command being applied. Since this function can be
* called recusively when applying spooled changes, save the current
@@ -3665,3 +3746,91 @@ reset_apply_error_context_info(void)
apply_error_callback_arg.remote_attnum = -1;
set_apply_error_context_xact(InvalidTransactionId, 0);
}
+
+/*
+ * Start skipping changes of the transaction if the given XID matches the
+ * transaction ID specified by skip_xid option.
+ */
+static void
+maybe_start_skipping_changes(TransactionId xid)
+{
+ Assert(!is_skipping_changes());
+ Assert(!TransactionIdIsValid(skipping_xid));
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ if (!TransactionIdIsValid(MySubscription->skipxid) ||
+ MySubscription->skipxid != xid)
+ return;
+
+ skipping_xid = xid;
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction %u",
+ skipping_xid));
+}
+
+/*
+ * Stop skipping changes and reset the skip XID. Also, reset the skip XID
+ * (pg_subscription.subskipxid). If origin_lsn and origin_committs are valid, we
+ * set origin state to the transaction commit that resets the skip XID so that we
+ * can start streaming from the transaction next to the one that we just skipped.
+ */
+static void
+stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs)
+{
+ Relation rel;
+ HeapTuple tup;
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ Assert(is_skipping_changes());
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction %u",
+ skipping_xid)));
+
+ /* Stop skipping changes */
+ skipping_xid = InvalidTransactionId;
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ if (!IsTransactionState())
+ StartTransactionCommand();
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ ObjectIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ /* Set subskipxid to null */
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ /* Update the system catalog to reset the skip XID */
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+
+ heap_freetuple(tup);
+ table_close(rel, RowExclusiveLock);
+
+ if (!XLogRecPtrIsInvalid(origin_lsn))
+ {
+ /*
+ * Update origin state so we can restart streaming from correct
+ * position in case of crash.
+ */
+ replorigin_session_origin_lsn = origin_lsn;
+ replorigin_session_origin_timestamp = origin_committs;
+ }
+
+ CommitTransactionCommand();
+ pgstat_report_stat(false);
+}
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 21061493ea..beaa6e646d 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -67,6 +67,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
char subtwophasestate; /* Stream two-phase transactions */
+ TransactionId subskipxid BKI_FORCE_NULL; /* All changes associated with
+ * this XID are skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -103,6 +106,7 @@ typedef struct Subscription
* binary format */
bool stream; /* Allow streaming in-progress transactions. */
char twophasestate; /* Allow streaming two-phase transactions */
+ TransactionId skipxid; /* All changes of the XID are skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 539921cb52..63503b86da 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3694,6 +3694,7 @@ typedef struct AlterSubscriptionStmt
char *conninfo; /* Connection string to publisher */
List *publication; /* One or more publication to subscribe to */
List *options; /* List of DefElem nodes */
+ TransactionId skip_xid; /* XID to skip */
} AlterSubscriptionStmt;
typedef struct DropSubscriptionStmt
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index e4c16cab66..e4dc4fb946 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -293,6 +293,19 @@ ERROR: unrecognized subscription parameter: "enabled"
-- fail - RESET must not include values
ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
ERROR: RESET must not include values for parameters
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = '4294967295');
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1.1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+ERROR: invalid transaction id
\dRs+
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 3b0fbea897..c458b38985 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -228,6 +228,17 @@ ALTER SUBSCRIPTION regress_testsub RESET (enabled);
-- fail - RESET must not include values
ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = '4294967295');
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1.1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+
\dRs+
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/subscription/t/025_error_report.pl b/src/test/subscription/t/025_error_report.pl
index c5af95f339..94556e2007 100644
--- a/src/test/subscription/t/025_error_report.pl
+++ b/src/test/subscription/t/025_error_report.pl
@@ -1,12 +1,14 @@
# Copyright (c) 2021, PostgreSQL Global Development Group
-# Tests for subscription error reporting.
+# Tests for subscription error reporting and skipping logical
+# replication transactions.
+
use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 5;
+use Test::More tests => 14;
# Test if the error reported on pg_subscription_errors view is expected.
sub test_subscription_error
@@ -32,6 +34,35 @@ WHERE relid = '$relname'::regclass;
]);
is($result, $expected_error, $msg);
}
+# Check the error reported on pg_stat_subscription view and skip the failed
+# transaction.
+sub test_skip_subscription_error
+{
+ my ($node, $subname, $relname, $xid, $expected_error, $msg) = @_;
+
+ # Check the reported error.
+ test_subscription_error($node, $relname, $xid, $expected_error, $msg);
+
+ # Get XID of the failed transaction.
+ my $skipxid = $node->safe_psql(
+ 'postgres',
+ "SELECT xid FROM pg_stat_subscription_errors WHERE relid = '$relname'::regclass");
+ is($skipxid, $xid, "remote xid and skip_xid are equal");
+
+ $node->safe_psql('postgres',
+ "ALTER SUBSCRIPTION $subname SET (skip_xid = '$skipxid')");
+
+ # Restart the subscriber node to restart logical replication with no interval.
+ $node->restart;
+
+ # Wait for the failed transaction to be skipped.
+ $node->poll_query_until(
+ 'postgres',
+ qq[
+SELECT subskipxid IS NULL FROM pg_subscription
+WHERE subname = '$subname'
+]) or die "Timed out while waiting for the transaction to be skipped";
+}
# Create publisher node.
my $node_publisher = PostgresNode->new('publisher');
@@ -123,7 +154,7 @@ $result = $node_subscriber->safe_psql('postgres',
is($result, q(1), 'check initial data are copied to subscriber');
# Insert more data to test_tab1, raising an error on the subscriber due to violation
-# of the unique constraint on test_tab1.
+# of the unique constraint on test_tab1. Then skip the transaction in question.
my $xid = $node_publisher->safe_psql(
'postgres',
qq[
@@ -132,15 +163,79 @@ INSERT INTO test_tab1 VALUES (1);
SELECT pg_current_xact_id()::xid;
COMMIT;
]);
-test_subscription_error($node_subscriber, 'test_tab1', $xid,
- qq(tap_sub|INSERT|test_tab1|t),
- 'check the error reported by the apply worker');
+test_skip_subscription_error($node_subscriber, 'tap_sub', 'test_tab1',
+ $xid, qq(tap_sub|INSERT|test_tab1|t),
+ 'check the error reported by the apply worker');
# Check the table sync worker's error in the view.
test_subscription_error($node_subscriber, 'test_tab2', '',
qq(tap_sub||test_tab2|t),
'check the error reported by the table sync worker');
+# Insert enough rows to test_tab_streaming to exceed the 64kB limit, also raising an
+# error on the subscriber during applying spooled changes for the same reason. Then
+# skip the transaction in question.
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_skip_subscription_error($node_subscriber, 'tap_sub_streaming', 'test_tab_streaming',
+ $xid, qq(tap_sub_streaming|INSERT|test_tab_streaming|t),
+ 'skip the error reported by the table sync worker during applying streaming changes');
+
+# Insert data to test_tab1 and test_tab_streaming that don't conflict.
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab1 VALUES (2)");
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab_streaming VALUES (10001, md5(10001::text))");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Check the data is successfully replicated after skipping the transactions.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT * FROM test_tab1");
+is($result, q(1
+2), "subscription gets changes after skipped transaction");
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM test_tab_streaming");
+is($result, q(2), "subscription gets changes after skipped streamed transaction");
+
+# Tests for skipping the transactions that are prepared and stream-prepared. We insert
+# the same data as the previous tests but prepare the transactions. Those insertions
+# raise an error on the subscriptions. Then we skip the transactions in question.
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab1 VALUES (1);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'skip_sub1';
+COMMIT PREPARED 'skip_sub1';
+]);
+test_skip_subscription_error($node_subscriber, 'tap_sub', 'test_tab1',
+ $xid, qq(tap_sub|INSERT|test_tab1|t),
+ 'skip the error on changes of the prepared transaction');
+
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'skip_sub2';
+COMMIT PREPARED 'skip_sub2';
+]);
+test_skip_subscription_error($node_subscriber, 'tap_sub_streaming', 'test_tab_streaming',
+ $xid, qq(tap_sub_streaming|INSERT|test_tab_streaming|t),
+ 'skip the error on changes of the prepared-streamed transaction');
+
# Check if the view doesn't show any entries after dropping the subscriptions.
$node_subscriber->safe_psql(
'postgres',
--
2.24.3 (Apple Git-128)
v17-0002-Add-RESET-command-to-ALTER-SUBSCRIPTION-command.patchapplication/octet-stream; name=v17-0002-Add-RESET-command-to-ALTER-SUBSCRIPTION-command.patchDownload
From e46e341840f2716c500ff655ef133cacb1967e91 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 2 Aug 2021 14:23:18 +0900
Subject: [PATCH v17 2/3] Add RESET command to ALTER SUBSCRIPTION command.
ALTER SUBSCRIPTION ... RESET command resets subscription
parameters. The parameters that can be set are streaming, binary,
synchronous_commit.
The RESET parameter for ALTER SUBSCRIPTION is required by the
follow-up commit that introduces a new resettable subscription
parameter "skip_xid".
---
doc/src/sgml/ref/alter_subscription.sgml | 9 +++-
src/backend/commands/subscriptioncmds.c | 59 +++++++++++++++-------
src/backend/parser/gram.y | 11 +++-
src/include/nodes/parsenodes.h | 5 +-
src/test/regress/expected/subscription.out | 14 ++++-
src/test/regress/sql/subscription.sql | 13 +++++
6 files changed, 88 insertions(+), 23 deletions(-)
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 0b027cc346..2ed35f5408 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -29,6 +29,7 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> REFRESH PUB
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> ENABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> DISABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <replaceable>new_name</replaceable>
</synopsis>
@@ -194,16 +195,22 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
<varlistentry>
<term><literal>SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )</literal></term>
+ <term><literal>RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )</literal></term>
<listitem>
<para>
This clause alters parameters originally set by
<xref linkend="sql-createsubscription"/>. See there for more
- information. The parameters that can be altered
+ information. The parameters that can be set
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, and
<literal>streaming</literal>.
</para>
+ <para>
+ The parameters that can be reset are:
+ <literal>synchronous_commit</literal>, <literal>binary</literal>,
+ and <literal>streaming</literal>.
+ </para>
</listitem>
</varlistentry>
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index c47ba26369..896ec8b836 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -99,7 +99,8 @@ static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname,
*/
static void
parse_subscription_options(ParseState *pstate, List *stmt_options,
- bits32 supported_opts, SubOpts *opts)
+ bits32 supported_opts, SubOpts *opts,
+ bool is_reset)
{
ListCell *lc;
@@ -134,6 +135,11 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
{
DefElem *defel = (DefElem *) lfirst(lc);
+ if (is_reset && defel->arg != NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("RESET must not include values for parameters")));
+
if (IsSet(supported_opts, SUBOPT_CONNECT) &&
strcmp(defel->defname, "connect") == 0)
{
@@ -192,12 +198,18 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_SYNCHRONOUS_COMMIT;
- opts->synchronous_commit = defGetString(defel);
+ if (!is_reset)
+ {
+ opts->synchronous_commit = defGetString(defel);
- /* Test if the given value is valid for synchronous_commit GUC. */
- (void) set_config_option("synchronous_commit", opts->synchronous_commit,
- PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
- false, 0, false);
+ /*
+ * Test if the given value is valid for synchronous_commit
+ * GUC.
+ */
+ (void) set_config_option("synchronous_commit", opts->synchronous_commit,
+ PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
+ false, 0, false);
+ }
}
else if (IsSet(supported_opts, SUBOPT_REFRESH) &&
strcmp(defel->defname, "refresh") == 0)
@@ -215,7 +227,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_BINARY;
- opts->binary = defGetBoolean(defel);
+ if (!is_reset)
+ opts->binary = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_STREAMING) &&
strcmp(defel->defname, "streaming") == 0)
@@ -224,7 +237,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_STREAMING;
- opts->streaming = defGetBoolean(defel);
+ if (!is_reset)
+ opts->streaming = defGetBoolean(defel);
}
else if (strcmp(defel->defname, "two_phase") == 0)
{
@@ -397,7 +411,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT);
- parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
+ parse_subscription_options(pstate, stmt->options, supported_opts, &opts,
+ false);
/*
* Since creating a replication slot is not transactional, rolling back
@@ -866,14 +881,21 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
switch (stmt->kind)
{
- case ALTER_SUBSCRIPTION_OPTIONS:
+ case ALTER_SUBSCRIPTION_SET_OPTIONS:
+ case ALTER_SUBSCRIPTION_RESET_OPTIONS:
{
- supported_opts = (SUBOPT_SLOT_NAME |
- SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
- SUBOPT_STREAMING);
+ bool is_reset = (stmt->kind == ALTER_SUBSCRIPTION_RESET_OPTIONS);
+
+ if (is_reset)
+ supported_opts = (SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
+ SUBOPT_STREAMING);
+ else
+ supported_opts = (SUBOPT_SLOT_NAME |
+ SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
+ SUBOPT_STREAMING);
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, is_reset);
if (IsSet(opts.specified_opts, SUBOPT_SLOT_NAME))
{
@@ -926,7 +948,8 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
case ALTER_SUBSCRIPTION_ENABLED:
{
parse_subscription_options(pstate, stmt->options,
- SUBOPT_ENABLED, &opts);
+ SUBOPT_ENABLED, &opts, false);
+
Assert(IsSet(opts.specified_opts, SUBOPT_ENABLED));
if (!sub->slotname && opts.enabled)
@@ -961,7 +984,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
{
supported_opts = SUBOPT_COPY_DATA | SUBOPT_REFRESH;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
values[Anum_pg_subscription_subpublications - 1] =
publicationListToArray(stmt->publication);
@@ -1008,7 +1031,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = SUBOPT_REFRESH | SUBOPT_COPY_DATA;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
publist = merge_publications(sub->publications, stmt->publication, isadd, stmt->subname);
values[Anum_pg_subscription_subpublications - 1] =
@@ -1056,7 +1079,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
errmsg("ALTER SUBSCRIPTION ... REFRESH is not allowed for disabled subscriptions")));
parse_subscription_options(pstate, stmt->options,
- SUBOPT_COPY_DATA, &opts);
+ SUBOPT_COPY_DATA, &opts, false);
/*
* The subscription option "two_phase" requires that
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 08f1bf1031..a7e2853f0e 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9721,7 +9721,16 @@ AlterSubscriptionStmt:
{
AlterSubscriptionStmt *n =
makeNode(AlterSubscriptionStmt);
- n->kind = ALTER_SUBSCRIPTION_OPTIONS;
+ n->kind = ALTER_SUBSCRIPTION_SET_OPTIONS;
+ n->subname = $3;
+ n->options = $5;
+ $$ = (Node *)n;
+ }
+ | ALTER SUBSCRIPTION name RESET definition
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_RESET_OPTIONS;
n->subname = $3;
n->options = $5;
$$ = (Node *)n;
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 3138877553..539921cb52 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3676,7 +3676,8 @@ typedef struct CreateSubscriptionStmt
typedef enum AlterSubscriptionType
{
- ALTER_SUBSCRIPTION_OPTIONS,
+ ALTER_SUBSCRIPTION_SET_OPTIONS,
+ ALTER_SUBSCRIPTION_RESET_OPTIONS,
ALTER_SUBSCRIPTION_CONNECTION,
ALTER_SUBSCRIPTION_SET_PUBLICATION,
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
@@ -3688,7 +3689,7 @@ typedef enum AlterSubscriptionType
typedef struct AlterSubscriptionStmt
{
NodeTag type;
- AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_OPTIONS, etc */
+ AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_SET_OPTIONS, etc */
char *subname; /* Name of the subscription */
char *conninfo; /* Connection string to publisher */
List *publication; /* One or more publication to subscribe to */
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 15a1ac6398..e4c16cab66 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -281,11 +281,23 @@ ALTER SUBSCRIPTION regress_testsub SET (two_phase = false);
ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+-- ok
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit);
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit, binary, streaming);
+-- fail - unsupported parameters
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ERROR: unrecognized subscription parameter: "connect"
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+ERROR: unrecognized subscription parameter: "enabled"
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+ERROR: RESET must not include values for parameters
\dRs+
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7faa935a2a..3b0fbea897 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -215,6 +215,19 @@ ALTER SUBSCRIPTION regress_testsub SET (two_phase = false);
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+
+-- ok
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit);
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit, binary, streaming);
+
+-- fail - unsupported parameters
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+
\dRs+
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
--
2.24.3 (Apple Git-128)
v17-0001-Add-a-subscription-errors-statistics-view-pg_sta.patchapplication/octet-stream; name=v17-0001-Add-a-subscription-errors-statistics-view-pg_sta.patchDownload
From ea98cc6b0ab851354a8ec9d48bad2baa7e879b40 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 16 Jul 2021 23:10:22 +0900
Subject: [PATCH v17 1/3] Add a subscription errors statistics view
"pg_stat_subscription_errors".
This commit adds a new system view pg_stat_subscription_errors,
that shows information about any errors which occur during application
of logical replication changes as well as during performing initial table
synchronization.
The subscription error entries are removed by autovacuum workers after
table synchronization completes in table sync worker cases and after
dropping the subscription in apply worker cases.
It also adds an SQL function pg_stat_reset_subscription_error() to
reset a single subscription error.
---
doc/src/sgml/monitoring.sgml | 160 ++++++
src/backend/catalog/system_functions.sql | 2 +
src/backend/catalog/system_views.sql | 25 +
src/backend/postmaster/pgstat.c | 574 ++++++++++++++++++++
src/backend/replication/logical/worker.c | 54 +-
src/backend/utils/adt/pgstatfuncs.c | 121 +++++
src/include/catalog/pg_proc.dat | 13 +
src/include/pgstat.h | 124 +++++
src/test/regress/expected/rules.out | 20 +
src/test/subscription/t/025_error_report.pl | 156 ++++++
src/tools/pgindent/typedefs.list | 6 +
11 files changed, 1252 insertions(+), 3 deletions(-)
create mode 100644 src/test/subscription/t/025_error_report.pl
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 7355835202..94426b0516 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -627,6 +627,15 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
</entry>
</row>
+ <row>
+ <entry><structname>pg_stat_subscription_errors</structname><indexterm><primary>pg_stat_subscription_errors</primary></indexterm></entry>
+ <entry>One row per error that occurred on subscription, showing information about
+ each subscription error.
+ See <link linkend="monitoring-pg-stat-subscription-errors">
+ <structname>pg_stat_subscription_errors</structname></link> for details.
+ </entry>
+ </row>
+
</tbody>
</tgroup>
</table>
@@ -3050,6 +3059,135 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
</sect2>
+ <sect2 id="monitoring-pg-stat-subscription-errors">
+ <title><structname>pg_stat_subscription_errors</structname></title>
+
+ <indexterm>
+ <primary>pg_stat_subscription_errors</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_stat_subscription_errors</structname> view will contain one
+ row per subscription error reported by workers applying logical replication
+ changes and workers handling the initial data copy of the subscribed tables.
+ </para>
+
+ <table id="pg-stat-subscription-errors" xreflabel="pg_stat_subscription-errors">
+ <title><structname>pg_stat_subscription_errors</structname> View</title>
+ <tgroup cols="1">
+ <thead>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ Column Type
+ </para>
+ <para>
+ Description
+ </para></entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the subscription
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the subscription
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subrelid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker is synchronizing; NULL for the
+ main apply worker
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>relid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker was processing when the
+ error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>command</structfield> <type>text</type>
+ </para>
+ <para>
+ Name of command being applied when the error occurred. This field
+ is always NULL if the error was reported during the initial data
+ copy.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>xid</structfield> <type>xid</type>
+ </para>
+ <para>
+ Transaction ID of the publisher node being applied when the error
+ occurred. This field is always NULL if the error was reported
+ during the initial data copy.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>count</structfield> <type>uint8</type>
+ </para>
+ <para>
+ Number of consecutive times the error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>error_message</structfield> <type>text</type>
+ </para>
+ <para>
+ The error message
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_error_time</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which the last error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>stats_reset</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which these statistics were last reset
+ </para></entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ </sect2>
+
<sect2 id="monitoring-pg-stat-ssl-view">
<title><structname>pg_stat_ssl</structname></title>
@@ -5172,6 +5310,28 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
can be granted EXECUTE to run the function.
</para></entry>
</row>
+
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_stat_reset_subscription_error</primary>
+ </indexterm>
+ <function>pg_stat_reset_subscription_error</function> ( <parameter>subid</parameter> <type>oid</type>, <parameter>relid</parameter> <type>oid</type> )
+ <returnvalue>void</returnvalue>
+ </para>
+ <para>
+ Resets statistics of a single subscription error. If
+ the argument <parameter>relid</parameter> is not <literal>NULL</literal>,
+ resets error statistics of the <literal>tablesync</literal> worker for
+ the relation with <parameter>relid</parameter>. Otherwise, resets the
+ error statistics of the <literal>apply</literal> worker running on the
+ subscription with <parameter>subid</parameter>.
+ </para>
+ <para>
+ This function is restricted to superusers by default, but other users
+ can be granted EXECUTE to run the function.
+ </para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index a416e94d37..c9aa6f04d3 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -639,6 +639,8 @@ REVOKE EXECUTE ON FUNCTION pg_stat_reset_single_function_counters(oid) FROM publ
REVOKE EXECUTE ON FUNCTION pg_stat_reset_replication_slot(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_stat_reset_subscription_error(oid, oid) FROM public;
+
REVOKE EXECUTE ON FUNCTION lo_import(text) FROM public;
REVOKE EXECUTE ON FUNCTION lo_import(text, oid) FROM public;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 55f6e3711d..e3a52f6899 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1257,3 +1257,28 @@ REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
substream, subtwophasestate, subslotname, subsynccommit, subpublications)
ON pg_subscription TO public;
+
+CREATE VIEW pg_stat_subscription_errors AS
+ SELECT
+ e.subid,
+ s.subname,
+ e.subrelid,
+ e.relid,
+ e.command,
+ e.xid,
+ e.count,
+ e.error_message,
+ e.last_error_time,
+ e.stats_reset
+ FROM (SELECT
+ oid as subid,
+ NULL as relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT
+ srsubid as subid,
+ srrelid as relid
+ FROM pg_subscription_rel
+ WHERE srsubstate <> 'r') sr,
+ LATERAL pg_stat_get_subscription_error(sr.subid, sr.relid) e
+ JOIN pg_subscription s ON (e.subid = s.oid);
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index b7d0fbaefd..0b5ae4c529 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -41,6 +41,8 @@
#include "catalog/catalog.h"
#include "catalog/pg_database.h"
#include "catalog/pg_proc.h"
+#include "catalog/pg_subscription.h"
+#include "catalog/pg_subscription_rel.h"
#include "common/ip.h"
#include "executor/instrument.h"
#include "libpq/libpq.h"
@@ -106,6 +108,7 @@
#define PGSTAT_TAB_HASH_SIZE 512
#define PGSTAT_FUNCTION_HASH_SIZE 512
#define PGSTAT_REPLSLOT_HASH_SIZE 32
+#define PGSTAT_SUBWORKER_HASH_SIZE 32
/* ----------
@@ -282,6 +285,7 @@ static PgStat_GlobalStats globalStats;
static PgStat_WalStats walStats;
static PgStat_SLRUStats slruStats[SLRU_NUM_ELEMENTS];
static HTAB *replSlotStatHash = NULL;
+static HTAB *subWorkerStatHash = NULL;
/*
* List of OIDs of databases we need to write out. If an entry is InvalidOid,
@@ -332,6 +336,13 @@ static bool pgstat_db_requested(Oid databaseid);
static PgStat_StatReplSlotEntry *pgstat_get_replslot_entry(NameData name, bool create_it);
static void pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotstats, TimestampTz ts);
+static PgStat_StatSubWorkerEntry *pgstat_get_subworker_entry(Oid subid, Oid subrelid,
+ bool create);
+static void pgstat_reset_subworker_error(PgStat_StatSubWorkerEntry *wentry, TimestampTz ts);
+static void pgstat_report_subworker_purge(PgStat_MsgSubWorkerPurge *msg);
+static void pgstat_report_subworker_error_purge(PgStat_MsgSubWorkerErrorPurge *msg);
+static void pgstat_vacuum_subworker_stats(void);
+
static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg, TimestampTz now);
static void pgstat_send_funcstats(void);
static void pgstat_send_slru(void);
@@ -356,6 +367,7 @@ static void pgstat_recv_resetsharedcounter(PgStat_MsgResetsharedcounter *msg, in
static void pgstat_recv_resetsinglecounter(PgStat_MsgResetsinglecounter *msg, int len);
static void pgstat_recv_resetslrucounter(PgStat_MsgResetslrucounter *msg, int len);
static void pgstat_recv_resetreplslotcounter(PgStat_MsgResetreplslotcounter *msg, int len);
+static void pgstat_recv_resetsubworkererror(PgStat_MsgResetsubworkererror *msg, int len);
static void pgstat_recv_autovac(PgStat_MsgAutovacStart *msg, int len);
static void pgstat_recv_vacuum(PgStat_MsgVacuum *msg, int len);
static void pgstat_recv_analyze(PgStat_MsgAnalyze *msg, int len);
@@ -373,6 +385,9 @@ static void pgstat_recv_connect(PgStat_MsgConnect *msg, int len);
static void pgstat_recv_disconnect(PgStat_MsgDisconnect *msg, int len);
static void pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len);
static void pgstat_recv_tempfile(PgStat_MsgTempFile *msg, int len);
+static void pgstat_recv_subworker_error(PgStat_MsgSubWorkerError *msg, int len);
+static void pgstat_recv_subworker_error_purge(PgStat_MsgSubWorkerErrorPurge *msg, int len);
+static void pgstat_recv_subworker_purge(PgStat_MsgSubWorkerPurge *msg, int len);
/* ------------------------------------------------------------
* Public functions called from postmaster follow
@@ -1178,6 +1193,10 @@ pgstat_vacuum_stat(void)
}
}
+ /* Cleanup the dead subscription workers statistics */
+ if (subWorkerStatHash)
+ pgstat_vacuum_subworker_stats();
+
/*
* Lookup our own database entry; if not found, nothing more to do.
*/
@@ -1355,6 +1374,176 @@ pgstat_collect_oids(Oid catalogid, AttrNumber anum_oid)
}
+/* PgStat_StatSubWorkerEntry comparator sorting subid and subrelid */
+static int
+subworker_stats_comparator(const ListCell *a, const ListCell *b)
+{
+ PgStat_StatSubWorkerEntry *entry1 = (PgStat_StatSubWorkerEntry *) lfirst(a);
+ PgStat_StatSubWorkerEntry *entry2 = (PgStat_StatSubWorkerEntry *) lfirst(b);
+ int ret;
+
+ ret = oid_cmp(&entry1->key.subid, &entry2->key.subid);
+ if (ret != 0)
+ return ret;
+
+ return oid_cmp(&entry1->key.subrelid, &entry2->key.subrelid);
+}
+
+/* ----------
+ * pgstat_vacuum_subworker_stat() -
+ *
+ * This is a subroutine for pgstat_vacuum_stat to tell the collector to
+ * remove dead subscription worker statistics.
+ */
+static void
+pgstat_vacuum_subworker_stats(void)
+{
+ PgStat_MsgSubWorkerPurge wpmsg;
+ PgStat_MsgSubWorkerErrorPurge epmsg;
+ PgStat_StatSubWorkerEntry *wentry;
+ HTAB *subids;
+ HASH_SEQ_STATUS hstat;
+ List *subworker_stats = NIL;
+ List *not_ready_rels = NIL;
+ ListCell *lc1;
+
+ /* Build the list of worker stats and sort it by subid and relid */
+ hash_seq_init(&hstat, subWorkerStatHash);
+ while ((wentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&hstat)) != NULL)
+ subworker_stats = lappend(subworker_stats, wentry);
+ list_sort(subworker_stats, subworker_stats_comparator);
+
+ /*
+ * Read pg_subscription and make a list of OIDs of all existing
+ * subscriptions
+ */
+ subids = pgstat_collect_oids(SubscriptionRelationId, Anum_pg_subscription_oid);
+
+ /*
+ * Search for all the dead subscriptions and unnecessary table sync worker
+ * entries in stats hashtable and tell the stats collector to drop them.
+ */
+ wpmsg.m_nentries = 0;
+ epmsg.m_nentries = 0;
+ epmsg.m_subid = InvalidOid;
+ foreach(lc1, subworker_stats)
+ {
+ ListCell *lc2;
+ bool keep_it = false;
+
+ wentry = (PgStat_StatSubWorkerEntry *) lfirst(lc1);
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* Skip if we already registered this subscription to purge */
+ if (wpmsg.m_nentries > 0 &&
+ wpmsg.m_subids[wpmsg.m_nentries - 1] == wentry->key.subid)
+ continue;
+
+ /* Check if the subscription is dead */
+ if (hash_search(subids, (void *) &(wentry->key.subid), HASH_FIND, NULL) == NULL)
+ {
+ /* This subscription is dead, add the subid to the message */
+ wpmsg.m_subids[wpmsg.m_nentries++] = wentry->key.subid;
+
+ /*
+ * If the message is full, send it out and reinitialize to empty
+ */
+ if (wpmsg.m_nentries >= PGSTAT_NUM_SUBWORKERPURGE)
+ {
+ pgstat_report_subworker_purge(&wpmsg);
+ wpmsg.m_nentries = 0;
+ }
+
+ continue;
+ }
+
+ /*
+ * This subscription is alive. The next step is that we search errors
+ * of table sync workers who are already in sync state. These errors
+ * should be removed.
+ */
+
+ /* We remove only table sync errors in the current database */
+ if (wentry->dbid != MyDatabaseId)
+ continue;
+
+ /* Skip if it's an apply worker error */
+ if (!OidIsValid(wentry->key.subrelid))
+ continue;
+
+ if (epmsg.m_subid != wentry->key.subid)
+ {
+ /*
+ * Send the purge message for previously collected table sync
+ * errors, if there is.
+ */
+ if (epmsg.m_nentries > 0)
+ {
+ pgstat_report_subworker_error_purge(&epmsg);
+ epmsg.m_nentries = 0;
+ }
+
+ /* Clean up if necessary */
+ if (not_ready_rels != NIL)
+ list_free_deep(not_ready_rels);
+
+ /* Refresh the not-ready-relations of this subscription */
+ not_ready_rels = GetSubscriptionNotReadyRelations(wentry->key.subid);
+
+ /* Prepare the error purge message for the subscription */
+ epmsg.m_subid = wentry->key.subid;
+ }
+
+ /*
+ * Check if the table is still being synchronized or no longer belongs
+ * to the subscription.
+ */
+ foreach(lc2, not_ready_rels)
+ {
+ SubscriptionRelState *relstate = (SubscriptionRelState *) lfirst(lc2);
+
+ if (relstate->relid == wentry->key.subrelid)
+ {
+ /* This table is still being synchronized, so keep it */
+ keep_it = true;
+ break;
+ }
+ }
+
+ if (keep_it)
+ continue;
+
+ /* Add the table to the error purge message */
+ epmsg.m_relids[epmsg.m_nentries++] = wentry->key.subrelid;
+
+ /*
+ * If the error purge message is full, send it out and reinitialize to
+ * empty
+ */
+ if (epmsg.m_nentries >= PGSTAT_NUM_SUBWORKERERRORPURGE)
+ {
+ pgstat_report_subworker_error_purge(&epmsg);
+ epmsg.m_nentries = 0;
+ }
+ }
+
+ /* Send the rest of dead subscriptions */
+ if (wpmsg.m_nentries > 0)
+ pgstat_report_subworker_purge(&wpmsg);
+
+ /* Send the rest of dead error entries */
+ if (epmsg.m_nentries > 0)
+ pgstat_report_subworker_error_purge(&epmsg);
+
+ /* Clean up */
+ if (not_ready_rels != NIL)
+ list_free_deep(not_ready_rels);
+
+ list_free(subworker_stats);
+ hash_destroy(subids);
+}
+
/* ----------
* pgstat_drop_database() -
*
@@ -1544,6 +1733,24 @@ pgstat_reset_replslot_counter(const char *name)
pgstat_send(&msg, sizeof(msg));
}
+/* ----------
+ * pgstat_reset_subworker_error_stats() -
+ *
+ * Tell the collector to reset the subscription worker error.
+ * ----------
+ */
+void
+pgstat_reset_subworker_error_stats(Oid subid, Oid subrelid)
+{
+ PgStat_MsgResetsubworkererror msg;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_RESETSUBWORKERERROR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+
+ pgstat_send(&msg, sizeof(PgStat_MsgResetsubworkererror));
+}
+
/* ----------
* pgstat_report_autovac() -
*
@@ -1804,6 +2011,47 @@ pgstat_should_report_connstat(void)
return MyBackendType == B_BACKEND;
}
+/* --------
+ * pgstat_report_subworker_purge() -
+ *
+ * Tell the collector to remove subscriptions worker statistics.
+ * --------
+ */
+static void
+pgstat_report_subworker_purge(PgStat_MsgSubWorkerPurge *msg)
+{
+ int len;
+
+ Assert(msg->m_nentries > 0);
+
+ len = offsetof(PgStat_MsgSubWorkerPurge, m_subids[0])
+ + msg->m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&msg->m_hdr, PGSTAT_MTYPE_SUBWORKERPURGE);
+ pgstat_send(msg, len);
+}
+
+/* --------
+ * pgstat_report_subworker_error_purge() -
+ *
+ * Tell the collector to remove table sync errors.
+ * --------
+ */
+static void
+pgstat_report_subworker_error_purge(PgStat_MsgSubWorkerErrorPurge *msg)
+{
+ int len;
+
+ Assert(OidIsValid(msg->m_subid));
+ Assert(msg->m_nentries > 0);
+
+ len = offsetof(PgStat_MsgSubWorkerErrorPurge, m_relids[0])
+ + msg->m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&msg->m_hdr, PGSTAT_MTYPE_SUBWORKERERRORPURGE);
+ pgstat_send(msg, len);
+}
+
/* ----------
* pgstat_report_replslot() -
*
@@ -1869,6 +2117,36 @@ pgstat_report_replslot_drop(const char *slotname)
pgstat_send(&msg, sizeof(PgStat_MsgReplSlot));
}
+/* ----------
+ * pgstat_report_subworker_error() -
+ *
+ * Tell the collector about the subscription worker error.
+ * ----------
+ */
+void
+pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command, TransactionId xid,
+ const char *errmsg)
+{
+ PgStat_MsgSubWorkerError msg;
+ int len;
+
+ Assert(strlen(errmsg) < PGSTAT_SUBWORKERERROR_MSGLEN);
+ len = offsetof(PgStat_MsgSubWorkerError, m_message[0]) + strlen(errmsg) + 1;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBWORKERERROR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_dbid = MyDatabaseId;
+ msg.m_relid = relid;
+ msg.m_command = command;
+ msg.m_xid = xid;
+ msg.m_timestamp = GetCurrentTimestamp();
+ strlcpy(msg.m_message, errmsg, PGSTAT_SUBWORKERERROR_MSGLEN);
+
+ pgstat_send(&msg, len);
+}
+
/* ----------
* pgstat_ping() -
*
@@ -2987,6 +3265,22 @@ pgstat_fetch_replslot(NameData slotname)
return pgstat_get_replslot_entry(slotname, false);
}
+/*
+ * ---------
+ * pgstat_fetch_subworker() -
+ *
+ * Support function for the SQL-callable pgstat* functions. Returns
+ * a pointer to the subscription worker struct.
+ * ---------
+ */
+PgStat_StatSubWorkerEntry *
+pgstat_fetch_subworker(Oid subid, Oid subrelid)
+{
+ backend_read_statsfile();
+
+ return pgstat_get_subworker_entry(subid, subrelid, false);
+}
+
/*
* Shut down a single backend's statistics reporting at process exit.
*
@@ -3498,6 +3792,11 @@ PgstatCollectorMain(int argc, char *argv[])
len);
break;
+ case PGSTAT_MTYPE_RESETSUBWORKERERROR:
+ pgstat_recv_resetsubworkererror(&msg.msg_resetsubworkererror,
+ len);
+ break;
+
case PGSTAT_MTYPE_AUTOVAC_START:
pgstat_recv_autovac(&msg.msg_autovacuum_start, len);
break;
@@ -3568,6 +3867,19 @@ PgstatCollectorMain(int argc, char *argv[])
pgstat_recv_disconnect(&msg.msg_disconnect, len);
break;
+ case PGSTAT_MTYPE_SUBWORKERERROR:
+ pgstat_recv_subworker_error(&msg.msg_subworkererror, len);
+ break;
+
+ case PGSTAT_MTYPE_SUBWORKERERRORPURGE:
+ pgstat_recv_subworker_error_purge(&msg.msg_subworkererrorpurge,
+ len);
+ break;
+
+ case PGSTAT_MTYPE_SUBWORKERPURGE:
+ pgstat_recv_subworker_purge(&msg.msg_subworkerpurge, len);
+ break;
+
default:
break;
}
@@ -3868,6 +4180,22 @@ pgstat_write_statsfiles(bool permanent, bool allDbs)
}
}
+ /*
+ * Write subscription worker stats struct
+ */
+ if (subWorkerStatHash)
+ {
+ PgStat_StatSubWorkerEntry *wentry;
+
+ hash_seq_init(&hstat, subWorkerStatHash);
+ while ((wentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ fputc('S', fpout);
+ rc = fwrite(wentry, sizeof(PgStat_StatSubWorkerEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+ }
+ }
+
/*
* No more output to be done. Close the temp file and replace the old
* pgstat.stat with it. The ferror() check replaces testing for error
@@ -4329,6 +4657,48 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
break;
}
+ /*
+ * 'S' A PgStat_StatSubWorkerEntry struct describing
+ * subscription worker statistics.
+ */
+ case 'S':
+ {
+ PgStat_StatSubWorkerEntry wbuf;
+ PgStat_StatSubWorkerEntry *wentry;
+
+ /* Read the subscription entry */
+ if (fread(&wbuf, 1, sizeof(PgStat_StatSubWorkerEntry), fpin)
+ != sizeof(PgStat_StatSubWorkerEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /* Create hash table if we don't have it already. */
+ if (subWorkerStatHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(PgStat_StatSubWorkerKey);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubWorkerEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ subWorkerStatHash = hash_create("Subscription worker stat entries",
+ PGSTAT_SUBWORKER_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ }
+
+ /* Enter the subscription entry and initialize fields */
+ wentry =
+ (PgStat_StatSubWorkerEntry *) hash_search(subWorkerStatHash,
+ (void *) &wbuf.key,
+ HASH_ENTER, NULL);
+ memcpy(wentry, &wbuf, sizeof(PgStat_StatSubWorkerEntry));
+ break;
+ }
+
case 'E':
goto done;
@@ -4541,6 +4911,7 @@ pgstat_read_db_statsfile_timestamp(Oid databaseid, bool permanent,
PgStat_WalStats myWalStats;
PgStat_SLRUStats mySLRUStats[SLRU_NUM_ELEMENTS];
PgStat_StatReplSlotEntry myReplSlotStats;
+ PgStat_StatSubWorkerEntry mySubWorkerStats;
FILE *fpin;
int32 format_id;
const char *statfile = permanent ? PGSTAT_STAT_PERMANENT_FILENAME : pgstat_stat_filename;
@@ -4671,6 +5042,22 @@ pgstat_read_db_statsfile_timestamp(Oid databaseid, bool permanent,
}
break;
+ /*
+ * 'S' A PgStat_StatSubWorkerEntry struct describing
+ * subscription worker statistics.
+ */
+ case 'S':
+ if (fread(&mySubWorkerStats, 1, sizeof(mySubWorkerStats), fpin)
+ != sizeof(mySubWorkerStats))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ FreeFile(fpin);
+ return false;
+ }
+ break;
+
case 'E':
goto done;
@@ -4876,6 +5263,7 @@ pgstat_clear_snapshot(void)
pgStatLocalContext = NULL;
pgStatDBHash = NULL;
replSlotStatHash = NULL;
+ subWorkerStatHash = NULL;
/*
* Historically the backend_status.c facilities lived in this file, and
@@ -5344,6 +5732,33 @@ pgstat_recv_resetreplslotcounter(PgStat_MsgResetreplslotcounter *msg,
}
}
+/* ----------
+ * pgstat_recv_resetsubworkererror() -
+ *
+ * Process a RESETSUBWORKERERROR message.
+ * ----------
+ */
+static void
+pgstat_recv_resetsubworkererror(PgStat_MsgResetsubworkererror *msg, int len)
+{
+ PgStat_StatSubWorkerEntry *wentry;
+
+ Assert(OidIsValid(msg->m_subid));
+
+ /* Get subscription worker stats */
+ wentry = pgstat_get_subworker_entry(msg->m_subid, msg->m_subrelid, false);
+
+ /*
+ * Nothing to do if the subscription error entry is not found. This could
+ * happen when the subscription is dropped and the message for dropping
+ * subscription entry arrived before the message for resetting the error.
+ */
+ if (wentry == NULL)
+ return;
+
+ /* reset the entry and set reset timestamp */
+ pgstat_reset_subworker_error(wentry, GetCurrentTimestamp());
+}
/* ----------
* pgstat_recv_autovac() -
@@ -5816,6 +6231,99 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
}
}
+/* ----------
+ * pgstat_recv_subworker_error() -
+ *
+ * Process a SUBWORKERERROR message.
+ * ----------
+ */
+static void
+pgstat_recv_subworker_error(PgStat_MsgSubWorkerError *msg, int len)
+{
+ PgStat_StatSubWorkerEntry *wentry;
+
+ /* Get the subscription worker stats */
+ wentry = pgstat_get_subworker_entry(msg->m_subid, msg->m_subrelid, true);
+ Assert(wentry);
+
+ /*
+ * Update only the counter and timestamp if we received the same error
+ * again
+ */
+ if (wentry->dbid == msg->m_dbid &&
+ wentry->relid == msg->m_relid &&
+ wentry->command == msg->m_command &&
+ wentry->xid == msg->m_xid &&
+ strcmp(wentry->message, msg->m_message) == 0)
+ {
+ wentry->count++;
+ wentry->timestamp = msg->m_timestamp;
+ return;
+ }
+
+ /* Otherwise, update the error information */
+ wentry->dbid = msg->m_dbid;
+ wentry->relid = msg->m_relid;
+ wentry->command = msg->m_command;
+ wentry->xid = msg->m_xid;
+ wentry->count = 1;
+ wentry->timestamp = msg->m_timestamp;
+ strlcpy(wentry->message, msg->m_message, PGSTAT_SUBWORKERERROR_MSGLEN);
+}
+
+/* ----------
+ * pgstat_recv_subworker_purge() -
+ *
+ * Process a SUBWORKERPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subworker_purge(PgStat_MsgSubWorkerPurge *msg, int len)
+{
+ if (subWorkerStatHash == NULL)
+ return;
+
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ HASH_SEQ_STATUS sstat;
+ PgStat_StatSubWorkerEntry *wentry;
+
+ /* Remove all worker statistics of the subscription */
+ hash_seq_init(&sstat, subWorkerStatHash);
+ while ((wentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&sstat)) != NULL)
+ {
+ if (wentry->key.subid == msg->m_subids[i])
+ (void) hash_search(subWorkerStatHash, (void *) &(wentry->key),
+ HASH_REMOVE, NULL);
+ }
+ }
+}
+
+/* ----------
+ * pgstat_recv_subworker_error_purge() -
+ *
+ * Process a SUBWORKERERRORPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subworker_error_purge(PgStat_MsgSubWorkerErrorPurge *msg, int len)
+{
+ PgStat_StatSubWorkerKey key;
+
+ key.subid = msg->m_subid;
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ /*
+ * Must be a table sync worker error as the apply worker error is
+ * dropped only when the subscription is dropped.
+ */
+ Assert(OidIsValid(msg->m_relids[i]));
+
+ key.subrelid = msg->m_relids[i];
+ (void) hash_search(subWorkerStatHash, (void *) &key, HASH_REMOVE, NULL);
+ }
+}
+
/* ----------
* pgstat_write_statsfile_needed() -
*
@@ -5934,6 +6442,72 @@ pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotent, TimestampTz ts)
slotent->stat_reset_timestamp = ts;
}
+/* ----------
+ * pgstat_get_subworker_entry
+ *
+ * Return subscription worker entry with the given subscription OID and
+ * relation OID. If subrelid is InvalidOid, it returns an entry of the
+ * apply worker otherwise of the table sync worker associated with subrelid.
+ * If no subscription entry exists, initialize it, if the create parameter
+ * is true. Else, return NULL.
+ * ----------
+ */
+static PgStat_StatSubWorkerEntry *
+pgstat_get_subworker_entry(Oid subid, Oid subrelid, bool create)
+{
+ PgStat_StatSubWorkerEntry *wentry;
+ PgStat_StatSubWorkerKey key;
+ HASHACTION action;
+ bool found;
+
+ if (subWorkerStatHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ if (!create)
+ return NULL;
+
+ hash_ctl.keysize = sizeof(PgStat_StatSubWorkerKey);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubWorkerEntry);
+ subWorkerStatHash = hash_create("Subscription worker stat entries",
+ PGSTAT_SUBWORKER_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+ }
+
+ key.subid = subid;
+ key.subrelid = subrelid;
+ action = (create ? HASH_ENTER : HASH_FIND);
+ wentry = (PgStat_StatSubWorkerEntry *) hash_search(subWorkerStatHash,
+ (void *) &key,
+ action, &found);
+
+ /* initialize fields */
+ if (create && !found)
+ pgstat_reset_subworker_error(wentry, 0);
+
+ return wentry;
+}
+
+/* ----------
+ * pgstat_reset_subworker_error
+ *
+ * Reset the given subscription worker error stats.
+ * ----------
+ */
+static void
+pgstat_reset_subworker_error(PgStat_StatSubWorkerEntry *wentry, TimestampTz ts)
+{
+ wentry->dbid = InvalidOid;
+ wentry->relid = InvalidOid;
+ wentry->command = 0;
+ wentry->xid = InvalidTransactionId;
+ wentry->count = 0;
+ wentry->timestamp = 0;
+ wentry->message[0] = '\0';
+ wentry->stat_reset_timestamp = ts;
+}
+
/*
* pgstat_slru_index
*
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 8d96c926b4..3a40684fa5 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3329,6 +3329,7 @@ void
ApplyWorkerMain(Datum main_arg)
{
int worker_slot = DatumGetInt32(main_arg);
+ MemoryContext cctx = CurrentMemoryContext;
MemoryContext oldctx;
char originname[NAMEDATALEN];
XLogRecPtr origin_startpos;
@@ -3429,8 +3430,30 @@ ApplyWorkerMain(Datum main_arg)
{
char *syncslotname;
- /* This is table synchronization worker, call initial sync. */
- syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ PG_TRY();
+ {
+ /* This is table synchronization worker, call initial sync. */
+ syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ }
+ PG_CATCH();
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ /*
+ * Report the table sync error. There is no corresponding message
+ * type for table synchronization.
+ */
+ pgstat_report_subworker_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ MyLogicalRepWorker->relid,
+ 0, /* message type */
+ InvalidTransactionId,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
/* allocate slot name in long-lived context */
myslotname = MemoryContextStrdup(ApplyContext, syncslotname);
@@ -3548,7 +3571,32 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- LogicalRepApplyLoop(origin_startpos);
+ PG_TRY();
+ {
+ LogicalRepApplyLoop(origin_startpos);
+ }
+ PG_CATCH();
+ {
+ /* report the apply error */
+ if (apply_error_callback_arg.command != 0)
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ pgstat_report_subworker_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ apply_error_callback_arg.rel != NULL
+ ? apply_error_callback_arg.rel->localreloid
+ : InvalidOid,
+ apply_error_callback_arg.command,
+ apply_error_callback_arg.remote_xid,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ }
+
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
proc_exit(0);
}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index ff5aedc99c..ec9a4e43f5 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -24,6 +24,7 @@
#include "pgstat.h"
#include "postmaster/bgworker_internals.h"
#include "postmaster/postmaster.h"
+#include "replication/logicalproto.h"
#include "replication/slot.h"
#include "storage/proc.h"
#include "storage/procarray.h"
@@ -2239,6 +2240,23 @@ pg_stat_reset_replication_slot(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
+/* Reset a subscription error stats */
+Datum
+pg_stat_reset_subscription_error(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid;
+
+ if (PG_ARGISNULL(1))
+ relid = InvalidOid; /* reset apply worker error stats */
+ else
+ relid = PG_GETARG_OID(1); /* reset table sync worker error stats */
+
+ pgstat_reset_subworker_error_stats(subid, relid);
+
+ PG_RETURN_VOID();
+}
+
Datum
pg_stat_get_archiver(PG_FUNCTION_ARGS)
{
@@ -2379,3 +2397,106 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
/* Returns the record as Datum */
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
+
+/*
+ * Get the subscription error for the given subscription (and relation).
+ */
+Datum
+pg_stat_get_subscription_error(PG_FUNCTION_ARGS)
+{
+#define PG_STAT_GET_SUBSCRIPTION_ERROR_COLS 9
+ Oid subid = PG_GETARG_OID(0);
+ Oid subrelid;
+ TupleDesc tupdesc;
+ Datum values[PG_STAT_GET_SUBSCRIPTION_ERROR_COLS];
+ bool nulls[PG_STAT_GET_SUBSCRIPTION_ERROR_COLS];
+ PgStat_StatSubWorkerEntry *wentry;
+ int i;
+
+ if (PG_ARGISNULL(1))
+ subrelid = InvalidOid;
+ else
+ subrelid = PG_GETARG_OID(1);
+
+ /* Initialise attributes information in the tuple descriptor */
+ tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_SUBSCRIPTION_ERROR_COLS);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "subid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subrelid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 3, "relid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 4, "command",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 5, "xid",
+ XIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 6, "count",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 7, "error_message",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 8, "last_error_time",
+ TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 9, "stats_reset",
+ TIMESTAMPTZOID, -1, 0);
+ BlessTupleDesc(tupdesc);
+
+ /* Get subscription worker stats */
+ wentry = pgstat_fetch_subworker(subid, subrelid);
+
+ /* Return NULL if the subscription doesn't have any errors */
+ if (wentry == NULL)
+ PG_RETURN_NULL();
+
+ /* Initialise values and NULL flags arrays */
+ MemSet(values, 0, sizeof(values));
+ MemSet(nulls, 0, sizeof(nulls));
+
+ i = 0;
+ /* subid */
+ values[i++] = ObjectIdGetDatum(subid);
+
+ /* subrelid */
+ if (OidIsValid(subrelid))
+ values[i++] = ObjectIdGetDatum(subrelid);
+ else
+ nulls[i++] = true;
+
+ /* relid */
+ if (OidIsValid(wentry->relid))
+ values[i++] = ObjectIdGetDatum(wentry->relid);
+ else
+ nulls[i++] = true;
+
+ /* command */
+ if (wentry->command != 0)
+ values[i++] = CStringGetTextDatum(logicalrep_message_type(wentry->command));
+ else
+ nulls[i++] = true;
+
+ /* xid */
+ if (TransactionIdIsValid(wentry->xid))
+ values[i++] = TransactionIdGetDatum(wentry->xid);
+ else
+ nulls[i++] = true;
+
+ /* count */
+ values[i++] = Int64GetDatum(wentry->count);
+
+ /* error_message */
+ values[i++] = CStringGetTextDatum(wentry->message);
+
+ /* last_error_time */
+ if (wentry->timestamp != 0)
+ values[i++] = TimestampTzGetDatum(wentry->timestamp);
+ else
+ nulls[i++] = true;
+
+ /* stats_reset */
+ if (wentry->stat_reset_timestamp != 0)
+ values[i++] = TimestampTzGetDatum(wentry->stat_reset_timestamp);
+ else
+ nulls[i++] = true;
+
+ /* Returns the record as Datum */
+ PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
+}
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index d068d6532e..99fdd78816 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5385,6 +5385,14 @@
proargmodes => '{i,o,o,o,o,o,o,o,o,o,o}',
proargnames => '{slot_name,slot_name,spill_txns,spill_count,spill_bytes,stream_txns,stream_count,stream_bytes,total_txns,total_bytes,stats_reset}',
prosrc => 'pg_stat_get_replication_slot' },
+{ oid => '8523', descr => 'statistics: information about subscription error',
+ proname => 'pg_stat_get_subscription_error', prorows => '1', proisstrict => 'f',
+ proretset => 't', provolatile => 's', proparallel => 'r',
+ prorettype => 'record', proargtypes => 'oid oid',
+ proallargtypes => '{oid,oid,oid,oid,oid,text,xid,int8,text,timestamptz,timestamptz}',
+ proargmodes => '{i,i,o,o,o,o,o,o,o,o,o}',
+ proargnames => '{subid,subrelid,subid,subrelid,relid,command,xid,count,error_message,last_error_time,stats_reset}',
+ prosrc => 'pg_stat_get_subscription_error' },
{ oid => '6118', descr => 'statistics: information about subscription',
proname => 'pg_stat_get_subscription', prorows => '10', proisstrict => 'f',
proretset => 't', provolatile => 's', proparallel => 'r',
@@ -5772,6 +5780,11 @@
proname => 'pg_stat_reset_replication_slot', proisstrict => 'f',
provolatile => 'v', prorettype => 'void', proargtypes => 'text',
prosrc => 'pg_stat_reset_replication_slot' },
+{ oid => '8524',
+ descr => 'statistics: reset collected statistics for a single subscription error',
+ proname => 'pg_stat_reset_subscription_error', proisstrict => 'f',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid oid',
+ prosrc => 'pg_stat_reset_subscription_error' },
{ oid => '3163', descr => 'current trigger depth',
proname => 'pg_trigger_depth', provolatile => 's', proparallel => 'r',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index bcd3588ea2..b8be908256 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -14,6 +14,7 @@
#include "datatype/timestamp.h"
#include "portability/instr_time.h"
#include "postmaster/pgarch.h" /* for MAX_XFN_CHARS */
+#include "replication/logicalproto.h"
#include "utils/backend_progress.h" /* for backward compatibility */
#include "utils/backend_status.h" /* for backward compatibility */
#include "utils/hsearch.h"
@@ -66,6 +67,7 @@ typedef enum StatMsgType
PGSTAT_MTYPE_RESETSINGLECOUNTER,
PGSTAT_MTYPE_RESETSLRUCOUNTER,
PGSTAT_MTYPE_RESETREPLSLOTCOUNTER,
+ PGSTAT_MTYPE_RESETSUBWORKERERROR,
PGSTAT_MTYPE_AUTOVAC_START,
PGSTAT_MTYPE_VACUUM,
PGSTAT_MTYPE_ANALYZE,
@@ -83,6 +85,9 @@ typedef enum StatMsgType
PGSTAT_MTYPE_REPLSLOT,
PGSTAT_MTYPE_CONNECT,
PGSTAT_MTYPE_DISCONNECT,
+ PGSTAT_MTYPE_SUBWORKERERROR,
+ PGSTAT_MTYPE_SUBWORKERERRORPURGE,
+ PGSTAT_MTYPE_SUBWORKERPURGE,
} StatMsgType;
/* ----------
@@ -389,6 +394,24 @@ typedef struct PgStat_MsgResetreplslotcounter
bool clearall;
} PgStat_MsgResetreplslotcounter;
+/* ----------
+ * PgStat_MsgRestsubworkererror Sent by the backend to reset the subscription
+ * worker error information.
+ * ----------
+ */
+typedef struct PgStat_MsgResetsubworkererror
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * Same as PgStat_MsgSubWorkerError, m_subid and m_subrelid are used to
+ * determine the subscription and the reporter of the error: the apply
+ * worker or the table sync worker.
+ */
+ Oid m_subid;
+ Oid m_subrelid;
+} PgStat_MsgResetsubworkererror;
+
/* ----------
* PgStat_MsgAutovacStart Sent by the autovacuum daemon to signal
* that a database is going to be processed
@@ -536,6 +559,68 @@ typedef struct PgStat_MsgReplSlot
PgStat_Counter m_total_bytes;
} PgStat_MsgReplSlot;
+/* ----------
+ * PgStat_MsgSubWorkerError Sent by the apply worker or the table sync worker to
+ * report the error occurred during logical replication.
+ * ----------
+ */
+#define PGSTAT_SUBWORKERERROR_MSGLEN 256
+typedef struct PgStat_MsgSubWorkerError
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * m_subid and m_subrelid are used to determine the subscription and the
+ * reporter of the error. m_subrelid is InvalidOid if reported by an apply
+ * worker otherwise reported by a table sync worker.
+ */
+ Oid m_subid;
+ Oid m_subrelid;
+
+ /*
+ * Oids of the database and the table that the reporter was actually
+ * processing. m_relid can be InvalidOid if an error occurred during
+ * worker applying a non-data-modification message such as RELATION.
+ */
+ Oid m_dbid;
+ Oid m_relid;
+
+ LogicalRepMsgType m_command;
+ TransactionId m_xid;
+ TimestampTz m_timestamp;
+ char m_message[PGSTAT_SUBWORKERERROR_MSGLEN];
+} PgStat_MsgSubWorkerError;
+
+/* ----------
+ * PgStat_MsgSubWorkerPurge Sent by the backend and autovacuum to tell the
+ * collector about the dead subscriptions.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBWORKERPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubWorkerPurge
+{
+ PgStat_MsgHdr m_hdr;
+ int m_nentries;
+ Oid m_subids[PGSTAT_NUM_SUBWORKERPURGE];
+} PgStat_MsgSubWorkerPurge;
+
+/* ----------
+ * PgStat_MsgSubWorkerErrorPurge Sent by the backend and autovacuum to purge
+ * the subscription errors.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBWORKERERRORPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(Oid) - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubWorkerErrorPurge
+{
+ PgStat_MsgHdr m_hdr;
+ Oid m_subid;
+ int m_nentries;
+ Oid m_relids[PGSTAT_NUM_SUBWORKERERRORPURGE];
+} PgStat_MsgSubWorkerErrorPurge;
/* ----------
* PgStat_MsgRecoveryConflict Sent by the backend upon recovery conflict
@@ -697,6 +782,7 @@ typedef union PgStat_Msg
PgStat_MsgResetsinglecounter msg_resetsinglecounter;
PgStat_MsgResetslrucounter msg_resetslrucounter;
PgStat_MsgResetreplslotcounter msg_resetreplslotcounter;
+ PgStat_MsgResetsubworkererror msg_resetsubworkererror;
PgStat_MsgAutovacStart msg_autovacuum_start;
PgStat_MsgVacuum msg_vacuum;
PgStat_MsgAnalyze msg_analyze;
@@ -714,6 +800,9 @@ typedef union PgStat_Msg
PgStat_MsgReplSlot msg_replslot;
PgStat_MsgConnect msg_connect;
PgStat_MsgDisconnect msg_disconnect;
+ PgStat_MsgSubWorkerError msg_subworkererror;
+ PgStat_MsgSubWorkerErrorPurge msg_subworkererrorpurge;
+ PgStat_MsgSubWorkerPurge msg_subworkerpurge;
} PgStat_Msg;
@@ -929,6 +1018,36 @@ typedef struct PgStat_StatReplSlotEntry
TimestampTz stat_reset_timestamp;
} PgStat_StatReplSlotEntry;
+/* The lookup key for subscription worker hash table */
+typedef struct PgStat_StatSubWorkerKey
+{
+ Oid subid;
+ Oid subrelid; /* InvalidOid for apply worker, otherwise for
+ * table sync worker */
+} PgStat_StatSubWorkerKey;
+
+/*
+ * Logical replication apply worker and table sync worker statistics kept in the
+ * stats collector.
+ */
+typedef struct PgStat_StatSubWorkerEntry
+{
+ PgStat_StatSubWorkerKey key; /* hash key (must be first) */
+
+ /*
+ * Subscription worker error statistics representing an error that
+ * occurred during application of logical replication or the initial table
+ * synchronization.
+ */
+ Oid dbid;
+ Oid relid;
+ LogicalRepMsgType command;
+ TransactionId xid;
+ PgStat_Counter count;
+ TimestampTz timestamp;
+ char message[PGSTAT_SUBWORKERERROR_MSGLEN];
+ TimestampTz stat_reset_timestamp;
+} PgStat_StatSubWorkerEntry;
/*
* Working state needed to accumulate per-function-call timing statistics.
@@ -1022,6 +1141,7 @@ extern void pgstat_reset_shared_counters(const char *);
extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type type);
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
+extern void pgstat_reset_subworker_error_stats(Oid subid, Oid subrelid);
extern void pgstat_report_connect(Oid dboid);
extern void pgstat_report_autovac(Oid dboid);
@@ -1038,6 +1158,9 @@ extern void pgstat_report_checksum_failure(void);
extern void pgstat_report_replslot(const PgStat_StatReplSlotEntry *repSlotStat);
extern void pgstat_report_replslot_create(const char *slotname);
extern void pgstat_report_replslot_drop(const char *slotname);
+extern void pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command,
+ TransactionId xid, const char *errmsg);
extern void pgstat_initialize(void);
@@ -1136,6 +1259,7 @@ extern PgStat_GlobalStats *pgstat_fetch_global(void);
extern PgStat_WalStats *pgstat_fetch_stat_wal(void);
extern PgStat_SLRUStats *pgstat_fetch_slru(void);
extern PgStat_StatReplSlotEntry *pgstat_fetch_replslot(NameData slotname);
+extern PgStat_StatSubWorkerEntry *pgstat_fetch_subworker(Oid subid, Oid subrelid);
extern void pgstat_count_slru_page_zeroed(int slru_idx);
extern void pgstat_count_slru_page_hit(int slru_idx);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2fa00a3c29..a7714829ee 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2094,6 +2094,26 @@ pg_stat_subscription| SELECT su.oid AS subid,
st.latest_end_time
FROM (pg_subscription su
LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid, pid, received_lsn, last_msg_send_time, last_msg_receipt_time, latest_end_lsn, latest_end_time) ON ((st.subid = su.oid)));
+pg_stat_subscription_errors| SELECT e.subid,
+ s.subname,
+ e.subrelid,
+ e.relid,
+ e.command,
+ e.xid,
+ e.count,
+ e.error_message,
+ e.last_error_time,
+ e.stats_reset
+ FROM ( SELECT pg_subscription.oid AS subid,
+ NULL::oid AS relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT pg_subscription_rel.srsubid AS subid,
+ pg_subscription_rel.srrelid AS relid
+ FROM pg_subscription_rel
+ WHERE (pg_subscription_rel.srsubstate <> 'r'::"char")) sr,
+ (LATERAL pg_stat_get_subscription_error(sr.subid, sr.relid) e(subid, subrelid, relid, command, xid, count, error_message, last_error_time, stats_reset)
+ JOIN pg_subscription s ON ((e.subid = s.oid)));
pg_stat_sys_indexes| SELECT pg_stat_all_indexes.relid,
pg_stat_all_indexes.indexrelid,
pg_stat_all_indexes.schemaname,
diff --git a/src/test/subscription/t/025_error_report.pl b/src/test/subscription/t/025_error_report.pl
new file mode 100644
index 0000000000..c5af95f339
--- /dev/null
+++ b/src/test/subscription/t/025_error_report.pl
@@ -0,0 +1,156 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Tests for subscription error reporting.
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 5;
+
+# Test if the error reported on pg_subscription_errors view is expected.
+sub test_subscription_error
+{
+ my ($node, $relname, $xid, $expected_error, $msg) = @_;
+
+ my $check_sql = qq[
+SELECT count(1) > 0 FROM pg_stat_subscription_errors
+WHERE relid = '$relname'::regclass];
+ $check_sql .= " AND xid = '$xid'::xid;" if $xid ne '';
+
+ # Wait for the error statistics to be updated.
+ $node->poll_query_until(
+ 'postgres', $check_sql,
+) or die "Timed out while waiting for statistics to be updated";
+
+ my $result = $node->safe_psql(
+ 'postgres',
+ qq[
+SELECT subname, command, relid::regclass, count > 0
+FROM pg_stat_subscription_errors
+WHERE relid = '$relname'::regclass;
+]);
+ is($result, $expected_error, $msg);
+}
+
+# Create publisher node.
+my $node_publisher = PostgresNode->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+logical_decoding_work_mem = 64kB
+]);
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgresNode->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+
+# The subscriber will enter an infinite error loop, so we don't want
+# to overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+wal_retrieve_retry_interval = 5s
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we create
+# the same tables but with primary keys. Also, insert some data that will conflict
+# with the data replicated from publisher later.
+$node_publisher->safe_psql(
+ 'postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int);
+CREATE TABLE test_tab2 (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+INSERT INTO test_tab1 VALUES (1);
+INSERT INTO test_tab2 VALUES (1);
+COMMIT;
+]);
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int primary key);
+CREATE TABLE test_tab2 (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab2 VALUES (1);
+INSERT INTO test_tab_streaming SELECT 10000, md5(10000::text);
+COMMIT;
+]);
+
+# Setup publications.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ q[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Check if there is no subscription errors before starting logical replication.
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_errors");
+is($result, qq(0), 'check no subscription error');
+
+# Create subscriptions. The table sync for test_tab2 on tap_sub will enter to
+# infinite error due to violating the unique constraint.
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (streaming = off, two_phase = on);");
+my $appname_streaming = 'tap_sub_streaming';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming' PUBLICATION tap_pub_streaming WITH (streaming = on, two_phase = on);");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Wait for initial table sync for test_tab1 and test_tab_streaming to finish.
+$node_subscriber->poll_query_until('postgres',
+ q[
+SELECT count(1) = 2 FROM pg_subscription_rel
+WHERE srrelid in ('test_tab1'::regclass, 'test_tab_streaming'::regclass) AND srsubstate in ('r', 's')
+]) or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check the initial data.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab1");
+is($result, q(1), 'check initial data are copied to subscriber');
+
+# Insert more data to test_tab1, raising an error on the subscriber due to violation
+# of the unique constraint on test_tab1.
+my $xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab1 VALUES (1);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_subscription_error($node_subscriber, 'test_tab1', $xid,
+ qq(tap_sub|INSERT|test_tab1|t),
+ 'check the error reported by the apply worker');
+
+# Check the table sync worker's error in the view.
+test_subscription_error($node_subscriber, 'test_tab2', '',
+ qq(tap_sub||test_tab2|t),
+ 'check the error reported by the table sync worker');
+
+# Check if the view doesn't show any entries after dropping the subscriptions.
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+DROP SUBSCRIPTION tap_sub;
+DROP SUBSCRIPTION tap_sub_streaming;
+]);
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_errors");
+is($result, q(0), 'no error after dropping subscription');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index cb5b5ec74c..6916c290f5 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1939,7 +1939,11 @@ PgStat_MsgResetreplslotcounter
PgStat_MsgResetsharedcounter
PgStat_MsgResetsinglecounter
PgStat_MsgResetslrucounter
+PgStat_MsgResetsubworkererror
PgStat_MsgSLRU
+PgStat_MsgSubWorkerError
+PgStat_MsgSubWorkerErrorPurge
+PgStat_MsgSubWorkerPurge
PgStat_MsgTabpurge
PgStat_MsgTabstat
PgStat_MsgTempFile
@@ -1951,6 +1955,8 @@ PgStat_Single_Reset_Type
PgStat_StatDBEntry
PgStat_StatFuncEntry
PgStat_StatReplSlotEntry
+PgStat_StatSubWorkerEntry
+PgStat_StatSubWorkerKey
PgStat_StatTabEntry
PgStat_SubXactStatus
PgStat_TableCounts
--
2.24.3 (Apple Git-128)
On Mon, Oct 11, 2021 at 12:57 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
On Monday, October 11, 2021 11:51 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Oct 8, 2021 at 4:09 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:On Thursday, September 30, 2021 2:45 PM Masahiko Sawada
<sawada.mshk@gmail.com> wrote:
I've attached updated patches that incorporate all comments I got so
far. Please review them.Sorry, if I misunderstand something but did someone check what happens
when we execute ALTER SUBSCRIPTION ... RESET (streaming) in the middle
of one txn which has several streaming of data to the sub, especially
after some part of txn has been already streamed.
My intention of this is something like *if* we can find an actual harm
of this, I wanted to suggest the necessity of a safeguard or some measureinto the patch.
...
I observed that the subscriber doesn't accept STREAM_COMMIT in this
case but gets BEGIN&COMMIT instead at the end.
I couldn't find any apparent and immediate issue from those steps but
is that no problem ?
Probably, this kind of situation applies to other reset target options ?I think that if a subscription parameter such as ‘streaming’ and ‘binary’ is
changed, an apply worker exits and the launcher starts a new worker (see
maybe_reread_subscription()). So I guess that in this case, the apply worker
exited during receiving streamed changes, restarted, and received the same
changes with ‘streaming = off’, therefore it got BEGIN and COMMIT instead. I
think that this happens even by using ‘SET (‘streaming’ = off)’.You are right. Yes, I checked that the apply worker did exit
and the new apply worker process dealt with the INSERT in the above case.
Also, setting streaming = false was same.
I think you can additionally verify that temporary streaming files get
removed after restart.
--
With Regards,
Amit Kapila.
On Mon, Oct 11, 2021 at 1:00 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Sun, Oct 10, 2021 at 11:04 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:On 04.10.21 02:31, Masahiko Sawada wrote:
I guess disabling subscriptions on error/conflict and skipping the
particular transactions are somewhat different types of functions.
Disabling subscriptions on error/conflict seems likes a setting
parameter of subscriptions. The users might want to specify this
option at creation time. Whereas, skipping the particular transaction
is a repair function that the user might want to use on the spot in
case of a failure. I’m concerned a bit that combining these functions
to one syntax could confuse the users.Also, would the skip option be dumped and restored using pg_dump? Maybe
there is an argument for yes, but if not, then we probably need a
different path of handling it separate from the more permanent options.Good point. I don’t think the skip option should be dumped and
restored using pg_dump since the utilization of transaction ids in
another installation is different.
This is a xid of publisher which subscriber wants to skip. So, even if
one restores the subscriber data in a different installation why would
it matter till it points to the same publisher?
Either way, can't we handle this in pg_dump?
--
With Regards,
Amit Kapila.
On Mon, Oct 18, 2021 at 6:07 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Oct 11, 2021 at 1:00 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Sun, Oct 10, 2021 at 11:04 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:On 04.10.21 02:31, Masahiko Sawada wrote:
I guess disabling subscriptions on error/conflict and skipping the
particular transactions are somewhat different types of functions.
Disabling subscriptions on error/conflict seems likes a setting
parameter of subscriptions. The users might want to specify this
option at creation time. Whereas, skipping the particular transaction
is a repair function that the user might want to use on the spot in
case of a failure. I’m concerned a bit that combining these functions
to one syntax could confuse the users.Also, would the skip option be dumped and restored using pg_dump? Maybe
there is an argument for yes, but if not, then we probably need a
different path of handling it separate from the more permanent options.Good point. I don’t think the skip option should be dumped and
restored using pg_dump since the utilization of transaction ids in
another installation is different.This is a xid of publisher which subscriber wants to skip. So, even if
one restores the subscriber data in a different installation why would
it matter till it points to the same publisher?Either way, can't we handle this in pg_dump?
Because of backups (dumps), I think we cannot expect that the user
restore it somewhere soon. If the dump is restored several months
later, the publisher could be a different installation (by rebuilding
from scratch) or XID of the publisher could already be wrapped around.
It might be useful to dump the skip_xid by pg_dump in some cases, but
I think it should be optional if we want to do that.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, Oct 19, 2021 at 8:23 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Oct 18, 2021 at 6:07 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Oct 11, 2021 at 1:00 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Sun, Oct 10, 2021 at 11:04 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:On 04.10.21 02:31, Masahiko Sawada wrote:
I guess disabling subscriptions on error/conflict and skipping the
particular transactions are somewhat different types of functions.
Disabling subscriptions on error/conflict seems likes a setting
parameter of subscriptions. The users might want to specify this
option at creation time. Whereas, skipping the particular transaction
is a repair function that the user might want to use on the spot in
case of a failure. I’m concerned a bit that combining these functions
to one syntax could confuse the users.Also, would the skip option be dumped and restored using pg_dump? Maybe
there is an argument for yes, but if not, then we probably need a
different path of handling it separate from the more permanent options.Good point. I don’t think the skip option should be dumped and
restored using pg_dump since the utilization of transaction ids in
another installation is different.This is a xid of publisher which subscriber wants to skip. So, even if
one restores the subscriber data in a different installation why would
it matter till it points to the same publisher?Either way, can't we handle this in pg_dump?
Because of backups (dumps), I think we cannot expect that the user
restore it somewhere soon. If the dump is restored several months
later, the publisher could be a different installation (by rebuilding
from scratch) or XID of the publisher could already be wrapped around.
It might be useful to dump the skip_xid by pg_dump in some cases, but
I think it should be optional if we want to do that.
Agreed, I think it depends on the use case, so we can keep it
optional, or maybe in the initial version let's not dump it, and only
if we later see the use case then we can add an optional parameter in
pg_dump. Do you think we need any special handling if we decide not to
dump it? I think if we decide to dump it either optionally or
otherwise, then we do need changes in pg_dump.
--
With Regards,
Amit Kapila.
On Tue, Oct 19, 2021 at 12:38 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Oct 19, 2021 at 8:23 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Oct 18, 2021 at 6:07 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Oct 11, 2021 at 1:00 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Sun, Oct 10, 2021 at 11:04 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:On 04.10.21 02:31, Masahiko Sawada wrote:
I guess disabling subscriptions on error/conflict and skipping the
particular transactions are somewhat different types of functions.
Disabling subscriptions on error/conflict seems likes a setting
parameter of subscriptions. The users might want to specify this
option at creation time. Whereas, skipping the particular transaction
is a repair function that the user might want to use on the spot in
case of a failure. I’m concerned a bit that combining these functions
to one syntax could confuse the users.Also, would the skip option be dumped and restored using pg_dump? Maybe
there is an argument for yes, but if not, then we probably need a
different path of handling it separate from the more permanent options.Good point. I don’t think the skip option should be dumped and
restored using pg_dump since the utilization of transaction ids in
another installation is different.This is a xid of publisher which subscriber wants to skip. So, even if
one restores the subscriber data in a different installation why would
it matter till it points to the same publisher?Either way, can't we handle this in pg_dump?
Because of backups (dumps), I think we cannot expect that the user
restore it somewhere soon. If the dump is restored several months
later, the publisher could be a different installation (by rebuilding
from scratch) or XID of the publisher could already be wrapped around.
It might be useful to dump the skip_xid by pg_dump in some cases, but
I think it should be optional if we want to do that.Agreed, I think it depends on the use case, so we can keep it
optional, or maybe in the initial version let's not dump it, and only
if we later see the use case then we can add an optional parameter in
pg_dump.
Agreed. I prefer not to dump it in the first version since it's
difficult to remove the option once it's introduced.
Do you think we need any special handling if we decide not to
dump it? I think if we decide to dump it either optionally or
otherwise, then we do need changes in pg_dump.
Yeah, if we don't dump the skip_xid (which is the current patch
behavior), any special handling is not required for pg_dump. On the
other hand, if we do that in any way, we need changes for pg_dump.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Mon, Oct 18, 2021 9:34 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached updated patches that incorporate all comments I got so far.
Hi,
Here are some minor comments for the patches.
v17-0001-Add-a-subscription-errors-statistics-view-pg_sta.patch
1)
+ /* Clean up */
+ if (not_ready_rels != NIL)
+ list_free_deep(not_ready_rels);
Maybe we don't need the ' if (not_ready_rels != NIL)' check as
list_free_deep will do this check internally.
2)
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ HASH_SEQ_STATUS sstat;
+ PgStat_StatSubWorkerEntry *wentry;
+
+ /* Remove all worker statistics of the subscription */
+ hash_seq_init(&sstat, subWorkerStatHash);
+ while ((wentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&sstat)) != NULL)
+ {
+ if (wentry->key.subid == msg->m_subids[i])
+ (void) hash_search(subWorkerStatHash, (void *) &(wentry->key),
+ HASH_REMOVE, NULL);
Would it be a little faster if we scan hashtable in outerloop and
scan the msg in innerloop ?
Like:
while ((wentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&sstat)) != NULL)
{
for (int i = 0; i < msg->m_nentries; i++)
...
v17-0002-Add-RESET-command-to-ALTER-SUBSCRIPTION-command
1)
I noticed that we cannot RESET slot_name while we can SET it.
And the slot_name have a default behavior that " use the name of the subscription for the slot name.".
So, is it possible to support RESET it ?
Best regards,
Hou zj
On Mon, Oct 18, 2021 at 12:34 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached updated patches that incorporate all comments I got so far.
Minor comment on patch 17-0003
src/backend/replication/logical/worker.c
(1) Typo in apply_handle_stream_abort() comment:
/* Stop skipping transaction transaction, if enabled */
should be:
/* Stop skipping transaction changes, if enabled */
Regards,
Greg Nancarrow
Fujitsu Australia
On Wed, Oct 20, 2021 at 12:03 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
On Mon, Oct 18, 2021 9:34 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached updated patches that incorporate all comments I got so far.
Hi,
Here are some minor comments for the patches.
Thank you for the comments!
v17-0001-Add-a-subscription-errors-statistics-view-pg_sta.patch
1)
+ /* Clean up */ + if (not_ready_rels != NIL) + list_free_deep(not_ready_rels);Maybe we don't need the ' if (not_ready_rels != NIL)' check as
list_free_deep will do this check internally.
Agreed.
2)
+ for (int i = 0; i < msg->m_nentries; i++) + { + HASH_SEQ_STATUS sstat; + PgStat_StatSubWorkerEntry *wentry; + + /* Remove all worker statistics of the subscription */ + hash_seq_init(&sstat, subWorkerStatHash); + while ((wentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&sstat)) != NULL) + { + if (wentry->key.subid == msg->m_subids[i]) + (void) hash_search(subWorkerStatHash, (void *) &(wentry->key), + HASH_REMOVE, NULL);Would it be a little faster if we scan hashtable in outerloop and
scan the msg in innerloop ?
Like:
while ((wentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&sstat)) != NULL)
{
for (int i = 0; i < msg->m_nentries; i++)
...
Agreed.
v17-0002-Add-RESET-command-to-ALTER-SUBSCRIPTION-command
1)
I noticed that we cannot RESET slot_name while we can SET it.
And the slot_name have a default behavior that " use the name of the subscription for the slot name.".
So, is it possible to support RESET it ?
Hmm, I'm not sure resetting slot_name is useful. I think that it’s
common to change the slot name to NONE by ALTER SUBSCRIPTION and vise
versa. But I think resetting the slot name (i.g., changing a
non-default name to the default name) is not the common use case. If
the user wants to do that, it seems safer to explicitly specify the
slot name using by ALTER SUBSCRIPTION ... SET (slot_name = 'XXX').
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wed, Oct 20, 2021 at 12:33 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Mon, Oct 18, 2021 at 12:34 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached updated patches that incorporate all comments I got so far.
Minor comment on patch 17-0003
Thank you for the comment!
src/backend/replication/logical/worker.c
(1) Typo in apply_handle_stream_abort() comment:
/* Stop skipping transaction transaction, if enabled */
should be:
/* Stop skipping transaction changes, if enabled */
Fixed.
I've attached updated patches. In this version, in addition to the
review comments I go so far, I've changed the view name from
pg_stat_subscription_errors to pg_stat_subscription_workers as per the
discussion on including xact info to the view on another thread[1]/messages/by-id/CAD21AoDF7LmSALzMfmPshRw_xFcRz3WvB-me8T2gO6Ht=3zL2w@mail.gmail.com.
I’ve also changed related codes accordingly.
Regards,
[1]: /messages/by-id/CAD21AoDF7LmSALzMfmPshRw_xFcRz3WvB-me8T2gO6Ht=3zL2w@mail.gmail.com
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v18-0003-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchapplication/octet-stream; name=v18-0003-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patchDownload
From c9a1065c5c69f40a4bf10c693871bef281a6f587 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 2 Aug 2021 14:27:40 +0900
Subject: [PATCH v18 3/3] Add skip_xid option to ALTER SUBSCRIPTION.
If incoming change violates any constraint, logical replication stops
until it's resolved. This commit introduces another way to skip the
transaction in question, other than manually updating the subscriber's
database or using pg_replication_origin_advance().
The user can specify XID by ALTER SUBSCRIPTION ... SET (skip_xid =
XXX), updating pg_subscription.subskipxid field, telling the apply
worker to skip the transaction. The apply worker skips all data
modification changes within the specified transaction.
After skipping the transaciton the apply worker clears
pg_subscription.subskipxid. Also, it clears the error statistics of
the subscription in pg_stat_subscription_errors system view as well in
order the user not to get confused. It's done by sending the message
for clearing a subscription error to the stats collector.
---
doc/src/sgml/logical-replication.sgml | 55 +++++-
doc/src/sgml/ref/alter_subscription.sgml | 34 +++-
src/backend/catalog/pg_subscription.c | 10 ++
src/backend/commands/subscriptioncmds.c | 42 ++++-
src/backend/replication/logical/worker.c | 183 +++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 4 +
src/include/catalog/pg_subscription.h | 4 +
src/include/nodes/parsenodes.h | 1 +
src/test/regress/expected/subscription.out | 13 ++
src/test/regress/sql/subscription.sql | 11 ++
src/test/subscription/t/025_error_report.pl | 107 +++++++++++-
11 files changed, 445 insertions(+), 19 deletions(-)
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 88646bc859..ff33f67900 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -333,20 +333,67 @@
<para>
A conflict will produce an error and will stop the replication; it must be
resolved manually by the user. Details about the conflict can be found in
- the subscriber's server log.
+ <xref linkend="monitoring-pg-stat-subscription-workers"/> as well as the
+ subscriber's server log.
</para>
<para>
The resolution can be done either by changing data on the subscriber so
- that it does not conflict with the incoming change or by skipping the
- transaction that conflicts with the existing data. The transaction can be
- skipped by calling the <link linkend="pg-replication-origin-advance">
+ that it does not conflict with the incoming changes or by skipping the whole
+ transaction. This option specifies the ID of the transaction whose
+ application is to be skipped by the logical replication worker. The logical
+ replication worker skips all data modification transaction conflicts with
+ the existing data. When a conflict produce an error, it is shown in
+ <structname>pg_stat_subscription_workers</structname> view as follows:
+ </para>
+
+ <programlisting>
+postgres=# SELECT * FROM pg_stat_subscription_workers;
+-[ RECORD 1 ]----+-----------------------------------------------------------
+subid | 16391
+subname | test_sub
+subrelid |
+relid | 16384
+command | INSERT
+xid | 716
+count | 50
+error_message | duplicate key value violates unique constraint "test_pkey"
+last_error_time | 2021-09-29 15:52:45.165754+00
+stats_reset |
+</programlisting>
+
+ <para>
+ and it is also shown in subscriber's server log as follows:
+ </para>
+
+<screen>
+ERROR: duplicate key value violates unique constraint "test_pkey"
+DETAIL: Key (id)=(1) already exists.
+CONTEXT: processing remote data during "INSERT" for replication target relation "public.test" in transaction 716 with commit timestamp 2021-07-15 21:54:58.802874+00
+</screen>
+
+ <para>
+ The transaction ID that contains the change violating the constraint can be
+ found from those outputs (transaction ID 716 in the above case). The transaction
+ can be skipped by setting <replaceable>skip_xid</replaceable> on the subscription
+ by <command>ALTER SUBSCRIPTION ... SET</command>. Alternatively, the transaction
+ can also be skipped by calling the <link linkend="pg-replication-origin-advance">
<function>pg_replication_origin_advance()</function></link> function with
a <parameter>node_name</parameter> corresponding to the subscription name,
and a position. The current position of origins can be seen in the
<link linkend="view-pg-replication-origin-status">
<structname>pg_replication_origin_status</structname></link> system view.
</para>
+
+ <para>
+ In this case, you need to consider changing the data on the subscriber so that it
+ doesn't conflict with incoming changes, or dropping the conflicting constraint or
+ unique index, or writing a trigger on the subscriber to suppress or redirect
+ conflicting incoming changes, or as a last resort, by skipping the whole transaction.
+ They skip the whole transaction, including changes that may not violate any
+ constraint. They may easily make the subscriber inconsistent, especially if
+ a user specifies the wrong transaction ID or the position of origin.
+ </para>
</sect1>
<sect1 id="logical-replication-restrictions">
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 2ed35f5408..b4b6ab8989 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -209,7 +209,39 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
<para>
The parameters that can be reset are:
<literal>synchronous_commit</literal>, <literal>binary</literal>,
- and <literal>streaming</literal>.
+ <literal>streaming</literal>, and following parameter:
+ </para>
+ <para>
+ <variablelist>
+ <varlistentry>
+ <term><literal>skip_xid</literal> (<type>xid</type>)</term>
+ <listitem>
+ <para>
+ If incoming data violates any constraints the logical replication
+ will stop until it is resolved. The resolution can be done either
+ by changing data on the subscriber so that it doesn't conflict with
+ incoming change or by skipping the whole transaction. This option
+ specifies the ID of the transaction whose application is to be skipped
+ by the logical replication worker. The logical replication worker
+ skips all data modification changes within the specified transaction.
+ Therefore, since it skips the whole transaction including the changes
+ that may not violate the constraint, it should only be used as a last
+ resort. This option has no effect for the transaction that is already
+ prepared with enabling <literal>two_phase</literal> on susbscriber.
+ After the logical replication successfully skips the transaction, the
+ transaction ID (stored in
+ <structname>pg_subscription</structname>.<structfield>subskipxid</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts.
+ </para>
+
+ <para>
+ Setting and resetting of <literal>skip_xid</literal> option is
+ restricted to superusers.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index 25021e25a4..cb22cd7463 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -104,6 +104,16 @@ GetSubscription(Oid subid, bool missing_ok)
Assert(!isnull);
sub->publications = textarray_to_stringlist(DatumGetArrayTypeP(datum));
+ /* Get skip XID */
+ datum = SysCacheGetAttr(SUBSCRIPTIONOID,
+ tup,
+ Anum_pg_subscription_subskipxid,
+ &isnull);
+ if (!isnull)
+ sub->skipxid = DatumGetTransactionId(datum);
+ else
+ sub->skipxid = InvalidTransactionId;
+
ReleaseSysCache(tup);
return sub;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 896ec8b836..fd74037fb8 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -60,6 +60,7 @@
#define SUBOPT_BINARY 0x00000080
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
+#define SUBOPT_SKIP_XID 0x00000400
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -81,6 +82,7 @@ typedef struct SubOpts
bool binary;
bool streaming;
bool twophase;
+ TransactionId skip_xid;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -129,6 +131,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->streaming = false;
if (IsSet(supported_opts, SUBOPT_TWOPHASE_COMMIT))
opts->twophase = false;
+ if (IsSet(supported_opts, SUBOPT_SKIP_XID))
+ opts->skip_xid = InvalidTransactionId;
/* Parse options */
foreach(lc, stmt_options)
@@ -261,6 +265,29 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_TWOPHASE_COMMIT;
opts->twophase = defGetBoolean(defel);
}
+ else if (strcmp(defel->defname, "skip_xid") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_SKIP_XID))
+ errorConflictingDefElem(defel, pstate);
+
+ if (!is_reset)
+ {
+ char *xid_str = defGetString(defel);
+ TransactionId xid;
+
+ /* Parse the argument as TransactionId */
+ xid = DatumGetTransactionId(DirectFunctionCall1(xidin,
+ CStringGetDatum(xid_str)));
+
+ if (!TransactionIdIsNormal(xid))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid transaction id")));
+ opts->skip_xid = xid;
+ }
+
+ opts->specified_opts |= SUBOPT_SKIP_XID;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -485,6 +512,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
CharGetDatum(opts.twophase ?
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -888,7 +916,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (is_reset)
supported_opts = (SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
- SUBOPT_STREAMING);
+ SUBOPT_STREAMING | SUBOPT_SKIP_XID);
else
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
@@ -941,6 +969,18 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
replaces[Anum_pg_subscription_substream - 1] = true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_SKIP_XID))
+ {
+ if (!superuser())
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("must be superuser to set %s", "skip_xid")));
+
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(opts.skip_xid);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+ }
+
update_tuple = true;
break;
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 3a40684fa5..d771a0c058 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -136,6 +136,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -255,6 +256,21 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * skipping_xid is a valid XID during skipping all data modification changes
+ * (INSERT/DELETE/UPDATE/TRUNCATE) of the specified transaction at MySubscription->skipxid.
+ * Please note that we don’t skip receiving the changes particularly in streaming
+ * cases, since we decide whether or not to skip applying the changes when starting
+ * to apply. Once starting skipping changes, we copy the XID to skipping_xid and
+ * then don't stop skipping until we skip the whole transaction even if the
+ * subscription is invalidated and MySubscription->skipxid gets changed or reset.
+ * When stopping the skipping behavior, we reset the skip XID (subskipxid) in the
+ * pg_subscription catalog and associate origin status to the transaction that resets
+ * the skip XID so that we can start streaming from the next transaction.
+ */
+static TransactionId skipping_xid = InvalidTransactionId;
+#define is_skipping_changes() (TransactionIdIsValid(skipping_xid))
+
/* BufFile handle of the current streaming file */
static BufFile *stream_fd = NULL;
@@ -335,6 +351,9 @@ static void apply_error_callback(void *arg);
static inline void set_apply_error_context_xact(TransactionId xid, TimestampTz ts);
static inline void reset_apply_error_context_info(void);
+static void maybe_start_skipping_changes(TransactionId xid);
+static void stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs);
+
/*
* Should this worker apply changes for given relation.
*
@@ -789,6 +808,11 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
+ /*
+ * Enable skipping all changes of this transaction if specified.
+ */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -813,7 +837,18 @@ apply_handle_commit(StringInfo s)
LSN_FORMAT_ARGS(commit_data.commit_lsn),
LSN_FORMAT_ARGS(remote_final_lsn))));
- apply_handle_commit_internal(&commit_data);
+ /*
+ * Stop skipping the transaction changes, if enabled. Otherwise, commit
+ * the changes that are just applied.
+ */
+ if (is_skipping_changes())
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ apply_handle_commit_internal(&commit_data);
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(commit_data.end_lsn);
@@ -841,6 +876,9 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
+ /* Enable skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -899,9 +937,10 @@ apply_handle_prepare(StringInfo s)
/*
* Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction. It is done this way because at
- * commit prepared time, we won't know whether we have skipped preparing a
- * transaction because of no change.
+ * change has happened in this transaction, possibly because we're
+ * skipping data-modification changes of this transaction. It is done this
+ * way because at commit prepared time, we won't know whether we have
+ * skipped preparing a transaction because of no change.
*
* XXX, We can optimize such that at commit prepared time, we first check
* whether we have prepared the transaction or not but that doesn't seem
@@ -915,6 +954,10 @@ apply_handle_prepare(StringInfo s)
CommitTransactionCommand();
pgstat_report_stat(false);
+ /* Stop skipping transaction changes, if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
store_flush_position(prepare_data.end_lsn);
in_remote_transaction = false;
@@ -1046,6 +1089,9 @@ apply_handle_stream_prepare(StringInfo s)
elog(DEBUG1, "received prepare for streamed transaction %u", prepare_data.xid);
+ /* Enable skipping all changes of this transaction if specified. */
+ maybe_start_skipping_changes(prepare_data.xid);
+
/* Replay all the spooled operations. */
apply_spooled_messages(prepare_data.xid, prepare_data.prepare_lsn);
@@ -1056,6 +1102,10 @@ apply_handle_stream_prepare(StringInfo s)
pgstat_report_stat(false);
+ /* Stop skipping transaction changes, if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
store_flush_position(prepare_data.end_lsn);
in_remote_transaction = false;
@@ -1081,9 +1131,10 @@ apply_handle_origin(StringInfo s)
{
/*
* ORIGIN message can only come inside streaming transaction or inside
- * remote transaction and before any actual writes.
+ * remote transaction and before any actual writes unless we're skipping
+ * changes of the transaction.
*/
- if (!in_streamed_transaction &&
+ if (!in_streamed_transaction && !is_skipping_changes() &&
(!in_remote_transaction ||
(IsTransactionState() && !am_tablesync_worker())))
ereport(ERROR,
@@ -1206,6 +1257,7 @@ apply_handle_stream_abort(StringInfo s)
errmsg_internal("STREAM ABORT message without STREAM STOP")));
logicalrep_read_stream_abort(s, &xid, &subxid);
+ maybe_start_skipping_changes(xid);
/*
* If the two XIDs are the same, it's in fact abort of toplevel xact, so
@@ -1289,6 +1341,10 @@ apply_handle_stream_abort(StringInfo s)
CommitTransactionCommand();
}
+ /* Stop skipping transaction changes, if enabled */
+ if (is_skipping_changes())
+ stop_skipping_changes(InvalidXLogRecPtr, 0);
+
reset_apply_error_context_info();
}
@@ -1428,9 +1484,23 @@ apply_handle_stream_commit(StringInfo s)
elog(DEBUG1, "received commit for streamed transaction %u", xid);
+ /* Enable skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(xid);
+
apply_spooled_messages(xid, commit_data.commit_lsn);
- apply_handle_commit_internal(&commit_data);
+ if (is_skipping_changes())
+ {
+ stop_skipping_changes(commit_data.end_lsn, commit_data.committime);
+
+ store_flush_position(commit_data.end_lsn);
+ in_remote_transaction = false;
+ }
+ else
+ {
+ /* commit the streamed transaction */
+ apply_handle_commit_internal(&commit_data);
+ }
/* unlink the files with serialized changes and subxact info */
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
@@ -2316,6 +2386,17 @@ apply_dispatch(StringInfo s)
LogicalRepMsgType action = pq_getmsgbyte(s);
LogicalRepMsgType saved_command;
+ /*
+ * Skip all data-modification changes if we're skipping changes of this
+ * transaction.
+ */
+ if (is_skipping_changes() &&
+ (action == LOGICAL_REP_MSG_INSERT ||
+ action == LOGICAL_REP_MSG_UPDATE ||
+ action == LOGICAL_REP_MSG_DELETE ||
+ action == LOGICAL_REP_MSG_TRUNCATE))
+ return;
+
/*
* Set the current command being applied. Since this function can be
* called recusively when applying spooled changes, save the current
@@ -3665,3 +3746,91 @@ reset_apply_error_context_info(void)
apply_error_callback_arg.remote_attnum = -1;
set_apply_error_context_xact(InvalidTransactionId, 0);
}
+
+/*
+ * Start skipping changes of the transaction if the given XID matches the
+ * transaction ID specified by skip_xid option.
+ */
+static void
+maybe_start_skipping_changes(TransactionId xid)
+{
+ Assert(!is_skipping_changes());
+ Assert(!TransactionIdIsValid(skipping_xid));
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ if (!TransactionIdIsValid(MySubscription->skipxid) ||
+ MySubscription->skipxid != xid)
+ return;
+
+ skipping_xid = xid;
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction %u",
+ skipping_xid));
+}
+
+/*
+ * Stop skipping changes and reset the skip XID. Also, reset the skip XID
+ * (pg_subscription.subskipxid). If origin_lsn and origin_committs are valid, we
+ * set origin state to the transaction commit that resets the skip XID so that we
+ * can start streaming from the transaction next to the one that we just skipped.
+ */
+static void
+stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_committs)
+{
+ Relation rel;
+ HeapTuple tup;
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ Assert(is_skipping_changes());
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction %u",
+ skipping_xid)));
+
+ /* Stop skipping changes */
+ skipping_xid = InvalidTransactionId;
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ if (!IsTransactionState())
+ StartTransactionCommand();
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ ObjectIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ /* Set subskipxid to null */
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ /* Update the system catalog to reset the skip XID */
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+
+ heap_freetuple(tup);
+ table_close(rel, RowExclusiveLock);
+
+ if (!XLogRecPtrIsInvalid(origin_lsn))
+ {
+ /*
+ * Update origin state so we can restart streaming from correct
+ * position in case of crash.
+ */
+ replorigin_session_origin_lsn = origin_lsn;
+ replorigin_session_origin_timestamp = origin_committs;
+ }
+
+ CommitTransactionCommand();
+ pgstat_report_stat(false);
+}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index ed8ed2f266..3e4c8c16cc 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4353,6 +4353,10 @@ getSubscriptions(Archive *fout)
ntups = PQntuples(res);
+ /*
+ * Get subscription fields. We don't fetch subskipxid as we don't
+ * include it in the dump.
+ */
i_tableoid = PQfnumber(res, "tableoid");
i_oid = PQfnumber(res, "oid");
i_subname = PQfnumber(res, "subname");
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 21061493ea..beaa6e646d 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -67,6 +67,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
char subtwophasestate; /* Stream two-phase transactions */
+ TransactionId subskipxid BKI_FORCE_NULL; /* All changes associated with
+ * this XID are skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -103,6 +106,7 @@ typedef struct Subscription
* binary format */
bool stream; /* Allow streaming in-progress transactions. */
char twophasestate; /* Allow streaming two-phase transactions */
+ TransactionId skipxid; /* All changes of the XID are skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 539921cb52..63503b86da 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3694,6 +3694,7 @@ typedef struct AlterSubscriptionStmt
char *conninfo; /* Connection string to publisher */
List *publication; /* One or more publication to subscribe to */
List *options; /* List of DefElem nodes */
+ TransactionId skip_xid; /* XID to skip */
} AlterSubscriptionStmt;
typedef struct DropSubscriptionStmt
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 924155e86c..8808f09f57 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -292,6 +292,19 @@ ERROR: unrecognized subscription parameter: "enabled"
-- fail - RESET must not include values
ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
ERROR: RESET must not include values for parameters
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = '4294967295');
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1.1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ERROR: invalid transaction id
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+ERROR: invalid transaction id
\dRs+
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 9abbb77635..030cc63aa3 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -226,6 +226,17 @@ ALTER SUBSCRIPTION regress_testsub RESET (enabled);
-- fail - RESET must not include values
ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+-- it works
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 3);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = '4294967295');
+ALTER SUBSCRIPTION regress_testsub RESET (skip_xid);
+
+-- fail - invalid XID
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1.1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 0);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 1);
+ALTER SUBSCRIPTION regress_testsub SET (skip_xid = 2);
+
\dRs+
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/subscription/t/025_error_report.pl b/src/test/subscription/t/025_error_report.pl
index d19caa7013..77d22eec52 100644
--- a/src/test/subscription/t/025_error_report.pl
+++ b/src/test/subscription/t/025_error_report.pl
@@ -1,12 +1,14 @@
# Copyright (c) 2021, PostgreSQL Global Development Group
-# Tests for subscription error reporting.
+# Tests for subscription error reporting and skipping logical
+# replication transactions.
+
use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 5;
+use Test::More tests => 14;
# Test if the error reported on pg_subscription_workers view is expected.
sub test_subscription_error
@@ -32,6 +34,35 @@ WHERE relid = '$relname'::regclass;
]);
is($result, $expected_error, $msg);
}
+# Check the error reported on pg_stat_subscription view and skip the failed
+# transaction.
+sub test_skip_subscription_error
+{
+ my ($node, $subname, $relname, $xid, $expected_error, $msg) = @_;
+
+ # Check the reported error.
+ test_subscription_error($node, $relname, $xid, $expected_error, $msg);
+
+ # Get XID of the failed transaction.
+ my $skipxid = $node->safe_psql(
+ 'postgres',
+ "SELECT xid FROM pg_stat_subscription_workers WHERE relid = '$relname'::regclass");
+ is($skipxid, $xid, "remote xid and skip_xid are equal");
+
+ $node->safe_psql('postgres',
+ "ALTER SUBSCRIPTION $subname SET (skip_xid = '$skipxid')");
+
+ # Restart the subscriber node to restart logical replication with no interval.
+ $node->restart;
+
+ # Wait for the failed transaction to be skipped.
+ $node->poll_query_until(
+ 'postgres',
+ qq[
+SELECT subskipxid IS NULL FROM pg_subscription
+WHERE subname = '$subname'
+]) or die "Timed out while waiting for the transaction to be skipped";
+}
# Create publisher node.
my $node_publisher = PostgresNode->new('publisher');
@@ -123,7 +154,7 @@ $result = $node_subscriber->safe_psql('postgres',
is($result, q(1), 'check initial data are copied to subscriber');
# Insert more data to test_tab1, raising an error on the subscriber due to violation
-# of the unique constraint on test_tab1.
+# of the unique constraint on test_tab1. Then skip the transaction in question.
my $xid = $node_publisher->safe_psql(
'postgres',
qq[
@@ -132,15 +163,79 @@ INSERT INTO test_tab1 VALUES (1);
SELECT pg_current_xact_id()::xid;
COMMIT;
]);
-test_subscription_error($node_subscriber, 'test_tab1', $xid,
- qq(tap_sub|INSERT|test_tab1|t),
- 'check the error reported by the apply worker');
+test_skip_subscription_error($node_subscriber, 'tap_sub', 'test_tab1',
+ $xid, qq(tap_sub|INSERT|test_tab1|t),
+ 'check the error reported by the apply worker');
# Check the table sync worker's error in the view.
test_subscription_error($node_subscriber, 'test_tab2', '',
qq(tap_sub||test_tab2|t),
'check the error reported by the table sync worker');
+# Insert enough rows to test_tab_streaming to exceed the 64kB limit, also raising an
+# error on the subscriber during applying spooled changes for the same reason. Then
+# skip the transaction in question.
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_skip_subscription_error($node_subscriber, 'tap_sub_streaming', 'test_tab_streaming',
+ $xid, qq(tap_sub_streaming|INSERT|test_tab_streaming|t),
+ 'skip the error reported by the table sync worker during applying streaming changes');
+
+# Insert data to test_tab1 and test_tab_streaming that don't conflict.
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab1 VALUES (2)");
+$node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO test_tab_streaming VALUES (10001, md5(10001::text))");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Check the data is successfully replicated after skipping the transactions.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT * FROM test_tab1");
+is($result, q(1
+2), "subscription gets changes after skipped transaction");
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM test_tab_streaming");
+is($result, q(2), "subscription gets changes after skipped streamed transaction");
+
+# Tests for skipping the transactions that are prepared and stream-prepared. We insert
+# the same data as the previous tests but prepare the transactions. Those insertions
+# raise an error on the subscriptions. Then we skip the transactions in question.
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab1 VALUES (1);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'skip_sub1';
+COMMIT PREPARED 'skip_sub1';
+]);
+test_skip_subscription_error($node_subscriber, 'tap_sub', 'test_tab1',
+ $xid, qq(tap_sub|INSERT|test_tab1|t),
+ 'skip the error on changes of the prepared transaction');
+
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'skip_sub2';
+COMMIT PREPARED 'skip_sub2';
+]);
+test_skip_subscription_error($node_subscriber, 'tap_sub_streaming', 'test_tab_streaming',
+ $xid, qq(tap_sub_streaming|INSERT|test_tab_streaming|t),
+ 'skip the error on changes of the prepared-streamed transaction');
+
# Check if the view doesn't show any entries after dropping the subscriptions.
$node_subscriber->safe_psql(
'postgres',
--
2.24.3 (Apple Git-128)
v18-0002-Add-RESET-command-to-ALTER-SUBSCRIPTION-command.patchapplication/octet-stream; name=v18-0002-Add-RESET-command-to-ALTER-SUBSCRIPTION-command.patchDownload
From 2f988ab0c912bcdf57ca235fdcf0f9f4252c2f49 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 2 Aug 2021 14:23:18 +0900
Subject: [PATCH v18 2/3] Add RESET command to ALTER SUBSCRIPTION command.
ALTER SUBSCRIPTION ... RESET command resets subscription
parameters. The parameters that can be set are streaming, binary,
synchronous_commit.
The RESET parameter for ALTER SUBSCRIPTION is required by the
follow-up commit that introduces a new resettable subscription
parameter "skip_xid".
---
doc/src/sgml/ref/alter_subscription.sgml | 9 +++-
src/backend/commands/subscriptioncmds.c | 59 +++++++++++++++-------
src/backend/parser/gram.y | 11 +++-
src/include/nodes/parsenodes.h | 5 +-
src/test/regress/expected/subscription.out | 13 ++++-
src/test/regress/sql/subscription.sql | 11 ++++
6 files changed, 85 insertions(+), 23 deletions(-)
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 0b027cc346..2ed35f5408 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -29,6 +29,7 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> REFRESH PUB
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> ENABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> DISABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <replaceable>new_name</replaceable>
</synopsis>
@@ -194,16 +195,22 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
<varlistentry>
<term><literal>SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )</literal></term>
+ <term><literal>RESET ( <replaceable class="parameter">subscription_parameter</replaceable> [, ... ] )</literal></term>
<listitem>
<para>
This clause alters parameters originally set by
<xref linkend="sql-createsubscription"/>. See there for more
- information. The parameters that can be altered
+ information. The parameters that can be set
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, and
<literal>streaming</literal>.
</para>
+ <para>
+ The parameters that can be reset are:
+ <literal>synchronous_commit</literal>, <literal>binary</literal>,
+ and <literal>streaming</literal>.
+ </para>
</listitem>
</varlistentry>
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index c47ba26369..896ec8b836 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -99,7 +99,8 @@ static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname,
*/
static void
parse_subscription_options(ParseState *pstate, List *stmt_options,
- bits32 supported_opts, SubOpts *opts)
+ bits32 supported_opts, SubOpts *opts,
+ bool is_reset)
{
ListCell *lc;
@@ -134,6 +135,11 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
{
DefElem *defel = (DefElem *) lfirst(lc);
+ if (is_reset && defel->arg != NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("RESET must not include values for parameters")));
+
if (IsSet(supported_opts, SUBOPT_CONNECT) &&
strcmp(defel->defname, "connect") == 0)
{
@@ -192,12 +198,18 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_SYNCHRONOUS_COMMIT;
- opts->synchronous_commit = defGetString(defel);
+ if (!is_reset)
+ {
+ opts->synchronous_commit = defGetString(defel);
- /* Test if the given value is valid for synchronous_commit GUC. */
- (void) set_config_option("synchronous_commit", opts->synchronous_commit,
- PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
- false, 0, false);
+ /*
+ * Test if the given value is valid for synchronous_commit
+ * GUC.
+ */
+ (void) set_config_option("synchronous_commit", opts->synchronous_commit,
+ PGC_BACKEND, PGC_S_TEST, GUC_ACTION_SET,
+ false, 0, false);
+ }
}
else if (IsSet(supported_opts, SUBOPT_REFRESH) &&
strcmp(defel->defname, "refresh") == 0)
@@ -215,7 +227,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_BINARY;
- opts->binary = defGetBoolean(defel);
+ if (!is_reset)
+ opts->binary = defGetBoolean(defel);
}
else if (IsSet(supported_opts, SUBOPT_STREAMING) &&
strcmp(defel->defname, "streaming") == 0)
@@ -224,7 +237,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errorConflictingDefElem(defel, pstate);
opts->specified_opts |= SUBOPT_STREAMING;
- opts->streaming = defGetBoolean(defel);
+ if (!is_reset)
+ opts->streaming = defGetBoolean(defel);
}
else if (strcmp(defel->defname, "two_phase") == 0)
{
@@ -397,7 +411,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT);
- parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
+ parse_subscription_options(pstate, stmt->options, supported_opts, &opts,
+ false);
/*
* Since creating a replication slot is not transactional, rolling back
@@ -866,14 +881,21 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
switch (stmt->kind)
{
- case ALTER_SUBSCRIPTION_OPTIONS:
+ case ALTER_SUBSCRIPTION_SET_OPTIONS:
+ case ALTER_SUBSCRIPTION_RESET_OPTIONS:
{
- supported_opts = (SUBOPT_SLOT_NAME |
- SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
- SUBOPT_STREAMING);
+ bool is_reset = (stmt->kind == ALTER_SUBSCRIPTION_RESET_OPTIONS);
+
+ if (is_reset)
+ supported_opts = (SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
+ SUBOPT_STREAMING);
+ else
+ supported_opts = (SUBOPT_SLOT_NAME |
+ SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
+ SUBOPT_STREAMING);
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, is_reset);
if (IsSet(opts.specified_opts, SUBOPT_SLOT_NAME))
{
@@ -926,7 +948,8 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
case ALTER_SUBSCRIPTION_ENABLED:
{
parse_subscription_options(pstate, stmt->options,
- SUBOPT_ENABLED, &opts);
+ SUBOPT_ENABLED, &opts, false);
+
Assert(IsSet(opts.specified_opts, SUBOPT_ENABLED));
if (!sub->slotname && opts.enabled)
@@ -961,7 +984,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
{
supported_opts = SUBOPT_COPY_DATA | SUBOPT_REFRESH;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
values[Anum_pg_subscription_subpublications - 1] =
publicationListToArray(stmt->publication);
@@ -1008,7 +1031,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = SUBOPT_REFRESH | SUBOPT_COPY_DATA;
parse_subscription_options(pstate, stmt->options,
- supported_opts, &opts);
+ supported_opts, &opts, false);
publist = merge_publications(sub->publications, stmt->publication, isadd, stmt->subname);
values[Anum_pg_subscription_subpublications - 1] =
@@ -1056,7 +1079,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
errmsg("ALTER SUBSCRIPTION ... REFRESH is not allowed for disabled subscriptions")));
parse_subscription_options(pstate, stmt->options,
- SUBOPT_COPY_DATA, &opts);
+ SUBOPT_COPY_DATA, &opts, false);
/*
* The subscription option "two_phase" requires that
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 08f1bf1031..a7e2853f0e 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9721,7 +9721,16 @@ AlterSubscriptionStmt:
{
AlterSubscriptionStmt *n =
makeNode(AlterSubscriptionStmt);
- n->kind = ALTER_SUBSCRIPTION_OPTIONS;
+ n->kind = ALTER_SUBSCRIPTION_SET_OPTIONS;
+ n->subname = $3;
+ n->options = $5;
+ $$ = (Node *)n;
+ }
+ | ALTER SUBSCRIPTION name RESET definition
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_RESET_OPTIONS;
n->subname = $3;
n->options = $5;
$$ = (Node *)n;
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 3138877553..539921cb52 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3676,7 +3676,8 @@ typedef struct CreateSubscriptionStmt
typedef enum AlterSubscriptionType
{
- ALTER_SUBSCRIPTION_OPTIONS,
+ ALTER_SUBSCRIPTION_SET_OPTIONS,
+ ALTER_SUBSCRIPTION_RESET_OPTIONS,
ALTER_SUBSCRIPTION_CONNECTION,
ALTER_SUBSCRIPTION_SET_PUBLICATION,
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
@@ -3688,7 +3689,7 @@ typedef enum AlterSubscriptionType
typedef struct AlterSubscriptionStmt
{
NodeTag type;
- AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_OPTIONS, etc */
+ AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_SET_OPTIONS, etc */
char *subname; /* Name of the subscription */
char *conninfo; /* Connection string to publisher */
List *publication; /* One or more publication to subscribe to */
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 15a1ac6398..924155e86c 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -281,11 +281,22 @@ ALTER SUBSCRIPTION regress_testsub SET (two_phase = false);
ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
+-- ok
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit);
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit, binary, streaming);
+-- fail - unsupported parameters
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ERROR: unrecognized subscription parameter: "connect"
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+ERROR: unrecognized subscription parameter: "enabled"
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+ERROR: RESET must not include values for parameters
\dRs+
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7faa935a2a..9abbb77635 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -215,6 +215,17 @@ ALTER SUBSCRIPTION regress_testsub SET (two_phase = false);
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
+-- ok
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit);
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit, binary, streaming);
+
+-- fail - unsupported parameters
+ALTER SUBSCRIPTION regress_testsub RESET (connect);
+ALTER SUBSCRIPTION regress_testsub RESET (enabled);
+
+-- fail - RESET must not include values
+ALTER SUBSCRIPTION regress_testsub RESET (synchronous_commit = off);
+
\dRs+
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
--
2.24.3 (Apple Git-128)
v18-0001-Add-a-subscription-worker-statistics-view-pg_sta.patchapplication/octet-stream; name=v18-0001-Add-a-subscription-worker-statistics-view-pg_sta.patchDownload
From 931aa345e40fbe64e09340bca9ea722b92a0f193 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 16 Jul 2021 23:10:22 +0900
Subject: [PATCH v18 1/3] Add a subscription worker statistics view
"pg_stat_subscription_workers".
This commit adds a new system view pg_stat_subscription_workers,
that shows information about any errors which occur during application
of logical replication changes as well as during performing initial table
synchronization.
The subscription error entries are removed by autovacuum workers after
table synchronization completes in table sync worker cases and after
dropping the subscription in apply worker cases.
It also adds an SQL function pg_stat_reset_subscription_worker() to
reset a single subscription error.
---
doc/src/sgml/monitoring.sgml | 161 ++++++
src/backend/catalog/system_functions.sql | 2 +
src/backend/catalog/system_views.sql | 25 +
src/backend/postmaster/pgstat.c | 573 ++++++++++++++++++++
src/backend/replication/logical/worker.c | 54 +-
src/backend/utils/adt/pgstatfuncs.c | 122 +++++
src/include/catalog/pg_proc.dat | 13 +
src/include/pgstat.h | 124 +++++
src/test/regress/expected/rules.out | 20 +
src/test/subscription/t/025_error_report.pl | 156 ++++++
src/tools/pgindent/typedefs.list | 6 +
11 files changed, 1253 insertions(+), 3 deletions(-)
create mode 100644 src/test/subscription/t/025_error_report.pl
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 7355835202..9400be4266 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -627,6 +627,15 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
</entry>
</row>
+ <row>
+ <entry><structname>pg_stat_subscription_workers</structname><indexterm><primary>pg_stat_subscription_workers</primary></indexterm></entry>
+ <entry>At least one row per subscription, showing about errors that
+ occurred on subscription.
+ See <link linkend="monitoring-pg-stat-subscription-workers">
+ <structname>pg_stat_subscription_workers</structname></link> for details.
+ </entry>
+ </row>
+
</tbody>
</tgroup>
</table>
@@ -3050,6 +3059,136 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
</sect2>
+ <sect2 id="monitoring-pg-stat-subscription-workers">
+ <title><structname>pg_stat_subscription_workers</structname></title>
+
+ <indexterm>
+ <primary>pg_stat_subscription_workers</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_stat_subscription_workers</structname> view will contain
+ one row per subscription error reported by workers applying logical
+ replication changes and workers handling the initial data copy of the
+ subscribed tables.
+ </para>
+
+ <table id="pg-stat-subscription-workers" xreflabel="pg_stat_subscription_workers">
+ <title><structname>pg_stat_subscription_workers</structname> View</title>
+ <tgroup cols="1">
+ <thead>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ Column Type
+ </para>
+ <para>
+ Description
+ </para></entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the subscription
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the subscription
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subrelid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker is synchronizing; NULL for the
+ main apply worker
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>relid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker was processing when the
+ error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>command</structfield> <type>text</type>
+ </para>
+ <para>
+ Name of command being applied when the error occurred. This field
+ is always NULL if the error was reported during the initial data
+ copy.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>xid</structfield> <type>xid</type>
+ </para>
+ <para>
+ Transaction ID of the publisher node being applied when the error
+ occurred. This field is always NULL if the error was reported
+ during the initial data copy.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>count</structfield> <type>uint8</type>
+ </para>
+ <para>
+ Number of consecutive times the error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>error_message</structfield> <type>text</type>
+ </para>
+ <para>
+ The error message
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_error_time</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which the last error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>stats_reset</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which these statistics were last reset
+ </para></entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ </sect2>
+
<sect2 id="monitoring-pg-stat-ssl-view">
<title><structname>pg_stat_ssl</structname></title>
@@ -5172,6 +5311,28 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
can be granted EXECUTE to run the function.
</para></entry>
</row>
+
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_stat_reset_subscription_worker</primary>
+ </indexterm>
+ <function>pg_stat_reset_subscription_worker</function> ( <parameter>subid</parameter> <type>oid</type>, <parameter>relid</parameter> <type>oid</type> )
+ <returnvalue>void</returnvalue>
+ </para>
+ <para>
+ Resets statistics of a single subscription worker error. If
+ the argument <parameter>relid</parameter> is not <literal>NULL</literal>,
+ resets error statistics of the <literal>tablesync</literal> worker for
+ the relation with <parameter>relid</parameter>. Otherwise, resets the
+ error statistics of the <literal>apply</literal> worker running on the
+ subscription with <parameter>subid</parameter>.
+ </para>
+ <para>
+ This function is restricted to superusers by default, but other users
+ can be granted EXECUTE to run the function.
+ </para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index a416e94d37..21c8d6edd6 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -639,6 +639,8 @@ REVOKE EXECUTE ON FUNCTION pg_stat_reset_single_function_counters(oid) FROM publ
REVOKE EXECUTE ON FUNCTION pg_stat_reset_replication_slot(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_stat_reset_subscription_worker(oid, oid) FROM public;
+
REVOKE EXECUTE ON FUNCTION lo_import(text) FROM public;
REVOKE EXECUTE ON FUNCTION lo_import(text, oid) FROM public;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 55f6e3711d..95226b8186 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1257,3 +1257,28 @@ REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
substream, subtwophasestate, subslotname, subsynccommit, subpublications)
ON pg_subscription TO public;
+
+CREATE VIEW pg_stat_subscription_workers AS
+ SELECT
+ e.subid,
+ s.subname,
+ e.subrelid,
+ e.relid,
+ e.command,
+ e.xid,
+ e.count,
+ e.error_message,
+ e.last_error_time,
+ e.stats_reset
+ FROM (SELECT
+ oid as subid,
+ NULL as relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT
+ srsubid as subid,
+ srrelid as relid
+ FROM pg_subscription_rel
+ WHERE srsubstate <> 'r') sr,
+ LATERAL pg_stat_get_subscription_worker(sr.subid, sr.relid) e
+ JOIN pg_subscription s ON (e.subid = s.oid);
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index b7d0fbaefd..d9a2048fcf 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -41,6 +41,8 @@
#include "catalog/catalog.h"
#include "catalog/pg_database.h"
#include "catalog/pg_proc.h"
+#include "catalog/pg_subscription.h"
+#include "catalog/pg_subscription_rel.h"
#include "common/ip.h"
#include "executor/instrument.h"
#include "libpq/libpq.h"
@@ -106,6 +108,7 @@
#define PGSTAT_TAB_HASH_SIZE 512
#define PGSTAT_FUNCTION_HASH_SIZE 512
#define PGSTAT_REPLSLOT_HASH_SIZE 32
+#define PGSTAT_SUBWORKER_HASH_SIZE 32
/* ----------
@@ -282,6 +285,7 @@ static PgStat_GlobalStats globalStats;
static PgStat_WalStats walStats;
static PgStat_SLRUStats slruStats[SLRU_NUM_ELEMENTS];
static HTAB *replSlotStatHash = NULL;
+static HTAB *subWorkerStatHash = NULL;
/*
* List of OIDs of databases we need to write out. If an entry is InvalidOid,
@@ -332,6 +336,13 @@ static bool pgstat_db_requested(Oid databaseid);
static PgStat_StatReplSlotEntry *pgstat_get_replslot_entry(NameData name, bool create_it);
static void pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotstats, TimestampTz ts);
+static PgStat_StatSubWorkerEntry *pgstat_get_subworker_entry(Oid subid, Oid subrelid,
+ bool create);
+static void pgstat_reset_subworker_entry(PgStat_StatSubWorkerEntry *wentry, TimestampTz ts);
+static void pgstat_report_subscription_purge(PgStat_MsgSubscriptionPurge *msg);
+static void pgstat_report_subworker_purge(PgStat_MsgSubWorkerPurge *msg);
+static void pgstat_vacuum_subworker_stats(void);
+
static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg, TimestampTz now);
static void pgstat_send_funcstats(void);
static void pgstat_send_slru(void);
@@ -356,6 +367,7 @@ static void pgstat_recv_resetsharedcounter(PgStat_MsgResetsharedcounter *msg, in
static void pgstat_recv_resetsinglecounter(PgStat_MsgResetsinglecounter *msg, int len);
static void pgstat_recv_resetslrucounter(PgStat_MsgResetslrucounter *msg, int len);
static void pgstat_recv_resetreplslotcounter(PgStat_MsgResetreplslotcounter *msg, int len);
+static void pgstat_recv_resetsubworkercounter(PgStat_MsgResetsubworkercounter *msg, int len);
static void pgstat_recv_autovac(PgStat_MsgAutovacStart *msg, int len);
static void pgstat_recv_vacuum(PgStat_MsgVacuum *msg, int len);
static void pgstat_recv_analyze(PgStat_MsgAnalyze *msg, int len);
@@ -373,6 +385,9 @@ static void pgstat_recv_connect(PgStat_MsgConnect *msg, int len);
static void pgstat_recv_disconnect(PgStat_MsgDisconnect *msg, int len);
static void pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len);
static void pgstat_recv_tempfile(PgStat_MsgTempFile *msg, int len);
+static void pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len);
+static void pgstat_recv_subworker_error(PgStat_MsgSubWorkerError *msg, int len);
+static void pgstat_recv_subworker_purge(PgStat_MsgSubWorkerPurge *msg, int len);
/* ------------------------------------------------------------
* Public functions called from postmaster follow
@@ -1178,6 +1193,10 @@ pgstat_vacuum_stat(void)
}
}
+ /* Cleanup the dead subscription workers statistics */
+ if (subWorkerStatHash)
+ pgstat_vacuum_subworker_stats();
+
/*
* Lookup our own database entry; if not found, nothing more to do.
*/
@@ -1355,6 +1374,173 @@ pgstat_collect_oids(Oid catalogid, AttrNumber anum_oid)
}
+/* PgStat_StatSubWorkerEntry comparator sorting subid and subrelid */
+static int
+subworker_stats_comparator(const ListCell *a, const ListCell *b)
+{
+ PgStat_StatSubWorkerEntry *entry1 = (PgStat_StatSubWorkerEntry *) lfirst(a);
+ PgStat_StatSubWorkerEntry *entry2 = (PgStat_StatSubWorkerEntry *) lfirst(b);
+ int ret;
+
+ ret = oid_cmp(&entry1->key.subid, &entry2->key.subid);
+ if (ret != 0)
+ return ret;
+
+ return oid_cmp(&entry1->key.subrelid, &entry2->key.subrelid);
+}
+
+/* ----------
+ * pgstat_vacuum_subworker_stat() -
+ *
+ * This is a subroutine for pgstat_vacuum_stat to tell the collector to
+ * remove dead subscriptions and worker statistics.
+ */
+static void
+pgstat_vacuum_subworker_stats(void)
+{
+ PgStat_StatSubWorkerEntry *wentry;
+ PgStat_MsgSubscriptionPurge spmsg;
+ PgStat_MsgSubWorkerPurge wpmsg;
+ HASH_SEQ_STATUS hstat;
+ HTAB *subids;
+ List *subworker_stats = NIL;
+ List *not_ready_rels = NIL;
+ ListCell *lc1;
+
+ /* Build the list of worker stats and sort it by subid and relid */
+ hash_seq_init(&hstat, subWorkerStatHash);
+ while ((wentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&hstat)) != NULL)
+ subworker_stats = lappend(subworker_stats, wentry);
+ list_sort(subworker_stats, subworker_stats_comparator);
+
+ /*
+ * Read pg_subscription and make a list of OIDs of all existing
+ * subscriptions
+ */
+ subids = pgstat_collect_oids(SubscriptionRelationId, Anum_pg_subscription_oid);
+
+ /*
+ * Search for all the dead subscriptions and unnecessary table sync worker
+ * entries in stats hashtable and tell the stats collector to drop them.
+ */
+ spmsg.m_nentries = 0;
+ wpmsg.m_nentries = 0;
+ wpmsg.m_subid = InvalidOid;
+ foreach(lc1, subworker_stats)
+ {
+ ListCell *lc2;
+ bool keep_it = false;
+
+ wentry = (PgStat_StatSubWorkerEntry *) lfirst(lc1);
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* Skip if we already registered this subscription to purge */
+ if (spmsg.m_nentries > 0 &&
+ spmsg.m_subids[spmsg.m_nentries - 1] == wentry->key.subid)
+ continue;
+
+ /* Check if the subscription is dead */
+ if (hash_search(subids, (void *) &(wentry->key.subid), HASH_FIND, NULL) == NULL)
+ {
+ /* This subscription is dead, add the subid to the message */
+ spmsg.m_subids[spmsg.m_nentries++] = wentry->key.subid;
+
+ /*
+ * If the message is full, send it out and reinitialize to empty
+ */
+ if (spmsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONPURGE)
+ {
+ pgstat_report_subscription_purge(&spmsg);
+ spmsg.m_nentries = 0;
+ }
+
+ continue;
+ }
+
+ /*
+ * This subscription is alive. The next step is that we search table
+ * sync worker entries who are already in sync state. These should be
+ * removed.
+ */
+
+ /* We remove only table sync entries in the current database */
+ if (wentry->dbid != MyDatabaseId)
+ continue;
+
+ /* Skip if it's an apply worker entry */
+ if (!OidIsValid(wentry->key.subrelid))
+ continue;
+
+ if (wpmsg.m_subid != wentry->key.subid)
+ {
+ /*
+ * Send the purge message for previously collected table sync
+ * entries, if there is.
+ */
+ if (wpmsg.m_nentries > 0)
+ {
+ pgstat_report_subworker_purge(&wpmsg);
+ wpmsg.m_nentries = 0;
+ }
+
+ /* Clean up the previously collected relations */
+ list_free_deep(not_ready_rels);
+
+ /* Refresh the not-ready-relations of this subscription */
+ not_ready_rels = GetSubscriptionNotReadyRelations(wentry->key.subid);
+
+ /* Prepare the worker purge message for the subscription */
+ wpmsg.m_subid = wentry->key.subid;
+ }
+
+ /*
+ * Check if the table is still being synchronized or no longer belongs
+ * to the subscription.
+ */
+ foreach(lc2, not_ready_rels)
+ {
+ SubscriptionRelState *relstate = (SubscriptionRelState *) lfirst(lc2);
+
+ if (relstate->relid == wentry->key.subrelid)
+ {
+ /* This table is still being synchronized, so keep it */
+ keep_it = true;
+ break;
+ }
+ }
+
+ if (keep_it)
+ continue;
+
+ /* Add the table to the worker purge message */
+ wpmsg.m_relids[wpmsg.m_nentries++] = wentry->key.subrelid;
+
+ /*
+ * If the worker purge message is full, send it out and reinitialize
+ * to empty
+ */
+ if (wpmsg.m_nentries >= PGSTAT_NUM_SUBWORKERPURGE)
+ {
+ pgstat_report_subworker_purge(&wpmsg);
+ wpmsg.m_nentries = 0;
+ }
+ }
+
+ /* Send the rest of dead subscriptions */
+ if (spmsg.m_nentries > 0)
+ pgstat_report_subscription_purge(&spmsg);
+
+ /* Send the rest of dead worker entries */
+ if (wpmsg.m_nentries > 0)
+ pgstat_report_subworker_purge(&wpmsg);
+
+ /* Clean up */
+ list_free_deep(not_ready_rels);
+ list_free(subworker_stats);
+ hash_destroy(subids);
+}
+
/* ----------
* pgstat_drop_database() -
*
@@ -1544,6 +1730,24 @@ pgstat_reset_replslot_counter(const char *name)
pgstat_send(&msg, sizeof(msg));
}
+/* ----------
+ * pgstat_reset_subworker_stats() -
+ *
+ * Tell the collector to reset the subscription worker statistics.
+ * ----------
+ */
+void
+pgstat_reset_subworker_stats(Oid subid, Oid subrelid)
+{
+ PgStat_MsgResetsubworkercounter msg;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_RESETSUBWORKERCOUNTER);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+
+ pgstat_send(&msg, sizeof(PgStat_MsgResetsubworkercounter));
+}
+
/* ----------
* pgstat_report_autovac() -
*
@@ -1804,6 +2008,47 @@ pgstat_should_report_connstat(void)
return MyBackendType == B_BACKEND;
}
+/* --------
+ * pgstat_report_subscription_purge() -
+ *
+ * Tell the collector about dead subscriptions.
+ * --------
+ */
+static void
+pgstat_report_subscription_purge(PgStat_MsgSubscriptionPurge *msg)
+{
+ int len;
+
+ Assert(msg->m_nentries > 0);
+
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + msg->m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&msg->m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(msg, len);
+}
+
+/* --------
+ * pgstat_report_subworker_purge() -
+ *
+ * Tell the collector to remove subscription worker statistics.
+ * --------
+ */
+static void
+pgstat_report_subworker_purge(PgStat_MsgSubWorkerPurge *msg)
+{
+ int len;
+
+ Assert(OidIsValid(msg->m_subid));
+ Assert(msg->m_nentries > 0);
+
+ len = offsetof(PgStat_MsgSubWorkerPurge, m_relids[0])
+ + msg->m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&msg->m_hdr, PGSTAT_MTYPE_SUBWORKERPURGE);
+ pgstat_send(msg, len);
+}
+
/* ----------
* pgstat_report_replslot() -
*
@@ -1869,6 +2114,36 @@ pgstat_report_replslot_drop(const char *slotname)
pgstat_send(&msg, sizeof(PgStat_MsgReplSlot));
}
+/* ----------
+ * pgstat_report_subworker_error() -
+ *
+ * Tell the collector about the subscription worker error.
+ * ----------
+ */
+void
+pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command, TransactionId xid,
+ const char *errmsg)
+{
+ PgStat_MsgSubWorkerError msg;
+ int len;
+
+ Assert(strlen(errmsg) < PGSTAT_SUBWORKERERROR_MSGLEN);
+ len = offsetof(PgStat_MsgSubWorkerError, m_message[0]) + strlen(errmsg) + 1;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBWORKERERROR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_dbid = MyDatabaseId;
+ msg.m_relid = relid;
+ msg.m_command = command;
+ msg.m_xid = xid;
+ msg.m_timestamp = GetCurrentTimestamp();
+ strlcpy(msg.m_message, errmsg, PGSTAT_SUBWORKERERROR_MSGLEN);
+
+ pgstat_send(&msg, len);
+}
+
/* ----------
* pgstat_ping() -
*
@@ -2987,6 +3262,22 @@ pgstat_fetch_replslot(NameData slotname)
return pgstat_get_replslot_entry(slotname, false);
}
+/*
+ * ---------
+ * pgstat_fetch_subworker() -
+ *
+ * Support function for the SQL-callable pgstat* functions. Returns
+ * a pointer to the subscription worker struct.
+ * ---------
+ */
+PgStat_StatSubWorkerEntry *
+pgstat_fetch_subworker(Oid subid, Oid subrelid)
+{
+ backend_read_statsfile();
+
+ return pgstat_get_subworker_entry(subid, subrelid, false);
+}
+
/*
* Shut down a single backend's statistics reporting at process exit.
*
@@ -3498,6 +3789,11 @@ PgstatCollectorMain(int argc, char *argv[])
len);
break;
+ case PGSTAT_MTYPE_RESETSUBWORKERCOUNTER:
+ pgstat_recv_resetsubworkercounter(&msg.msg_resetsubworkercounter,
+ len);
+ break;
+
case PGSTAT_MTYPE_AUTOVAC_START:
pgstat_recv_autovac(&msg.msg_autovacuum_start, len);
break;
@@ -3568,6 +3864,18 @@ PgstatCollectorMain(int argc, char *argv[])
pgstat_recv_disconnect(&msg.msg_disconnect, len);
break;
+ case PGSTAT_MTYPE_SUBSCRIPTIONPURGE:
+ pgstat_recv_subscription_purge(&msg.msg_subscriptionpurge, len);
+ break;
+
+ case PGSTAT_MTYPE_SUBWORKERERROR:
+ pgstat_recv_subworker_error(&msg.msg_subworkererror, len);
+ break;
+
+ case PGSTAT_MTYPE_SUBWORKERPURGE:
+ pgstat_recv_subworker_purge(&msg.msg_subworkerpurge, len);
+ break;
+
default:
break;
}
@@ -3868,6 +4176,22 @@ pgstat_write_statsfiles(bool permanent, bool allDbs)
}
}
+ /*
+ * Write subscription worker stats struct
+ */
+ if (subWorkerStatHash)
+ {
+ PgStat_StatSubWorkerEntry *wentry;
+
+ hash_seq_init(&hstat, subWorkerStatHash);
+ while ((wentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ fputc('S', fpout);
+ rc = fwrite(wentry, sizeof(PgStat_StatSubWorkerEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+ }
+ }
+
/*
* No more output to be done. Close the temp file and replace the old
* pgstat.stat with it. The ferror() check replaces testing for error
@@ -4329,6 +4653,48 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
break;
}
+ /*
+ * 'S' A PgStat_StatSubWorkerEntry struct describing
+ * subscription worker statistics.
+ */
+ case 'S':
+ {
+ PgStat_StatSubWorkerEntry wbuf;
+ PgStat_StatSubWorkerEntry *wentry;
+
+ /* Read the subscription entry */
+ if (fread(&wbuf, 1, sizeof(PgStat_StatSubWorkerEntry), fpin)
+ != sizeof(PgStat_StatSubWorkerEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /* Create hash table if we don't have it already. */
+ if (subWorkerStatHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(PgStat_StatSubWorkerKey);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubWorkerEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ subWorkerStatHash = hash_create("Subscription worker stat entries",
+ PGSTAT_SUBWORKER_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ }
+
+ /* Enter the subscription entry and initialize fields */
+ wentry =
+ (PgStat_StatSubWorkerEntry *) hash_search(subWorkerStatHash,
+ (void *) &wbuf.key,
+ HASH_ENTER, NULL);
+ memcpy(wentry, &wbuf, sizeof(PgStat_StatSubWorkerEntry));
+ break;
+ }
+
case 'E':
goto done;
@@ -4541,6 +4907,7 @@ pgstat_read_db_statsfile_timestamp(Oid databaseid, bool permanent,
PgStat_WalStats myWalStats;
PgStat_SLRUStats mySLRUStats[SLRU_NUM_ELEMENTS];
PgStat_StatReplSlotEntry myReplSlotStats;
+ PgStat_StatSubWorkerEntry mySubWorkerStats;
FILE *fpin;
int32 format_id;
const char *statfile = permanent ? PGSTAT_STAT_PERMANENT_FILENAME : pgstat_stat_filename;
@@ -4671,6 +5038,22 @@ pgstat_read_db_statsfile_timestamp(Oid databaseid, bool permanent,
}
break;
+ /*
+ * 'S' A PgStat_StatSubWorkerEntry struct describing
+ * subscription worker statistics.
+ */
+ case 'S':
+ if (fread(&mySubWorkerStats, 1, sizeof(mySubWorkerStats), fpin)
+ != sizeof(mySubWorkerStats))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ FreeFile(fpin);
+ return false;
+ }
+ break;
+
case 'E':
goto done;
@@ -4876,6 +5259,7 @@ pgstat_clear_snapshot(void)
pgStatLocalContext = NULL;
pgStatDBHash = NULL;
replSlotStatHash = NULL;
+ subWorkerStatHash = NULL;
/*
* Historically the backend_status.c facilities lived in this file, and
@@ -5344,6 +5728,33 @@ pgstat_recv_resetreplslotcounter(PgStat_MsgResetreplslotcounter *msg,
}
}
+/* ----------
+ * pgstat_recv_resetsubworkercounter() -
+ *
+ * Process a RESETSUBWORKERCOUNTER message.
+ * ----------
+ */
+static void
+pgstat_recv_resetsubworkercounter(PgStat_MsgResetsubworkercounter *msg, int len)
+{
+ PgStat_StatSubWorkerEntry *wentry;
+
+ Assert(OidIsValid(msg->m_subid));
+
+ /* Get subscription worker stats */
+ wentry = pgstat_get_subworker_entry(msg->m_subid, msg->m_subrelid, false);
+
+ /*
+ * Nothing to do if the subscription error entry is not found. This could
+ * happen when the subscription is dropped and the message for dropping
+ * subscription entry arrived before the message for resetting the error.
+ */
+ if (wentry == NULL)
+ return;
+
+ /* reset the entry and set reset timestamp */
+ pgstat_reset_subworker_entry(wentry, GetCurrentTimestamp());
+}
/* ----------
* pgstat_recv_autovac() -
@@ -5816,6 +6227,102 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
}
}
+/* ----------
+ * pgstat_recv_subscription_purge() -
+ *
+ * Process a SUBSCRIPTIONPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len)
+{
+ HASH_SEQ_STATUS sstat;
+ PgStat_StatSubWorkerEntry *wentry;
+
+ if (subWorkerStatHash == NULL)
+ return;
+
+ /* Remove all subscription worker statistics of the given subscriptions */
+ hash_seq_init(&sstat, subWorkerStatHash);
+ while ((wentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&sstat)) != NULL)
+ {
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ if (wentry->key.subid == msg->m_subids[i])
+ {
+ (void) hash_search(subWorkerStatHash, (void *) &(wentry->key),
+ HASH_REMOVE, NULL);
+ break;
+ }
+ }
+ }
+}
+
+/* ----------
+ * pgstat_recv_subworker_error() -
+ *
+ * Process a SUBWORKERERROR message.
+ * ----------
+ */
+static void
+pgstat_recv_subworker_error(PgStat_MsgSubWorkerError *msg, int len)
+{
+ PgStat_StatSubWorkerEntry *wentry;
+
+ /* Get the subscription worker stats */
+ wentry = pgstat_get_subworker_entry(msg->m_subid, msg->m_subrelid, true);
+ Assert(wentry);
+
+ /*
+ * Update only the counter and timestamp if we received the same error
+ * again
+ */
+ if (wentry->dbid == msg->m_dbid &&
+ wentry->relid == msg->m_relid &&
+ wentry->command == msg->m_command &&
+ wentry->xid == msg->m_xid &&
+ strcmp(wentry->message, msg->m_message) == 0)
+ {
+ wentry->count++;
+ wentry->timestamp = msg->m_timestamp;
+ return;
+ }
+
+ /* Otherwise, update the error information */
+ wentry->dbid = msg->m_dbid;
+ wentry->relid = msg->m_relid;
+ wentry->command = msg->m_command;
+ wentry->xid = msg->m_xid;
+ wentry->count = 1;
+ wentry->timestamp = msg->m_timestamp;
+ strlcpy(wentry->message, msg->m_message, PGSTAT_SUBWORKERERROR_MSGLEN);
+}
+
+/* ----------
+ * pgstat_recv_subworker_purge() -
+ *
+ * Process a SUBWORKERPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subworker_purge(PgStat_MsgSubWorkerPurge *msg, int len)
+{
+ PgStat_StatSubWorkerKey key;
+
+ key.subid = msg->m_subid;
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ /*
+ * Must be a table sync worker error as the apply worker error is
+ * dropped only when the subscription is dropped.
+ */
+ Assert(OidIsValid(msg->m_relids[i]));
+
+ key.subrelid = msg->m_relids[i];
+ (void) hash_search(subWorkerStatHash, (void *) &key, HASH_REMOVE, NULL);
+ }
+}
+
/* ----------
* pgstat_write_statsfile_needed() -
*
@@ -5934,6 +6441,72 @@ pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotent, TimestampTz ts)
slotent->stat_reset_timestamp = ts;
}
+/* ----------
+ * pgstat_get_subworker_entry
+ *
+ * Return subscription worker entry with the given subscription OID and
+ * relation OID. If subrelid is InvalidOid, it returns an entry of the
+ * apply worker otherwise of the table sync worker associated with subrelid.
+ * If no subscription entry exists, initialize it, if the create parameter
+ * is true. Else, return NULL.
+ * ----------
+ */
+static PgStat_StatSubWorkerEntry *
+pgstat_get_subworker_entry(Oid subid, Oid subrelid, bool create)
+{
+ PgStat_StatSubWorkerEntry *wentry;
+ PgStat_StatSubWorkerKey key;
+ HASHACTION action;
+ bool found;
+
+ if (subWorkerStatHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ if (!create)
+ return NULL;
+
+ hash_ctl.keysize = sizeof(PgStat_StatSubWorkerKey);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubWorkerEntry);
+ subWorkerStatHash = hash_create("Subscription worker stat entries",
+ PGSTAT_SUBWORKER_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+ }
+
+ key.subid = subid;
+ key.subrelid = subrelid;
+ action = (create ? HASH_ENTER : HASH_FIND);
+ wentry = (PgStat_StatSubWorkerEntry *) hash_search(subWorkerStatHash,
+ (void *) &key,
+ action, &found);
+
+ /* initialize fields */
+ if (create && !found)
+ pgstat_reset_subworker_entry(wentry, 0);
+
+ return wentry;
+}
+
+/* ----------
+ * pgstat_reset_subworker_entry
+ *
+ * Reset the given subscription worker statistics.
+ * ----------
+ */
+static void
+pgstat_reset_subworker_entry(PgStat_StatSubWorkerEntry *wentry, TimestampTz ts)
+{
+ wentry->dbid = InvalidOid;
+ wentry->relid = InvalidOid;
+ wentry->command = 0;
+ wentry->xid = InvalidTransactionId;
+ wentry->count = 0;
+ wentry->timestamp = 0;
+ wentry->message[0] = '\0';
+ wentry->stat_reset_timestamp = ts;
+}
+
/*
* pgstat_slru_index
*
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 8d96c926b4..3a40684fa5 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3329,6 +3329,7 @@ void
ApplyWorkerMain(Datum main_arg)
{
int worker_slot = DatumGetInt32(main_arg);
+ MemoryContext cctx = CurrentMemoryContext;
MemoryContext oldctx;
char originname[NAMEDATALEN];
XLogRecPtr origin_startpos;
@@ -3429,8 +3430,30 @@ ApplyWorkerMain(Datum main_arg)
{
char *syncslotname;
- /* This is table synchronization worker, call initial sync. */
- syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ PG_TRY();
+ {
+ /* This is table synchronization worker, call initial sync. */
+ syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ }
+ PG_CATCH();
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ /*
+ * Report the table sync error. There is no corresponding message
+ * type for table synchronization.
+ */
+ pgstat_report_subworker_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ MyLogicalRepWorker->relid,
+ 0, /* message type */
+ InvalidTransactionId,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
/* allocate slot name in long-lived context */
myslotname = MemoryContextStrdup(ApplyContext, syncslotname);
@@ -3548,7 +3571,32 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- LogicalRepApplyLoop(origin_startpos);
+ PG_TRY();
+ {
+ LogicalRepApplyLoop(origin_startpos);
+ }
+ PG_CATCH();
+ {
+ /* report the apply error */
+ if (apply_error_callback_arg.command != 0)
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ pgstat_report_subworker_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ apply_error_callback_arg.rel != NULL
+ ? apply_error_callback_arg.rel->localreloid
+ : InvalidOid,
+ apply_error_callback_arg.command,
+ apply_error_callback_arg.remote_xid,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ }
+
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
proc_exit(0);
}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index ff5aedc99c..b58a61dbe6 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -24,6 +24,7 @@
#include "pgstat.h"
#include "postmaster/bgworker_internals.h"
#include "postmaster/postmaster.h"
+#include "replication/logicalproto.h"
#include "replication/slot.h"
#include "storage/proc.h"
#include "storage/procarray.h"
@@ -2239,6 +2240,23 @@ pg_stat_reset_replication_slot(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
+/* Reset a subscription worker stats */
+Datum
+pg_stat_reset_subscription_worker(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid;
+
+ if (PG_ARGISNULL(1))
+ relid = InvalidOid; /* reset apply worker error stats */
+ else
+ relid = PG_GETARG_OID(1); /* reset table sync worker error stats */
+
+ pgstat_reset_subworker_stats(subid, relid);
+
+ PG_RETURN_VOID();
+}
+
Datum
pg_stat_get_archiver(PG_FUNCTION_ARGS)
{
@@ -2379,3 +2397,107 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
/* Returns the record as Datum */
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
+
+/*
+ * Get the subscription worker statistics for the given subscription
+ * (and relation).
+ */
+Datum
+pg_stat_get_subscription_worker(PG_FUNCTION_ARGS)
+{
+#define PG_STAT_GET_SUBSCRIPTION_WORKER_COLS 9
+ Oid subid = PG_GETARG_OID(0);
+ Oid subrelid;
+ TupleDesc tupdesc;
+ Datum values[PG_STAT_GET_SUBSCRIPTION_WORKER_COLS];
+ bool nulls[PG_STAT_GET_SUBSCRIPTION_WORKER_COLS];
+ PgStat_StatSubWorkerEntry *wentry;
+ int i;
+
+ if (PG_ARGISNULL(1))
+ subrelid = InvalidOid;
+ else
+ subrelid = PG_GETARG_OID(1);
+
+ /* Initialise attributes information in the tuple descriptor */
+ tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_SUBSCRIPTION_WORKER_COLS);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "subid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subrelid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 3, "relid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 4, "command",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 5, "xid",
+ XIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 6, "count",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 7, "error_message",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 8, "last_error_time",
+ TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 9, "stats_reset",
+ TIMESTAMPTZOID, -1, 0);
+ BlessTupleDesc(tupdesc);
+
+ /* Get subscription worker stats */
+ wentry = pgstat_fetch_subworker(subid, subrelid);
+
+ /* Return NULL if there is no worker statistics */
+ if (wentry == NULL)
+ PG_RETURN_NULL();
+
+ /* Initialise values and NULL flags arrays */
+ MemSet(values, 0, sizeof(values));
+ MemSet(nulls, 0, sizeof(nulls));
+
+ i = 0;
+ /* subid */
+ values[i++] = ObjectIdGetDatum(subid);
+
+ /* subrelid */
+ if (OidIsValid(subrelid))
+ values[i++] = ObjectIdGetDatum(subrelid);
+ else
+ nulls[i++] = true;
+
+ /* relid */
+ if (OidIsValid(wentry->relid))
+ values[i++] = ObjectIdGetDatum(wentry->relid);
+ else
+ nulls[i++] = true;
+
+ /* command */
+ if (wentry->command != 0)
+ values[i++] = CStringGetTextDatum(logicalrep_message_type(wentry->command));
+ else
+ nulls[i++] = true;
+
+ /* xid */
+ if (TransactionIdIsValid(wentry->xid))
+ values[i++] = TransactionIdGetDatum(wentry->xid);
+ else
+ nulls[i++] = true;
+
+ /* count */
+ values[i++] = Int64GetDatum(wentry->count);
+
+ /* error_message */
+ values[i++] = CStringGetTextDatum(wentry->message);
+
+ /* last_error_time */
+ if (wentry->timestamp != 0)
+ values[i++] = TimestampTzGetDatum(wentry->timestamp);
+ else
+ nulls[i++] = true;
+
+ /* stats_reset */
+ if (wentry->stat_reset_timestamp != 0)
+ values[i++] = TimestampTzGetDatum(wentry->stat_reset_timestamp);
+ else
+ nulls[i++] = true;
+
+ /* Returns the record as Datum */
+ PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
+}
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index d068d6532e..4543a00d2d 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5385,6 +5385,14 @@
proargmodes => '{i,o,o,o,o,o,o,o,o,o,o}',
proargnames => '{slot_name,slot_name,spill_txns,spill_count,spill_bytes,stream_txns,stream_count,stream_bytes,total_txns,total_bytes,stats_reset}',
prosrc => 'pg_stat_get_replication_slot' },
+{ oid => '8523', descr => 'statistics: information about subscription error',
+ proname => 'pg_stat_get_subscription_worker', prorows => '1', proisstrict => 'f',
+ proretset => 't', provolatile => 's', proparallel => 'r',
+ prorettype => 'record', proargtypes => 'oid oid',
+ proallargtypes => '{oid,oid,oid,oid,oid,text,xid,int8,text,timestamptz,timestamptz}',
+ proargmodes => '{i,i,o,o,o,o,o,o,o,o,o}',
+ proargnames => '{subid,subrelid,subid,subrelid,relid,command,xid,count,error_message,last_error_time,stats_reset}',
+ prosrc => 'pg_stat_get_subscription_worker' },
{ oid => '6118', descr => 'statistics: information about subscription',
proname => 'pg_stat_get_subscription', prorows => '10', proisstrict => 'f',
proretset => 't', provolatile => 's', proparallel => 'r',
@@ -5772,6 +5780,11 @@
proname => 'pg_stat_reset_replication_slot', proisstrict => 'f',
provolatile => 'v', prorettype => 'void', proargtypes => 'text',
prosrc => 'pg_stat_reset_replication_slot' },
+{ oid => '8524',
+ descr => 'statistics: reset collected statistics for a single subscription worker',
+ proname => 'pg_stat_reset_subscription_worker', proisstrict => 'f',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid oid',
+ prosrc => 'pg_stat_reset_subscription_worker' },
{ oid => '3163', descr => 'current trigger depth',
proname => 'pg_trigger_depth', provolatile => 's', proparallel => 'r',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index bcd3588ea2..59e37634a1 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -14,6 +14,7 @@
#include "datatype/timestamp.h"
#include "portability/instr_time.h"
#include "postmaster/pgarch.h" /* for MAX_XFN_CHARS */
+#include "replication/logicalproto.h"
#include "utils/backend_progress.h" /* for backward compatibility */
#include "utils/backend_status.h" /* for backward compatibility */
#include "utils/hsearch.h"
@@ -66,6 +67,7 @@ typedef enum StatMsgType
PGSTAT_MTYPE_RESETSINGLECOUNTER,
PGSTAT_MTYPE_RESETSLRUCOUNTER,
PGSTAT_MTYPE_RESETREPLSLOTCOUNTER,
+ PGSTAT_MTYPE_RESETSUBWORKERCOUNTER,
PGSTAT_MTYPE_AUTOVAC_START,
PGSTAT_MTYPE_VACUUM,
PGSTAT_MTYPE_ANALYZE,
@@ -83,6 +85,9 @@ typedef enum StatMsgType
PGSTAT_MTYPE_REPLSLOT,
PGSTAT_MTYPE_CONNECT,
PGSTAT_MTYPE_DISCONNECT,
+ PGSTAT_MTYPE_SUBSCRIPTIONPURGE,
+ PGSTAT_MTYPE_SUBWORKERERROR,
+ PGSTAT_MTYPE_SUBWORKERPURGE,
} StatMsgType;
/* ----------
@@ -389,6 +394,24 @@ typedef struct PgStat_MsgResetreplslotcounter
bool clearall;
} PgStat_MsgResetreplslotcounter;
+/* ----------
+ * PgStat_MsgRestsubworkercounter Sent by the backend to reset the subscription
+ * worker statistics.
+ * ----------
+ */
+typedef struct PgStat_MsgResetsubworkercounter
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * Same as PgStat_MsgSubWorkerError, m_subid and m_subrelid are used to
+ * determine the subscription and the reporter of the error: the apply
+ * worker or the table sync worker.
+ */
+ Oid m_subid;
+ Oid m_subrelid;
+} PgStat_MsgResetsubworkercounter;
+
/* ----------
* PgStat_MsgAutovacStart Sent by the autovacuum daemon to signal
* that a database is going to be processed
@@ -536,6 +559,68 @@ typedef struct PgStat_MsgReplSlot
PgStat_Counter m_total_bytes;
} PgStat_MsgReplSlot;
+/* ----------
+ * PgStat_MsgSubscriptionPurge Sent by the backend and autovacuum to tell the
+ * collector about the dead subscriptions.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionPurge
+{
+ PgStat_MsgHdr m_hdr;
+ int m_nentries;
+ Oid m_subids[PGSTAT_NUM_SUBSCRIPTIONPURGE];
+} PgStat_MsgSubscriptionPurge;
+
+/* ----------
+ * PgStat_MsgSubWorkerPurge Sent by the backend and autovacuum to purge
+ * the subscription worker statistics.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBWORKERPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(Oid) - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubWorkerPurge
+{
+ PgStat_MsgHdr m_hdr;
+ Oid m_subid;
+ int m_nentries;
+ Oid m_relids[PGSTAT_NUM_SUBWORKERPURGE];
+} PgStat_MsgSubWorkerPurge;
+
+/* ----------
+ * PgStat_MsgSubWorkerError Sent by the apply worker or the table sync worker to
+ * report the error occurred during logical replication.
+ * ----------
+ */
+#define PGSTAT_SUBWORKERERROR_MSGLEN 256
+typedef struct PgStat_MsgSubWorkerError
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * m_subid and m_subrelid are used to determine the subscription and the
+ * reporter of the error. m_subrelid is InvalidOid if reported by an apply
+ * worker otherwise reported by a table sync worker.
+ */
+ Oid m_subid;
+ Oid m_subrelid;
+
+ /*
+ * Oids of the database and the table that the reporter was actually
+ * processing. m_relid can be InvalidOid if an error occurred during
+ * worker applying a non-data-modification message such as RELATION.
+ */
+ Oid m_dbid;
+ Oid m_relid;
+
+ LogicalRepMsgType m_command;
+ TransactionId m_xid;
+ TimestampTz m_timestamp;
+ char m_message[PGSTAT_SUBWORKERERROR_MSGLEN];
+} PgStat_MsgSubWorkerError;
/* ----------
* PgStat_MsgRecoveryConflict Sent by the backend upon recovery conflict
@@ -697,6 +782,7 @@ typedef union PgStat_Msg
PgStat_MsgResetsinglecounter msg_resetsinglecounter;
PgStat_MsgResetslrucounter msg_resetslrucounter;
PgStat_MsgResetreplslotcounter msg_resetreplslotcounter;
+ PgStat_MsgResetsubworkercounter msg_resetsubworkercounter;
PgStat_MsgAutovacStart msg_autovacuum_start;
PgStat_MsgVacuum msg_vacuum;
PgStat_MsgAnalyze msg_analyze;
@@ -714,6 +800,9 @@ typedef union PgStat_Msg
PgStat_MsgReplSlot msg_replslot;
PgStat_MsgConnect msg_connect;
PgStat_MsgDisconnect msg_disconnect;
+ PgStat_MsgSubscriptionPurge msg_subscriptionpurge;
+ PgStat_MsgSubWorkerError msg_subworkererror;
+ PgStat_MsgSubWorkerPurge msg_subworkerpurge;
} PgStat_Msg;
@@ -929,6 +1018,36 @@ typedef struct PgStat_StatReplSlotEntry
TimestampTz stat_reset_timestamp;
} PgStat_StatReplSlotEntry;
+/* The lookup key for subscription worker hash table */
+typedef struct PgStat_StatSubWorkerKey
+{
+ Oid subid;
+ Oid subrelid; /* InvalidOid for apply worker, otherwise for
+ * table sync worker */
+} PgStat_StatSubWorkerKey;
+
+/*
+ * Logical replication apply worker and table sync worker statistics kept in the
+ * stats collector.
+ */
+typedef struct PgStat_StatSubWorkerEntry
+{
+ PgStat_StatSubWorkerKey key; /* hash key (must be first) */
+
+ /*
+ * Subscription worker error statistics representing an error that
+ * occurred during application of logical replication or the initial table
+ * synchronization.
+ */
+ Oid dbid;
+ Oid relid;
+ LogicalRepMsgType command;
+ TransactionId xid;
+ PgStat_Counter count;
+ TimestampTz timestamp;
+ char message[PGSTAT_SUBWORKERERROR_MSGLEN];
+ TimestampTz stat_reset_timestamp;
+} PgStat_StatSubWorkerEntry;
/*
* Working state needed to accumulate per-function-call timing statistics.
@@ -1022,6 +1141,7 @@ extern void pgstat_reset_shared_counters(const char *);
extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type type);
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
+extern void pgstat_reset_subworker_stats(Oid subid, Oid subrelid);
extern void pgstat_report_connect(Oid dboid);
extern void pgstat_report_autovac(Oid dboid);
@@ -1038,6 +1158,9 @@ extern void pgstat_report_checksum_failure(void);
extern void pgstat_report_replslot(const PgStat_StatReplSlotEntry *repSlotStat);
extern void pgstat_report_replslot_create(const char *slotname);
extern void pgstat_report_replslot_drop(const char *slotname);
+extern void pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command,
+ TransactionId xid, const char *errmsg);
extern void pgstat_initialize(void);
@@ -1136,6 +1259,7 @@ extern PgStat_GlobalStats *pgstat_fetch_global(void);
extern PgStat_WalStats *pgstat_fetch_stat_wal(void);
extern PgStat_SLRUStats *pgstat_fetch_slru(void);
extern PgStat_StatReplSlotEntry *pgstat_fetch_replslot(NameData slotname);
+extern PgStat_StatSubWorkerEntry *pgstat_fetch_subworker(Oid subid, Oid subrelid);
extern void pgstat_count_slru_page_zeroed(int slru_idx);
extern void pgstat_count_slru_page_hit(int slru_idx);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2fa00a3c29..c6b83cea27 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2094,6 +2094,26 @@ pg_stat_subscription| SELECT su.oid AS subid,
st.latest_end_time
FROM (pg_subscription su
LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid, pid, received_lsn, last_msg_send_time, last_msg_receipt_time, latest_end_lsn, latest_end_time) ON ((st.subid = su.oid)));
+pg_stat_subscription_workers| SELECT e.subid,
+ s.subname,
+ e.subrelid,
+ e.relid,
+ e.command,
+ e.xid,
+ e.count,
+ e.error_message,
+ e.last_error_time,
+ e.stats_reset
+ FROM ( SELECT pg_subscription.oid AS subid,
+ NULL::oid AS relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT pg_subscription_rel.srsubid AS subid,
+ pg_subscription_rel.srrelid AS relid
+ FROM pg_subscription_rel
+ WHERE (pg_subscription_rel.srsubstate <> 'r'::"char")) sr,
+ (LATERAL pg_stat_get_subscription_worker(sr.subid, sr.relid) e(subid, subrelid, relid, command, xid, count, error_message, last_error_time, stats_reset)
+ JOIN pg_subscription s ON ((e.subid = s.oid)));
pg_stat_sys_indexes| SELECT pg_stat_all_indexes.relid,
pg_stat_all_indexes.indexrelid,
pg_stat_all_indexes.schemaname,
diff --git a/src/test/subscription/t/025_error_report.pl b/src/test/subscription/t/025_error_report.pl
new file mode 100644
index 0000000000..d19caa7013
--- /dev/null
+++ b/src/test/subscription/t/025_error_report.pl
@@ -0,0 +1,156 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Tests for subscription error reporting.
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 5;
+
+# Test if the error reported on pg_subscription_workers view is expected.
+sub test_subscription_error
+{
+ my ($node, $relname, $xid, $expected_error, $msg) = @_;
+
+ my $check_sql = qq[
+SELECT count(1) > 0 FROM pg_stat_subscription_workers
+WHERE relid = '$relname'::regclass];
+ $check_sql .= " AND xid = '$xid'::xid;" if $xid ne '';
+
+ # Wait for the error statistics to be updated.
+ $node->poll_query_until(
+ 'postgres', $check_sql,
+) or die "Timed out while waiting for statistics to be updated";
+
+ my $result = $node->safe_psql(
+ 'postgres',
+ qq[
+SELECT subname, command, relid::regclass, count > 0
+FROM pg_stat_subscription_workers
+WHERE relid = '$relname'::regclass;
+]);
+ is($result, $expected_error, $msg);
+}
+
+# Create publisher node.
+my $node_publisher = PostgresNode->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+logical_decoding_work_mem = 64kB
+]);
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgresNode->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+
+# The subscriber will enter an infinite error loop, so we don't want
+# to overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+wal_retrieve_retry_interval = 5s
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we create
+# the same tables but with primary keys. Also, insert some data that will conflict
+# with the data replicated from publisher later.
+$node_publisher->safe_psql(
+ 'postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int);
+CREATE TABLE test_tab2 (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+INSERT INTO test_tab1 VALUES (1);
+INSERT INTO test_tab2 VALUES (1);
+COMMIT;
+]);
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int primary key);
+CREATE TABLE test_tab2 (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab2 VALUES (1);
+INSERT INTO test_tab_streaming SELECT 10000, md5(10000::text);
+COMMIT;
+]);
+
+# Setup publications.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ q[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Check if there is no subscription errors before starting logical replication.
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_workers");
+is($result, qq(0), 'check no subscription error');
+
+# Create subscriptions. The table sync for test_tab2 on tap_sub will enter to
+# infinite error due to violating the unique constraint.
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (streaming = off, two_phase = on);");
+my $appname_streaming = 'tap_sub_streaming';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming' PUBLICATION tap_pub_streaming WITH (streaming = on, two_phase = on);");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Wait for initial table sync for test_tab1 and test_tab_streaming to finish.
+$node_subscriber->poll_query_until('postgres',
+ q[
+SELECT count(1) = 2 FROM pg_subscription_rel
+WHERE srrelid in ('test_tab1'::regclass, 'test_tab_streaming'::regclass) AND srsubstate in ('r', 's')
+]) or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check the initial data.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab1");
+is($result, q(1), 'check initial data are copied to subscriber');
+
+# Insert more data to test_tab1, raising an error on the subscriber due to violation
+# of the unique constraint on test_tab1.
+my $xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab1 VALUES (1);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_subscription_error($node_subscriber, 'test_tab1', $xid,
+ qq(tap_sub|INSERT|test_tab1|t),
+ 'check the error reported by the apply worker');
+
+# Check the table sync worker's error in the view.
+test_subscription_error($node_subscriber, 'test_tab2', '',
+ qq(tap_sub||test_tab2|t),
+ 'check the error reported by the table sync worker');
+
+# Check if the view doesn't show any entries after dropping the subscriptions.
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+DROP SUBSCRIPTION tap_sub;
+DROP SUBSCRIPTION tap_sub_streaming;
+]);
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_workers");
+is($result, q(0), 'no error after dropping subscription');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index cb5b5ec74c..6916c290f5 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1939,7 +1939,11 @@ PgStat_MsgResetreplslotcounter
PgStat_MsgResetsharedcounter
PgStat_MsgResetsinglecounter
PgStat_MsgResetslrucounter
+PgStat_MsgResetsubworkererror
PgStat_MsgSLRU
+PgStat_MsgSubWorkerError
+PgStat_MsgSubWorkerErrorPurge
+PgStat_MsgSubWorkerPurge
PgStat_MsgTabpurge
PgStat_MsgTabstat
PgStat_MsgTempFile
@@ -1951,6 +1955,8 @@ PgStat_Single_Reset_Type
PgStat_StatDBEntry
PgStat_StatFuncEntry
PgStat_StatReplSlotEntry
+PgStat_StatSubWorkerEntry
+PgStat_StatSubWorkerKey
PgStat_StatTabEntry
PgStat_SubXactStatus
PgStat_TableCounts
--
2.24.3 (Apple Git-128)
On Wed, Oct 6, 2021 at 11:18 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Mon, Oct 4, 2021 at 4:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I think here the main point is that does this addresses Peter's
concern for this Patch to use a separate syntax? Peter E., can you
please confirm? Do let us know if you have something else going in
your mind?Peter's concern seemed to be that the use of a subscription option,
though convenient, isn't an intuitive natural fit for providing this
feature (i.e. ability to skip a transaction by xid). I tend to have
that feeling about using a subscription option for this feature. I'm
not sure what possible alternative syntax he had in mind and currently
can't really think of a good one myself that fits the purpose.I think that the 1st and 2nd patch are useful in their own right, but
couldn't this feature (i.e. the 3rd patch) be provided instead as an
additional Replication Management function (see 9.27.6)?
e.g. pg_replication_skip_xid
After some thoughts on the syntax, it's somewhat natural to me if we
support the skip transaction feature with another syntax like (I
prefer the former):
ALTER SUBSCRIPTION ... [SET|RESET] SKIP TRANSACTION xxx;
or
ALTER SUBSCRIPTION ... SKIP TRANSACTION xxx; (setting NONE as XID to
reset the XID to skip)
The primary reason to have another syntax is that ability to skip a
transaction seems not to be other subscription parameters such as
slot_name, binary, streaming that are dumped by pg_dump. FWIW IMO the
ability to disable the subscription on an error would be a
subscription parameter. The user is likely to want to specify this
option also at CREATE SUBSCRIPTION and wants it to be dumped by
pg_dump. So I think we can think of the skip xid option separately
from this parameter.
Also, I think we can think of the syntax for this ability (skipping a
transaction) separately from the syntax of the general conflict
resolution feature. I guess that we might rather need a whole new
syntax for conflict resolution. In addition, the user will want to
dump the definitions of confliction resolution by pg_dump in common
cases, unlike the skip XID.
As Amit pointed out, we might want to allow users to skip changes
based on something other than XID but the candidates seem only a few
to me (LSN, time, and something else?). If these are only a few,
probably we don’t need to worry about syntax bloat.
Regarding an additional replication management function proposed by
Greg, it seems a bit unnatural to me; the subscription is created and
altered by DDL but why is only skipping the transaction option
specified by an SQL function?
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Mon, Oct 25, 2021 at 7:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Oct 6, 2021 at 11:18 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
I think that the 1st and 2nd patch are useful in their own right, but
couldn't this feature (i.e. the 3rd patch) be provided instead as an
additional Replication Management function (see 9.27.6)?
e.g. pg_replication_skip_xidAfter some thoughts on the syntax, it's somewhat natural to me if we
support the skip transaction feature with another syntax like (I
prefer the former):ALTER SUBSCRIPTION ... [SET|RESET] SKIP TRANSACTION xxx;
or
ALTER SUBSCRIPTION ... SKIP TRANSACTION xxx; (setting NONE as XID to
reset the XID to skip)The primary reason to have another syntax is that ability to skip a
transaction seems not to be other subscription parameters such as
slot_name, binary, streaming that are dumped by pg_dump. FWIW IMO the
ability to disable the subscription on an error would be a
subscription parameter. The user is likely to want to specify this
option also at CREATE SUBSCRIPTION and wants it to be dumped by
pg_dump. So I think we can think of the skip xid option separately
from this parameter.Also, I think we can think of the syntax for this ability (skipping a
transaction) separately from the syntax of the general conflict
resolution feature. I guess that we might rather need a whole new
syntax for conflict resolution.
I agree that we will need a separate syntax for conflict resolution
but there is some similarity in what I proposed above (On
Error/Conflict [1]/messages/by-id/CAA4eK1+BOHXC=0S2kA7GkErWq3-QKj34oQvwAPfuTHq=epf34w@mail.gmail.com) with the existing syntax of Insert ... On
Conflict. I understand that here the context is different and we are
storing this information in the catalog but still there is some syntax
similarity and it will avoid adding new syntax variants.
In addition, the user will want to
dump the definitions of confliction resolution by pg_dump in common
cases, unlike the skip XID.As Amit pointed out, we might want to allow users to skip changes
based on something other than XID but the candidates seem only a few
to me (LSN, time, and something else?). If these are only a few,
probably we don’t need to worry about syntax bloat.
I guess one might want to skip particular operations that cause an
error and that would be possible as we are providing the relevant
information via a view.
Regarding an additional replication management function proposed by
Greg, it seems a bit unnatural to me; the subscription is created and
altered by DDL but why is only skipping the transaction option
specified by an SQL function?
The one advantage I see is that it will be similar to what we already
have via pg_replication_origin_advance() for skipping WAL during
apply. The other thing could be that this feature can lead to problems
if not used carefully so maybe it is better to provide it only by
special functions. Having said that, I still feel we should do it via
Alter Subscription in some way as that will be convenient to use.
[1]: /messages/by-id/CAA4eK1+BOHXC=0S2kA7GkErWq3-QKj34oQvwAPfuTHq=epf34w@mail.gmail.com
--
With Regards,
Amit Kapila.
On Tue, Oct 26, 2021 at 5:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I agree that we will need a separate syntax for conflict resolution
but there is some similarity in what I proposed above (On
Error/Conflict [1]) with the existing syntax of Insert ... On
Conflict. I understand that here the context is different and we are
storing this information in the catalog but still there is some syntax
similarity and it will avoid adding new syntax variants.
The problem I see with the suggested syntax:
Alter Subscription <sub_name> On Error ( subscription_parameter [=
value] [, ... ] );
OR
Alter Subscription <sub_name> On Conflict ( subscription_parameter [=
value] [, ... ] );
is that "On Error ..." and "On Conflict" imply an action to be done on
a future condition (Error/Conflict), whereas at least in this case
(skip_xid) it's only AFTER the problem condition has occurred that we
know the XID of the failed transaction that we want to skip. So that
syntax looks a little confusing to me. Unless you had something else
in mind on how it would work?
Regards,
Greg Nancarrow
Fujitsu Australia
On Tue, Oct 26, 2021 at 2:27 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Tue, Oct 26, 2021 at 5:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I agree that we will need a separate syntax for conflict resolution
but there is some similarity in what I proposed above (On
Error/Conflict [1]) with the existing syntax of Insert ... On
Conflict. I understand that here the context is different and we are
storing this information in the catalog but still there is some syntax
similarity and it will avoid adding new syntax variants.The problem I see with the suggested syntax:
Alter Subscription <sub_name> On Error ( subscription_parameter [=
value] [, ... ] );
OR
Alter Subscription <sub_name> On Conflict ( subscription_parameter [=
value] [, ... ] );is that "On Error ..." and "On Conflict" imply an action to be done on
a future condition (Error/Conflict), whereas at least in this case
(skip_xid) it's only AFTER the problem condition has occurred that we
know the XID of the failed transaction that we want to skip. So that
syntax looks a little confusing to me. Unless you had something else
in mind on how it would work?
You have a point. The other alternatives on this line could be:
Alter Subscription <sub_name> SKIP ( subscription_parameter [=value] [, ... ] );
where subscription_parameter can be one of:
xid = <xid_val>
lsn = <lsn_val>
...
Instead of using Skip, we can use WITH as used in Alter Database
syntax but we are already using WITH in Create Subscription for a
different purpose, so that may not be a very good idea.
The basic idea is that I am trying to use options here rather than a
keyword-based syntax as there can be multiple such options.
--
With Regards,
Amit Kapila.
On Tue, Oct 26, 2021 at 7:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Oct 26, 2021 at 2:27 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Tue, Oct 26, 2021 at 5:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I agree that we will need a separate syntax for conflict resolution
but there is some similarity in what I proposed above (On
Error/Conflict [1]) with the existing syntax of Insert ... On
Conflict. I understand that here the context is different and we are
storing this information in the catalog but still there is some syntax
similarity and it will avoid adding new syntax variants.The problem I see with the suggested syntax:
Alter Subscription <sub_name> On Error ( subscription_parameter [=
value] [, ... ] );
OR
Alter Subscription <sub_name> On Conflict ( subscription_parameter [=
value] [, ... ] );is that "On Error ..." and "On Conflict" imply an action to be done on
a future condition (Error/Conflict), whereas at least in this case
(skip_xid) it's only AFTER the problem condition has occurred that we
know the XID of the failed transaction that we want to skip. So that
syntax looks a little confusing to me. Unless you had something else
in mind on how it would work?You have a point. The other alternatives on this line could be:
Alter Subscription <sub_name> SKIP ( subscription_parameter [=value] [, ... ] );
where subscription_parameter can be one of:
xid = <xid_val>
lsn = <lsn_val>
...
Looks better.
BTW how useful is specifying LSN instead of XID in practice? Given
that this skipping behavior is used to skip the particular transaction
(or its part of operations) in question, I’m not sure specifying LSN
or time is useful. And, if it’s essentially the same as
pg_replication_origin_advance(), we don’t need to have it.
The basic idea is that I am trying to use options here rather than a
keyword-based syntax as there can be multiple such options.
Agreed.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thurs, Oct 21, 2021 12:59 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached updated patches. In this version, in addition to the review
comments I go so far, I've changed the view name from
pg_stat_subscription_errors to pg_stat_subscription_workers as per the
discussion on including xact info to the view on another thread[1].
I’ve also changed related codes accordingly.
When reviewing the v18-0002 patch.
I noticed that "RESET SYNCHRONOUS_COMMIT" does not take effect
(RESET doesn't change the value to 'off').
+ if (!is_reset)
+ {
+ opts->synchronous_commit = defGetString(defel);
- ...
+ }
I think we need to add else branch here to set the synchronous_commit to 'off'.
Best regards,
Hou zj
On Wed, Oct 27, 2021 at 8:32 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Oct 26, 2021 at 7:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
You have a point. The other alternatives on this line could be:
Alter Subscription <sub_name> SKIP ( subscription_parameter [=value] [, ... ] );
where subscription_parameter can be one of:
xid = <xid_val>
lsn = <lsn_val>
...Looks better.
BTW how useful is specifying LSN instead of XID in practice? Given
that this skipping behavior is used to skip the particular transaction
(or its part of operations) in question, I’m not sure specifying LSN
or time is useful.
I think if the user wants to skip multiple xacts, she might want to
use the highest LSN to skip instead of specifying individual xids.
--
With Regards,
Amit Kapila.
On Wed, Oct 27, 2021 at 2:28 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
When reviewing the v18-0002 patch.
I noticed that "RESET SYNCHRONOUS_COMMIT" does not take effect
(RESET doesn't change the value to 'off').+ if (!is_reset) + { + opts->synchronous_commit = defGetString(defel);- ... + }I think we need to add else branch here to set the synchronous_commit to 'off'.
I agree that it doesn't seem to handle the RESET of synchronous_commit.
I think that for consistency, the default value "off" for
synchronous_commit should be set (in the SubOpts) near where the
default values of the boolean supported options are currently set -
near the top of parse_subscription_options().
Regards,
Greg Nancarrow
Fujitsu Australia
On Wed, Oct 27, 2021 at 12:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Oct 27, 2021 at 8:32 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Oct 26, 2021 at 7:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
You have a point. The other alternatives on this line could be:
Alter Subscription <sub_name> SKIP ( subscription_parameter [=value] [, ... ] );
where subscription_parameter can be one of:
xid = <xid_val>
lsn = <lsn_val>
...Looks better.
BTW how useful is specifying LSN instead of XID in practice? Given
that this skipping behavior is used to skip the particular transaction
(or its part of operations) in question, I’m not sure specifying LSN
or time is useful.I think if the user wants to skip multiple xacts, she might want to
use the highest LSN to skip instead of specifying individual xids.
I think it assumes that the situation where the user already knows
multiple transactions that cannot be applied on the subscription but
how do they know?
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wed, Oct 27, 2021 at 10:43 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Oct 27, 2021 at 12:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Oct 27, 2021 at 8:32 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Oct 26, 2021 at 7:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
You have a point. The other alternatives on this line could be:
Alter Subscription <sub_name> SKIP ( subscription_parameter [=value] [, ... ] );
where subscription_parameter can be one of:
xid = <xid_val>
lsn = <lsn_val>
...Looks better.
BTW how useful is specifying LSN instead of XID in practice? Given
that this skipping behavior is used to skip the particular transaction
(or its part of operations) in question, I’m not sure specifying LSN
or time is useful.I think if the user wants to skip multiple xacts, she might want to
use the highest LSN to skip instead of specifying individual xids.I think it assumes that the situation where the user already knows
multiple transactions that cannot be applied on the subscription but
how do they know?
Either from the error messages in the server log or from the new view
we are planning to add. I think such a case is possible during the
initial synchronization phase where apply worker went ahead then
tablesync worker by skipping to apply the changes on the corresponding
table. After that it is possible, that table sync worker failed during
copy and apply worker fails during the processing of some other rel.
Now, I think the only way to move is via LSNs. Currently, figuring out
LSNs to skip is not straight forward but improving that area is the
work of another patch.
--
With Regards,
Amit Kapila.
On Thursday, October 21, 2021 12:59 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached updated patches. In this version, in addition to the
review comments I go so far, I've changed the view name from
pg_stat_subscription_errors to pg_stat_subscription_workers as per the
discussion on including xact info to the view on another thread[1].
I’ve also changed related codes accordingly.
Thanks for your patch.
I have some minor comments on your 0001 and 0002 patch.
1. For 0001 patch, src/backend/catalog/system_views.sql
+CREATE VIEW pg_stat_subscription_workers AS
+ SELECT
+ e.subid,
+ s.subname,
+ e.subrelid,
+ e.relid,
+ e.command,
+ e.xid,
+ e.count,
+ e.error_message,
+ e.last_error_time,
+ e.stats_reset
+ FROM (SELECT
+ oid as subid,
...
Some places use TABs, I think it's better to use spaces here, to be consistent
with other places in this file.
2. For 0002 patch, I think we can add some changes to tab-complete.c, maybe
something like this:
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index ecae9df8ed..96665f6115 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1654,7 +1654,7 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> */
else if (Matches("ALTER", "SUBSCRIPTION", MatchAny))
COMPLETE_WITH("CONNECTION", "ENABLE", "DISABLE", "OWNER TO",
- "RENAME TO", "REFRESH PUBLICATION", "SET",
+ "RENAME TO", "REFRESH PUBLICATION", "SET", "RESET",
"ADD PUBLICATION", "DROP PUBLICATION");
/* ALTER SUBSCRIPTION <name> REFRESH PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) &&
@@ -1670,6 +1670,12 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
COMPLETE_WITH("binary", "slot_name", "streaming", "synchronous_commit");
+ /* ALTER SUBSCRIPTION <name> RESET */
+ else if (Matches("ALTER", "SUBSCRIPTION", MatchAny, "RESET"))
+ COMPLETE_WITH("(");
+ /* ALTER SUBSCRIPTION <name> RESET ( */
+ else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("RESET", "("))
+ COMPLETE_WITH("binary", "streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SET PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "PUBLICATION"))
{
Regards
Tang
On Thu, Oct 21, 2021 at 10:29 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached updated patches.
Few comments:
==============
1. Is the patch cleaning tablesync error entries except via vacuum? If
not, can't we send a message to remove tablesync errors once tablesync
is successful (say when we reset skip_xid or when tablesync is
finished) or when we drop subscription? I think the same applies to
apply worker. I think we may want to track it in some way whether an
error has occurred before sending the message but relying completely
on a vacuum might be the recipe of bloat. I think in the case of a
drop subscription we can simply send the message as that is not a
frequent operation. I might be missing something here because in the
tests after drop subscription you are expecting the entries from the
view to get cleared
2.
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>count</structfield> <type>uint8</type>
+ </para>
+ <para>
+ Number of consecutive times the error occurred
+ </para></entry>
Shall we name this field as error_count as there will be other fields
in this view in the future that may not be directly related to the
error?
3.
+
+CREATE VIEW pg_stat_subscription_workers AS
+ SELECT
+ e.subid,
+ s.subname,
+ e.subrelid,
+ e.relid,
+ e.command,
+ e.xid,
+ e.count,
+ e.error_message,
+ e.last_error_time,
+ e.stats_reset
+ FROM (SELECT
+ oid as subid,
+ NULL as relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT
+ srsubid as subid,
+ srrelid as relid
+ FROM pg_subscription_rel
+ WHERE srsubstate <> 'r') sr,
+ LATERAL pg_stat_get_subscription_worker(sr.subid, sr.relid) e
It might be better to use 'w' as an alias instead of 'e' as the
information is now not restricted to only errors.
4. +# Test if the error reported on pg_subscription_workers view is expected.
The view name is wrong in the above comment
5.
+# Check if the view doesn't show any entries after dropping the subscriptions.
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+DROP SUBSCRIPTION tap_sub;
+DROP SUBSCRIPTION tap_sub_streaming;
+]);
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_workers");
+is($result, q(0), 'no error after dropping subscription');
Don't we need to wait after dropping the subscription and before
checking the view as there might be a slight delay in messages to get
cleared?
7.
+# Create subscriptions. The table sync for test_tab2 on tap_sub will enter to
+# infinite error due to violating the unique constraint.
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr
application_name=$appname' PUBLICATION tap_pub WITH (streaming = off,
two_phase = on);");
+my $appname_streaming = 'tap_sub_streaming';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION
'$publisher_connstr application_name=$appname_streaming' PUBLICATION
tap_pub_streaming WITH (streaming = on, two_phase = on);");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
How can we ensure that subscriber would have caught up when one of the
tablesync workers is constantly in the error loop? Isn't it possible
that the subscriber didn't send the latest lsn feedback till the table
sync worker is finished?
8.
+# Create subscriptions. The table sync for test_tab2 on tap_sub will enter to
+# infinite error due to violating the unique constraint.
The second sentence of the comment can be written as: "The table sync
for test_tab2 on tap_sub will enter into infinite error loop due to
violating the unique constraint."
--
With Regards,
Amit Kapila.
On Wed, Oct 27, 2021 at 8:32 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Oct 26, 2021 at 7:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Oct 26, 2021 at 2:27 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Tue, Oct 26, 2021 at 5:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I agree that we will need a separate syntax for conflict resolution
but there is some similarity in what I proposed above (On
Error/Conflict [1]) with the existing syntax of Insert ... On
Conflict. I understand that here the context is different and we are
storing this information in the catalog but still there is some syntax
similarity and it will avoid adding new syntax variants.The problem I see with the suggested syntax:
Alter Subscription <sub_name> On Error ( subscription_parameter [=
value] [, ... ] );
OR
Alter Subscription <sub_name> On Conflict ( subscription_parameter [=
value] [, ... ] );is that "On Error ..." and "On Conflict" imply an action to be done on
a future condition (Error/Conflict), whereas at least in this case
(skip_xid) it's only AFTER the problem condition has occurred that we
know the XID of the failed transaction that we want to skip. So that
syntax looks a little confusing to me. Unless you had something else
in mind on how it would work?You have a point. The other alternatives on this line could be:
Alter Subscription <sub_name> SKIP ( subscription_parameter [=value] [, ... ] );
where subscription_parameter can be one of:
xid = <xid_val>
lsn = <lsn_val>
...Looks better.
If we want to follow the above, then how do we allow users to reset
the parameter? One way is to allow the user to set xid as 0 which
would mean that we reset it. The other way is to allow SET/RESET
before SKIP but not sure if that is a good option. I was also thinking
about how we can extend the current syntax in the future if we want to
allow users to specify multiple xids? I guess we can either make xid
as a list or allow it to be specified multiple times. We don't need to
do this now but just from the point that we should be able to extend
it later if required.
--
With Regards,
Amit Kapila.
On Wed, Oct 27, 2021 at 2:43 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Oct 27, 2021 at 10:43 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Oct 27, 2021 at 12:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Oct 27, 2021 at 8:32 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Oct 26, 2021 at 7:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
You have a point. The other alternatives on this line could be:
Alter Subscription <sub_name> SKIP ( subscription_parameter [=value] [, ... ] );
where subscription_parameter can be one of:
xid = <xid_val>
lsn = <lsn_val>
...Looks better.
BTW how useful is specifying LSN instead of XID in practice? Given
that this skipping behavior is used to skip the particular transaction
(or its part of operations) in question, I’m not sure specifying LSN
or time is useful.I think if the user wants to skip multiple xacts, she might want to
use the highest LSN to skip instead of specifying individual xids.I think it assumes that the situation where the user already knows
multiple transactions that cannot be applied on the subscription but
how do they know?Either from the error messages in the server log or from the new view
we are planning to add. I think such a case is possible during the
initial synchronization phase where apply worker went ahead then
tablesync worker by skipping to apply the changes on the corresponding
table. After that it is possible, that table sync worker failed during
copy and apply worker fails during the processing of some other rel.
Does it mean that if both initial copy for the corresponding table by
table sync worker and applying changes for other rels by apply worker
fail, we skip both by specifying LSN? If so, can't we disable the
initial copy for the table and skip only the changes for other rels
that cannot be applied?
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thu, Oct 28, 2021 at 7:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Oct 27, 2021 at 2:43 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Oct 27, 2021 at 10:43 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
BTW how useful is specifying LSN instead of XID in practice? Given
that this skipping behavior is used to skip the particular transaction
(or its part of operations) in question, I’m not sure specifying LSN
or time is useful.I think if the user wants to skip multiple xacts, she might want to
use the highest LSN to skip instead of specifying individual xids.I think it assumes that the situation where the user already knows
multiple transactions that cannot be applied on the subscription but
how do they know?Either from the error messages in the server log or from the new view
we are planning to add. I think such a case is possible during the
initial synchronization phase where apply worker went ahead then
tablesync worker by skipping to apply the changes on the corresponding
table. After that it is possible, that table sync worker failed during
copy and apply worker fails during the processing of some other rel.Does it mean that if both initial copy for the corresponding table by
table sync worker and applying changes for other rels by apply worker
fail, we skip both by specifying LSN?
Yes.
If so, can't we disable the
initial copy for the table and skip only the changes for other rels
that cannot be applied?
But anyway you need some way to skip changes via a particular
tablesync worker so that it can mark the relation in 'ready' state. I
think one can also try to use disable_on_error option in such
scenarios depending on how we expose it. Say, if the option means that
all workers (apply or table sync) should be disabled on an error then
it would be a bit tricky but if we can come up with a way to behave
differently for different workers then it is possible to disable one
set of workers and skip the changes in another set of workers.
--
With Regards,
Amit Kapila.
On Wed, Oct 27, 2021 at 4:07 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Oct 27, 2021 at 8:32 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Oct 26, 2021 at 7:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
You have a point. The other alternatives on this line could be:
Alter Subscription <sub_name> SKIP ( subscription_parameter [=value] [, ... ] );
where subscription_parameter can be one of:
xid = <xid_val>
lsn = <lsn_val>
...Looks better.
If we want to follow the above, then how do we allow users to reset
the parameter? One way is to allow the user to set xid as 0 which
would mean that we reset it. The other way is to allow SET/RESET
before SKIP but not sure if that is a good option.
After thinking some more on this, I think it is better to not use
SET/RESET keyword here. I think we can use a model similar to how we
allow setting some of the options in Alter Database:
# Set the connection limit for a database:
Alter Database akapila WITH connection_limit = 1;
# Reset the connection limit
Alter Database akapila WITH connection_limit = -1;
Thoughts?
With Regards,
Amit Kapila.
On Wed, Oct 27, 2021 at 7:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Oct 21, 2021 at 10:29 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached updated patches.
Thank you for the comments!
Few comments:
==============
1. Is the patch cleaning tablesync error entries except via vacuum? If
not, can't we send a message to remove tablesync errors once tablesync
is successful (say when we reset skip_xid or when tablesync is
finished) or when we drop subscription? I think the same applies to
apply worker. I think we may want to track it in some way whether an
error has occurred before sending the message but relying completely
on a vacuum might be the recipe of bloat. I think in the case of a
drop subscription we can simply send the message as that is not a
frequent operation. I might be missing something here because in the
tests after drop subscription you are expecting the entries from the
view to get cleared
Yes, I think we can have tablesync worker send a message to drop stats
once tablesync is successful. But if we do that also when dropping a
subscription, I think we need to do that only the transaction is
committed since we can drop a subscription that doesn't have a
replication slot and rollback the transaction. Probably we can send
the message only when the subscritpion does have a replication slot.
In other cases, we can remember the subscriptions being dropped and
send the message to drop the statistics of them after committing the
transaction but I’m not sure it’s worth having it. FWIW, we completely
rely on pg_stat_vacuum_stats() for cleaning up the dead tables and
functions. And we don't expect there are many subscriptions on the
database.
What do you think?
2. + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>count</structfield> <type>uint8</type> + </para> + <para> + Number of consecutive times the error occurred + </para></entry>Shall we name this field as error_count as there will be other fields
in this view in the future that may not be directly related to the
error?
Agreed.
3. + +CREATE VIEW pg_stat_subscription_workers AS + SELECT + e.subid, + s.subname, + e.subrelid, + e.relid, + e.command, + e.xid, + e.count, + e.error_message, + e.last_error_time, + e.stats_reset + FROM (SELECT + oid as subid, + NULL as relid + FROM pg_subscription + UNION ALL + SELECT + srsubid as subid, + srrelid as relid + FROM pg_subscription_rel + WHERE srsubstate <> 'r') sr, + LATERAL pg_stat_get_subscription_worker(sr.subid, sr.relid) eIt might be better to use 'w' as an alias instead of 'e' as the
information is now not restricted to only errors.
Agreed.
4. +# Test if the error reported on pg_subscription_workers view is expected.
The view name is wrong in the above comment
Fixed.
5. +# Check if the view doesn't show any entries after dropping the subscriptions. +$node_subscriber->safe_psql( + 'postgres', + q[ +DROP SUBSCRIPTION tap_sub; +DROP SUBSCRIPTION tap_sub_streaming; +]); +$result = $node_subscriber->safe_psql('postgres', + "SELECT count(1) FROM pg_stat_subscription_workers"); +is($result, q(0), 'no error after dropping subscription');Don't we need to wait after dropping the subscription and before
checking the view as there might be a slight delay in messages to get
cleared?
I think the test always passes without waiting for the statistics to
be updated since we fetch the subscription worker statistics from the
stats collector based on the entries of pg_subscription catalog. So
this test checks if statistics of already-dropped subscription doesn’t
show up in the view after DROP SUBSCRIPTION, but does not check if the
subscription worker statistics entry in the stats collector gets
removed. The primary reason is that as I mentioned above, the patch
relies on pgstat_vacuum_stat() for cleaning up the dead subscriptions.
7. +# Create subscriptions. The table sync for test_tab2 on tap_sub will enter to +# infinite error due to violating the unique constraint. +my $appname = 'tap_sub'; +$node_subscriber->safe_psql( + 'postgres', + "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (streaming = off, two_phase = on);"); +my $appname_streaming = 'tap_sub_streaming'; +$node_subscriber->safe_psql( + 'postgres', + "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming' PUBLICATION tap_pub_streaming WITH (streaming = on, two_phase = on);"); + +$node_publisher->wait_for_catchup($appname); +$node_publisher->wait_for_catchup($appname_streaming);How can we ensure that subscriber would have caught up when one of the
tablesync workers is constantly in the error loop? Isn't it possible
that the subscriber didn't send the latest lsn feedback till the table
sync worker is finished?
I thought that even if tablesync for a table is still ongoing, the
apply worker can apply commit records, update write LSN and flush LSN,
and send the feedback to the wal sender. No?
8. +# Create subscriptions. The table sync for test_tab2 on tap_sub will enter to +# infinite error due to violating the unique constraint.The second sentence of the comment can be written as: "The table sync
for test_tab2 on tap_sub will enter into infinite error loop due to
violating the unique constraint."
Fixed.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thu, Oct 28, 2021 at 1:05 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Oct 28, 2021 at 7:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Oct 27, 2021 at 2:43 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Oct 27, 2021 at 10:43 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
BTW how useful is specifying LSN instead of XID in practice? Given
that this skipping behavior is used to skip the particular transaction
(or its part of operations) in question, I’m not sure specifying LSN
or time is useful.I think if the user wants to skip multiple xacts, she might want to
use the highest LSN to skip instead of specifying individual xids.I think it assumes that the situation where the user already knows
multiple transactions that cannot be applied on the subscription but
how do they know?Either from the error messages in the server log or from the new view
we are planning to add. I think such a case is possible during the
initial synchronization phase where apply worker went ahead then
tablesync worker by skipping to apply the changes on the corresponding
table. After that it is possible, that table sync worker failed during
copy and apply worker fails during the processing of some other rel.Does it mean that if both initial copy for the corresponding table by
table sync worker and applying changes for other rels by apply worker
fail, we skip both by specifying LSN?Yes.
If so, can't we disable the
initial copy for the table and skip only the changes for other rels
that cannot be applied?But anyway you need some way to skip changes via a particular
tablesync worker so that it can mark the relation in 'ready' state.
Right.
I
think one can also try to use disable_on_error option in such
scenarios depending on how we expose it. Say, if the option means that
all workers (apply or table sync) should be disabled on an error then
it would be a bit tricky but if we can come up with a way to behave
differently for different workers then it is possible to disable one
set of workers and skip the changes in another set of workers.
Yes, I would prefer to skip individual transactions in question rather
than skip changes until the particular LSN. It’s not advisable to use
LSN to skip changes since it has a risk of skipping unrelated changes
too.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thu, Oct 28, 2021 at 1:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Oct 27, 2021 at 4:07 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Oct 27, 2021 at 8:32 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Oct 26, 2021 at 7:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
You have a point. The other alternatives on this line could be:
Alter Subscription <sub_name> SKIP ( subscription_parameter [=value] [, ... ] );
where subscription_parameter can be one of:
xid = <xid_val>
lsn = <lsn_val>
...Looks better.
If we want to follow the above, then how do we allow users to reset
the parameter? One way is to allow the user to set xid as 0 which
would mean that we reset it. The other way is to allow SET/RESET
before SKIP but not sure if that is a good option.After thinking some more on this, I think it is better to not use
SET/RESET keyword here. I think we can use a model similar to how we
allow setting some of the options in Alter Database:# Set the connection limit for a database:
Alter Database akapila WITH connection_limit = 1;
# Reset the connection limit
Alter Database akapila WITH connection_limit = -1;Thoughts?
Agreed.
Another thing I'm concerned is that the syntax "SKIP (
subscription_parameter [=value] [, ...])" looks like we can specify
multiple options for example, "SKIP (xid = '100', lsn =
'0/12345678’)”. Is there a case where we need to specify multiple
options? Perhaps when specifying the target XID and operations for
example, “SKIP (xid = 100, action = ‘insert, update’)”?
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thu, Oct 28, 2021 at 10:56 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Oct 28, 2021 at 1:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Oct 27, 2021 at 4:07 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Oct 27, 2021 at 8:32 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Oct 26, 2021 at 7:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
You have a point. The other alternatives on this line could be:
Alter Subscription <sub_name> SKIP ( subscription_parameter [=value] [, ... ] );
where subscription_parameter can be one of:
xid = <xid_val>
lsn = <lsn_val>
...Looks better.
If we want to follow the above, then how do we allow users to reset
the parameter? One way is to allow the user to set xid as 0 which
would mean that we reset it. The other way is to allow SET/RESET
before SKIP but not sure if that is a good option.After thinking some more on this, I think it is better to not use
SET/RESET keyword here. I think we can use a model similar to how we
allow setting some of the options in Alter Database:# Set the connection limit for a database:
Alter Database akapila WITH connection_limit = 1;
# Reset the connection limit
Alter Database akapila WITH connection_limit = -1;Thoughts?
Agreed.
Another thing I'm concerned is that the syntax "SKIP (
subscription_parameter [=value] [, ...])" looks like we can specify
multiple options for example, "SKIP (xid = '100', lsn =
'0/12345678’)”. Is there a case where we need to specify multiple
options? Perhaps when specifying the target XID and operations for
example, “SKIP (xid = 100, action = ‘insert, update’)”?
Yeah, or maybe prepared transaction identifier and actions. BTW, if we
want to proceed without the SET/RESET keyword then you can prepare the
SKIP xid patch as the second in the series and we can probably work on
the RESET syntax as a completely independent patch.
--
With Regards,
Amit Kapila.
On Thu, Oct 28, 2021 at 10:48 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Oct 28, 2021 at 1:05 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Oct 28, 2021 at 7:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Either from the error messages in the server log or from the new view
we are planning to add. I think such a case is possible during the
initial synchronization phase where apply worker went ahead then
tablesync worker by skipping to apply the changes on the corresponding
table. After that it is possible, that table sync worker failed during
copy and apply worker fails during the processing of some other rel.Does it mean that if both initial copy for the corresponding table by
table sync worker and applying changes for other rels by apply worker
fail, we skip both by specifying LSN?Yes.
If so, can't we disable the
initial copy for the table and skip only the changes for other rels
that cannot be applied?But anyway you need some way to skip changes via a particular
tablesync worker so that it can mark the relation in 'ready' state.Right.
I
think one can also try to use disable_on_error option in such
scenarios depending on how we expose it. Say, if the option means that
all workers (apply or table sync) should be disabled on an error then
it would be a bit tricky but if we can come up with a way to behave
differently for different workers then it is possible to disable one
set of workers and skip the changes in another set of workers.Yes, I would prefer to skip individual transactions in question rather
than skip changes until the particular LSN. It’s not advisable to use
LSN to skip changes since it has a risk of skipping unrelated changes
too.
Fair enough but I think providing LSN is also useful if user can
identify the same easily as otherwise there might be more
administrative work to make replication progress.
--
With Regards,
Amit Kapila.
On Thu, Oct 28, 2021 at 10:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Oct 27, 2021 at 7:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Oct 21, 2021 at 10:29 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached updated patches.
Thank you for the comments!
Few comments:
==============
1. Is the patch cleaning tablesync error entries except via vacuum? If
not, can't we send a message to remove tablesync errors once tablesync
is successful (say when we reset skip_xid or when tablesync is
finished) or when we drop subscription? I think the same applies to
apply worker. I think we may want to track it in some way whether an
error has occurred before sending the message but relying completely
on a vacuum might be the recipe of bloat. I think in the case of a
drop subscription we can simply send the message as that is not a
frequent operation. I might be missing something here because in the
tests after drop subscription you are expecting the entries from the
view to get clearedYes, I think we can have tablesync worker send a message to drop stats
once tablesync is successful. But if we do that also when dropping a
subscription, I think we need to do that only the transaction is
committed since we can drop a subscription that doesn't have a
replication slot and rollback the transaction. Probably we can send
the message only when the subscritpion does have a replication slot.
Right. And probably for apply worker after updating skip xid.
In other cases, we can remember the subscriptions being dropped and
send the message to drop the statistics of them after committing the
transaction but I’m not sure it’s worth having it.
Yeah, let's not go to that extent. I think in most cases subscriptions
will have corresponding slots.
FWIW, we completely
rely on pg_stat_vacuum_stats() for cleaning up the dead tables and
functions. And we don't expect there are many subscriptions on the
database.
True, but we do send it for the database, so let's do it for the cases
you explained in the first paragraph.
5. +# Check if the view doesn't show any entries after dropping the subscriptions. +$node_subscriber->safe_psql( + 'postgres', + q[ +DROP SUBSCRIPTION tap_sub; +DROP SUBSCRIPTION tap_sub_streaming; +]); +$result = $node_subscriber->safe_psql('postgres', + "SELECT count(1) FROM pg_stat_subscription_workers"); +is($result, q(0), 'no error after dropping subscription');Don't we need to wait after dropping the subscription and before
checking the view as there might be a slight delay in messages to get
cleared?I think the test always passes without waiting for the statistics to
be updated since we fetch the subscription worker statistics from the
stats collector based on the entries of pg_subscription catalog. So
this test checks if statistics of already-dropped subscription doesn’t
show up in the view after DROP SUBSCRIPTION, but does not check if the
subscription worker statistics entry in the stats collector gets
removed. The primary reason is that as I mentioned above, the patch
relies on pgstat_vacuum_stat() for cleaning up the dead subscriptions.
That makes sense.
7. +# Create subscriptions. The table sync for test_tab2 on tap_sub will enter to +# infinite error due to violating the unique constraint. +my $appname = 'tap_sub'; +$node_subscriber->safe_psql( + 'postgres', + "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (streaming = off, two_phase = on);"); +my $appname_streaming = 'tap_sub_streaming'; +$node_subscriber->safe_psql( + 'postgres', + "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming' PUBLICATION tap_pub_streaming WITH (streaming = on, two_phase = on);"); + +$node_publisher->wait_for_catchup($appname); +$node_publisher->wait_for_catchup($appname_streaming);How can we ensure that subscriber would have caught up when one of the
tablesync workers is constantly in the error loop? Isn't it possible
that the subscriber didn't send the latest lsn feedback till the table
sync worker is finished?I thought that even if tablesync for a table is still ongoing, the
apply worker can apply commit records, update write LSN and flush LSN,
and send the feedback to the wal sender. No?
You are right, this case will work.
--
With Regards,
Amit Kapila.
On Thu, Oct 21, 2021 at 10:30 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Oct 20, 2021 at 12:33 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Mon, Oct 18, 2021 at 12:34 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached updated patches that incorporate all comments I got so far.
Minor comment on patch 17-0003
Thank you for the comment!
src/backend/replication/logical/worker.c
(1) Typo in apply_handle_stream_abort() comment:
/* Stop skipping transaction transaction, if enabled */
should be:
/* Stop skipping transaction changes, if enabled */Fixed.
I've attached updated patches.
I have started to have a look at the feature and review the patch, my
initial comments:
1) I could specify invalid subscriber id to
pg_stat_reset_subscription_worker which creates an assertion failure?
+static void
+pgstat_recv_resetsubworkercounter(PgStat_MsgResetsubworkercounter
*msg, int len)
+{
+ PgStat_StatSubWorkerEntry *wentry;
+
+ Assert(OidIsValid(msg->m_subid));
+
+ /* Get subscription worker stats */
+ wentry = pgstat_get_subworker_entry(msg->m_subid,
msg->m_subrelid, false);
postgres=# select pg_stat_reset_subscription_worker(NULL, NULL);
pg_stat_reset_subscription_worker
-----------------------------------
(1 row)
TRAP: FailedAssertion("OidIsValid(msg->m_subid)", File: "pgstat.c",
Line: 5742, PID: 789588)
postgres: stats collector (ExceptionalCondition+0xd0)[0x55d33bba4778]
postgres: stats collector (+0x545a43)[0x55d33b90aa43]
postgres: stats collector (+0x541fad)[0x55d33b906fad]
postgres: stats collector (pgstat_start+0xdd)[0x55d33b9020e1]
postgres: stats collector (+0x54ae0c)[0x55d33b90fe0c]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x141f0)[0x7f8509ccc1f0]
/lib/x86_64-linux-gnu/libc.so.6(__select+0x57)[0x7f8509a78ac7]
postgres: stats collector (+0x548cab)[0x55d33b90dcab]
postgres: stats collector (PostmasterMain+0x134c)[0x55d33b90d5c6]
postgres: stats collector (+0x43b8be)[0x55d33b8008be]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xd5)[0x7f8509992565]
postgres: stats collector (_start+0x2e)[0x55d33b48e4fe]
2) I was able to provide invalid relation id for
pg_stat_reset_subscription_worker? Should we add any validation for
this?
select pg_stat_reset_subscription_worker(16389, -1);
+pg_stat_reset_subscription_worker(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid;
+
+ if (PG_ARGISNULL(1))
+ relid = InvalidOid; /* reset apply worker
error stats */
+ else
+ relid = PG_GETARG_OID(1); /* reset table sync
worker error stats */
+
+ pgstat_reset_subworker_stats(subid, relid);
+
+ PG_RETURN_VOID();
+}
3) 025_error_report test is failing because of one of the recent
commit that has made some changes in the way node is initialized in
the tap tests, corresponding changes need to be done in
025_error_report:
t/025_error_report.pl .............. Dubious, test returned 2 (wstat 512, 0x200)
No subtests run
t/100_bugs.pl ...................... ok
Regards,
Vignesh
On Thu, Oct 28, 2021 at 6:34 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Oct 28, 2021 at 10:56 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Oct 28, 2021 at 1:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Oct 27, 2021 at 4:07 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Oct 27, 2021 at 8:32 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Oct 26, 2021 at 7:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
You have a point. The other alternatives on this line could be:
Alter Subscription <sub_name> SKIP ( subscription_parameter [=value] [, ... ] );
where subscription_parameter can be one of:
xid = <xid_val>
lsn = <lsn_val>
...Looks better.
If we want to follow the above, then how do we allow users to reset
the parameter? One way is to allow the user to set xid as 0 which
would mean that we reset it. The other way is to allow SET/RESET
before SKIP but not sure if that is a good option.After thinking some more on this, I think it is better to not use
SET/RESET keyword here. I think we can use a model similar to how we
allow setting some of the options in Alter Database:# Set the connection limit for a database:
Alter Database akapila WITH connection_limit = 1;
# Reset the connection limit
Alter Database akapila WITH connection_limit = -1;Thoughts?
Agreed.
Another thing I'm concerned is that the syntax "SKIP (
subscription_parameter [=value] [, ...])" looks like we can specify
multiple options for example, "SKIP (xid = '100', lsn =
'0/12345678’)”. Is there a case where we need to specify multiple
options? Perhaps when specifying the target XID and operations for
example, “SKIP (xid = 100, action = ‘insert, update’)”?Yeah, or maybe prepared transaction identifier and actions.
Prepared transactions seem not to need to be skipped since those
changes are already successfully applied, though.
BTW, if we
want to proceed without the SET/RESET keyword then you can prepare the
SKIP xid patch as the second in the series and we can probably work on
the RESET syntax as a completely independent patch.
Right. If we do that, the second patch can be an independent patch.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thu, Oct 28, 2021 at 7:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Oct 28, 2021 at 10:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Oct 27, 2021 at 7:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Oct 21, 2021 at 10:29 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached updated patches.
Thank you for the comments!
Few comments:
==============
1. Is the patch cleaning tablesync error entries except via vacuum? If
not, can't we send a message to remove tablesync errors once tablesync
is successful (say when we reset skip_xid or when tablesync is
finished) or when we drop subscription? I think the same applies to
apply worker. I think we may want to track it in some way whether an
error has occurred before sending the message but relying completely
on a vacuum might be the recipe of bloat. I think in the case of a
drop subscription we can simply send the message as that is not a
frequent operation. I might be missing something here because in the
tests after drop subscription you are expecting the entries from the
view to get clearedYes, I think we can have tablesync worker send a message to drop stats
once tablesync is successful. But if we do that also when dropping a
subscription, I think we need to do that only the transaction is
committed since we can drop a subscription that doesn't have a
replication slot and rollback the transaction. Probably we can send
the message only when the subscritpion does have a replication slot.Right. And probably for apply worker after updating skip xid.
I'm not sure it's better to drop apply worker stats after resetting
skip xid (i.g., after skipping the transaction). Since the view is a
cumulative view and has last_error_time, I thought we can have the
apply worker stats until the subscription gets dropped. Since the
error reporting message could get lost, no entry in the view doesn’t
mean the worker doesn’t face an issue.
In other cases, we can remember the subscriptions being dropped and
send the message to drop the statistics of them after committing the
transaction but I’m not sure it’s worth having it.Yeah, let's not go to that extent. I think in most cases subscriptions
will have corresponding slots.
Agreed.
FWIW, we completely
rely on pg_stat_vacuum_stats() for cleaning up the dead tables and
functions. And we don't expect there are many subscriptions on the
database.True, but we do send it for the database, so let's do it for the cases
you explained in the first paragraph.
Agreed.
I've attached a new version patch. Since the syntax of skipping
transaction id is under the discussion I've attached only the error
reporting patch for now.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v19-0001-Add-a-subscription-worker-statistics-view-pg_sta.patchapplication/x-patch; name=v19-0001-Add-a-subscription-worker-statistics-view-pg_sta.patchDownload
From ea18cad8624a093aa103272c9ababa303a49e66e Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 16 Jul 2021 23:10:22 +0900
Subject: [PATCH v19 1/3] Add a subscription worker statistics view
"pg_stat_subscription_workers".
This commit adds a new system view pg_stat_subscription_workers,
that shows information about any errors which occur during application
of logical replication changes as well as during performing initial table
synchronization.
The subscription error entries are removed by autovacuum workers after
table synchronization completes in table sync worker cases and after
dropping the subscription in apply worker cases.
It also adds an SQL function pg_stat_reset_subscription_worker() to
reset a single subscription error.
---
doc/src/sgml/monitoring.sgml | 161 ++++++
src/backend/catalog/system_functions.sql | 2 +
src/backend/catalog/system_views.sql | 25 +
src/backend/commands/subscriptioncmds.c | 15 +-
src/backend/postmaster/pgstat.c | 604 ++++++++++++++++++++
src/backend/replication/logical/tablesync.c | 13 +
src/backend/replication/logical/worker.c | 54 +-
src/backend/utils/adt/pgstatfuncs.c | 122 ++++
src/include/catalog/pg_proc.dat | 13 +
src/include/pgstat.h | 126 ++++
src/test/regress/expected/rules.out | 20 +
src/test/subscription/t/026_error_report.pl | 156 +++++
src/tools/pgindent/typedefs.list | 6 +
13 files changed, 1313 insertions(+), 4 deletions(-)
create mode 100644 src/test/subscription/t/026_error_report.pl
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 3173ec2566..094c7239fa 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -627,6 +627,15 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
</entry>
</row>
+ <row>
+ <entry><structname>pg_stat_subscription_workers</structname><indexterm><primary>pg_stat_subscription_workers</primary></indexterm></entry>
+ <entry>At least one row per subscription, showing about errors that
+ occurred on subscription.
+ See <link linkend="monitoring-pg-stat-subscription-workers">
+ <structname>pg_stat_subscription_workers</structname></link> for details.
+ </entry>
+ </row>
+
</tbody>
</tgroup>
</table>
@@ -3034,6 +3043,136 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
</sect2>
+ <sect2 id="monitoring-pg-stat-subscription-workers">
+ <title><structname>pg_stat_subscription_workers</structname></title>
+
+ <indexterm>
+ <primary>pg_stat_subscription_workers</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_stat_subscription_workers</structname> view will contain
+ one row per subscription error reported by workers applying logical
+ replication changes and workers handling the initial data copy of the
+ subscribed tables.
+ </para>
+
+ <table id="pg-stat-subscription-workers" xreflabel="pg_stat_subscription_workers">
+ <title><structname>pg_stat_subscription_workers</structname> View</title>
+ <tgroup cols="1">
+ <thead>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ Column Type
+ </para>
+ <para>
+ Description
+ </para></entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the subscription
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the subscription
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subrelid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker is synchronizing; NULL for the
+ main apply worker
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>relid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker was processing when the
+ error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>command</structfield> <type>text</type>
+ </para>
+ <para>
+ Name of command being applied when the error occurred. This field
+ is always NULL if the error was reported during the initial data
+ copy.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>xid</structfield> <type>xid</type>
+ </para>
+ <para>
+ Transaction ID of the publisher node being applied when the error
+ occurred. This field is always NULL if the error was reported
+ during the initial data copy.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>error_count</structfield> <type>uint8</type>
+ </para>
+ <para>
+ Number of consecutive times the error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>error_message</structfield> <type>text</type>
+ </para>
+ <para>
+ The error message
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_error_time</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which the last error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>stats_reset</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which these statistics were last reset
+ </para></entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ </sect2>
+
<sect2 id="monitoring-pg-stat-ssl-view">
<title><structname>pg_stat_ssl</structname></title>
@@ -5156,6 +5295,28 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
can be granted EXECUTE to run the function.
</para></entry>
</row>
+
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_stat_reset_subscription_worker</primary>
+ </indexterm>
+ <function>pg_stat_reset_subscription_worker</function> ( <parameter>subid</parameter> <type>oid</type>, <parameter>relid</parameter> <type>oid</type> )
+ <returnvalue>void</returnvalue>
+ </para>
+ <para>
+ Resets statistics of a single subscription worker error. If
+ the argument <parameter>relid</parameter> is not <literal>NULL</literal>,
+ resets error statistics of the <literal>tablesync</literal> worker for
+ the relation with <parameter>relid</parameter>. Otherwise, resets the
+ error statistics of the <literal>apply</literal> worker running on the
+ subscription with <parameter>subid</parameter>.
+ </para>
+ <para>
+ This function is restricted to superusers by default, but other users
+ can be granted EXECUTE to run the function.
+ </para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index 54c93b16c4..921fce5546 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -639,6 +639,8 @@ REVOKE EXECUTE ON FUNCTION pg_stat_reset_single_function_counters(oid) FROM publ
REVOKE EXECUTE ON FUNCTION pg_stat_reset_replication_slot(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_stat_reset_subscription_worker(oid, oid) FROM public;
+
REVOKE EXECUTE ON FUNCTION lo_import(text) FROM public;
REVOKE EXECUTE ON FUNCTION lo_import(text, oid) FROM public;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index eb560955cd..a2ee00c6fd 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1261,3 +1261,28 @@ REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
substream, subtwophasestate, subslotname, subsynccommit, subpublications)
ON pg_subscription TO public;
+
+CREATE VIEW pg_stat_subscription_workers AS
+ SELECT
+ w.subid,
+ s.subname,
+ w.subrelid,
+ w.relid,
+ w.command,
+ w.xid,
+ w.error_count,
+ w.error_message,
+ w.last_error_time,
+ w.stats_reset
+ FROM (SELECT
+ oid as subid,
+ NULL as relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT
+ srsubid as subid,
+ srrelid as relid
+ FROM pg_subscription_rel
+ WHERE srsubstate <> 'r') sr,
+ LATERAL pg_stat_get_subscription_worker(sr.subid, sr.relid) w
+ JOIN pg_subscription s ON (w.subid = s.oid);
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index c47ba26369..18962b91e1 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -32,6 +32,7 @@
#include "executor/executor.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "pgstat.h"
#include "replication/logicallauncher.h"
#include "replication/origin.h"
#include "replication/slot.h"
@@ -1204,7 +1205,8 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
* Since dropping a replication slot is not transactional, the replication
* slot stays dropped even if the transaction rolls back. So we cannot
* run DROP SUBSCRIPTION inside a transaction block if dropping the
- * replication slot.
+ * replication slot. Also, in this case, we report a message for dropping
+ * the subscription to the stats collector.
*
* XXX The command name should really be something like "DROP SUBSCRIPTION
* of a subscription that is associated with a replication slot", but we
@@ -1377,6 +1379,17 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
}
PG_END_TRY();
+ /*
+ * Send a message for dropping this subscription to the stats collector. We
+ * can safely report dropping the subscription statistics here if the
+ * subscription is associated with a replication slot since we cannot run
+ * DROP SUBSCRIPTION inside a transaction block. Subscription statistics will
+ * be removed later by (auto)vacuum either if it's not associated with a
+ * replication slot or if the message for dropping the subscription gets lost.
+ */
+ if (slotname)
+ pgstat_report_subscription_drop(subid);
+
table_close(rel, NoLock);
}
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index b7d0fbaefd..b7883ec477 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -41,6 +41,8 @@
#include "catalog/catalog.h"
#include "catalog/pg_database.h"
#include "catalog/pg_proc.h"
+#include "catalog/pg_subscription.h"
+#include "catalog/pg_subscription_rel.h"
#include "common/ip.h"
#include "executor/instrument.h"
#include "libpq/libpq.h"
@@ -106,6 +108,7 @@
#define PGSTAT_TAB_HASH_SIZE 512
#define PGSTAT_FUNCTION_HASH_SIZE 512
#define PGSTAT_REPLSLOT_HASH_SIZE 32
+#define PGSTAT_SUBWORKER_HASH_SIZE 32
/* ----------
@@ -282,6 +285,7 @@ static PgStat_GlobalStats globalStats;
static PgStat_WalStats walStats;
static PgStat_SLRUStats slruStats[SLRU_NUM_ELEMENTS];
static HTAB *replSlotStatHash = NULL;
+static HTAB *subWorkerStatHash = NULL;
/*
* List of OIDs of databases we need to write out. If an entry is InvalidOid,
@@ -332,6 +336,13 @@ static bool pgstat_db_requested(Oid databaseid);
static PgStat_StatReplSlotEntry *pgstat_get_replslot_entry(NameData name, bool create_it);
static void pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotstats, TimestampTz ts);
+static PgStat_StatSubWorkerEntry *pgstat_get_subworker_entry(Oid subid, Oid subrelid,
+ bool create);
+static void pgstat_reset_subworker_entry(PgStat_StatSubWorkerEntry *wentry, TimestampTz ts);
+static void pgstat_vacuum_subworker_stats(void);
+static void pgstat_send_subscription_purge(PgStat_MsgSubscriptionPurge *msg);
+static void pgstat_send_subworker_purge(PgStat_MsgSubWorkerPurge *msg);
+
static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg, TimestampTz now);
static void pgstat_send_funcstats(void);
static void pgstat_send_slru(void);
@@ -356,6 +367,7 @@ static void pgstat_recv_resetsharedcounter(PgStat_MsgResetsharedcounter *msg, in
static void pgstat_recv_resetsinglecounter(PgStat_MsgResetsinglecounter *msg, int len);
static void pgstat_recv_resetslrucounter(PgStat_MsgResetslrucounter *msg, int len);
static void pgstat_recv_resetreplslotcounter(PgStat_MsgResetreplslotcounter *msg, int len);
+static void pgstat_recv_resetsubworkercounter(PgStat_MsgResetsubworkercounter *msg, int len);
static void pgstat_recv_autovac(PgStat_MsgAutovacStart *msg, int len);
static void pgstat_recv_vacuum(PgStat_MsgVacuum *msg, int len);
static void pgstat_recv_analyze(PgStat_MsgAnalyze *msg, int len);
@@ -373,6 +385,9 @@ static void pgstat_recv_connect(PgStat_MsgConnect *msg, int len);
static void pgstat_recv_disconnect(PgStat_MsgDisconnect *msg, int len);
static void pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len);
static void pgstat_recv_tempfile(PgStat_MsgTempFile *msg, int len);
+static void pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len);
+static void pgstat_recv_subworker_error(PgStat_MsgSubWorkerError *msg, int len);
+static void pgstat_recv_subworker_purge(PgStat_MsgSubWorkerPurge *msg, int len);
/* ------------------------------------------------------------
* Public functions called from postmaster follow
@@ -1178,6 +1193,10 @@ pgstat_vacuum_stat(void)
}
}
+ /* Cleanup the dead subscription workers statistics */
+ if (subWorkerStatHash)
+ pgstat_vacuum_subworker_stats();
+
/*
* Lookup our own database entry; if not found, nothing more to do.
*/
@@ -1355,6 +1374,210 @@ pgstat_collect_oids(Oid catalogid, AttrNumber anum_oid)
}
+/* PgStat_StatSubWorkerEntry comparator sorting subid and subrelid */
+static int
+subworker_stats_comparator(const ListCell *a, const ListCell *b)
+{
+ PgStat_StatSubWorkerEntry *entry1 = (PgStat_StatSubWorkerEntry *) lfirst(a);
+ PgStat_StatSubWorkerEntry *entry2 = (PgStat_StatSubWorkerEntry *) lfirst(b);
+ int ret;
+
+ ret = oid_cmp(&entry1->key.subid, &entry2->key.subid);
+ if (ret != 0)
+ return ret;
+
+ return oid_cmp(&entry1->key.subrelid, &entry2->key.subrelid);
+}
+
+/* ----------
+ * pgstat_vacuum_subworker_stat() -
+ *
+ * Subroutine for pgstat_vacuum_stat: tell the collector to remove dead
+ * subscriptions and worker statistics.
+ * ----------
+ */
+static void
+pgstat_vacuum_subworker_stats(void)
+{
+ PgStat_StatSubWorkerEntry *wentry;
+ PgStat_MsgSubscriptionPurge spmsg;
+ PgStat_MsgSubWorkerPurge wpmsg;
+ HASH_SEQ_STATUS hstat;
+ HTAB *subids;
+ List *subworker_stats = NIL;
+ List *not_ready_rels = NIL;
+ ListCell *lc1;
+
+ /* Build the list of worker stats and sort it by subid and relid */
+ hash_seq_init(&hstat, subWorkerStatHash);
+ while ((wentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&hstat)) != NULL)
+ subworker_stats = lappend(subworker_stats, wentry);
+ list_sort(subworker_stats, subworker_stats_comparator);
+
+ /*
+ * Read pg_subscription and make a list of OIDs of all existing
+ * subscriptions
+ */
+ subids = pgstat_collect_oids(SubscriptionRelationId, Anum_pg_subscription_oid);
+
+ /*
+ * Search for all the dead subscriptions and unnecessary table sync worker
+ * entries in stats hashtable and tell the stats collector to drop them.
+ */
+ spmsg.m_nentries = 0;
+ wpmsg.m_nentries = 0;
+ wpmsg.m_subid = InvalidOid;
+ foreach(lc1, subworker_stats)
+ {
+ ListCell *lc2;
+ bool keep_it = false;
+
+ wentry = (PgStat_StatSubWorkerEntry *) lfirst(lc1);
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* Skip if we already registered this subscription to purge */
+ if (spmsg.m_nentries > 0 &&
+ spmsg.m_subids[spmsg.m_nentries - 1] == wentry->key.subid)
+ continue;
+
+ /* Check if the subscription is dead */
+ if (hash_search(subids, (void *) &(wentry->key.subid), HASH_FIND, NULL) == NULL)
+ {
+ /* This subscription is dead, add the subid to the message */
+ spmsg.m_subids[spmsg.m_nentries++] = wentry->key.subid;
+
+ /*
+ * If the message is full, send it out and reinitialize to empty
+ */
+ if (spmsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONPURGE)
+ {
+ pgstat_send_subscription_purge(&spmsg);
+ spmsg.m_nentries = 0;
+ }
+
+ continue;
+ }
+
+ /*
+ * This subscription is alive. The next step is that we search table
+ * sync worker entries who are already in sync state. These should be
+ * removed.
+ */
+
+ /* We remove only table sync entries in the current database */
+ if (wentry->dbid != MyDatabaseId)
+ continue;
+
+ /* Skip if it's an apply worker entry */
+ if (!OidIsValid(wentry->key.subrelid))
+ continue;
+
+ if (wpmsg.m_subid != wentry->key.subid)
+ {
+ /*
+ * Send the purge message for previously collected table sync
+ * entries, if there is.
+ */
+ if (wpmsg.m_nentries > 0)
+ {
+ pgstat_send_subworker_purge(&wpmsg);
+ wpmsg.m_nentries = 0;
+ }
+
+ /* Clean up the previously collected relations */
+ list_free_deep(not_ready_rels);
+
+ /* Refresh the not-ready-relations of this subscription */
+ not_ready_rels = GetSubscriptionNotReadyRelations(wentry->key.subid);
+
+ /* Prepare the worker purge message for the subscription */
+ wpmsg.m_subid = wentry->key.subid;
+ }
+
+ /*
+ * Check if the table is still being synchronized or no longer belongs
+ * to the subscription.
+ */
+ foreach(lc2, not_ready_rels)
+ {
+ SubscriptionRelState *relstate = (SubscriptionRelState *) lfirst(lc2);
+
+ if (relstate->relid == wentry->key.subrelid)
+ {
+ /* This table is still being synchronized, so keep it */
+ keep_it = true;
+ break;
+ }
+ }
+
+ if (keep_it)
+ continue;
+
+ /* Add the table to the worker purge message */
+ wpmsg.m_relids[wpmsg.m_nentries++] = wentry->key.subrelid;
+
+ /*
+ * If the worker purge message is full, send it out and reinitialize
+ * to empty
+ */
+ if (wpmsg.m_nentries >= PGSTAT_NUM_SUBWORKERPURGE)
+ {
+ pgstat_send_subworker_purge(&wpmsg);
+ wpmsg.m_nentries = 0;
+ }
+ }
+
+ /* Send the rest of dead subscriptions */
+ if (spmsg.m_nentries > 0)
+ pgstat_send_subscription_purge(&spmsg);
+
+ /* Send the rest of dead worker entries */
+ if (wpmsg.m_nentries > 0)
+ pgstat_send_subworker_purge(&wpmsg);
+
+ /* Clean up */
+ list_free_deep(not_ready_rels);
+ list_free(subworker_stats);
+ hash_destroy(subids);
+}
+
+/* --------
+ * pgstat_send_subscription_purge() -
+ *
+ * Send a subscription purge message to the collector
+ * --------
+ */
+static void
+pgstat_send_subscription_purge(PgStat_MsgSubscriptionPurge *msg)
+{
+ int len;
+
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + msg->m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&msg->m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(msg, len);
+}
+
+/* --------
+ * pgstat_send_subworker_purge() -
+ *
+ * Send a subscription worker purge message to the collector
+ * --------
+ */
+static void
+pgstat_send_subworker_purge(PgStat_MsgSubWorkerPurge *msg)
+{
+ int len;
+
+ len = offsetof(PgStat_MsgSubWorkerPurge, m_relids[0])
+ + msg->m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&msg->m_hdr, PGSTAT_MTYPE_SUBWORKERPURGE);
+ pgstat_send(msg, len);
+}
+
/* ----------
* pgstat_drop_database() -
*
@@ -1544,6 +1767,24 @@ pgstat_reset_replslot_counter(const char *name)
pgstat_send(&msg, sizeof(msg));
}
+/* ----------
+ * pgstat_reset_subworker_stats() -
+ *
+ * Tell the collector to reset the subscription worker statistics.
+ * ----------
+ */
+void
+pgstat_reset_subworker_stats(Oid subid, Oid subrelid)
+{
+ PgStat_MsgResetsubworkercounter msg;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_RESETSUBWORKERCOUNTER);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+
+ pgstat_send(&msg, sizeof(PgStat_MsgResetsubworkercounter));
+}
+
/* ----------
* pgstat_report_autovac() -
*
@@ -1869,6 +2110,70 @@ pgstat_report_replslot_drop(const char *slotname)
pgstat_send(&msg, sizeof(PgStat_MsgReplSlot));
}
+/* ----------
+ * pgstat_report_subworker_error() -
+ *
+ * Tell the collector about the subscription worker error.
+ * ----------
+ */
+void
+pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command, TransactionId xid,
+ const char *errmsg)
+{
+ PgStat_MsgSubWorkerError msg;
+ int len;
+
+ Assert(strlen(errmsg) < PGSTAT_SUBWORKERERROR_MSGLEN);
+ len = offsetof(PgStat_MsgSubWorkerError, m_message[0]) + strlen(errmsg) + 1;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBWORKERERROR);
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_dbid = MyDatabaseId;
+ msg.m_relid = relid;
+ msg.m_command = command;
+ msg.m_xid = xid;
+ msg.m_timestamp = GetCurrentTimestamp();
+ strlcpy(msg.m_message, errmsg, PGSTAT_SUBWORKERERROR_MSGLEN);
+
+ pgstat_send(&msg, len);
+}
+
+/* ----------
+ * pgstat_report_subworker_drop() -
+ *
+ * Tell the collector about dropping the subscription worker statistics.
+ * This is used when a table sync worker exits.
+ * ----------
+ */
+void
+pgstat_report_subworker_drop(Oid subid, Oid subrelid)
+{
+ PgStat_MsgSubWorkerPurge msg;
+
+ msg.m_subid = subid;
+ msg.m_relids[0] = subrelid;
+ msg.m_nentries = 1;
+ pgstat_send_subworker_purge(&msg);
+}
+
+/* ----------
+ * pgstat_report_subscription_drop() -
+ *
+ * Tell the collector about dropping the subscription.
+ * ----------
+ */
+void
+pgstat_report_subscription_drop(Oid subid)
+{
+ PgStat_MsgSubscriptionPurge msg;
+
+ msg.m_subids[0] = subid;
+ msg.m_nentries = 1;
+ pgstat_send_subscription_purge(&msg);
+}
+
/* ----------
* pgstat_ping() -
*
@@ -2987,6 +3292,22 @@ pgstat_fetch_replslot(NameData slotname)
return pgstat_get_replslot_entry(slotname, false);
}
+/*
+ * ---------
+ * pgstat_fetch_subworker() -
+ *
+ * Support function for the SQL-callable pgstat* functions. Returns
+ * a pointer to the subscription worker struct.
+ * ---------
+ */
+PgStat_StatSubWorkerEntry *
+pgstat_fetch_subworker(Oid subid, Oid subrelid)
+{
+ backend_read_statsfile();
+
+ return pgstat_get_subworker_entry(subid, subrelid, false);
+}
+
/*
* Shut down a single backend's statistics reporting at process exit.
*
@@ -3498,6 +3819,11 @@ PgstatCollectorMain(int argc, char *argv[])
len);
break;
+ case PGSTAT_MTYPE_RESETSUBWORKERCOUNTER:
+ pgstat_recv_resetsubworkercounter(&msg.msg_resetsubworkercounter,
+ len);
+ break;
+
case PGSTAT_MTYPE_AUTOVAC_START:
pgstat_recv_autovac(&msg.msg_autovacuum_start, len);
break;
@@ -3568,6 +3894,18 @@ PgstatCollectorMain(int argc, char *argv[])
pgstat_recv_disconnect(&msg.msg_disconnect, len);
break;
+ case PGSTAT_MTYPE_SUBSCRIPTIONPURGE:
+ pgstat_recv_subscription_purge(&msg.msg_subscriptionpurge, len);
+ break;
+
+ case PGSTAT_MTYPE_SUBWORKERERROR:
+ pgstat_recv_subworker_error(&msg.msg_subworkererror, len);
+ break;
+
+ case PGSTAT_MTYPE_SUBWORKERPURGE:
+ pgstat_recv_subworker_purge(&msg.msg_subworkerpurge, len);
+ break;
+
default:
break;
}
@@ -3868,6 +4206,22 @@ pgstat_write_statsfiles(bool permanent, bool allDbs)
}
}
+ /*
+ * Write subscription worker stats struct
+ */
+ if (subWorkerStatHash)
+ {
+ PgStat_StatSubWorkerEntry *wentry;
+
+ hash_seq_init(&hstat, subWorkerStatHash);
+ while ((wentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ fputc('S', fpout);
+ rc = fwrite(wentry, sizeof(PgStat_StatSubWorkerEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+ }
+ }
+
/*
* No more output to be done. Close the temp file and replace the old
* pgstat.stat with it. The ferror() check replaces testing for error
@@ -4329,6 +4683,48 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
break;
}
+ /*
+ * 'S' A PgStat_StatSubWorkerEntry struct describing
+ * subscription worker statistics.
+ */
+ case 'S':
+ {
+ PgStat_StatSubWorkerEntry wbuf;
+ PgStat_StatSubWorkerEntry *wentry;
+
+ /* Read the subscription entry */
+ if (fread(&wbuf, 1, sizeof(PgStat_StatSubWorkerEntry), fpin)
+ != sizeof(PgStat_StatSubWorkerEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /* Create hash table if we don't have it already. */
+ if (subWorkerStatHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ hash_ctl.keysize = sizeof(PgStat_StatSubWorkerKey);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubWorkerEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ subWorkerStatHash = hash_create("Subscription worker stat entries",
+ PGSTAT_SUBWORKER_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ }
+
+ /* Enter the subscription entry and initialize fields */
+ wentry =
+ (PgStat_StatSubWorkerEntry *) hash_search(subWorkerStatHash,
+ (void *) &wbuf.key,
+ HASH_ENTER, NULL);
+ memcpy(wentry, &wbuf, sizeof(PgStat_StatSubWorkerEntry));
+ break;
+ }
+
case 'E':
goto done;
@@ -4541,6 +4937,7 @@ pgstat_read_db_statsfile_timestamp(Oid databaseid, bool permanent,
PgStat_WalStats myWalStats;
PgStat_SLRUStats mySLRUStats[SLRU_NUM_ELEMENTS];
PgStat_StatReplSlotEntry myReplSlotStats;
+ PgStat_StatSubWorkerEntry mySubWorkerStats;
FILE *fpin;
int32 format_id;
const char *statfile = permanent ? PGSTAT_STAT_PERMANENT_FILENAME : pgstat_stat_filename;
@@ -4671,6 +5068,22 @@ pgstat_read_db_statsfile_timestamp(Oid databaseid, bool permanent,
}
break;
+ /*
+ * 'S' A PgStat_StatSubWorkerEntry struct describing
+ * subscription worker statistics.
+ */
+ case 'S':
+ if (fread(&mySubWorkerStats, 1, sizeof(mySubWorkerStats), fpin)
+ != sizeof(mySubWorkerStats))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ FreeFile(fpin);
+ return false;
+ }
+ break;
+
case 'E':
goto done;
@@ -4876,6 +5289,7 @@ pgstat_clear_snapshot(void)
pgStatLocalContext = NULL;
pgStatDBHash = NULL;
replSlotStatHash = NULL;
+ subWorkerStatHash = NULL;
/*
* Historically the backend_status.c facilities lived in this file, and
@@ -5344,6 +5758,31 @@ pgstat_recv_resetreplslotcounter(PgStat_MsgResetreplslotcounter *msg,
}
}
+/* ----------
+ * pgstat_recv_resetsubworkercounter() -
+ *
+ * Process a RESETSUBWORKERCOUNTER message.
+ * ----------
+ */
+static void
+pgstat_recv_resetsubworkercounter(PgStat_MsgResetsubworkercounter *msg, int len)
+{
+ PgStat_StatSubWorkerEntry *wentry;
+
+ /* Get subscription worker stats */
+ wentry = pgstat_get_subworker_entry(msg->m_subid, msg->m_subrelid, false);
+
+ /*
+ * Nothing to do if the subscription error entry is not found. This could
+ * happen when the subscription is dropped and the message for dropping
+ * subscription entry arrived before the message for resetting the error.
+ */
+ if (wentry == NULL)
+ return;
+
+ /* reset the entry and set reset timestamp */
+ pgstat_reset_subworker_entry(wentry, GetCurrentTimestamp());
+}
/* ----------
* pgstat_recv_autovac() -
@@ -5816,6 +6255,105 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
}
}
+/* ----------
+ * pgstat_recv_subscription_purge() -
+ *
+ * Process a SUBSCRIPTIONPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len)
+{
+ HASH_SEQ_STATUS sstat;
+ PgStat_StatSubWorkerEntry *wentry;
+
+ if (subWorkerStatHash == NULL)
+ return;
+
+ /* Remove all subscription worker statistics of the given subscriptions */
+ hash_seq_init(&sstat, subWorkerStatHash);
+ while ((wentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&sstat)) != NULL)
+ {
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ if (wentry->key.subid == msg->m_subids[i])
+ {
+ (void) hash_search(subWorkerStatHash, (void *) &(wentry->key),
+ HASH_REMOVE, NULL);
+ break;
+ }
+ }
+ }
+}
+
+/* ----------
+ * pgstat_recv_subworker_error() -
+ *
+ * Process a SUBWORKERERROR message.
+ * ----------
+ */
+static void
+pgstat_recv_subworker_error(PgStat_MsgSubWorkerError *msg, int len)
+{
+ PgStat_StatSubWorkerEntry *wentry;
+
+ /* Get the subscription worker stats */
+ wentry = pgstat_get_subworker_entry(msg->m_subid, msg->m_subrelid, true);
+ Assert(wentry);
+
+ /*
+ * Update only the counter and timestamp if we received the same error
+ * again
+ */
+ if (wentry->dbid == msg->m_dbid &&
+ wentry->relid == msg->m_relid &&
+ wentry->command == msg->m_command &&
+ wentry->xid == msg->m_xid &&
+ strcmp(wentry->error_message, msg->m_message) == 0)
+ {
+ wentry->error_count++;
+ wentry->error_time = msg->m_timestamp;
+ return;
+ }
+
+ /* Otherwise, update the error information */
+ wentry->dbid = msg->m_dbid;
+ wentry->relid = msg->m_relid;
+ wentry->command = msg->m_command;
+ wentry->xid = msg->m_xid;
+ wentry->error_count = 1;
+ wentry->error_time = msg->m_timestamp;
+ strlcpy(wentry->error_message, msg->m_message, PGSTAT_SUBWORKERERROR_MSGLEN);
+}
+
+/* ----------
+ * pgstat_recv_subworker_purge() -
+ *
+ * Process a SUBWORKERPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subworker_purge(PgStat_MsgSubWorkerPurge *msg, int len)
+{
+ PgStat_StatSubWorkerKey key;
+
+ if (subWorkerStatHash == NULL)
+ return;
+
+ key.subid = msg->m_subid;
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ /*
+ * Must be a table sync worker error as the apply worker error is
+ * dropped only when the subscription is dropped.
+ */
+ Assert(OidIsValid(msg->m_relids[i]));
+
+ key.subrelid = msg->m_relids[i];
+ (void) hash_search(subWorkerStatHash, (void *) &key, HASH_REMOVE, NULL);
+ }
+}
+
/* ----------
* pgstat_write_statsfile_needed() -
*
@@ -5934,6 +6472,72 @@ pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotent, TimestampTz ts)
slotent->stat_reset_timestamp = ts;
}
+/* ----------
+ * pgstat_get_subworker_entry
+ *
+ * Return subscription worker entry with the given subscription OID and
+ * relation OID. If subrelid is InvalidOid, it returns an entry of the
+ * apply worker otherwise of the table sync worker associated with subrelid.
+ * If no subscription entry exists, initialize it, if the create parameter
+ * is true. Else, return NULL.
+ * ----------
+ */
+static PgStat_StatSubWorkerEntry *
+pgstat_get_subworker_entry(Oid subid, Oid subrelid, bool create)
+{
+ PgStat_StatSubWorkerEntry *wentry;
+ PgStat_StatSubWorkerKey key;
+ HASHACTION action;
+ bool found;
+
+ if (subWorkerStatHash == NULL)
+ {
+ HASHCTL hash_ctl;
+
+ if (!create)
+ return NULL;
+
+ hash_ctl.keysize = sizeof(PgStat_StatSubWorkerKey);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubWorkerEntry);
+ subWorkerStatHash = hash_create("Subscription worker stat entries",
+ PGSTAT_SUBWORKER_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
+ }
+
+ key.subid = subid;
+ key.subrelid = subrelid;
+ action = (create ? HASH_ENTER : HASH_FIND);
+ wentry = (PgStat_StatSubWorkerEntry *) hash_search(subWorkerStatHash,
+ (void *) &key,
+ action, &found);
+
+ /* initialize fields */
+ if (create && !found)
+ pgstat_reset_subworker_entry(wentry, 0);
+
+ return wentry;
+}
+
+/* ----------
+ * pgstat_reset_subworker_entry
+ *
+ * Reset the given subscription worker statistics.
+ * ----------
+ */
+static void
+pgstat_reset_subworker_entry(PgStat_StatSubWorkerEntry *wentry, TimestampTz ts)
+{
+ wentry->dbid = InvalidOid;
+ wentry->relid = InvalidOid;
+ wentry->command = 0;
+ wentry->xid = InvalidTransactionId;
+ wentry->error_count = 0;
+ wentry->error_time = 0;
+ wentry->error_message[0] = '\0';
+ wentry->stat_reset_timestamp = ts;
+}
+
/*
* pgstat_slru_index
*
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f07983a43c..9b6d0579b4 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -331,6 +331,19 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn)
*/
ReplicationSlotDropAtPubNode(LogRepWorkerWalRcvConn, syncslotname, false);
+ /*
+ * Send a message to drop the subscription worker statistics to the
+ * stats collector. Since there is no guarantee of the order of
+ * message transfer on a UDP connection, it's possible that a message
+ * for reporting statistics such as an error reaches after a message
+ * for removing the statistics. If the message reached in reverse or
+ * the message got lost, we could not drop the statistics. But
+ * (auto)vacuum cleans up the statistics of the subscription worker who
+ * is already in a ready state.
+ */
+ pgstat_report_subworker_drop(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid);
+
finish_sync_worker();
}
else
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 8d96c926b4..3a40684fa5 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3329,6 +3329,7 @@ void
ApplyWorkerMain(Datum main_arg)
{
int worker_slot = DatumGetInt32(main_arg);
+ MemoryContext cctx = CurrentMemoryContext;
MemoryContext oldctx;
char originname[NAMEDATALEN];
XLogRecPtr origin_startpos;
@@ -3429,8 +3430,30 @@ ApplyWorkerMain(Datum main_arg)
{
char *syncslotname;
- /* This is table synchronization worker, call initial sync. */
- syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ PG_TRY();
+ {
+ /* This is table synchronization worker, call initial sync. */
+ syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ }
+ PG_CATCH();
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ /*
+ * Report the table sync error. There is no corresponding message
+ * type for table synchronization.
+ */
+ pgstat_report_subworker_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ MyLogicalRepWorker->relid,
+ 0, /* message type */
+ InvalidTransactionId,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
/* allocate slot name in long-lived context */
myslotname = MemoryContextStrdup(ApplyContext, syncslotname);
@@ -3548,7 +3571,32 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- LogicalRepApplyLoop(origin_startpos);
+ PG_TRY();
+ {
+ LogicalRepApplyLoop(origin_startpos);
+ }
+ PG_CATCH();
+ {
+ /* report the apply error */
+ if (apply_error_callback_arg.command != 0)
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ pgstat_report_subworker_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ apply_error_callback_arg.rel != NULL
+ ? apply_error_callback_arg.rel->localreloid
+ : InvalidOid,
+ apply_error_callback_arg.command,
+ apply_error_callback_arg.remote_xid,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ }
+
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
proc_exit(0);
}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index ff5aedc99c..2511df1c67 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -24,6 +24,7 @@
#include "pgstat.h"
#include "postmaster/bgworker_internals.h"
#include "postmaster/postmaster.h"
+#include "replication/logicalproto.h"
#include "replication/slot.h"
#include "storage/proc.h"
#include "storage/procarray.h"
@@ -2239,6 +2240,23 @@ pg_stat_reset_replication_slot(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
+/* Reset a subscription worker stats */
+Datum
+pg_stat_reset_subscription_worker(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid;
+
+ if (PG_ARGISNULL(1))
+ relid = InvalidOid; /* reset apply worker error stats */
+ else
+ relid = PG_GETARG_OID(1); /* reset table sync worker error stats */
+
+ pgstat_reset_subworker_stats(subid, relid);
+
+ PG_RETURN_VOID();
+}
+
Datum
pg_stat_get_archiver(PG_FUNCTION_ARGS)
{
@@ -2379,3 +2397,107 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
/* Returns the record as Datum */
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
+
+/*
+ * Get the subscription worker statistics for the given subscription
+ * (and relation).
+ */
+Datum
+pg_stat_get_subscription_worker(PG_FUNCTION_ARGS)
+{
+#define PG_STAT_GET_SUBSCRIPTION_WORKER_COLS 9
+ Oid subid = PG_GETARG_OID(0);
+ Oid subrelid;
+ TupleDesc tupdesc;
+ Datum values[PG_STAT_GET_SUBSCRIPTION_WORKER_COLS];
+ bool nulls[PG_STAT_GET_SUBSCRIPTION_WORKER_COLS];
+ PgStat_StatSubWorkerEntry *wentry;
+ int i;
+
+ if (PG_ARGISNULL(1))
+ subrelid = InvalidOid;
+ else
+ subrelid = PG_GETARG_OID(1);
+
+ /* Initialise attributes information in the tuple descriptor */
+ tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_SUBSCRIPTION_WORKER_COLS);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "subid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subrelid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 3, "relid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 4, "command",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 5, "xid",
+ XIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 6, "error_count",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 7, "error_message",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 8, "last_error_time",
+ TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 9, "stats_reset",
+ TIMESTAMPTZOID, -1, 0);
+ BlessTupleDesc(tupdesc);
+
+ /* Get subscription worker stats */
+ wentry = pgstat_fetch_subworker(subid, subrelid);
+
+ /* Return NULL if there is no worker statistics */
+ if (wentry == NULL)
+ PG_RETURN_NULL();
+
+ /* Initialise values and NULL flags arrays */
+ MemSet(values, 0, sizeof(values));
+ MemSet(nulls, 0, sizeof(nulls));
+
+ i = 0;
+ /* subid */
+ values[i++] = ObjectIdGetDatum(subid);
+
+ /* subrelid */
+ if (OidIsValid(subrelid))
+ values[i++] = ObjectIdGetDatum(subrelid);
+ else
+ nulls[i++] = true;
+
+ /* relid */
+ if (OidIsValid(wentry->relid))
+ values[i++] = ObjectIdGetDatum(wentry->relid);
+ else
+ nulls[i++] = true;
+
+ /* command */
+ if (wentry->command != 0)
+ values[i++] = CStringGetTextDatum(logicalrep_message_type(wentry->command));
+ else
+ nulls[i++] = true;
+
+ /* xid */
+ if (TransactionIdIsValid(wentry->xid))
+ values[i++] = TransactionIdGetDatum(wentry->xid);
+ else
+ nulls[i++] = true;
+
+ /* error_count */
+ values[i++] = Int64GetDatum(wentry->error_count);
+
+ /* error_message */
+ values[i++] = CStringGetTextDatum(wentry->error_message);
+
+ /* last_error_time */
+ if (wentry->error_time != 0)
+ values[i++] = TimestampTzGetDatum(wentry->error_time);
+ else
+ nulls[i++] = true;
+
+ /* stats_reset */
+ if (wentry->stat_reset_timestamp != 0)
+ values[i++] = TimestampTzGetDatum(wentry->stat_reset_timestamp);
+ else
+ nulls[i++] = true;
+
+ /* Returns the record as Datum */
+ PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
+}
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index d068d6532e..e6c7abbdcc 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5385,6 +5385,14 @@
proargmodes => '{i,o,o,o,o,o,o,o,o,o,o}',
proargnames => '{slot_name,slot_name,spill_txns,spill_count,spill_bytes,stream_txns,stream_count,stream_bytes,total_txns,total_bytes,stats_reset}',
prosrc => 'pg_stat_get_replication_slot' },
+{ oid => '8523', descr => 'statistics: information about subscription worker',
+ proname => 'pg_stat_get_subscription_worker', prorows => '1', proisstrict => 'f',
+ proretset => 't', provolatile => 's', proparallel => 'r',
+ prorettype => 'record', proargtypes => 'oid oid',
+ proallargtypes => '{oid,oid,oid,oid,oid,text,xid,int8,text,timestamptz,timestamptz}',
+ proargmodes => '{i,i,o,o,o,o,o,o,o,o,o}',
+ proargnames => '{subid,subrelid,subid,subrelid,relid,command,xid,error_count,error_message,last_error_time,stats_reset}',
+ prosrc => 'pg_stat_get_subscription_worker' },
{ oid => '6118', descr => 'statistics: information about subscription',
proname => 'pg_stat_get_subscription', prorows => '10', proisstrict => 'f',
proretset => 't', provolatile => 's', proparallel => 'r',
@@ -5772,6 +5780,11 @@
proname => 'pg_stat_reset_replication_slot', proisstrict => 'f',
provolatile => 'v', prorettype => 'void', proargtypes => 'text',
prosrc => 'pg_stat_reset_replication_slot' },
+{ oid => '8524',
+ descr => 'statistics: reset collected statistics for a single subscription worker',
+ proname => 'pg_stat_reset_subscription_worker', proisstrict => 'f',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid oid',
+ prosrc => 'pg_stat_reset_subscription_worker' },
{ oid => '3163', descr => 'current trigger depth',
proname => 'pg_trigger_depth', provolatile => 's', proparallel => 'r',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index bcd3588ea2..7a26d6db3f 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -14,6 +14,7 @@
#include "datatype/timestamp.h"
#include "portability/instr_time.h"
#include "postmaster/pgarch.h" /* for MAX_XFN_CHARS */
+#include "replication/logicalproto.h"
#include "utils/backend_progress.h" /* for backward compatibility */
#include "utils/backend_status.h" /* for backward compatibility */
#include "utils/hsearch.h"
@@ -66,6 +67,7 @@ typedef enum StatMsgType
PGSTAT_MTYPE_RESETSINGLECOUNTER,
PGSTAT_MTYPE_RESETSLRUCOUNTER,
PGSTAT_MTYPE_RESETREPLSLOTCOUNTER,
+ PGSTAT_MTYPE_RESETSUBWORKERCOUNTER,
PGSTAT_MTYPE_AUTOVAC_START,
PGSTAT_MTYPE_VACUUM,
PGSTAT_MTYPE_ANALYZE,
@@ -83,6 +85,9 @@ typedef enum StatMsgType
PGSTAT_MTYPE_REPLSLOT,
PGSTAT_MTYPE_CONNECT,
PGSTAT_MTYPE_DISCONNECT,
+ PGSTAT_MTYPE_SUBSCRIPTIONPURGE,
+ PGSTAT_MTYPE_SUBWORKERERROR,
+ PGSTAT_MTYPE_SUBWORKERPURGE,
} StatMsgType;
/* ----------
@@ -389,6 +394,24 @@ typedef struct PgStat_MsgResetreplslotcounter
bool clearall;
} PgStat_MsgResetreplslotcounter;
+/* ----------
+ * PgStat_MsgRestsubworkercounter Sent by the backend to reset the subscription
+ * worker statistics.
+ * ----------
+ */
+typedef struct PgStat_MsgResetsubworkercounter
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * Same as PgStat_MsgSubWorkerError, m_subid and m_subrelid are used to
+ * determine the subscription and the reporter of the error: the apply
+ * worker or the table sync worker.
+ */
+ Oid m_subid;
+ Oid m_subrelid;
+} PgStat_MsgResetsubworkercounter;
+
/* ----------
* PgStat_MsgAutovacStart Sent by the autovacuum daemon to signal
* that a database is going to be processed
@@ -536,6 +559,68 @@ typedef struct PgStat_MsgReplSlot
PgStat_Counter m_total_bytes;
} PgStat_MsgReplSlot;
+/* ----------
+ * PgStat_MsgSubscriptionPurge Sent by the backend and autovacuum to tell the
+ * collector about the dead subscriptions.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionPurge
+{
+ PgStat_MsgHdr m_hdr;
+ int m_nentries;
+ Oid m_subids[PGSTAT_NUM_SUBSCRIPTIONPURGE];
+} PgStat_MsgSubscriptionPurge;
+
+/* ----------
+ * PgStat_MsgSubWorkerPurge Sent by the backend and autovacuum to purge
+ * the subscription worker statistics.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBWORKERPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(Oid) - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubWorkerPurge
+{
+ PgStat_MsgHdr m_hdr;
+ Oid m_subid;
+ int m_nentries;
+ Oid m_relids[PGSTAT_NUM_SUBWORKERPURGE];
+} PgStat_MsgSubWorkerPurge;
+
+/* ----------
+ * PgStat_MsgSubWorkerError Sent by the apply worker or the table sync worker to
+ * report the error occurred during logical replication.
+ * ----------
+ */
+#define PGSTAT_SUBWORKERERROR_MSGLEN 256
+typedef struct PgStat_MsgSubWorkerError
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * m_subid and m_subrelid are used to determine the subscription and the
+ * reporter of the error. m_subrelid is InvalidOid if reported by an apply
+ * worker otherwise reported by a table sync worker.
+ */
+ Oid m_subid;
+ Oid m_subrelid;
+
+ /*
+ * Oids of the database and the table that the reporter was actually
+ * processing. m_relid can be InvalidOid if an error occurred during
+ * worker applying a non-data-modification message such as RELATION.
+ */
+ Oid m_dbid;
+ Oid m_relid;
+
+ LogicalRepMsgType m_command;
+ TransactionId m_xid;
+ TimestampTz m_timestamp;
+ char m_message[PGSTAT_SUBWORKERERROR_MSGLEN];
+} PgStat_MsgSubWorkerError;
/* ----------
* PgStat_MsgRecoveryConflict Sent by the backend upon recovery conflict
@@ -697,6 +782,7 @@ typedef union PgStat_Msg
PgStat_MsgResetsinglecounter msg_resetsinglecounter;
PgStat_MsgResetslrucounter msg_resetslrucounter;
PgStat_MsgResetreplslotcounter msg_resetreplslotcounter;
+ PgStat_MsgResetsubworkercounter msg_resetsubworkercounter;
PgStat_MsgAutovacStart msg_autovacuum_start;
PgStat_MsgVacuum msg_vacuum;
PgStat_MsgAnalyze msg_analyze;
@@ -714,6 +800,9 @@ typedef union PgStat_Msg
PgStat_MsgReplSlot msg_replslot;
PgStat_MsgConnect msg_connect;
PgStat_MsgDisconnect msg_disconnect;
+ PgStat_MsgSubscriptionPurge msg_subscriptionpurge;
+ PgStat_MsgSubWorkerError msg_subworkererror;
+ PgStat_MsgSubWorkerPurge msg_subworkerpurge;
} PgStat_Msg;
@@ -929,6 +1018,36 @@ typedef struct PgStat_StatReplSlotEntry
TimestampTz stat_reset_timestamp;
} PgStat_StatReplSlotEntry;
+/* The lookup key for subscription worker hash table */
+typedef struct PgStat_StatSubWorkerKey
+{
+ Oid subid;
+ Oid subrelid; /* InvalidOid for apply worker, otherwise for
+ * table sync worker */
+} PgStat_StatSubWorkerKey;
+
+/*
+ * Logical replication apply worker and table sync worker statistics kept in the
+ * stats collector.
+ */
+typedef struct PgStat_StatSubWorkerEntry
+{
+ PgStat_StatSubWorkerKey key; /* hash key (must be first) */
+
+ /*
+ * Subscription worker error statistics representing an error that
+ * occurred during application of logical replication or the initial table
+ * synchronization.
+ */
+ Oid dbid;
+ Oid relid;
+ LogicalRepMsgType command;
+ TransactionId xid;
+ PgStat_Counter error_count;
+ TimestampTz error_time;
+ char error_message[PGSTAT_SUBWORKERERROR_MSGLEN];
+ TimestampTz stat_reset_timestamp;
+} PgStat_StatSubWorkerEntry;
/*
* Working state needed to accumulate per-function-call timing statistics.
@@ -1022,6 +1141,7 @@ extern void pgstat_reset_shared_counters(const char *);
extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type type);
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
+extern void pgstat_reset_subworker_stats(Oid subid, Oid subrelid);
extern void pgstat_report_connect(Oid dboid);
extern void pgstat_report_autovac(Oid dboid);
@@ -1038,6 +1158,11 @@ extern void pgstat_report_checksum_failure(void);
extern void pgstat_report_replslot(const PgStat_StatReplSlotEntry *repSlotStat);
extern void pgstat_report_replslot_create(const char *slotname);
extern void pgstat_report_replslot_drop(const char *slotname);
+extern void pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command,
+ TransactionId xid, const char *errmsg);
+extern void pgstat_report_subworker_drop(Oid subid, Oid subrelid);
+extern void pgstat_report_subscription_drop(Oid subid);
extern void pgstat_initialize(void);
@@ -1136,6 +1261,7 @@ extern PgStat_GlobalStats *pgstat_fetch_global(void);
extern PgStat_WalStats *pgstat_fetch_stat_wal(void);
extern PgStat_SLRUStats *pgstat_fetch_slru(void);
extern PgStat_StatReplSlotEntry *pgstat_fetch_replslot(NameData slotname);
+extern PgStat_StatSubWorkerEntry *pgstat_fetch_subworker(Oid subid, Oid subrelid);
extern void pgstat_count_slru_page_zeroed(int slru_idx);
extern void pgstat_count_slru_page_hit(int slru_idx);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2fa00a3c29..f6b1bd657f 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2094,6 +2094,26 @@ pg_stat_subscription| SELECT su.oid AS subid,
st.latest_end_time
FROM (pg_subscription su
LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid, pid, received_lsn, last_msg_send_time, last_msg_receipt_time, latest_end_lsn, latest_end_time) ON ((st.subid = su.oid)));
+pg_stat_subscription_workers| SELECT w.subid,
+ s.subname,
+ w.subrelid,
+ w.relid,
+ w.command,
+ w.xid,
+ w.error_count,
+ w.error_message,
+ w.last_error_time,
+ w.stats_reset
+ FROM ( SELECT pg_subscription.oid AS subid,
+ NULL::oid AS relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT pg_subscription_rel.srsubid AS subid,
+ pg_subscription_rel.srrelid AS relid
+ FROM pg_subscription_rel
+ WHERE (pg_subscription_rel.srsubstate <> 'r'::"char")) sr,
+ (LATERAL pg_stat_get_subscription_worker(sr.subid, sr.relid) w(subid, subrelid, relid, command, xid, error_count, error_message, last_error_time, stats_reset)
+ JOIN pg_subscription s ON ((w.subid = s.oid)));
pg_stat_sys_indexes| SELECT pg_stat_all_indexes.relid,
pg_stat_all_indexes.indexrelid,
pg_stat_all_indexes.schemaname,
diff --git a/src/test/subscription/t/026_error_report.pl b/src/test/subscription/t/026_error_report.pl
new file mode 100644
index 0000000000..3d23bb55d4
--- /dev/null
+++ b/src/test/subscription/t/026_error_report.pl
@@ -0,0 +1,156 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Tests for subscription error reporting.
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 5;
+
+# Test if the error reported on pg_stat_subscription_workers view is expected.
+sub test_subscription_error
+{
+ my ($node, $relname, $xid, $expected_error, $msg) = @_;
+
+ my $check_sql = qq[
+SELECT count(1) > 0 FROM pg_stat_subscription_workers
+WHERE relid = '$relname'::regclass];
+ $check_sql .= " AND xid = '$xid'::xid;" if $xid ne '';
+
+ # Wait for the error statistics to be updated.
+ $node->poll_query_until(
+ 'postgres', $check_sql,
+) or die "Timed out while waiting for statistics to be updated";
+
+ my $result = $node->safe_psql(
+ 'postgres',
+ qq[
+SELECT subname, command, relid::regclass, error_count > 0
+FROM pg_stat_subscription_workers
+WHERE relid = '$relname'::regclass;
+]);
+ is($result, $expected_error, $msg);
+}
+
+# Create publisher node.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+logical_decoding_work_mem = 64kB
+]);
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+
+# The subscriber will enter an infinite error loop, so we don't want
+# to overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+wal_retrieve_retry_interval = 5s
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we create
+# the same tables but with primary keys. Also, insert some data that will conflict
+# with the data replicated from publisher later.
+$node_publisher->safe_psql(
+ 'postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int);
+CREATE TABLE test_tab2 (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+INSERT INTO test_tab1 VALUES (1);
+INSERT INTO test_tab2 VALUES (1);
+COMMIT;
+]);
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int primary key);
+CREATE TABLE test_tab2 (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab2 VALUES (1);
+INSERT INTO test_tab_streaming SELECT 10000, md5(10000::text);
+COMMIT;
+]);
+
+# Setup publications.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ q[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Check if there is no subscription errors before starting logical replication.
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_workers");
+is($result, qq(0), 'check no subscription error');
+
+# Create subscriptions. The table sync for test_tab2 on tap_sub will enter into
+# infinite error loop due to violating the unique constraint.
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (streaming = off, two_phase = on);");
+my $appname_streaming = 'tap_sub_streaming';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming' PUBLICATION tap_pub_streaming WITH (streaming = on, two_phase = on);");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Wait for initial table sync for test_tab1 and test_tab_streaming to finish.
+$node_subscriber->poll_query_until('postgres',
+ q[
+SELECT count(1) = 2 FROM pg_subscription_rel
+WHERE srrelid in ('test_tab1'::regclass, 'test_tab_streaming'::regclass) AND srsubstate in ('r', 's')
+]) or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check the initial data.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab1");
+is($result, q(1), 'check initial data are copied to subscriber');
+
+# Insert more data to test_tab1, raising an error on the subscriber due to violation
+# of the unique constraint on test_tab1.
+my $xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab1 VALUES (1);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_subscription_error($node_subscriber, 'test_tab1', $xid,
+ qq(tap_sub|INSERT|test_tab1|t),
+ 'check the error reported by the apply worker');
+
+# Check the table sync worker's error in the view.
+test_subscription_error($node_subscriber, 'test_tab2', '',
+ qq(tap_sub||test_tab2|t),
+ 'check the error reported by the table sync worker');
+
+# Check if the view doesn't show any entries after dropping the subscriptions.
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+DROP SUBSCRIPTION tap_sub;
+DROP SUBSCRIPTION tap_sub_streaming;
+]);
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_workers");
+is($result, q(0), 'no error after dropping subscription');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 7bbbb34e2f..12cce497a3 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1942,7 +1942,11 @@ PgStat_MsgResetreplslotcounter
PgStat_MsgResetsharedcounter
PgStat_MsgResetsinglecounter
PgStat_MsgResetslrucounter
+PgStat_MsgResetsubworkererror
PgStat_MsgSLRU
+PgStat_MsgSubWorkerError
+PgStat_MsgSubWorkerErrorPurge
+PgStat_MsgSubWorkerPurge
PgStat_MsgTabpurge
PgStat_MsgTabstat
PgStat_MsgTempFile
@@ -1954,6 +1958,8 @@ PgStat_Single_Reset_Type
PgStat_StatDBEntry
PgStat_StatFuncEntry
PgStat_StatReplSlotEntry
+PgStat_StatSubWorkerEntry
+PgStat_StatSubWorkerKey
PgStat_StatTabEntry
PgStat_SubXactStatus
PgStat_TableCounts
--
2.24.3 (Apple Git-128)
On Thu, Oct 28, 2021 at 7:47 PM vignesh C <vignesh21@gmail.com> wrote:
On Thu, Oct 21, 2021 at 10:30 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Oct 20, 2021 at 12:33 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Mon, Oct 18, 2021 at 12:34 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached updated patches that incorporate all comments I got so far.
Minor comment on patch 17-0003
Thank you for the comment!
src/backend/replication/logical/worker.c
(1) Typo in apply_handle_stream_abort() comment:
/* Stop skipping transaction transaction, if enabled */
should be:
/* Stop skipping transaction changes, if enabled */Fixed.
I've attached updated patches.
I have started to have a look at the feature and review the patch, my
initial comments:
Thank you for the comments!
1) I could specify invalid subscriber id to
pg_stat_reset_subscription_worker which creates an assertion failure?+static void +pgstat_recv_resetsubworkercounter(PgStat_MsgResetsubworkercounter *msg, int len) +{ + PgStat_StatSubWorkerEntry *wentry; + + Assert(OidIsValid(msg->m_subid)); + + /* Get subscription worker stats */ + wentry = pgstat_get_subworker_entry(msg->m_subid, msg->m_subrelid, false);postgres=# select pg_stat_reset_subscription_worker(NULL, NULL);
pg_stat_reset_subscription_worker
-----------------------------------(1 row)
TRAP: FailedAssertion("OidIsValid(msg->m_subid)", File: "pgstat.c",
Line: 5742, PID: 789588)
postgres: stats collector (ExceptionalCondition+0xd0)[0x55d33bba4778]
postgres: stats collector (+0x545a43)[0x55d33b90aa43]
postgres: stats collector (+0x541fad)[0x55d33b906fad]
postgres: stats collector (pgstat_start+0xdd)[0x55d33b9020e1]
postgres: stats collector (+0x54ae0c)[0x55d33b90fe0c]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x141f0)[0x7f8509ccc1f0]
/lib/x86_64-linux-gnu/libc.so.6(__select+0x57)[0x7f8509a78ac7]
postgres: stats collector (+0x548cab)[0x55d33b90dcab]
postgres: stats collector (PostmasterMain+0x134c)[0x55d33b90d5c6]
postgres: stats collector (+0x43b8be)[0x55d33b8008be]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xd5)[0x7f8509992565]
postgres: stats collector (_start+0x2e)[0x55d33b48e4fe]
Good catch. Fixed.
2) I was able to provide invalid relation id for
pg_stat_reset_subscription_worker? Should we add any validation for
this?
select pg_stat_reset_subscription_worker(16389, -1);+pg_stat_reset_subscription_worker(PG_FUNCTION_ARGS) +{ + Oid subid = PG_GETARG_OID(0); + Oid relid; + + if (PG_ARGISNULL(1)) + relid = InvalidOid; /* reset apply worker error stats */ + else + relid = PG_GETARG_OID(1); /* reset table sync worker error stats */ + + pgstat_reset_subworker_stats(subid, relid); + + PG_RETURN_VOID(); +}
I think that validation is not necessarily necessary. OID '-1' is interpreted as
4294967295 and we don't reject it.
3) 025_error_report test is failing because of one of the recent
commit that has made some changes in the way node is initialized in
the tap tests, corresponding changes need to be done in
025_error_report:
t/025_error_report.pl .............. Dubious, test returned 2 (wstat 512, 0x200)
No subtests run
t/100_bugs.pl ...................... ok
Fixed.
These comments are incorporated into the latest version patch I just
submitted[1]/messages/by-id/CAD21AoDY-9_x819F_m1_wfCVXXFJrGiSmR2MfC9Nw4nW8Om0qA@mail.gmail.com.
Regards,
[1]: /messages/by-id/CAD21AoDY-9_x819F_m1_wfCVXXFJrGiSmR2MfC9Nw4nW8Om0qA@mail.gmail.com
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Fri, Oct 29, 2021 at 6:18 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Oct 28, 2021 at 6:34 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Oct 28, 2021 at 10:56 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Another thing I'm concerned is that the syntax "SKIP (
subscription_parameter [=value] [, ...])" looks like we can specify
multiple options for example, "SKIP (xid = '100', lsn =
'0/12345678’)”. Is there a case where we need to specify multiple
options? Perhaps when specifying the target XID and operations for
example, “SKIP (xid = 100, action = ‘insert, update’)”?Yeah, or maybe prepared transaction identifier and actions.
Prepared transactions seem not to need to be skipped since those
changes are already successfully applied, though.
I think it can also fail before apply of prepare is successful. Right
now, we are just logging xid in error cases bug gid could also be
logged as we receive that in begin_prepare. I think currently xid is
sufficient but I have given this as an example for future
consideration.
--
With Regards,
Amit Kapila.
On Fri, Oct 29, 2021 at 10:54 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Oct 28, 2021 at 7:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Oct 28, 2021 at 10:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Oct 27, 2021 at 7:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Oct 21, 2021 at 10:29 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached updated patches.
Thank you for the comments!
Few comments:
==============
1. Is the patch cleaning tablesync error entries except via vacuum? If
not, can't we send a message to remove tablesync errors once tablesync
is successful (say when we reset skip_xid or when tablesync is
finished) or when we drop subscription? I think the same applies to
apply worker. I think we may want to track it in some way whether an
error has occurred before sending the message but relying completely
on a vacuum might be the recipe of bloat. I think in the case of a
drop subscription we can simply send the message as that is not a
frequent operation. I might be missing something here because in the
tests after drop subscription you are expecting the entries from the
view to get clearedYes, I think we can have tablesync worker send a message to drop stats
once tablesync is successful. But if we do that also when dropping a
subscription, I think we need to do that only the transaction is
committed since we can drop a subscription that doesn't have a
replication slot and rollback the transaction. Probably we can send
the message only when the subscritpion does have a replication slot.Right. And probably for apply worker after updating skip xid.
I'm not sure it's better to drop apply worker stats after resetting
skip xid (i.g., after skipping the transaction). Since the view is a
cumulative view and has last_error_time, I thought we can have the
apply worker stats until the subscription gets dropped.
Fair enough. So statistics can be removed either by vacuum or drop
subscription. Also, if we go by this logic then there is no harm in
retaining the stat entries for tablesync errors. Why have different
behavior for apply and tablesync workers?
I have another question in this regard. Currently, the reset function
seems to be resetting only the first stat entry for a subscription.
But can't we have multiple stat entries for a subscription considering
the view's cumulative nature?
--
With Regards,
Amit Kapila.
On Fri, Oct 29, 2021 at 4:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Oct 29, 2021 at 10:54 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I'm not sure it's better to drop apply worker stats after resetting
skip xid (i.g., after skipping the transaction). Since the view is a
cumulative view and has last_error_time, I thought we can have the
apply worker stats until the subscription gets dropped.Fair enough. So statistics can be removed either by vacuum or drop
subscription. Also, if we go by this logic then there is no harm in
retaining the stat entries for tablesync errors. Why have different
behavior for apply and tablesync workers?I have another question in this regard. Currently, the reset function
seems to be resetting only the first stat entry for a subscription.
But can't we have multiple stat entries for a subscription considering
the view's cumulative nature?
Don't we want these stats to be dealt in the same way as tables and
functions as all the stats entries (subscription entries) are specific
to a particular database? If so, I think we should write/read these
to/from db specific stats file in the same way as we do for tables or
functions. I think in the current patch, it will unnecessarily read
and probably write subscription stats even when those are not
required.
--
With Regards,
Amit Kapila.
On Fri, Oct 29, 2021 at 8:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Oct 29, 2021 at 10:54 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Oct 28, 2021 at 7:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Oct 28, 2021 at 10:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Oct 27, 2021 at 7:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Oct 21, 2021 at 10:29 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached updated patches.
Thank you for the comments!
Few comments:
==============
1. Is the patch cleaning tablesync error entries except via vacuum? If
not, can't we send a message to remove tablesync errors once tablesync
is successful (say when we reset skip_xid or when tablesync is
finished) or when we drop subscription? I think the same applies to
apply worker. I think we may want to track it in some way whether an
error has occurred before sending the message but relying completely
on a vacuum might be the recipe of bloat. I think in the case of a
drop subscription we can simply send the message as that is not a
frequent operation. I might be missing something here because in the
tests after drop subscription you are expecting the entries from the
view to get clearedYes, I think we can have tablesync worker send a message to drop stats
once tablesync is successful. But if we do that also when dropping a
subscription, I think we need to do that only the transaction is
committed since we can drop a subscription that doesn't have a
replication slot and rollback the transaction. Probably we can send
the message only when the subscritpion does have a replication slot.Right. And probably for apply worker after updating skip xid.
I'm not sure it's better to drop apply worker stats after resetting
skip xid (i.g., after skipping the transaction). Since the view is a
cumulative view and has last_error_time, I thought we can have the
apply worker stats until the subscription gets dropped.Fair enough. So statistics can be removed either by vacuum or drop
subscription. Also, if we go by this logic then there is no harm in
retaining the stat entries for tablesync errors. Why have different
behavior for apply and tablesync workers?
My understanding is that the subscription worker statistics entry
corresponds to workers (but not physical workers since the physical
process is changed after restarting). So if the worker finishes its
jobs, it is no longer necessary to show errors since further problems
will not occur after that. Table sync worker’s job finishes when
completing table copy (unless table sync is performed again by REFRESH
PUBLICATION) whereas apply worker’s job finishes when the subscription
is dropped. Also, I’m concerned about a situation like where a lot of
table sync failed. In which case, if we don’t drop table sync worker
statistics after completing its job, we end up having a lot of entries
in the view unless the subscription is dropped.
I have another question in this regard. Currently, the reset function
seems to be resetting only the first stat entry for a subscription.
But can't we have multiple stat entries for a subscription considering
the view's cumulative nature?
I might be missing your points but I think that with the current
patch, the view has multiple entries for a subscription. That is,
there is one apply worker stats and multiple table sync worker stats
per subscription. And pg_stat_reset_subscription() function can reset
any stats by specifying subscription OID and relation OID.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Sat, Oct 30, 2021 at 12:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Oct 29, 2021 at 4:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Oct 29, 2021 at 10:54 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I'm not sure it's better to drop apply worker stats after resetting
skip xid (i.g., after skipping the transaction). Since the view is a
cumulative view and has last_error_time, I thought we can have the
apply worker stats until the subscription gets dropped.Fair enough. So statistics can be removed either by vacuum or drop
subscription. Also, if we go by this logic then there is no harm in
retaining the stat entries for tablesync errors. Why have different
behavior for apply and tablesync workers?I have another question in this regard. Currently, the reset function
seems to be resetting only the first stat entry for a subscription.
But can't we have multiple stat entries for a subscription considering
the view's cumulative nature?Don't we want these stats to be dealt in the same way as tables and
functions as all the stats entries (subscription entries) are specific
to a particular database? If so, I think we should write/read these
to/from db specific stats file in the same way as we do for tables or
functions. I think in the current patch, it will unnecessarily read
and probably write subscription stats even when those are not
required.
Good point! So probably we should have PgStat_StatDBEntry have the
hash table for subscription worker statistics, right?
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Fri, Oct 29, 2021 at 4:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached a new version patch. Since the syntax of skipping
transaction id is under the discussion I've attached only the error
reporting patch for now.
I have some comments on the v19-0001 patch:
v19-0001
(1) doc/src/sgml/monitoring.sgml
Seems to be missing the word "information":
BEFORE:
+ <entry>At least one row per subscription, showing about errors that
+ occurred on subscription.
AFTER:
+ <entry>At least one row per subscription, showing information about
+ errors that occurred on subscription.
(2) pg_stat_reset_subscription_worker(subid Oid, relid Oid)
First of all, I think that the documentation for this function should
make it clear that a non-NULL "subid" parameter is required for both
reset cases (tablesync and apply).
Perhaps this could be done by simply changing the first sentence to say:
"Resets statistics of a single subscription worker error, for a worker
running on subscription with <parameter>subid</parameter>."
(and then can remove " running on the subscription with
<parameter>subid</parameter>" from the last sentence)
I think that the documentation for this function should say that it
should be used in conjunction with the "pg_stat_subscription_workers"
view in order to obtain the required subid/relid values for resetting.
(and should provide a link to the documentation for that view)
Also, I think that the function documentation should make it clear
that the tablesync error case is indicated by a NULL "command" in the
information returned from the "pg_stat_subscription_workers" view
(otherwise the user needs to look at the server log in order to
determine whether the error is for the apply/tablesync worker).
Finally, there are currently no tests for this new function.
(3) pg_stat_subscription_workers
In the documentation for this, the description for the "command"
column says: "This field is always NULL if the error was reported
during the initial data copy."
Some users may not realise that this refers to "tablesync", so perhaps
add " (tablesync)" to the end of this sentence, or similar.
Regards,
Greg Nancarrow
Fujitsu Australia
On Mon, Nov 1, 2021 at 7:18 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Oct 29, 2021 at 8:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Fair enough. So statistics can be removed either by vacuum or drop
subscription. Also, if we go by this logic then there is no harm in
retaining the stat entries for tablesync errors. Why have different
behavior for apply and tablesync workers?My understanding is that the subscription worker statistics entry
corresponds to workers (but not physical workers since the physical
process is changed after restarting). So if the worker finishes its
jobs, it is no longer necessary to show errors since further problems
will not occur after that. Table sync worker’s job finishes when
completing table copy (unless table sync is performed again by REFRESH
PUBLICATION) whereas apply worker’s job finishes when the subscription
is dropped.
Actually, I am not very sure how users can use the old error
information after we allowed skipping the conflicting xid. Say, if
they want to add/remove some constraints on the table based on
previous errors then they might want to refer to errors of both the
apply worker and table sync worker.
Also, I’m concerned about a situation like where a lot of
table sync failed. In which case, if we don’t drop table sync worker
statistics after completing its job, we end up having a lot of entries
in the view unless the subscription is dropped.
True, but the same could be said for apply workers where errors can be
accumulated over a period of time.
I have another question in this regard. Currently, the reset function
seems to be resetting only the first stat entry for a subscription.
But can't we have multiple stat entries for a subscription considering
the view's cumulative nature?I might be missing your points but I think that with the current
patch, the view has multiple entries for a subscription. That is,
there is one apply worker stats and multiple table sync worker stats
per subscription.
Can't we have multiple entries for one apply worker?
And pg_stat_reset_subscription() function can reset
any stats by specifying subscription OID and relation OID.
Say, if the user has supplied just subscription OID then isn't it
better to reset all the error entries for that subscription?
--
With Regards,
Amit Kapila.
On Mon, Nov 1, 2021 at 7:25 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Sat, Oct 30, 2021 at 12:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Don't we want these stats to be dealt in the same way as tables and
functions as all the stats entries (subscription entries) are specific
to a particular database? If so, I think we should write/read these
to/from db specific stats file in the same way as we do for tables or
functions. I think in the current patch, it will unnecessarily read
and probably write subscription stats even when those are not
required.Good point! So probably we should have PgStat_StatDBEntry have the
hash table for subscription worker statistics, right?
Yes.
--
With Regards,
Amit Kapila.
On Tue, Nov 2, 2021 at 2:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Nov 1, 2021 at 7:18 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Oct 29, 2021 at 8:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Fair enough. So statistics can be removed either by vacuum or drop
subscription. Also, if we go by this logic then there is no harm in
retaining the stat entries for tablesync errors. Why have different
behavior for apply and tablesync workers?My understanding is that the subscription worker statistics entry
corresponds to workers (but not physical workers since the physical
process is changed after restarting). So if the worker finishes its
jobs, it is no longer necessary to show errors since further problems
will not occur after that. Table sync worker’s job finishes when
completing table copy (unless table sync is performed again by REFRESH
PUBLICATION) whereas apply worker’s job finishes when the subscription
is dropped.Actually, I am not very sure how users can use the old error
information after we allowed skipping the conflicting xid. Say, if
they want to add/remove some constraints on the table based on
previous errors then they might want to refer to errors of both the
apply worker and table sync worker.
I think that in general, statistics should be retained as long as a
corresponding object exists on the database, like other cumulative
statistic views. So I’m concerned that an entry of a cumulative stats
view is automatically removed by a non-stats-related function (i.g.,
ALTER SUBSCRIPTION SKIP). Which seems a new behavior for cumulative
stats views.
We can retain the stats entries for table sync worker but what I want
to avoid is that the view shows many old entries that will never be
updated. I've sometimes seen cases where the user mistakenly restored
table data on the subscriber before creating a subscription, failed
table sync on many tables due to unique violation, and truncated
tables on the subscriber. I think that unlike the stats entries for
apply worker, retaining the stats entries for table sync could be
harmful since it’s likely to be a large amount (even hundreds of
entries). Especially, it could lead to bloat the stats file since it
has an error message. So if we do that, I'd like to provide a function
for users to remove (not reset) stats entries manually. Even if we
removed stats entries after skipping the transaction in question, the
stats entries would be left if we resolve the conflict in another way.
I have another question in this regard. Currently, the reset function
seems to be resetting only the first stat entry for a subscription.
But can't we have multiple stat entries for a subscription considering
the view's cumulative nature?I might be missing your points but I think that with the current
patch, the view has multiple entries for a subscription. That is,
there is one apply worker stats and multiple table sync worker stats
per subscription.Can't we have multiple entries for one apply worker?
Umm, I think we have one stats entry per one logical replication
worker (apply worker or table sync worker). Am I missing something?
And pg_stat_reset_subscription() function can reset
any stats by specifying subscription OID and relation OID.Say, if the user has supplied just subscription OID then isn't it
better to reset all the error entries for that subscription?
Agreed. So pg_stat_reset_subscription_worker(oid) removes all errors
for the subscription whereas pg_stat_reset_subscription_worker(oid,
null) reset only the apply worker error for the subscription?
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Friday, October 29, 2021 1:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached a new version patch. Since the syntax of skipping
transaction id is under the discussion I've attached only the error
reporting patch for now.
Thanks for your patch. Some comments on 026_error_report.pl file.
1. For test_tab_streaming table, the test only checks initial table sync and
doesn't check anything related to the new view pg_stat_subscription_workers. Do
you want to add more test cases for it?
2. The subscriptions are created with two_phase option on, but I didn't see two
phase transactions. Should we add some test cases for two phase transactions?
3. Errors reported by table sync worker will be cleaned up if the table sync
worker finish, should we add this case to the test? (After checking the table
sync worker's error in the view, delete data which caused the error, then check
the view again after table sync worker finished.)
Regards
Tang
On Tue, Nov 2, 2021 at 2:17 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Nov 2, 2021 at 2:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I have another question in this regard. Currently, the reset function
seems to be resetting only the first stat entry for a subscription.
But can't we have multiple stat entries for a subscription considering
the view's cumulative nature?I might be missing your points but I think that with the current
patch, the view has multiple entries for a subscription. That is,
there is one apply worker stats and multiple table sync worker stats
per subscription.Can't we have multiple entries for one apply worker?
Umm, I think we have one stats entry per one logical replication
worker (apply worker or table sync worker). Am I missing something?
No, you are right. I got confused.
And pg_stat_reset_subscription() function can reset
any stats by specifying subscription OID and relation OID.Say, if the user has supplied just subscription OID then isn't it
better to reset all the error entries for that subscription?Agreed. So pg_stat_reset_subscription_worker(oid) removes all errors
for the subscription whereas pg_stat_reset_subscription_worker(oid,
null) reset only the apply worker error for the subscription?
Yes.
--
With Regards,
Amit Kapila.
On Tue, Nov 2, 2021 at 2:17 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Nov 2, 2021 at 2:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Nov 1, 2021 at 7:18 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Oct 29, 2021 at 8:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Fair enough. So statistics can be removed either by vacuum or drop
subscription. Also, if we go by this logic then there is no harm in
retaining the stat entries for tablesync errors. Why have different
behavior for apply and tablesync workers?My understanding is that the subscription worker statistics entry
corresponds to workers (but not physical workers since the physical
process is changed after restarting). So if the worker finishes its
jobs, it is no longer necessary to show errors since further problems
will not occur after that. Table sync worker’s job finishes when
completing table copy (unless table sync is performed again by REFRESH
PUBLICATION) whereas apply worker’s job finishes when the subscription
is dropped.Actually, I am not very sure how users can use the old error
information after we allowed skipping the conflicting xid. Say, if
they want to add/remove some constraints on the table based on
previous errors then they might want to refer to errors of both the
apply worker and table sync worker.I think that in general, statistics should be retained as long as a
corresponding object exists on the database, like other cumulative
statistic views. So I’m concerned that an entry of a cumulative stats
view is automatically removed by a non-stats-related function (i.g.,
ALTER SUBSCRIPTION SKIP). Which seems a new behavior for cumulative
stats views.We can retain the stats entries for table sync worker but what I want
to avoid is that the view shows many old entries that will never be
updated. I've sometimes seen cases where the user mistakenly restored
table data on the subscriber before creating a subscription, failed
table sync on many tables due to unique violation, and truncated
tables on the subscriber. I think that unlike the stats entries for
apply worker, retaining the stats entries for table sync could be
harmful since it’s likely to be a large amount (even hundreds of
entries). Especially, it could lead to bloat the stats file since it
has an error message. So if we do that, I'd like to provide a function
for users to remove (not reset) stats entries manually.
If we follow the idea of keeping stats at db level (in
PgStat_StatDBEntry) as discussed above then I think we already have a
way to remove stat entries via pg_stat_reset which removes the stats
corresponding to tables, functions and after this patch corresponding
to subscriptions as well for the current database. Won't that be
sufficient? I see your point but I think it may be better if we keep
the same behavior for stats of apply and table sync workers.
Following the tables, functions, I thought of keeping the name of the
reset function similar to "pg_stat_reset_single_table_counters" but I
feel the currently used name "pg_stat_reset_subscription_worker" in
the patch is better. Do let me know what you think?
--
With Regards,
Amit Kapila.
On Fri, Oct 29, 2021 at 10:55 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Oct 28, 2021 at 7:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Oct 28, 2021 at 10:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Oct 27, 2021 at 7:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Oct 21, 2021 at 10:29 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached updated patches.
Thank you for the comments!
Few comments:
==============
1. Is the patch cleaning tablesync error entries except via vacuum? If
not, can't we send a message to remove tablesync errors once tablesync
is successful (say when we reset skip_xid or when tablesync is
finished) or when we drop subscription? I think the same applies to
apply worker. I think we may want to track it in some way whether an
error has occurred before sending the message but relying completely
on a vacuum might be the recipe of bloat. I think in the case of a
drop subscription we can simply send the message as that is not a
frequent operation. I might be missing something here because in the
tests after drop subscription you are expecting the entries from the
view to get clearedYes, I think we can have tablesync worker send a message to drop stats
once tablesync is successful. But if we do that also when dropping a
subscription, I think we need to do that only the transaction is
committed since we can drop a subscription that doesn't have a
replication slot and rollback the transaction. Probably we can send
the message only when the subscritpion does have a replication slot.Right. And probably for apply worker after updating skip xid.
I'm not sure it's better to drop apply worker stats after resetting
skip xid (i.g., after skipping the transaction). Since the view is a
cumulative view and has last_error_time, I thought we can have the
apply worker stats until the subscription gets dropped. Since the
error reporting message could get lost, no entry in the view doesn’t
mean the worker doesn’t face an issue.In other cases, we can remember the subscriptions being dropped and
send the message to drop the statistics of them after committing the
transaction but I’m not sure it’s worth having it.Yeah, let's not go to that extent. I think in most cases subscriptions
will have corresponding slots.Agreed.
FWIW, we completely
rely on pg_stat_vacuum_stats() for cleaning up the dead tables and
functions. And we don't expect there are many subscriptions on the
database.True, but we do send it for the database, so let's do it for the cases
you explained in the first paragraph.Agreed.
I've attached a new version patch. Since the syntax of skipping
transaction id is under the discussion I've attached only the error
reporting patch for now.
Thanks for the updated patch, few comments:
1) This check and return can be moved above CreateTemplateTupleDesc so
that the tuple descriptor need not be created if there is no worker
statistics
+ BlessTupleDesc(tupdesc);
+
+ /* Get subscription worker stats */
+ wentry = pgstat_fetch_subworker(subid, subrelid);
+
+ /* Return NULL if there is no worker statistics */
+ if (wentry == NULL)
+ PG_RETURN_NULL();
+
+ /* Initialise values and NULL flags arrays */
+ MemSet(values, 0, sizeof(values));
+ MemSet(nulls, 0, sizeof(nulls));
2) "NULL for the main apply worker" is mentioned as "null for the main
apply worker" in case of pg_stat_subscription view, we can mention it
similarly.
+ <para>
+ OID of the relation that the worker is synchronizing; NULL for the
+ main apply worker
+ </para></entry>
3) Variable assignment can be done during declaration and this the
assignment can be removed
+ i = 0;
+ /* subid */
+ values[i++] = ObjectIdGetDatum(subid);
4) I noticed that the worker error is still present when queried from
pg_stat_subscription_workers even after conflict is resolved in the
subscriber and the worker proceeds with applying the other
transactions, should this be documented somewhere?
5) This needs to be aligned, the columns in select have used TAB, we
should align it using spaces.
+CREATE VIEW pg_stat_subscription_workers AS
+ SELECT
+ w.subid,
+ s.subname,
+ w.subrelid,
+ w.relid,
+ w.command,
+ w.xid,
+ w.error_count,
+ w.error_message,
+ w.last_error_time,
+ w.stats_reset
Regards,
Vignesh
On Wed, Nov 3, 2021 at 12:41 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Nov 2, 2021 at 2:17 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Nov 2, 2021 at 2:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Nov 1, 2021 at 7:18 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Oct 29, 2021 at 8:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Fair enough. So statistics can be removed either by vacuum or drop
subscription. Also, if we go by this logic then there is no harm in
retaining the stat entries for tablesync errors. Why have different
behavior for apply and tablesync workers?My understanding is that the subscription worker statistics entry
corresponds to workers (but not physical workers since the physical
process is changed after restarting). So if the worker finishes its
jobs, it is no longer necessary to show errors since further problems
will not occur after that. Table sync worker’s job finishes when
completing table copy (unless table sync is performed again by REFRESH
PUBLICATION) whereas apply worker’s job finishes when the subscription
is dropped.Actually, I am not very sure how users can use the old error
information after we allowed skipping the conflicting xid. Say, if
they want to add/remove some constraints on the table based on
previous errors then they might want to refer to errors of both the
apply worker and table sync worker.I think that in general, statistics should be retained as long as a
corresponding object exists on the database, like other cumulative
statistic views. So I’m concerned that an entry of a cumulative stats
view is automatically removed by a non-stats-related function (i.g.,
ALTER SUBSCRIPTION SKIP). Which seems a new behavior for cumulative
stats views.We can retain the stats entries for table sync worker but what I want
to avoid is that the view shows many old entries that will never be
updated. I've sometimes seen cases where the user mistakenly restored
table data on the subscriber before creating a subscription, failed
table sync on many tables due to unique violation, and truncated
tables on the subscriber. I think that unlike the stats entries for
apply worker, retaining the stats entries for table sync could be
harmful since it’s likely to be a large amount (even hundreds of
entries). Especially, it could lead to bloat the stats file since it
has an error message. So if we do that, I'd like to provide a function
for users to remove (not reset) stats entries manually.If we follow the idea of keeping stats at db level (in
PgStat_StatDBEntry) as discussed above then I think we already have a
way to remove stat entries via pg_stat_reset which removes the stats
corresponding to tables, functions and after this patch corresponding
to subscriptions as well for the current database. Won't that be
sufficient? I see your point but I think it may be better if we keep
the same behavior for stats of apply and table sync workers.
Make sense.
Following the tables, functions, I thought of keeping the name of the
reset function similar to "pg_stat_reset_single_table_counters" but I
feel the currently used name "pg_stat_reset_subscription_worker" in
the patch is better. Do let me know what you think?
Yeah, I also tend to prefer pg_stat_reset_subscription_worker name
since "single" isn't clear in the context of subscription worker. And
the behavior of the reset function for subscription workers is also
different from pg_stat_reset_single_xxx_counters.
I've attached an updated patch. In this version patch, subscription
worker statistics are collected per-database and handled in a similar
way to tables and functions. I think perhaps we still need to discuss
details of how the statistics should be handled but I'd like to share
the patch for discussion.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v20-0001-Add-a-subscription-worker-statistics-view-pg_sta.patchapplication/octet-stream; name=v20-0001-Add-a-subscription-worker-statistics-view-pg_sta.patchDownload
From 9f0e5c242a0a068b575dcc53307f9bd2e1b9cde1 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 16 Jul 2021 23:10:22 +0900
Subject: [PATCH v20 1/3] Add a subscription worker statistics view
"pg_stat_subscription_workers".
This commit adds a new system view pg_stat_subscription_workers,
that shows information about any errors which occur during application
of logical replication changes as well as during performing initial table
synchronization.
The subscription error entries are removed by autovacuum workers after
table synchronization completes in table sync worker cases and after
dropping the subscription in apply worker cases.
It also adds an SQL function pg_stat_reset_subscription_worker() to
reset single subscription errors.
---
doc/src/sgml/monitoring.sgml | 163 ++++++++
src/backend/catalog/system_functions.sql | 2 +
src/backend/catalog/system_views.sql | 24 ++
src/backend/commands/subscriptioncmds.c | 15 +-
src/backend/postmaster/pgstat.c | 420 +++++++++++++++++++-
src/backend/replication/logical/worker.c | 54 ++-
src/backend/utils/adt/pgstatfuncs.c | 120 ++++++
src/include/catalog/pg_proc.dat | 18 +
src/include/pgstat.h | 121 +++++-
src/test/regress/expected/rules.out | 19 +
src/test/subscription/t/026_error_report.pl | 156 ++++++++
src/tools/pgindent/typedefs.list | 6 +
12 files changed, 1108 insertions(+), 10 deletions(-)
create mode 100644 src/test/subscription/t/026_error_report.pl
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 3173ec2566..7a4a1a2a76 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -627,6 +627,15 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
</entry>
</row>
+ <row>
+ <entry><structname>pg_stat_subscription_workers</structname><indexterm><primary>pg_stat_subscription_workers</primary></indexterm></entry>
+ <entry>At least one row per subscription, showing about errors that
+ occurred on subscription.
+ See <link linkend="monitoring-pg-stat-subscription-workers">
+ <structname>pg_stat_subscription_workers</structname></link> for details.
+ </entry>
+ </row>
+
</tbody>
</tgroup>
</table>
@@ -3034,6 +3043,136 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
</sect2>
+ <sect2 id="monitoring-pg-stat-subscription-workers">
+ <title><structname>pg_stat_subscription_workers</structname></title>
+
+ <indexterm>
+ <primary>pg_stat_subscription_workers</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_stat_subscription_workers</structname> view will contain
+ one row per subscription error reported by workers applying logical
+ replication changes and workers handling the initial data copy of the
+ subscribed tables.
+ </para>
+
+ <table id="pg-stat-subscription-workers" xreflabel="pg_stat_subscription_workers">
+ <title><structname>pg_stat_subscription_workers</structname> View</title>
+ <tgroup cols="1">
+ <thead>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ Column Type
+ </para>
+ <para>
+ Description
+ </para></entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the subscription
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the subscription
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subrelid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker is synchronizing; null for the
+ main apply worker
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>relid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker was processing when the
+ error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>command</structfield> <type>text</type>
+ </para>
+ <para>
+ Name of command being applied when the error occurred. This field
+ is always NULL if the error was reported during the initial data
+ copy.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>xid</structfield> <type>xid</type>
+ </para>
+ <para>
+ Transaction ID of the publisher node being applied when the error
+ occurred. This field is always NULL if the error was reported
+ during the initial data copy.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>error_count</structfield> <type>uint8</type>
+ </para>
+ <para>
+ Number of consecutive times the error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>error_message</structfield> <type>text</type>
+ </para>
+ <para>
+ The error message
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_error_time</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which the last error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>stats_reset</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which these statistics were last reset
+ </para></entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ </sect2>
+
<sect2 id="monitoring-pg-stat-ssl-view">
<title><structname>pg_stat_ssl</structname></title>
@@ -5156,6 +5295,30 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
can be granted EXECUTE to run the function.
</para></entry>
</row>
+
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_stat_reset_subscription_worker</primary>
+ </indexterm>
+ <function>pg_stat_reset_subscription_worker</function> ( <parameter>subid</parameter> <type>oid</type>, <optional> <parameter>relid</parameter> <type>oid</type> </optional> )
+ <returnvalue>void</returnvalue>
+ </para>
+ <para>
+ Resets statistics of a single subscription worker statistics. If
+ the argument <parameter>relid</parameter> is not <literal>NULL</literal>,
+ resets statistics of the <literal>tablesync</literal> worker for
+ the relation with <parameter>relid</parameter>. Otherwise, resets the
+ subscription worker statistics of the <literal>apply</literal> worker
+ running on the subscription with <parameter>subid</parameter>. If the
+ argument <parameter>relid</parameter> is omitted, resets all subscription
+ worker statistics associated with the <parameter>subid</parameter>.
+ </para>
+ <para>
+ This function is restricted to superusers by default, but other users
+ can be granted EXECUTE to run the function.
+ </para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index 54c93b16c4..921fce5546 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -639,6 +639,8 @@ REVOKE EXECUTE ON FUNCTION pg_stat_reset_single_function_counters(oid) FROM publ
REVOKE EXECUTE ON FUNCTION pg_stat_reset_replication_slot(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_stat_reset_subscription_worker(oid, oid) FROM public;
+
REVOKE EXECUTE ON FUNCTION lo_import(text) FROM public;
REVOKE EXECUTE ON FUNCTION lo_import(text, oid) FROM public;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index eb560955cd..fece48aabd 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1261,3 +1261,27 @@ REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
substream, subtwophasestate, subslotname, subsynccommit, subpublications)
ON pg_subscription TO public;
+
+CREATE VIEW pg_stat_subscription_workers AS
+ SELECT
+ w.subid,
+ s.subname,
+ w.subrelid,
+ w.relid,
+ w.command,
+ w.xid,
+ w.error_count,
+ w.error_message,
+ w.last_error_time
+ FROM (SELECT
+ oid as subid,
+ NULL as relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT
+ srsubid as subid,
+ srrelid as relid
+ FROM pg_subscription_rel
+ WHERE srsubstate <> 'r') sr,
+ LATERAL pg_stat_get_subscription_worker(sr.subid, sr.relid) w
+ JOIN pg_subscription s ON (w.subid = s.oid);
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index c47ba26369..18962b91e1 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -32,6 +32,7 @@
#include "executor/executor.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "pgstat.h"
#include "replication/logicallauncher.h"
#include "replication/origin.h"
#include "replication/slot.h"
@@ -1204,7 +1205,8 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
* Since dropping a replication slot is not transactional, the replication
* slot stays dropped even if the transaction rolls back. So we cannot
* run DROP SUBSCRIPTION inside a transaction block if dropping the
- * replication slot.
+ * replication slot. Also, in this case, we report a message for dropping
+ * the subscription to the stats collector.
*
* XXX The command name should really be something like "DROP SUBSCRIPTION
* of a subscription that is associated with a replication slot", but we
@@ -1377,6 +1379,17 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
}
PG_END_TRY();
+ /*
+ * Send a message for dropping this subscription to the stats collector. We
+ * can safely report dropping the subscription statistics here if the
+ * subscription is associated with a replication slot since we cannot run
+ * DROP SUBSCRIPTION inside a transaction block. Subscription statistics will
+ * be removed later by (auto)vacuum either if it's not associated with a
+ * replication slot or if the message for dropping the subscription gets lost.
+ */
+ if (slotname)
+ pgstat_report_subscription_drop(subid);
+
table_close(rel, NoLock);
}
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index b7d0fbaefd..d944361c80 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -41,6 +41,8 @@
#include "catalog/catalog.h"
#include "catalog/pg_database.h"
#include "catalog/pg_proc.h"
+#include "catalog/pg_subscription.h"
+#include "catalog/pg_subscription_rel.h"
#include "common/ip.h"
#include "executor/instrument.h"
#include "libpq/libpq.h"
@@ -106,6 +108,7 @@
#define PGSTAT_TAB_HASH_SIZE 512
#define PGSTAT_FUNCTION_HASH_SIZE 512
#define PGSTAT_REPLSLOT_HASH_SIZE 32
+#define PGSTAT_SUBWORKER_HASH_SIZE 32
/* ----------
@@ -320,10 +323,14 @@ NON_EXEC_STATIC void PgstatCollectorMain(int argc, char *argv[]) pg_attribute_no
static PgStat_StatDBEntry *pgstat_get_db_entry(Oid databaseid, bool create);
static PgStat_StatTabEntry *pgstat_get_tab_entry(PgStat_StatDBEntry *dbentry,
Oid tableoid, bool create);
+static PgStat_StatSubWorkerEntry *pgstat_get_subworker_entry(PgStat_StatDBEntry *dbentry,
+ Oid subid, Oid subrelid,
+ bool create);
static void pgstat_write_statsfiles(bool permanent, bool allDbs);
static void pgstat_write_db_statsfile(PgStat_StatDBEntry *dbentry, bool permanent);
static HTAB *pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep);
-static void pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash, bool permanent);
+static void pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash,
+ HTAB *subworkerhash, bool permanent);
static void backend_read_statsfile(void);
static bool pgstat_write_statsfile_needed(void);
@@ -332,9 +339,11 @@ static bool pgstat_db_requested(Oid databaseid);
static PgStat_StatReplSlotEntry *pgstat_get_replslot_entry(NameData name, bool create_it);
static void pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotstats, TimestampTz ts);
+
static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg, TimestampTz now);
static void pgstat_send_funcstats(void);
static void pgstat_send_slru(void);
+static void pgstat_send_subscription_purge(PgStat_MsgSubscriptionPurge *msg);
static HTAB *pgstat_collect_oids(Oid catalogid, AttrNumber anum_oid);
static bool pgstat_should_report_connstat(void);
static void pgstat_report_disconnect(Oid dboid);
@@ -356,6 +365,7 @@ static void pgstat_recv_resetsharedcounter(PgStat_MsgResetsharedcounter *msg, in
static void pgstat_recv_resetsinglecounter(PgStat_MsgResetsinglecounter *msg, int len);
static void pgstat_recv_resetslrucounter(PgStat_MsgResetslrucounter *msg, int len);
static void pgstat_recv_resetreplslotcounter(PgStat_MsgResetreplslotcounter *msg, int len);
+static void pgstat_recv_resetsubworkercounter(PgStat_MsgResetsubworkercounter *msg, int len);
static void pgstat_recv_autovac(PgStat_MsgAutovacStart *msg, int len);
static void pgstat_recv_vacuum(PgStat_MsgVacuum *msg, int len);
static void pgstat_recv_analyze(PgStat_MsgAnalyze *msg, int len);
@@ -373,6 +383,8 @@ static void pgstat_recv_connect(PgStat_MsgConnect *msg, int len);
static void pgstat_recv_disconnect(PgStat_MsgDisconnect *msg, int len);
static void pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len);
static void pgstat_recv_tempfile(PgStat_MsgTempFile *msg, int len);
+static void pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len);
+static void pgstat_recv_subworker_error(PgStat_MsgSubWorkerError *msg, int len);
/* ------------------------------------------------------------
* Public functions called from postmaster follow
@@ -1302,6 +1314,52 @@ pgstat_vacuum_stat(void)
hash_destroy(htab);
}
+
+ /* Repeat for subscription workers */
+ if (dbentry->subworkers != NULL &&
+ hash_get_num_entries(dbentry->subworkers) > 0)
+ {
+ PgStat_StatSubWorkerEntry *subwentry;
+ PgStat_MsgSubscriptionPurge spmsg;
+
+ /*
+ * Read pg_subscription and make a list of OIDs of all existing
+ * subscriptions
+ */
+ htab = pgstat_collect_oids(SubscriptionRelationId, Anum_pg_subscription_oid);
+
+ pgstat_setheader(&spmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ spmsg.m_databaseid = MyDatabaseId;
+ spmsg.m_nentries = 0;
+
+ hash_seq_init(&hstat, dbentry->subworkers);
+ while ((subwentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ CHECK_FOR_INTERRUPTS();
+
+ if (hash_search(htab, (void *) &(subwentry->key.subid), HASH_FIND, NULL)
+ != NULL)
+ continue;
+
+ /* This subscription is dead, add the subid to the message */
+ spmsg.m_subids[spmsg.m_nentries++] = subwentry->key.subid;
+
+ /*
+ * If the message is full, send it out and reinitialize to empty
+ */
+ if (spmsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONPURGE)
+ {
+ pgstat_send_subscription_purge(&spmsg);
+ spmsg.m_nentries = 0;
+ }
+ }
+
+ /* Send the rest of dead subscriptions */
+ if (spmsg.m_nentries > 0)
+ pgstat_send_subscription_purge(&spmsg);
+
+ hash_destroy(htab);
+ }
}
@@ -1354,6 +1412,23 @@ pgstat_collect_oids(Oid catalogid, AttrNumber anum_oid)
return htab;
}
+/* --------
+ * pgstat_send_subscription_purge() -
+ *
+ * Send a subscription purge message to the collector
+ * --------
+ */
+static void
+pgstat_send_subscription_purge(PgStat_MsgSubscriptionPurge *msg)
+{
+ int len;
+
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + msg->m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&msg->m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(msg, len);
+}
/* ----------
* pgstat_drop_database() -
@@ -1544,6 +1619,26 @@ pgstat_reset_replslot_counter(const char *name)
pgstat_send(&msg, sizeof(msg));
}
+/* ----------
+ * pgstat_reset_subworker_stats() -
+ *
+ * Tell the collector to reset the subscription worker statistics.
+ * ----------
+ */
+void
+pgstat_reset_subworker_stats(Oid subid, Oid subrelid, bool allstats)
+{
+ PgStat_MsgResetsubworkercounter msg;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_RESETSUBWORKERCOUNTER);
+ msg.m_databaseid = MyDatabaseId;
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_allstats = allstats;
+
+ pgstat_send(&msg, sizeof(PgStat_MsgResetsubworkercounter));
+}
+
/* ----------
* pgstat_report_autovac() -
*
@@ -1869,6 +1964,53 @@ pgstat_report_replslot_drop(const char *slotname)
pgstat_send(&msg, sizeof(PgStat_MsgReplSlot));
}
+/* ----------
+ * pgstat_report_subworker_error() -
+ *
+ * Tell the collector about the subscription worker error.
+ * ----------
+ */
+void
+pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command, TransactionId xid,
+ const char *errmsg)
+{
+ PgStat_MsgSubWorkerError msg;
+ int len;
+
+ Assert(strlen(errmsg) < PGSTAT_SUBWORKERERROR_MSGLEN);
+ len = offsetof(PgStat_MsgSubWorkerError, m_message[0]) + strlen(errmsg) + 1;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBWORKERERROR);
+ msg.m_databaseid = MyDatabaseId;
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_relid = relid;
+ msg.m_command = command;
+ msg.m_xid = xid;
+ msg.m_timestamp = GetCurrentTimestamp();
+ strlcpy(msg.m_message, errmsg, PGSTAT_SUBWORKERERROR_MSGLEN);
+
+ pgstat_send(&msg, len);
+}
+
+/* ----------
+ * pgstat_report_subscription_drop() -
+ *
+ * Tell the collector about dropping the subscription.
+ * ----------
+ */
+void
+pgstat_report_subscription_drop(Oid subid)
+{
+ PgStat_MsgSubscriptionPurge msg;
+
+ msg.m_databaseid = MyDatabaseId;
+ msg.m_subids[0] = subid;
+ msg.m_nentries = 1;
+ pgstat_send_subscription_purge(&msg);
+}
+
/* ----------
* pgstat_ping() -
*
@@ -2874,6 +3016,33 @@ pgstat_fetch_stat_funcentry(Oid func_id)
return funcentry;
}
+/*
+ * ---------
+ * pgstat_fetch_stat_subworker_entry() -
+ *
+ * Support function for the SQL-callable pgstat* functions. Returns
+ * a pointer to the subscription worker struct or NULL.
+ * ---------
+ */
+PgStat_StatSubWorkerEntry *
+pgstat_fetch_stat_subworker_entry(Oid subid, Oid subrelid)
+{
+ PgStat_StatDBEntry *dbentry;
+ PgStat_StatSubWorkerEntry *wentry = NULL;
+
+ /* Load the stats file if needed */
+ backend_read_statsfile();
+
+ /* Look up database, then find the requested subscription worker stats */
+ dbentry = pgstat_fetch_stat_dbentry(MyDatabaseId);
+ if (dbentry != NULL && dbentry->subworkers != NULL)
+ {
+ wentry = pgstat_get_subworker_entry(dbentry, subid, subrelid,
+ false);
+ }
+
+ return wentry;
+}
/*
* ---------
@@ -3498,6 +3667,11 @@ PgstatCollectorMain(int argc, char *argv[])
len);
break;
+ case PGSTAT_MTYPE_RESETSUBWORKERCOUNTER:
+ pgstat_recv_resetsubworkercounter(&msg.msg_resetsubworkercounter,
+ len);
+ break;
+
case PGSTAT_MTYPE_AUTOVAC_START:
pgstat_recv_autovac(&msg.msg_autovacuum_start, len);
break;
@@ -3568,6 +3742,14 @@ PgstatCollectorMain(int argc, char *argv[])
pgstat_recv_disconnect(&msg.msg_disconnect, len);
break;
+ case PGSTAT_MTYPE_SUBSCRIPTIONPURGE:
+ pgstat_recv_subscription_purge(&msg.msg_subscriptionpurge, len);
+ break;
+
+ case PGSTAT_MTYPE_SUBWORKERERROR:
+ pgstat_recv_subworker_error(&msg.msg_subworkererror, len);
+ break;
+
default:
break;
}
@@ -3666,6 +3848,13 @@ reset_dbentry_counters(PgStat_StatDBEntry *dbentry)
PGSTAT_FUNCTION_HASH_SIZE,
&hash_ctl,
HASH_ELEM | HASH_BLOBS);
+
+ hash_ctl.keysize = sizeof(PgStat_StatSubWorkerKey);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubWorkerEntry);
+ dbentry->subworkers = hash_create("Per-database subscription worker",
+ PGSTAT_SUBWORKER_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
}
/*
@@ -3690,7 +3879,7 @@ pgstat_get_db_entry(Oid databaseid, bool create)
/*
* If not found, initialize the new one. This creates empty hash tables
- * for tables and functions, too.
+ * for tables, functions, and subscription worker, too.
*/
if (!found)
reset_dbentry_counters(result);
@@ -3947,8 +4136,10 @@ pgstat_write_db_statsfile(PgStat_StatDBEntry *dbentry, bool permanent)
{
HASH_SEQ_STATUS tstat;
HASH_SEQ_STATUS fstat;
+ HASH_SEQ_STATUS sstat;
PgStat_StatTabEntry *tabentry;
PgStat_StatFuncEntry *funcentry;
+ PgStat_StatSubWorkerEntry *subwentry;
FILE *fpout;
int32 format_id;
Oid dbid = dbentry->databaseid;
@@ -4003,6 +4194,17 @@ pgstat_write_db_statsfile(PgStat_StatDBEntry *dbentry, bool permanent)
(void) rc; /* we'll check for error with ferror */
}
+ /*
+ * Walk through the database's subscription worker stats table.
+ */
+ hash_seq_init(&sstat, dbentry->subworkers);
+ while ((subwentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&sstat)) != NULL)
+ {
+ fputc('S', fpout);
+ rc = fwrite(subwentry, sizeof(PgStat_StatSubWorkerEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+ }
+
/*
* No more output to be done. Close the temp file and replace the old
* pgstat.stat with it. The ferror() check replaces testing for error
@@ -4241,6 +4443,7 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
memcpy(dbentry, &dbbuf, sizeof(PgStat_StatDBEntry));
dbentry->tables = NULL;
dbentry->functions = NULL;
+ dbentry->subworkers = NULL;
/*
* In the collector, disregard the timestamp we read from the
@@ -4278,6 +4481,14 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
&hash_ctl,
HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ hash_ctl.keysize = sizeof(PgStat_StatSubWorkerKey);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubWorkerEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ dbentry->subworkers = hash_create("Per-database subscription worker",
+ PGSTAT_SUBWORKER_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+
/*
* If requested, read the data from the database-specific
* file. Otherwise we just leave the hashtables empty.
@@ -4286,6 +4497,7 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
pgstat_read_db_statsfile(dbentry->databaseid,
dbentry->tables,
dbentry->functions,
+ dbentry->subworkers,
permanent);
break;
@@ -4370,12 +4582,14 @@ done:
*/
static void
pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash,
- bool permanent)
+ HTAB *subworkerhash, bool permanent)
{
PgStat_StatTabEntry *tabentry;
PgStat_StatTabEntry tabbuf;
PgStat_StatFuncEntry funcbuf;
PgStat_StatFuncEntry *funcentry;
+ PgStat_StatSubWorkerEntry subwbuf;
+ PgStat_StatSubWorkerEntry *subwentry;
FILE *fpin;
int32 format_id;
bool found;
@@ -4489,6 +4703,41 @@ pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash,
memcpy(funcentry, &funcbuf, sizeof(funcbuf));
break;
+ /*
+ * 'S' A PgStat_StatSubWorkerEntry struct describing
+ * subscription worker statistics.
+ */
+ case 'S':
+ if (fread(&subwbuf, 1, sizeof(PgStat_StatSubWorkerEntry),
+ fpin) != sizeof(PgStat_StatSubWorkerEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /*
+ * Skip if subscription worker data not wanted.
+ */
+ if (subworkerhash == NULL)
+ break;
+
+ subwentry = (PgStat_StatSubWorkerEntry *) hash_search(subworkerhash,
+ (void *) &subwbuf.key,
+ HASH_ENTER, &found);
+
+ if (found)
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ memcpy(subwentry, &subwbuf, sizeof(subwbuf));
+ break;
+
/*
* 'E' The EOF marker of a complete stats file.
*/
@@ -5162,6 +5411,8 @@ pgstat_recv_dropdb(PgStat_MsgDropdb *msg, int len)
hash_destroy(dbentry->tables);
if (dbentry->functions != NULL)
hash_destroy(dbentry->functions);
+ if (dbentry->subworkers != NULL)
+ hash_destroy(dbentry->subworkers);
if (hash_search(pgStatDBHash,
(void *) &dbid,
@@ -5199,13 +5450,16 @@ pgstat_recv_resetcounter(PgStat_MsgResetcounter *msg, int len)
hash_destroy(dbentry->tables);
if (dbentry->functions != NULL)
hash_destroy(dbentry->functions);
+ if (dbentry->subworkers != NULL)
+ hash_destroy(dbentry->subworkers);
dbentry->tables = NULL;
dbentry->functions = NULL;
+ dbentry->subworkers = NULL;
/*
* Reset database-level stats, too. This creates empty hash tables for
- * tables and functions.
+ * tables, functions, and subscription workers.
*/
reset_dbentry_counters(dbentry);
}
@@ -5344,6 +5598,48 @@ pgstat_recv_resetreplslotcounter(PgStat_MsgResetreplslotcounter *msg,
}
}
+/* ----------
+ * pgstat_recv_resetsubworkercounter() -
+ *
+ * Process a RESETSUBWORKERCOUNTER message.
+ * ----------
+ */
+static void
+pgstat_recv_resetsubworkercounter(PgStat_MsgResetsubworkercounter *msg, int len)
+{
+ PgStat_StatDBEntry *dbentry;
+ PgStat_StatSubWorkerKey key;
+
+ dbentry = pgstat_get_db_entry(msg->m_databaseid, false);
+
+ if (!dbentry || !dbentry->subworkers)
+ return;
+
+ /* Set the reset timestamp for the whole database */
+ dbentry->stat_reset_timestamp = GetCurrentTimestamp();
+
+ if (msg->m_allstats)
+ {
+ HASH_SEQ_STATUS hstat;
+ PgStat_StatSubWorkerEntry *subwentry;
+
+ /* Remove all statistics associated with m_subid */
+ hash_seq_init(&hstat, dbentry->subworkers);
+ while ((subwentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ if (subwentry->key.subid == msg->m_subid)
+ (void) hash_search(dbentry->subworkers, (void *) &(subwentry->key),
+ HASH_REMOVE, NULL);
+ }
+ }
+ else
+ {
+ /* Remove single statistics */
+ key.subid = msg->m_subid;
+ key.subrelid = msg->m_subrelid;
+ (void) hash_search(dbentry->subworkers, (void *) &key, HASH_REMOVE, NULL);
+ }
+}
/* ----------
* pgstat_recv_autovac() -
@@ -5816,6 +6112,83 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
}
}
+/* ----------
+ * pgstat_recv_subscription_purge() -
+ *
+ * Process a SUBSCRIPTIONPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len)
+{
+ HASH_SEQ_STATUS hstat;
+ PgStat_StatDBEntry *dbentry;
+ PgStat_StatSubWorkerEntry *subwentry;
+
+ dbentry = pgstat_get_db_entry(msg->m_databaseid, false);
+
+ /* No need to purge if we don't have even know the database */
+ if (!dbentry || !dbentry->subworkers)
+ return;
+
+ /* Remove all subscription worker statistics of the given subscriptions */
+ hash_seq_init(&hstat, dbentry->subworkers);
+ while ((subwentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ if (subwentry->key.subid == msg->m_subids[i])
+ {
+ (void) hash_search(dbentry->subworkers, (void *) &(subwentry->key),
+ HASH_REMOVE, NULL);
+ break;
+ }
+ }
+ }
+}
+
+/* ----------
+ * pgstat_recv_subworker_error() -
+ *
+ * Process a SUBWORKERERROR message.
+ * ----------
+ */
+static void
+pgstat_recv_subworker_error(PgStat_MsgSubWorkerError *msg, int len)
+{
+ PgStat_StatDBEntry *dbentry;
+ PgStat_StatSubWorkerEntry *subwentry;
+
+ dbentry = pgstat_get_db_entry(msg->m_databaseid, true);
+
+ /* Get the subscription worker stats */
+ subwentry = pgstat_get_subworker_entry(dbentry, msg->m_subid,
+ msg->m_subrelid, true);
+ Assert(subwentry);
+
+ /*
+ * Update only the counter and timestamp if we received the same error
+ * again
+ */
+ if (subwentry->relid == msg->m_relid &&
+ subwentry->command == msg->m_command &&
+ subwentry->xid == msg->m_xid &&
+ strcmp(subwentry->error_message, msg->m_message) == 0)
+ {
+ subwentry->error_count++;
+ subwentry->error_time = msg->m_timestamp;
+ return;
+ }
+
+ /* Otherwise, update the error information */
+ subwentry->relid = msg->m_relid;
+ subwentry->command = msg->m_command;
+ subwentry->xid = msg->m_xid;
+ subwentry->error_count = 1;
+ subwentry->error_time = msg->m_timestamp;
+ strlcpy(subwentry->error_message, msg->m_message, PGSTAT_SUBWORKERERROR_MSGLEN);
+}
+
/* ----------
* pgstat_write_statsfile_needed() -
*
@@ -5934,6 +6307,45 @@ pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotent, TimestampTz ts)
slotent->stat_reset_timestamp = ts;
}
+/* ----------
+ * pgstat_get_subworker_entry
+ *
+ * Return subscription worker entry with the given subscription OID and
+ * relation OID. If subrelid is InvalidOid, it returns an entry of the
+ * apply worker otherwise of the table sync worker associated with subrelid.
+ * If no subscription entry exists, initialize it, if the create parameter
+ * is true. Else, return NULL.
+ * ----------
+ */
+static PgStat_StatSubWorkerEntry *
+pgstat_get_subworker_entry(PgStat_StatDBEntry *dbentry, Oid subid, Oid subrelid,
+ bool create)
+{
+ PgStat_StatSubWorkerEntry *subwentry;
+ PgStat_StatSubWorkerKey key;
+ bool found;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);;
+
+ key.subid = subid;
+ key.subrelid = subrelid;
+ subwentry = (PgStat_StatSubWorkerEntry *) hash_search(dbentry->subworkers,
+ (void *) &key,
+ action, &found);
+
+ /* If not found, initialize the new one */
+ if (create && !found)
+ {
+ subwentry->relid = InvalidOid;
+ subwentry->command = 0;
+ subwentry->xid = InvalidTransactionId;
+ subwentry->error_count = 0;
+ subwentry->error_time = 0;
+ subwentry->error_message[0] = '\0';
+ }
+
+ return subwentry;
+}
+
/*
* pgstat_slru_index
*
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 0bd5d0ee5e..e2a929b166 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3329,6 +3329,7 @@ void
ApplyWorkerMain(Datum main_arg)
{
int worker_slot = DatumGetInt32(main_arg);
+ MemoryContext cctx = CurrentMemoryContext;
MemoryContext oldctx;
char originname[NAMEDATALEN];
XLogRecPtr origin_startpos;
@@ -3429,8 +3430,30 @@ ApplyWorkerMain(Datum main_arg)
{
char *syncslotname;
- /* This is table synchronization worker, call initial sync. */
- syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ PG_TRY();
+ {
+ /* This is table synchronization worker, call initial sync. */
+ syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ }
+ PG_CATCH();
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ /*
+ * Report the table sync error. There is no corresponding message
+ * type for table synchronization.
+ */
+ pgstat_report_subworker_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ MyLogicalRepWorker->relid,
+ 0, /* message type */
+ InvalidTransactionId,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
/* allocate slot name in long-lived context */
myslotname = MemoryContextStrdup(ApplyContext, syncslotname);
@@ -3548,7 +3571,32 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- LogicalRepApplyLoop(origin_startpos);
+ PG_TRY();
+ {
+ LogicalRepApplyLoop(origin_startpos);
+ }
+ PG_CATCH();
+ {
+ /* report the apply error */
+ if (apply_error_callback_arg.command != 0)
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ pgstat_report_subworker_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ apply_error_callback_arg.rel != NULL
+ ? apply_error_callback_arg.rel->localreloid
+ : InvalidOid,
+ apply_error_callback_arg.command,
+ apply_error_callback_arg.remote_xid,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ }
+
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
proc_exit(0);
}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index ff5aedc99c..aa17b82ea6 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -24,6 +24,7 @@
#include "pgstat.h"
#include "postmaster/bgworker_internals.h"
#include "postmaster/postmaster.h"
+#include "replication/logicalproto.h"
#include "replication/slot.h"
#include "storage/proc.h"
#include "storage/procarray.h"
@@ -2239,6 +2240,29 @@ pg_stat_reset_replication_slot(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
+/* Reset all subscription worker stats associated with the given subscription */
+Datum
+pg_stat_reset_subscription_worker_sub(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+
+ pgstat_reset_subworker_stats(subid, InvalidOid, true);
+
+ PG_RETURN_VOID();
+}
+
+/* Reset a subscription worker stats */
+Datum
+pg_stat_reset_subscription_worker_subrel(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid = PG_ARGISNULL(1) ? InvalidOid : PG_GETARG_OID(1);
+
+ pgstat_reset_subworker_stats(subid, relid, false);
+
+ PG_RETURN_VOID();
+}
+
Datum
pg_stat_get_archiver(PG_FUNCTION_ARGS)
{
@@ -2379,3 +2403,99 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
/* Returns the record as Datum */
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
+
+/*
+ * Get the subscription worker statistics for the given subscription
+ * (and relation).
+ */
+Datum
+pg_stat_get_subscription_worker(PG_FUNCTION_ARGS)
+{
+#define PG_STAT_GET_SUBSCRIPTION_WORKER_COLS 8
+ Oid subid = PG_GETARG_OID(0);
+ Oid subrelid;
+ TupleDesc tupdesc;
+ Datum values[PG_STAT_GET_SUBSCRIPTION_WORKER_COLS];
+ bool nulls[PG_STAT_GET_SUBSCRIPTION_WORKER_COLS];
+ PgStat_StatSubWorkerEntry *wentry;
+ int i;
+
+ if (PG_ARGISNULL(1))
+ subrelid = InvalidOid;
+ else
+ subrelid = PG_GETARG_OID(1);
+
+ /* Get subscription worker stats */
+ wentry = pgstat_fetch_stat_subworker_entry(subid, subrelid);
+
+ /* Return NULL if there is no worker statistics */
+ if (wentry == NULL)
+ PG_RETURN_NULL();
+
+ /* Initialise attributes information in the tuple descriptor */
+ tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_SUBSCRIPTION_WORKER_COLS);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "subid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subrelid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 3, "relid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 4, "command",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 5, "xid",
+ XIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 6, "error_count",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 7, "error_message",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 8, "last_error_time",
+ TIMESTAMPTZOID, -1, 0);
+ BlessTupleDesc(tupdesc);
+
+ /* Initialise values and NULL flags arrays */
+ MemSet(values, 0, sizeof(values));
+ MemSet(nulls, 0, sizeof(nulls));
+
+ i = 0;
+ /* subid */
+ values[i++] = ObjectIdGetDatum(subid);
+
+ /* subrelid */
+ if (OidIsValid(subrelid))
+ values[i++] = ObjectIdGetDatum(subrelid);
+ else
+ nulls[i++] = true;
+
+ /* relid */
+ if (OidIsValid(wentry->relid))
+ values[i++] = ObjectIdGetDatum(wentry->relid);
+ else
+ nulls[i++] = true;
+
+ /* command */
+ if (wentry->command != 0)
+ values[i++] = CStringGetTextDatum(logicalrep_message_type(wentry->command));
+ else
+ nulls[i++] = true;
+
+ /* xid */
+ if (TransactionIdIsValid(wentry->xid))
+ values[i++] = TransactionIdGetDatum(wentry->xid);
+ else
+ nulls[i++] = true;
+
+ /* error_count */
+ values[i++] = Int64GetDatum(wentry->error_count);
+
+ /* error_message */
+ values[i++] = CStringGetTextDatum(wentry->error_message);
+
+ /* last_error_time */
+ if (wentry->error_time != 0)
+ values[i++] = TimestampTzGetDatum(wentry->error_time);
+ else
+ nulls[i++] = true;
+
+ /* Returns the record as Datum */
+ PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
+}
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index d068d6532e..528c39d391 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5385,6 +5385,14 @@
proargmodes => '{i,o,o,o,o,o,o,o,o,o,o}',
proargnames => '{slot_name,slot_name,spill_txns,spill_count,spill_bytes,stream_txns,stream_count,stream_bytes,total_txns,total_bytes,stats_reset}',
prosrc => 'pg_stat_get_replication_slot' },
+{ oid => '8523', descr => 'statistics: information about subscription worker',
+ proname => 'pg_stat_get_subscription_worker', prorows => '1', proisstrict => 'f',
+ proretset => 't', provolatile => 's', proparallel => 'r',
+ prorettype => 'record', proargtypes => 'oid oid',
+ proallargtypes => '{oid,oid,oid,oid,oid,text,xid,int8,text,timestamptz}',
+ proargmodes => '{i,i,o,o,o,o,o,o,o,o}',
+ proargnames => '{subid,subrelid,subid,subrelid,relid,command,xid,error_count,error_message,last_error_time}',
+ prosrc => 'pg_stat_get_subscription_worker' },
{ oid => '6118', descr => 'statistics: information about subscription',
proname => 'pg_stat_get_subscription', prorows => '10', proisstrict => 'f',
proretset => 't', provolatile => 's', proparallel => 'r',
@@ -5772,6 +5780,16 @@
proname => 'pg_stat_reset_replication_slot', proisstrict => 'f',
provolatile => 'v', prorettype => 'void', proargtypes => 'text',
prosrc => 'pg_stat_reset_replication_slot' },
+{ oid => '8524',
+ descr => 'statistics: reset collected statistics for a single subscription worker',
+ proname => 'pg_stat_reset_subscription_worker', proisstrict => 'f',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid oid',
+ prosrc => 'pg_stat_reset_subscription_worker_subrel' },
+{ oid => '8525',
+ descr => 'statistics: reset all collected statistics for a single subscription',
+ proname => 'pg_stat_reset_subscription_worker',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid',
+ prosrc => 'pg_stat_reset_subscription_worker_sub' },
{ oid => '3163', descr => 'current trigger depth',
proname => 'pg_trigger_depth', provolatile => 's', proparallel => 'r',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index bcd3588ea2..3749bacced 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -14,6 +14,7 @@
#include "datatype/timestamp.h"
#include "portability/instr_time.h"
#include "postmaster/pgarch.h" /* for MAX_XFN_CHARS */
+#include "replication/logicalproto.h"
#include "utils/backend_progress.h" /* for backward compatibility */
#include "utils/backend_status.h" /* for backward compatibility */
#include "utils/hsearch.h"
@@ -66,6 +67,7 @@ typedef enum StatMsgType
PGSTAT_MTYPE_RESETSINGLECOUNTER,
PGSTAT_MTYPE_RESETSLRUCOUNTER,
PGSTAT_MTYPE_RESETREPLSLOTCOUNTER,
+ PGSTAT_MTYPE_RESETSUBWORKERCOUNTER,
PGSTAT_MTYPE_AUTOVAC_START,
PGSTAT_MTYPE_VACUUM,
PGSTAT_MTYPE_ANALYZE,
@@ -83,6 +85,8 @@ typedef enum StatMsgType
PGSTAT_MTYPE_REPLSLOT,
PGSTAT_MTYPE_CONNECT,
PGSTAT_MTYPE_DISCONNECT,
+ PGSTAT_MTYPE_SUBSCRIPTIONPURGE,
+ PGSTAT_MTYPE_SUBWORKERERROR,
} StatMsgType;
/* ----------
@@ -389,6 +393,28 @@ typedef struct PgStat_MsgResetreplslotcounter
bool clearall;
} PgStat_MsgResetreplslotcounter;
+/* ----------
+ * PgStat_MsgRestsubworkercounter Sent by the backend to reset the subscription
+ * worker statistics.
+ * ----------
+ */
+typedef struct PgStat_MsgResetsubworkercounter
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * Same as PgStat_MsgSubWorkerError, m_subid and m_subrelid are used to
+ * determine the subscription and the reporter of the error: the apply
+ * worker or the table sync worker.
+ */
+ Oid m_databaseid;
+ Oid m_subid;
+ Oid m_subrelid;
+
+ /* Reset all subscription worker stats associated with m_subid */
+ bool m_allstats;
+} PgStat_MsgResetsubworkercounter;
+
/* ----------
* PgStat_MsgAutovacStart Sent by the autovacuum daemon to signal
* that a database is going to be processed
@@ -536,6 +562,53 @@ typedef struct PgStat_MsgReplSlot
PgStat_Counter m_total_bytes;
} PgStat_MsgReplSlot;
+/* ----------
+ * PgStat_MsgSubscriptionPurge Sent by the backend and autovacuum to tell the
+ * collector about the dead subscriptions.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(Oid) - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionPurge
+{
+ PgStat_MsgHdr m_hdr;
+ Oid m_databaseid;
+ int m_nentries;
+ Oid m_subids[PGSTAT_NUM_SUBSCRIPTIONPURGE];
+} PgStat_MsgSubscriptionPurge;
+
+/* ----------
+ * PgStat_MsgSubWorkerError Sent by the apply worker or the table sync worker to
+ * report the error occurred during logical replication.
+ * ----------
+ */
+#define PGSTAT_SUBWORKERERROR_MSGLEN 256
+typedef struct PgStat_MsgSubWorkerError
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * m_subid and m_subrelid are used to determine the subscription and the
+ * reporter of the error. m_subrelid is InvalidOid if reported by an apply
+ * worker otherwise reported by a table sync worker.
+ */
+ Oid m_databaseid;
+ Oid m_subid;
+ Oid m_subrelid;
+
+ /*
+ * Oids of the database and the table that the reporter was actually
+ * processing. m_relid can be InvalidOid if an error occurred during
+ * worker applying a non-data-modification message such as RELATION.
+ */
+ Oid m_relid;
+
+ LogicalRepMsgType m_command;
+ TransactionId m_xid;
+ TimestampTz m_timestamp;
+ char m_message[PGSTAT_SUBWORKERERROR_MSGLEN];
+} PgStat_MsgSubWorkerError;
/* ----------
* PgStat_MsgRecoveryConflict Sent by the backend upon recovery conflict
@@ -697,6 +770,7 @@ typedef union PgStat_Msg
PgStat_MsgResetsinglecounter msg_resetsinglecounter;
PgStat_MsgResetslrucounter msg_resetslrucounter;
PgStat_MsgResetreplslotcounter msg_resetreplslotcounter;
+ PgStat_MsgResetsubworkercounter msg_resetsubworkercounter;
PgStat_MsgAutovacStart msg_autovacuum_start;
PgStat_MsgVacuum msg_vacuum;
PgStat_MsgAnalyze msg_analyze;
@@ -714,6 +788,8 @@ typedef union PgStat_Msg
PgStat_MsgReplSlot msg_replslot;
PgStat_MsgConnect msg_connect;
PgStat_MsgDisconnect msg_disconnect;
+ PgStat_MsgSubscriptionPurge msg_subscriptionpurge;
+ PgStat_MsgSubWorkerError msg_subworkererror;
} PgStat_Msg;
@@ -768,11 +844,16 @@ typedef struct PgStat_StatDBEntry
TimestampTz stats_timestamp; /* time of db stats file update */
/*
- * tables and functions must be last in the struct, because we don't write
- * the pointers out to the stats file.
+ * tables, functions, and subscription workers must be last in the struct,
+ * because we don't write the pointers out to the stats file.
+ *
+ * subworker is the hash table of PgStat_StatSubWorkerEntry which stores
+ * statistics of logical replication workers: apply worker
+ * and table sync worker.
*/
HTAB *tables;
HTAB *functions;
+ HTAB *subworkers;
} PgStat_StatDBEntry;
@@ -929,6 +1010,35 @@ typedef struct PgStat_StatReplSlotEntry
TimestampTz stat_reset_timestamp;
} PgStat_StatReplSlotEntry;
+/* The lookup key for subscription worker hash table */
+typedef struct PgStat_StatSubWorkerKey
+{
+ Oid subid;
+ Oid subrelid; /* InvalidOid for apply worker, otherwise for
+ * table sync worker */
+} PgStat_StatSubWorkerKey;
+
+/*
+ * Logical replication apply worker and table sync worker statistics kept in the
+ * stats collector.
+ */
+typedef struct PgStat_StatSubWorkerEntry
+{
+ PgStat_StatSubWorkerKey key; /* hash key (must be first) */
+
+ /*
+ * Subscription worker error statistics representing an error that
+ * occurred during application of logical replication or the initial table
+ * synchronization.
+ */
+ Oid dbid;
+ Oid relid;
+ LogicalRepMsgType command;
+ TransactionId xid;
+ PgStat_Counter error_count;
+ TimestampTz error_time;
+ char error_message[PGSTAT_SUBWORKERERROR_MSGLEN];
+} PgStat_StatSubWorkerEntry;
/*
* Working state needed to accumulate per-function-call timing statistics.
@@ -1022,6 +1132,7 @@ extern void pgstat_reset_shared_counters(const char *);
extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type type);
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
+extern void pgstat_reset_subworker_stats(Oid subid, Oid subrelid, bool allstats);
extern void pgstat_report_connect(Oid dboid);
extern void pgstat_report_autovac(Oid dboid);
@@ -1038,6 +1149,10 @@ extern void pgstat_report_checksum_failure(void);
extern void pgstat_report_replslot(const PgStat_StatReplSlotEntry *repSlotStat);
extern void pgstat_report_replslot_create(const char *slotname);
extern void pgstat_report_replslot_drop(const char *slotname);
+extern void pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command,
+ TransactionId xid, const char *errmsg);
+extern void pgstat_report_subscription_drop(Oid subid);
extern void pgstat_initialize(void);
@@ -1129,6 +1244,8 @@ extern void pgstat_send_wal(bool force);
extern PgStat_StatDBEntry *pgstat_fetch_stat_dbentry(Oid dbid);
extern PgStat_StatTabEntry *pgstat_fetch_stat_tabentry(Oid relid);
extern PgStat_StatFuncEntry *pgstat_fetch_stat_funcentry(Oid funcid);
+extern PgStat_StatSubWorkerEntry *pgstat_fetch_stat_subworker_entry(Oid subid,
+ Oid subrelid);
extern PgStat_ArchiverStats *pgstat_fetch_stat_archiver(void);
extern PgStat_BgWriterStats *pgstat_fetch_stat_bgwriter(void);
extern PgStat_CheckpointerStats *pgstat_fetch_stat_checkpointer(void);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2fa00a3c29..d7d17b7892 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2094,6 +2094,25 @@ pg_stat_subscription| SELECT su.oid AS subid,
st.latest_end_time
FROM (pg_subscription su
LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid, pid, received_lsn, last_msg_send_time, last_msg_receipt_time, latest_end_lsn, latest_end_time) ON ((st.subid = su.oid)));
+pg_stat_subscription_workers| SELECT w.subid,
+ s.subname,
+ w.subrelid,
+ w.relid,
+ w.command,
+ w.xid,
+ w.error_count,
+ w.error_message,
+ w.last_error_time
+ FROM ( SELECT pg_subscription.oid AS subid,
+ NULL::oid AS relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT pg_subscription_rel.srsubid AS subid,
+ pg_subscription_rel.srrelid AS relid
+ FROM pg_subscription_rel
+ WHERE (pg_subscription_rel.srsubstate <> 'r'::"char")) sr,
+ (LATERAL pg_stat_get_subscription_worker(sr.subid, sr.relid) w(subid, subrelid, relid, command, xid, error_count, error_message, last_error_time)
+ JOIN pg_subscription s ON ((w.subid = s.oid)));
pg_stat_sys_indexes| SELECT pg_stat_all_indexes.relid,
pg_stat_all_indexes.indexrelid,
pg_stat_all_indexes.schemaname,
diff --git a/src/test/subscription/t/026_error_report.pl b/src/test/subscription/t/026_error_report.pl
new file mode 100644
index 0000000000..3d23bb55d4
--- /dev/null
+++ b/src/test/subscription/t/026_error_report.pl
@@ -0,0 +1,156 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Tests for subscription error reporting.
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 5;
+
+# Test if the error reported on pg_stat_subscription_workers view is expected.
+sub test_subscription_error
+{
+ my ($node, $relname, $xid, $expected_error, $msg) = @_;
+
+ my $check_sql = qq[
+SELECT count(1) > 0 FROM pg_stat_subscription_workers
+WHERE relid = '$relname'::regclass];
+ $check_sql .= " AND xid = '$xid'::xid;" if $xid ne '';
+
+ # Wait for the error statistics to be updated.
+ $node->poll_query_until(
+ 'postgres', $check_sql,
+) or die "Timed out while waiting for statistics to be updated";
+
+ my $result = $node->safe_psql(
+ 'postgres',
+ qq[
+SELECT subname, command, relid::regclass, error_count > 0
+FROM pg_stat_subscription_workers
+WHERE relid = '$relname'::regclass;
+]);
+ is($result, $expected_error, $msg);
+}
+
+# Create publisher node.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+logical_decoding_work_mem = 64kB
+]);
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+
+# The subscriber will enter an infinite error loop, so we don't want
+# to overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+wal_retrieve_retry_interval = 5s
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we create
+# the same tables but with primary keys. Also, insert some data that will conflict
+# with the data replicated from publisher later.
+$node_publisher->safe_psql(
+ 'postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int);
+CREATE TABLE test_tab2 (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+INSERT INTO test_tab1 VALUES (1);
+INSERT INTO test_tab2 VALUES (1);
+COMMIT;
+]);
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int primary key);
+CREATE TABLE test_tab2 (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab2 VALUES (1);
+INSERT INTO test_tab_streaming SELECT 10000, md5(10000::text);
+COMMIT;
+]);
+
+# Setup publications.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ q[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Check if there is no subscription errors before starting logical replication.
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_workers");
+is($result, qq(0), 'check no subscription error');
+
+# Create subscriptions. The table sync for test_tab2 on tap_sub will enter into
+# infinite error loop due to violating the unique constraint.
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (streaming = off, two_phase = on);");
+my $appname_streaming = 'tap_sub_streaming';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming' PUBLICATION tap_pub_streaming WITH (streaming = on, two_phase = on);");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Wait for initial table sync for test_tab1 and test_tab_streaming to finish.
+$node_subscriber->poll_query_until('postgres',
+ q[
+SELECT count(1) = 2 FROM pg_subscription_rel
+WHERE srrelid in ('test_tab1'::regclass, 'test_tab_streaming'::regclass) AND srsubstate in ('r', 's')
+]) or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check the initial data.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab1");
+is($result, q(1), 'check initial data are copied to subscriber');
+
+# Insert more data to test_tab1, raising an error on the subscriber due to violation
+# of the unique constraint on test_tab1.
+my $xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab1 VALUES (1);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_subscription_error($node_subscriber, 'test_tab1', $xid,
+ qq(tap_sub|INSERT|test_tab1|t),
+ 'check the error reported by the apply worker');
+
+# Check the table sync worker's error in the view.
+test_subscription_error($node_subscriber, 'test_tab2', '',
+ qq(tap_sub||test_tab2|t),
+ 'check the error reported by the table sync worker');
+
+# Check if the view doesn't show any entries after dropping the subscriptions.
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+DROP SUBSCRIPTION tap_sub;
+DROP SUBSCRIPTION tap_sub_streaming;
+]);
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_workers");
+is($result, q(0), 'no error after dropping subscription');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index da6ac8ed83..e4d78a9370 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1942,7 +1942,11 @@ PgStat_MsgResetreplslotcounter
PgStat_MsgResetsharedcounter
PgStat_MsgResetsinglecounter
PgStat_MsgResetslrucounter
+PgStat_MsgResetsubworkererror
PgStat_MsgSLRU
+PgStat_MsgSubWorkerError
+PgStat_MsgSubWorkerErrorPurge
+PgStat_MsgSubWorkerPurge
PgStat_MsgTabpurge
PgStat_MsgTabstat
PgStat_MsgTempFile
@@ -1954,6 +1958,8 @@ PgStat_Single_Reset_Type
PgStat_StatDBEntry
PgStat_StatFuncEntry
PgStat_StatReplSlotEntry
+PgStat_StatSubWorkerEntry
+PgStat_StatSubWorkerKey
PgStat_StatTabEntry
PgStat_SubXactStatus
PgStat_TableCounts
--
2.24.3 (Apple Git-128)
On Fri, Nov 5, 2021 at 12:57 AM vignesh C <vignesh21@gmail.com> wrote:
On Fri, Oct 29, 2021 at 10:55 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Oct 28, 2021 at 7:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Oct 28, 2021 at 10:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Oct 27, 2021 at 7:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Oct 21, 2021 at 10:29 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached updated patches.
Thank you for the comments!
Few comments:
==============
1. Is the patch cleaning tablesync error entries except via vacuum? If
not, can't we send a message to remove tablesync errors once tablesync
is successful (say when we reset skip_xid or when tablesync is
finished) or when we drop subscription? I think the same applies to
apply worker. I think we may want to track it in some way whether an
error has occurred before sending the message but relying completely
on a vacuum might be the recipe of bloat. I think in the case of a
drop subscription we can simply send the message as that is not a
frequent operation. I might be missing something here because in the
tests after drop subscription you are expecting the entries from the
view to get clearedYes, I think we can have tablesync worker send a message to drop stats
once tablesync is successful. But if we do that also when dropping a
subscription, I think we need to do that only the transaction is
committed since we can drop a subscription that doesn't have a
replication slot and rollback the transaction. Probably we can send
the message only when the subscritpion does have a replication slot.Right. And probably for apply worker after updating skip xid.
I'm not sure it's better to drop apply worker stats after resetting
skip xid (i.g., after skipping the transaction). Since the view is a
cumulative view and has last_error_time, I thought we can have the
apply worker stats until the subscription gets dropped. Since the
error reporting message could get lost, no entry in the view doesn’t
mean the worker doesn’t face an issue.In other cases, we can remember the subscriptions being dropped and
send the message to drop the statistics of them after committing the
transaction but I’m not sure it’s worth having it.Yeah, let's not go to that extent. I think in most cases subscriptions
will have corresponding slots.Agreed.
FWIW, we completely
rely on pg_stat_vacuum_stats() for cleaning up the dead tables and
functions. And we don't expect there are many subscriptions on the
database.True, but we do send it for the database, so let's do it for the cases
you explained in the first paragraph.Agreed.
I've attached a new version patch. Since the syntax of skipping
transaction id is under the discussion I've attached only the error
reporting patch for now.Thanks for the updated patch, few comments: 1) This check and return can be moved above CreateTemplateTupleDesc so that the tuple descriptor need not be created if there is no worker statistics + BlessTupleDesc(tupdesc); + + /* Get subscription worker stats */ + wentry = pgstat_fetch_subworker(subid, subrelid); + + /* Return NULL if there is no worker statistics */ + if (wentry == NULL) + PG_RETURN_NULL(); + + /* Initialise values and NULL flags arrays */ + MemSet(values, 0, sizeof(values)); + MemSet(nulls, 0, sizeof(nulls));2) "NULL for the main apply worker" is mentioned as "null for the main apply worker" in case of pg_stat_subscription view, we can mention it similarly. + <para> + OID of the relation that the worker is synchronizing; NULL for the + main apply worker + </para></entry>3) Variable assignment can be done during declaration and this the assignment can be removed + i = 0; + /* subid */ + values[i++] = ObjectIdGetDatum(subid);4) I noticed that the worker error is still present when queried from
pg_stat_subscription_workers even after conflict is resolved in the
subscriber and the worker proceeds with applying the other
transactions, should this be documented somewhere?5) This needs to be aligned, the columns in select have used TAB, we should align it using spaces. +CREATE VIEW pg_stat_subscription_workers AS + SELECT + w.subid, + s.subname, + w.subrelid, + w.relid, + w.command, + w.xid, + w.error_count, + w.error_message, + w.last_error_time, + w.stats_reset
Thank you for the comments! These comments are incorporated into the
latest (v20) patch I just submitted[1]/messages/by-id/CAD21AoAT42mhcqeB1jPfRL1+EUHbZk8MMY_fBgsyZvJeKNpG+w@mail.gmail.com.
Regards,
[1]: /messages/by-id/CAD21AoAT42mhcqeB1jPfRL1+EUHbZk8MMY_fBgsyZvJeKNpG+w@mail.gmail.com
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Mon, Nov 8, 2021 at 1:20 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated patch. In this version patch, subscription
worker statistics are collected per-database and handled in a similar
way to tables and functions. I think perhaps we still need to discuss
details of how the statistics should be handled but I'd like to share
the patch for discussion.
That's for the updated patch.
Some initial comments on the v20 patch:
doc/src/sgml/monitoring.sgml
(1) wording
The word "information" seems to be missing after "showing" (otherwise
is reads "showing about errors", which isn't correct grammar).
I suggest the following change:
BEFORE:
+ <entry>At least one row per subscription, showing about errors that
+ occurred on subscription.
AFTER:
+ <entry>At least one row per subscription, showing information about
+ errors that occurred on subscription.
(2) pg_stat_reset_subscription_worker(subid Oid, relid Oid) function
documentation
The description doesn't read well. I'd suggest the following change:
BEFORE:
* Resets statistics of a single subscription worker statistics.
AFTER:
* Resets the statistics of a single subscription worker.
I think that the documentation for this function should make it clear
that a non-NULL "subid" parameter is required for both reset cases
(tablesync and apply).
Perhaps this could be done by simply changing the first sentence to say:
"Resets the statistics of a single subscription worker, for a worker
running on the subscription with <parameter>subid</parameter>."
(and then can remove " running on the subscription with
<parameter>subid</parameter>" from the last sentence)
I think that the documentation for this function should say that it
should be used in conjunction with the "pg_stat_subscription_workers"
view in order to obtain the required subid/relid values for resetting.
(and should provide a link to the documentation for that view)
Also, I think that the function documentation should make it clear how
to distinguish the tablesync vs apply worker statistics case.
e.g. the tablesync error case is indicated by a null "command" in the
information returned from the "pg_stat_subscription_workers" view
(otherwise it seems a user could only know this by looking at the server log).
Finally, there are currently no tests for this new function.
(3) pg_stat_subscription_workers
In the documentation for this, some users may not realise that "the
initial data copy" refers to "tablesync", so maybe say "the initial
data copy (tablesync)", or similar.
(4) stats_reset
"stats_reset" is currently documented as the last column of the
"pg_stat_subscription_workers" view - but it's actually no longer
included in the view.
(5) src/tools/pgindent/typedefs.list
The following current entries are bogus:
PgStat_MsgSubWorkerErrorPurge
PgStat_MsgSubWorkerPurge
The following entry is missing:
PgStat_MsgSubscriptionPurge
Regards,
Greg Nancarrow
Fujitsu Australia
On Sun, Nov 7, 2021 at 7:50 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated patch. In this version patch, subscription
worker statistics are collected per-database and handled in a similar
way to tables and functions. I think perhaps we still need to discuss
details of how the statistics should be handled but I'd like to share
the patch for discussion.
While reviewing the v20, I have some initial comments,
+ <row>
+ <entry><structname>pg_stat_subscription_workers</structname><indexterm><primary>pg_stat_subscription_workers</primary></indexterm></entry>
+ <entry>At least one row per subscription, showing about errors that
+ occurred on subscription.
+ See <link linkend="monitoring-pg-stat-subscription-workers">
+ <structname>pg_stat_subscription_workers</structname></link> for details.
+ </entry>
1.
I don't like the fact that this view is very specific for showing the
errors but the name of the view is very generic. So are we keeping
this name to expand the scope of the view in the future? If this is
meant only for showing the errors then the name should be more
specific.
2.
Why comment says "At least one row per subscription"? this looks
confusing, I mean if there is no error then there will not be even one
row right?
+ <para>
+ The <structname>pg_stat_subscription_workers</structname> view will contain
+ one row per subscription error reported by workers applying logical
+ replication changes and workers handling the initial data copy of the
+ subscribed tables.
+ </para>
3.
So there will only be one row per subscription? I did not read the
code, but suppose there was an error due to some constraint now if
that constraint is removed and there is a new error then the old error
will be removed immediately or it will be removed by auto vacuum? If
it is not removed immediately then there could be multiple errors per
subscription in the view so the comment is not correct.
4.
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_error_time</structfield> <type>timestamp
with time zone</type>
+ </para>
+ <para>
+ Time at which the last error occurred
+ </para></entry>
+ </row>
Will it be useful to know when the first time error occurred?
5.
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>stats_reset</structfield> <type>timestamp with
time zone</type>
+ </para>
+ <para>
The actual view does not contain this column.
6.
+ <para>
+ Resets statistics of a single subscription worker statistics.
/Resets statistics of a single subscription worker statistics/Resets
statistics of a single subscription worker
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Sun, Nov 7, 2021 at 7:50 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Nov 3, 2021 at 12:41 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Nov 2, 2021 at 2:17 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
If we follow the idea of keeping stats at db level (in
PgStat_StatDBEntry) as discussed above then I think we already have a
way to remove stat entries via pg_stat_reset which removes the stats
corresponding to tables, functions and after this patch corresponding
to subscriptions as well for the current database. Won't that be
sufficient? I see your point but I think it may be better if we keep
the same behavior for stats of apply and table sync workers.Make sense.
We can document this point.
Following the tables, functions, I thought of keeping the name of the
reset function similar to "pg_stat_reset_single_table_counters" but I
feel the currently used name "pg_stat_reset_subscription_worker" in
the patch is better. Do let me know what you think?Yeah, I also tend to prefer pg_stat_reset_subscription_worker name
since "single" isn't clear in the context of subscription worker. And
the behavior of the reset function for subscription workers is also
different from pg_stat_reset_single_xxx_counters.I've attached an updated patch. In this version patch, subscription
worker statistics are collected per-database and handled in a similar
way to tables and functions. I think perhaps we still need to discuss
details of how the statistics should be handled but I'd like to share
the patch for discussion.
Do you have something specific in mind to discuss the details of how
stats should be handled?
Few comments/questions:
====================
1.
static void pgstat_reset_replslot(PgStat_StatReplSlotEntry
*slotstats, TimestampTz ts);
+
static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg, TimestampTz now);
Spurious line addition.
2. Why now there is no code to deal with dead table sync entries as
compared to previous version of patch?
3. Why do we need two different functions
pg_stat_reset_subscription_worker_sub and
pg_stat_reset_subscription_worker_subrel to handle reset? Isn't it
sufficient to reset all entries for a subscription if relid is
InvalidOid?
4. It seems now stats_reset entry is not present in
pg_stat_subscription_workers? How will users find that information if
required?
--
With Regards,
Amit Kapila.
On Tue, Nov 9, 2021 at 11:37 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Sun, Nov 7, 2021 at 7:50 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated patch. In this version patch, subscription
worker statistics are collected per-database and handled in a similar
way to tables and functions. I think perhaps we still need to discuss
details of how the statistics should be handled but I'd like to share
the patch for discussion.While reviewing the v20, I have some initial comments,
+ <row> + <entry><structname>pg_stat_subscription_workers</structname><indexterm><primary>pg_stat_subscription_workers</primary></indexterm></entry> + <entry>At least one row per subscription, showing about errors that + occurred on subscription. + See <link linkend="monitoring-pg-stat-subscription-workers"> + <structname>pg_stat_subscription_workers</structname></link> for details. + </entry>1.
I don't like the fact that this view is very specific for showing the
errors but the name of the view is very generic. So are we keeping
this name to expand the scope of the view in the future?
Yes, we are planning to display some other xact specific stats as well
corresponding to subscription workers. See [1]/messages/by-id/OSBPR01MB48887CA8F40C8D984A6DC00CED199@OSBPR01MB4888.jpnprd01.prod.outlook.com[2]/messages/by-id/CAA4eK1+1n3upCMB-Y_k9b1wPNCtNE7MEHan9kA1s6GNsZGB0Og@mail.gmail.com.
[1]: /messages/by-id/OSBPR01MB48887CA8F40C8D984A6DC00CED199@OSBPR01MB4888.jpnprd01.prod.outlook.com
[2]: /messages/by-id/CAA4eK1+1n3upCMB-Y_k9b1wPNCtNE7MEHan9kA1s6GNsZGB0Og@mail.gmail.com
--
With Regards,
Amit Kapila.
On Tue, Nov 9, 2021 at 3:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Sun, Nov 7, 2021 at 7:50 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Nov 3, 2021 at 12:41 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Nov 2, 2021 at 2:17 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
If we follow the idea of keeping stats at db level (in
PgStat_StatDBEntry) as discussed above then I think we already have a
way to remove stat entries via pg_stat_reset which removes the stats
corresponding to tables, functions and after this patch corresponding
to subscriptions as well for the current database. Won't that be
sufficient? I see your point but I think it may be better if we keep
the same behavior for stats of apply and table sync workers.Make sense.
We can document this point.
Okay.
Following the tables, functions, I thought of keeping the name of the
reset function similar to "pg_stat_reset_single_table_counters" but I
feel the currently used name "pg_stat_reset_subscription_worker" in
the patch is better. Do let me know what you think?Yeah, I also tend to prefer pg_stat_reset_subscription_worker name
since "single" isn't clear in the context of subscription worker. And
the behavior of the reset function for subscription workers is also
different from pg_stat_reset_single_xxx_counters.I've attached an updated patch. In this version patch, subscription
worker statistics are collected per-database and handled in a similar
way to tables and functions. I think perhaps we still need to discuss
details of how the statistics should be handled but I'd like to share
the patch for discussion.Do you have something specific in mind to discuss the details of how
stats should be handled?
As you commented, I removed stats_reset column from
pg_stat_subscription_workers view since tables and functions stats
view doesn't have it.
Few comments/questions:
====================
1.
static void pgstat_reset_replslot(PgStat_StatReplSlotEntry
*slotstats, TimestampTz ts);+
static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg, TimestampTz now);Spurious line addition.
Will fix.
2. Why now there is no code to deal with dead table sync entries as
compared to previous version of patch?
I think we discussed that it's better if we keep the same behavior for
stats of apply and table sync workers. So the table sync entries are
dead after the subscription is dropped, like apply entries. No?
3. Why do we need two different functions
pg_stat_reset_subscription_worker_sub and
pg_stat_reset_subscription_worker_subrel to handle reset? Isn't it
sufficient to reset all entries for a subscription if relid is
InvalidOid?
Since setting InvalidOid to relid means an apply entry we cannot use
it for that purpose.
4. It seems now stats_reset entry is not present in
pg_stat_subscription_workers? How will users find that information if
required?
Users can find it in pg_stat_databases. The same is true for table and
function statistics -- they don't have stats_reset column but reset
stats_reset of its entry on pg_stat_database.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, Nov 9, 2021 at 11:57 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Nov 9, 2021 at 11:37 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
1.
I don't like the fact that this view is very specific for showing the
errors but the name of the view is very generic. So are we keeping
this name to expand the scope of the view in the future?Yes, we are planning to display some other xact specific stats as well
corresponding to subscription workers. See [1][2].[1] - /messages/by-id/OSBPR01MB48887CA8F40C8D984A6DC00CED199@OSBPR01MB4888.jpnprd01.prod.outlook.com
[2] - /messages/by-id/CAA4eK1+1n3upCMB-Y_k9b1wPNCtNE7MEHan9kA1s6GNsZGB0Og@mail.gmail.com
Thanks for pointing me to this thread, I will have a look.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Tue, Nov 9, 2021 at 12:13 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Nov 9, 2021 at 3:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
4. It seems now stats_reset entry is not present in
pg_stat_subscription_workers? How will users find that information if
required?Users can find it in pg_stat_databases. The same is true for table and
function statistics -- they don't have stats_reset column but reset
stats_reset of its entry on pg_stat_database.
Okay, but isn't it better to deal with the reset of subscription
workers via pgstat_recv_resetsinglecounter by introducing subobjectid?
I think that will make code consistent for all database-related stats.
--
With Regards,
Amit Kapila.
On Tue, Nov 9, 2021 at 7:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Nov 9, 2021 at 12:13 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Nov 9, 2021 at 3:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
4. It seems now stats_reset entry is not present in
pg_stat_subscription_workers? How will users find that information if
required?Users can find it in pg_stat_databases. The same is true for table and
function statistics -- they don't have stats_reset column but reset
stats_reset of its entry on pg_stat_database.Okay, but isn't it better to deal with the reset of subscription
workers via pgstat_recv_resetsinglecounter by introducing subobjectid?
I think that will make code consistent for all database-related stats.
Agreed. It's better to use the same function internally even if the
SQL-callable interfaces are different.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, Nov 9, 2021 at 1:10 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Nov 9, 2021 at 11:57 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Nov 9, 2021 at 11:37 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
1.
I don't like the fact that this view is very specific for showing the
errors but the name of the view is very generic. So are we keeping
this name to expand the scope of the view in the future?Yes, we are planning to display some other xact specific stats as well
corresponding to subscription workers. See [1][2].[1] - /messages/by-id/OSBPR01MB48887CA8F40C8D984A6DC00CED199@OSBPR01MB4888.jpnprd01.prod.outlook.com
[2] - /messages/by-id/CAA4eK1+1n3upCMB-Y_k9b1wPNCtNE7MEHan9kA1s6GNsZGB0Og@mail.gmail.comThanks for pointing me to this thread, I will have a look.
I think we can even add a line in the commit message stating that this
can be extended in the future to track other xact related stats for
subscription workers. I think it will help readers of the patch.
--
With Regards,
Amit Kapila.
On Sun, Nov 7, 2021 at 7:50 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Nov 3, 2021 at 12:41 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Nov 2, 2021 at 2:17 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Nov 2, 2021 at 2:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Nov 1, 2021 at 7:18 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Oct 29, 2021 at 8:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Fair enough. So statistics can be removed either by vacuum or drop
subscription. Also, if we go by this logic then there is no harm in
retaining the stat entries for tablesync errors. Why have different
behavior for apply and tablesync workers?My understanding is that the subscription worker statistics entry
corresponds to workers (but not physical workers since the physical
process is changed after restarting). So if the worker finishes its
jobs, it is no longer necessary to show errors since further problems
will not occur after that. Table sync worker’s job finishes when
completing table copy (unless table sync is performed again by REFRESH
PUBLICATION) whereas apply worker’s job finishes when the subscription
is dropped.Actually, I am not very sure how users can use the old error
information after we allowed skipping the conflicting xid. Say, if
they want to add/remove some constraints on the table based on
previous errors then they might want to refer to errors of both the
apply worker and table sync worker.I think that in general, statistics should be retained as long as a
corresponding object exists on the database, like other cumulative
statistic views. So I’m concerned that an entry of a cumulative stats
view is automatically removed by a non-stats-related function (i.g.,
ALTER SUBSCRIPTION SKIP). Which seems a new behavior for cumulative
stats views.We can retain the stats entries for table sync worker but what I want
to avoid is that the view shows many old entries that will never be
updated. I've sometimes seen cases where the user mistakenly restored
table data on the subscriber before creating a subscription, failed
table sync on many tables due to unique violation, and truncated
tables on the subscriber. I think that unlike the stats entries for
apply worker, retaining the stats entries for table sync could be
harmful since it’s likely to be a large amount (even hundreds of
entries). Especially, it could lead to bloat the stats file since it
has an error message. So if we do that, I'd like to provide a function
for users to remove (not reset) stats entries manually.If we follow the idea of keeping stats at db level (in
PgStat_StatDBEntry) as discussed above then I think we already have a
way to remove stat entries via pg_stat_reset which removes the stats
corresponding to tables, functions and after this patch corresponding
to subscriptions as well for the current database. Won't that be
sufficient? I see your point but I think it may be better if we keep
the same behavior for stats of apply and table sync workers.Make sense.
Following the tables, functions, I thought of keeping the name of the
reset function similar to "pg_stat_reset_single_table_counters" but I
feel the currently used name "pg_stat_reset_subscription_worker" in
the patch is better. Do let me know what you think?Yeah, I also tend to prefer pg_stat_reset_subscription_worker name
since "single" isn't clear in the context of subscription worker. And
the behavior of the reset function for subscription workers is also
different from pg_stat_reset_single_xxx_counters.I've attached an updated patch. In this version patch, subscription
worker statistics are collected per-database and handled in a similar
way to tables and functions. I think perhaps we still need to discuss
details of how the statistics should be handled but I'd like to share
the patch for discussion.
Thanks for the updated patch, Few comments:
1) should we change "Tables and functions hashes are initialized to
empty" to "Tables, functions and subworker hashes are initialized to
empty"
+ hash_ctl.keysize = sizeof(PgStat_StatSubWorkerKey);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubWorkerEntry);
+ dbentry->subworkers = hash_create("Per-database subscription worker",
+
PGSTAT_SUBWORKER_HASH_SIZE,
+
&hash_ctl,
+
HASH_ELEM | HASH_BLOBS);
2) Since databaseid, tabhash, funchash and subworkerhash are members
of dbentry, can we remove the function arguments databaseid, tabhash,
funchash and subworkerhash and pass dbentry similar to
pgstat_write_db_statsfile function?
@@ -4370,12 +4582,14 @@ done:
*/
static void
pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash,
- bool permanent)
+ HTAB *subworkerhash,
bool permanent)
{
PgStat_StatTabEntry *tabentry;
PgStat_StatTabEntry tabbuf;
PgStat_StatFuncEntry funcbuf;
PgStat_StatFuncEntry *funcentry;
+ PgStat_StatSubWorkerEntry subwbuf;
+ PgStat_StatSubWorkerEntry *subwentry;
3) Can we move pgstat_get_subworker_entry below pgstat_get_db_entry
and pgstat_get_tab_entry, so that the hash lookup can be together
consistently. Similarly pgstat_send_subscription_purge can be moved
after pgstat_send_slru.
+/* ----------
+ * pgstat_get_subworker_entry
+ *
+ * Return subscription worker entry with the given subscription OID and
+ * relation OID. If subrelid is InvalidOid, it returns an entry of the
+ * apply worker otherwise of the table sync worker associated with subrelid.
+ * If no subscription entry exists, initialize it, if the create parameter
+ * is true. Else, return NULL.
+ * ----------
+ */
+static PgStat_StatSubWorkerEntry *
+pgstat_get_subworker_entry(PgStat_StatDBEntry *dbentry, Oid subid,
Oid subrelid,
+ bool create)
+{
+ PgStat_StatSubWorkerEntry *subwentry;
+ PgStat_StatSubWorkerKey key;
+ bool found;
4) This change can be removed from pgstat.c:
@@ -332,9 +339,11 @@ static bool pgstat_db_requested(Oid databaseid);
static PgStat_StatReplSlotEntry *pgstat_get_replslot_entry(NameData
name, bool create_it);
static void pgstat_reset_replslot(PgStat_StatReplSlotEntry
*slotstats, TimestampTz ts);
+
static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg, TimestampTz now);
static void pgstat_send_funcstats(void);
5) I was able to compile without including
catalog/pg_subscription_rel.h, we can remove including
catalog/pg_subscription_rel.h if not required.
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -41,6 +41,8 @@
#include "catalog/catalog.h"
#include "catalog/pg_database.h"
#include "catalog/pg_proc.h"
+#include "catalog/pg_subscription.h"
+#include "catalog/pg_subscription_rel.h"
6) Similarly replication/logicalproto.h also need not be included
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -24,6 +24,7 @@
#include "pgstat.h"
#include "postmaster/bgworker_internals.h"
#include "postmaster/postmaster.h"
+#include "replication/logicalproto.h"
#include "replication/slot.h"
#include "storage/proc.h"
7) There is an extra ";", We can remove one ";" from below:
+ PgStat_StatSubWorkerKey key;
+ bool found;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);;
+
+ key.subid = subid;
+ key.subrelid = subrelid;
Regards,
Vignesh
On Mon, Nov 8, 2021 at 4:10 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Mon, Nov 8, 2021 at 1:20 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated patch. In this version patch, subscription
worker statistics are collected per-database and handled in a similar
way to tables and functions. I think perhaps we still need to discuss
details of how the statistics should be handled but I'd like to share
the patch for discussion.That's for the updated patch.
Some initial comments on the v20 patch:
Thank you for the comments!
doc/src/sgml/monitoring.sgml
(1) wording
The word "information" seems to be missing after "showing" (otherwise
is reads "showing about errors", which isn't correct grammar).
I suggest the following change:BEFORE: + <entry>At least one row per subscription, showing about errors that + occurred on subscription. AFTER: + <entry>At least one row per subscription, showing information about + errors that occurred on subscription.
Fixed.
(2) pg_stat_reset_subscription_worker(subid Oid, relid Oid) function
documentation
The description doesn't read well. I'd suggest the following change:BEFORE:
* Resets statistics of a single subscription worker statistics.
AFTER:
* Resets the statistics of a single subscription worker.I think that the documentation for this function should make it clear
that a non-NULL "subid" parameter is required for both reset cases
(tablesync and apply).
Perhaps this could be done by simply changing the first sentence to say:
"Resets the statistics of a single subscription worker, for a worker
running on the subscription with <parameter>subid</parameter>."
(and then can remove " running on the subscription with
<parameter>subid</parameter>" from the last sentence)
Fixed.
I think that the documentation for this function should say that it
should be used in conjunction with the "pg_stat_subscription_workers"
view in order to obtain the required subid/relid values for resetting.
(and should provide a link to the documentation for that view)
I think it's not necessarily true that users should use
pg_stat_subscription_workers in order to obtain subid/relid since we
can obtain the same also from pg_subscription_rel. But I agree that it
should clarify that this function resets entries of
pg_stat_subscription view. Fixed.
Also, I think that the function documentation should make it clear how
to distinguish the tablesync vs apply worker statistics case.
e.g. the tablesync error case is indicated by a null "command" in the
information returned from the "pg_stat_subscription_workers" view
(otherwise it seems a user could only know this by looking at the server log).
The documentation of pg_stat_subscription_workers explains that
subrelid is always NULL for apply workers. Is it not enough?
Finally, there are currently no tests for this new function.
I've added some tests.
(3) pg_stat_subscription_workers
In the documentation for this, some users may not realise that "the
initial data copy" refers to "tablesync", so maybe say "the initial
data copy (tablesync)", or similar.
Perhaps it's better not to use the term "tablesync" since we don't use
the term anywhere now. Instead, we should say more clearly, say
"subscription worker handling initial data copy of the relation, as
the description pg_stat_subscription says.
(4) stats_reset
"stats_reset" is currently documented as the last column of the
"pg_stat_subscription_workers" view - but it's actually no longer
included in the view.
Removed.
(5) src/tools/pgindent/typedefs.list
The following current entries are bogus:
PgStat_MsgSubWorkerErrorPurge
PgStat_MsgSubWorkerPurgeThe following entry is missing:
PgStat_MsgSubscriptionPurge
Fixed.
I'll submit an updated patch soon.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, Nov 9, 2021 at 3:07 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Sun, Nov 7, 2021 at 7:50 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated patch. In this version patch, subscription
worker statistics are collected per-database and handled in a similar
way to tables and functions. I think perhaps we still need to discuss
details of how the statistics should be handled but I'd like to share
the patch for discussion.While reviewing the v20, I have some initial comments,
+ <row> + <entry><structname>pg_stat_subscription_workers</structname><indexterm><primary>pg_stat_subscription_workers</primary></indexterm></entry> + <entry>At least one row per subscription, showing about errors that + occurred on subscription. + See <link linkend="monitoring-pg-stat-subscription-workers"> + <structname>pg_stat_subscription_workers</structname></link> for details. + </entry>1.
I don't like the fact that this view is very specific for showing the
errors but the name of the view is very generic. So are we keeping
this name to expand the scope of the view in the future? If this is
meant only for showing the errors then the name should be more
specific.
As Amit already mentioned, we're planning to add more xact statistics
to this view. I've mentioned that in the commit message.
2.
Why comment says "At least one row per subscription"? this looks
confusing, I mean if there is no error then there will not be even one
row right?+ <para> + The <structname>pg_stat_subscription_workers</structname> view will contain + one row per subscription error reported by workers applying logical + replication changes and workers handling the initial data copy of the + subscribed tables. + </para>
Right. Fixed.
3.
So there will only be one row per subscription? I did not read the
code, but suppose there was an error due to some constraint now if
that constraint is removed and there is a new error then the old error
will be removed immediately or it will be removed by auto vacuum? If
it is not removed immediately then there could be multiple errors per
subscription in the view so the comment is not correct.
There is one row per subscription worker (apply worker and tablesync
worker). If the same error consecutively occurred, error_count is
incremented and last_error_time is updated. Otherwise, i.g., if a
different error occurred on the apply worker, all statistics are
updated.
4. + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>last_error_time</structfield> <type>timestamp with time zone</type> + </para> + <para> + Time at which the last error occurred + </para></entry> + </row>Will it be useful to know when the first time error occurred?
Good idea. Users can know when the subscription stopped due to this
error. Added.
5. + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>stats_reset</structfield> <type>timestamp with time zone</type> + </para> + <para>The actual view does not contain this column.
Removed.
6. + <para> + Resets statistics of a single subscription worker statistics./Resets statistics of a single subscription worker statistics/Resets
statistics of a single subscription worker
Fixed.
I'll update an updated patch soon.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wed, Nov 10, 2021 at 12:49 PM vignesh C <vignesh21@gmail.com> wrote:
Thanks for the updated patch, Few comments:
Thank you for the comments!
1) should we change "Tables and functions hashes are initialized to empty" to "Tables, functions and subworker hashes are initialized to empty" + hash_ctl.keysize = sizeof(PgStat_StatSubWorkerKey); + hash_ctl.entrysize = sizeof(PgStat_StatSubWorkerEntry); + dbentry->subworkers = hash_create("Per-database subscription worker", + PGSTAT_SUBWORKER_HASH_SIZE, + &hash_ctl, + HASH_ELEM | HASH_BLOBS);
Fixed.
2) Since databaseid, tabhash, funchash and subworkerhash are members of dbentry, can we remove the function arguments databaseid, tabhash, funchash and subworkerhash and pass dbentry similar to pgstat_write_db_statsfile function? @@ -4370,12 +4582,14 @@ done: */ static void pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash, - bool permanent) + HTAB *subworkerhash, bool permanent) { PgStat_StatTabEntry *tabentry; PgStat_StatTabEntry tabbuf; PgStat_StatFuncEntry funcbuf; PgStat_StatFuncEntry *funcentry; + PgStat_StatSubWorkerEntry subwbuf; + PgStat_StatSubWorkerEntry *subwentry;
As the comment of this function says, this function has the ability to
skip storing per-table or per-function (and or
per-subscription-workers) data, if NULL is passed for the
corresponding hashtable, although that's not used at the moment. IMO
it'd be better to keep such behavior.
3) Can we move pgstat_get_subworker_entry below pgstat_get_db_entry and pgstat_get_tab_entry, so that the hash lookup can be together consistently. Similarly pgstat_send_subscription_purge can be moved after pgstat_send_slru. +/* ---------- + * pgstat_get_subworker_entry + * + * Return subscription worker entry with the given subscription OID and + * relation OID. If subrelid is InvalidOid, it returns an entry of the + * apply worker otherwise of the table sync worker associated with subrelid. + * If no subscription entry exists, initialize it, if the create parameter + * is true. Else, return NULL. + * ---------- + */ +static PgStat_StatSubWorkerEntry * +pgstat_get_subworker_entry(PgStat_StatDBEntry *dbentry, Oid subid, Oid subrelid, + bool create) +{ + PgStat_StatSubWorkerEntry *subwentry; + PgStat_StatSubWorkerKey key; + bool found;
Agreed. Moved.
4) This change can be removed from pgstat.c:
@@ -332,9 +339,11 @@ static bool pgstat_db_requested(Oid databaseid);
static PgStat_StatReplSlotEntry *pgstat_get_replslot_entry(NameData
name, bool create_it);
static void pgstat_reset_replslot(PgStat_StatReplSlotEntry
*slotstats, TimestampTz ts);+
static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg, TimestampTz now);
static void pgstat_send_funcstats(void);
Removed.
5) I was able to compile without including catalog/pg_subscription_rel.h, we can remove including catalog/pg_subscription_rel.h if not required. --- a/src/backend/postmaster/pgstat.c +++ b/src/backend/postmaster/pgstat.c @@ -41,6 +41,8 @@ #include "catalog/catalog.h" #include "catalog/pg_database.h" #include "catalog/pg_proc.h" +#include "catalog/pg_subscription.h" +#include "catalog/pg_subscription_rel.h"
Removed.
6) Similarly replication/logicalproto.h also need not be included --- a/src/backend/utils/adt/pgstatfuncs.c +++ b/src/backend/utils/adt/pgstatfuncs.c @@ -24,6 +24,7 @@ #include "pgstat.h" #include "postmaster/bgworker_internals.h" #include "postmaster/postmaster.h" +#include "replication/logicalproto.h" #include "replication/slot.h" #include "storage/proc.h"
Removed;
7) There is an extra ";", We can remove one ";" from below: + PgStat_StatSubWorkerKey key; + bool found; + HASHACTION action = (create ? HASH_ENTER : HASH_FIND);; + + key.subid = subid; + key.subrelid = subrelid;
Fixed.
I've attached an updated patch that incorporates all comments I got so
far. Please review it.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v21-0001-Add-a-subscription-worker-statistics-view-pg_sta.patchapplication/octet-stream; name=v21-0001-Add-a-subscription-worker-statistics-view-pg_sta.patchDownload
From bfabcb5ec86973562f5cc59a6a56c29da7ed9906 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 16 Jul 2021 23:10:22 +0900
Subject: [PATCH v21 1/3] Add a subscription worker statistics view
"pg_stat_subscription_workers".
This commit adds a new system view pg_stat_subscription_workers,
that shows information about any errors which occur during application
of logical replication changes as well as during performing initial table
synchronization. The subscription statistics entries are removed when
the corresponding subscription is removed.
It also adds an SQL function pg_stat_reset_subscription_worker() to
reset single subscription errors.
The contents of this view can be used by an upcoming patch that skips
the particular transaction that conflicts with the existing data on
the subscriber.
This view can be extended in the future to track other xact related
statistics for subscription workers.
---
doc/src/sgml/monitoring.sgml | 167 +++++++++
src/backend/catalog/system_functions.sql | 4 +
src/backend/catalog/system_views.sql | 25 ++
src/backend/commands/subscriptioncmds.c | 15 +-
src/backend/postmaster/pgstat.c | 379 +++++++++++++++++++-
src/backend/replication/logical/worker.c | 54 ++-
src/backend/utils/adt/pgstatfuncs.c | 134 ++++++-
src/include/catalog/pg_proc.dat | 18 +
src/include/pgstat.h | 105 +++++-
src/test/regress/expected/rules.out | 20 ++
src/test/subscription/t/026_error_report.pl | 191 ++++++++++
src/tools/pgindent/typedefs.list | 4 +
12 files changed, 1095 insertions(+), 21 deletions(-)
create mode 100644 src/test/subscription/t/026_error_report.pl
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 3173ec2566..4ee97dbb2d 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -627,6 +627,15 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
</entry>
</row>
+ <row>
+ <entry><structname>pg_stat_subscription_workers</structname><indexterm><primary>pg_stat_subscription_workers</primary></indexterm></entry>
+ <entry>One row per subscription worker, showing statistics about errors
+ that occurred on that subscription worker.
+ See <link linkend="monitoring-pg-stat-subscription-workers">
+ <structname>pg_stat_subscription_workers</structname></link> for details.
+ </entry>
+ </row>
+
</tbody>
</tgroup>
</table>
@@ -3034,6 +3043,138 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
</sect2>
+ <sect2 id="monitoring-pg-stat-subscription-workers">
+ <title><structname>pg_stat_subscription_workers</structname></title>
+
+ <indexterm>
+ <primary>pg_stat_subscription_workers</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_stat_subscription_workers</structname> view will contain
+ one row per subscription error reported by workers applying logical
+ replication changes and workers handling the initial data copy of the
+ subscribed tables. The statistics entry is removed when the subscription
+ the worker is running on is removed.
+ </para>
+
+ <table id="pg-stat-subscription-workers" xreflabel="pg_stat_subscription_workers">
+ <title><structname>pg_stat_subscription_workers</structname> View</title>
+ <tgroup cols="1">
+ <thead>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ Column Type
+ </para>
+ <para>
+ Description
+ </para></entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the subscription
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the subscription
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subrelid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker is synchronizing; null for the
+ main apply worker
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>relid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker was processing when the
+ error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>command</structfield> <type>text</type>
+ </para>
+ <para>
+ Name of command being applied when the error occurred. This field
+ is always NULL if the error was reported during the initial data
+ copy.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>xid</structfield> <type>xid</type>
+ </para>
+ <para>
+ Transaction ID of the publisher node being applied when the error
+ occurred. This field is always NULL if the error was reported
+ during the initial data copy.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>error_count</structfield> <type>uint8</type>
+ </para>
+ <para>
+ Number of consecutive times the error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>error_message</structfield> <type>text</type>
+ </para>
+ <para>
+ The error message
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>first_error_time</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which the first error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_error_time</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which the last error occurred
+ </para></entry>
+ </row>
+
+ </tbody>
+ </tgroup>
+ </table>
+
+ </sect2>
+
<sect2 id="monitoring-pg-stat-ssl-view">
<title><structname>pg_stat_ssl</structname></title>
@@ -5156,6 +5297,32 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
can be granted EXECUTE to run the function.
</para></entry>
</row>
+
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_stat_reset_subscription_worker</primary>
+ </indexterm>
+ <function>pg_stat_reset_subscription_worker</function> ( <parameter>subid</parameter> <type>oid</type>, <optional> <parameter>relid</parameter> <type>oid</type> </optional> )
+ <returnvalue>void</returnvalue>
+ </para>
+ <para>
+ Resets the statistics of a single subscription worker running on the
+ subscription with <parameter>subid</parameter> shown in the
+ <structname>pg_stat_subscription_worker</structname> view. If the
+ argument <parameter>relid</parameter> is not <literal>NULL</literal>,
+ resets statistics of the subscription worker handling the initial data
+ copy of the relation with <parameter>relid</parameter>. Otherwise,
+ resets the subscription worker statistics of the main apply worker.
+ If the argument <parameter>relid</parameter> is omitted, resets the
+ statistics of all subscription workers running on the subscription
+ with <parameter>subid</parameter>.
+ </para>
+ <para>
+ This function is restricted to superusers by default, but other users
+ can be granted EXECUTE to run the function.
+ </para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index 54c93b16c4..cd1d649f9f 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -639,6 +639,10 @@ REVOKE EXECUTE ON FUNCTION pg_stat_reset_single_function_counters(oid) FROM publ
REVOKE EXECUTE ON FUNCTION pg_stat_reset_replication_slot(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_stat_reset_subscription_worker(oid) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_stat_reset_subscription_worker(oid, oid) FROM public;
+
REVOKE EXECUTE ON FUNCTION lo_import(text) FROM public;
REVOKE EXECUTE ON FUNCTION lo_import(text, oid) FROM public;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index eb560955cd..cb2f77cd1e 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1261,3 +1261,28 @@ REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
substream, subtwophasestate, subslotname, subsynccommit, subpublications)
ON pg_subscription TO public;
+
+CREATE VIEW pg_stat_subscription_workers AS
+ SELECT
+ w.subid,
+ s.subname,
+ w.subrelid,
+ w.relid,
+ w.command,
+ w.xid,
+ w.error_count,
+ w.error_message,
+ w.first_error_time,
+ w.last_error_time
+ FROM (SELECT
+ oid as subid,
+ NULL as relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT
+ srsubid as subid,
+ srrelid as relid
+ FROM pg_subscription_rel
+ WHERE srsubstate <> 'r') sr,
+ LATERAL pg_stat_get_subscription_worker(sr.subid, sr.relid) w
+ JOIN pg_subscription s ON (w.subid = s.oid);
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index c47ba26369..18962b91e1 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -32,6 +32,7 @@
#include "executor/executor.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "pgstat.h"
#include "replication/logicallauncher.h"
#include "replication/origin.h"
#include "replication/slot.h"
@@ -1204,7 +1205,8 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
* Since dropping a replication slot is not transactional, the replication
* slot stays dropped even if the transaction rolls back. So we cannot
* run DROP SUBSCRIPTION inside a transaction block if dropping the
- * replication slot.
+ * replication slot. Also, in this case, we report a message for dropping
+ * the subscription to the stats collector.
*
* XXX The command name should really be something like "DROP SUBSCRIPTION
* of a subscription that is associated with a replication slot", but we
@@ -1377,6 +1379,17 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
}
PG_END_TRY();
+ /*
+ * Send a message for dropping this subscription to the stats collector. We
+ * can safely report dropping the subscription statistics here if the
+ * subscription is associated with a replication slot since we cannot run
+ * DROP SUBSCRIPTION inside a transaction block. Subscription statistics will
+ * be removed later by (auto)vacuum either if it's not associated with a
+ * replication slot or if the message for dropping the subscription gets lost.
+ */
+ if (slotname)
+ pgstat_report_subscription_drop(subid);
+
table_close(rel, NoLock);
}
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 8c166e5e16..a620379957 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -41,6 +41,7 @@
#include "catalog/catalog.h"
#include "catalog/pg_database.h"
#include "catalog/pg_proc.h"
+#include "catalog/pg_subscription.h"
#include "common/ip.h"
#include "executor/instrument.h"
#include "libpq/libpq.h"
@@ -106,6 +107,7 @@
#define PGSTAT_TAB_HASH_SIZE 512
#define PGSTAT_FUNCTION_HASH_SIZE 512
#define PGSTAT_REPLSLOT_HASH_SIZE 32
+#define PGSTAT_SUBWORKER_HASH_SIZE 32
/* ----------
@@ -320,10 +322,14 @@ NON_EXEC_STATIC void PgstatCollectorMain(int argc, char *argv[]) pg_attribute_no
static PgStat_StatDBEntry *pgstat_get_db_entry(Oid databaseid, bool create);
static PgStat_StatTabEntry *pgstat_get_tab_entry(PgStat_StatDBEntry *dbentry,
Oid tableoid, bool create);
+static PgStat_StatSubWorkerEntry *pgstat_get_subworker_entry(PgStat_StatDBEntry *dbentry,
+ Oid subid, Oid subrelid,
+ bool create);
static void pgstat_write_statsfiles(bool permanent, bool allDbs);
static void pgstat_write_db_statsfile(PgStat_StatDBEntry *dbentry, bool permanent);
static HTAB *pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep);
-static void pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash, bool permanent);
+static void pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash,
+ HTAB *subworkerhash, bool permanent);
static void backend_read_statsfile(void);
static bool pgstat_write_statsfile_needed(void);
@@ -335,6 +341,7 @@ static void pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotstats, Timestamp
static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg, TimestampTz now);
static void pgstat_send_funcstats(void);
static void pgstat_send_slru(void);
+static void pgstat_send_subscription_purge(PgStat_MsgSubscriptionPurge *msg);
static HTAB *pgstat_collect_oids(Oid catalogid, AttrNumber anum_oid);
static bool pgstat_should_report_connstat(void);
static void pgstat_report_disconnect(Oid dboid);
@@ -373,6 +380,8 @@ static void pgstat_recv_connect(PgStat_MsgConnect *msg, int len);
static void pgstat_recv_disconnect(PgStat_MsgDisconnect *msg, int len);
static void pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len);
static void pgstat_recv_tempfile(PgStat_MsgTempFile *msg, int len);
+static void pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len);
+static void pgstat_recv_subworker_error(PgStat_MsgSubWorkerError *msg, int len);
/* ------------------------------------------------------------
* Public functions called from postmaster follow
@@ -1302,6 +1311,55 @@ pgstat_vacuum_stat(void)
hash_destroy(htab);
}
+
+ /*
+ * Repeat for subscription workers. Similarly, we needn't bother
+ * in the common case where no function stats are being collected.
+ */
+ if (dbentry->subworkers != NULL &&
+ hash_get_num_entries(dbentry->subworkers) > 0)
+ {
+ PgStat_StatSubWorkerEntry *subwentry;
+ PgStat_MsgSubscriptionPurge spmsg;
+
+ /*
+ * Read pg_subscription and make a list of OIDs of all existing
+ * subscriptions
+ */
+ htab = pgstat_collect_oids(SubscriptionRelationId, Anum_pg_subscription_oid);
+
+ pgstat_setheader(&spmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ spmsg.m_databaseid = MyDatabaseId;
+ spmsg.m_nentries = 0;
+
+ hash_seq_init(&hstat, dbentry->subworkers);
+ while ((subwentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ CHECK_FOR_INTERRUPTS();
+
+ if (hash_search(htab, (void *) &(subwentry->key.subid), HASH_FIND, NULL)
+ != NULL)
+ continue;
+
+ /* This subscription is dead, add the subid to the message */
+ spmsg.m_subids[spmsg.m_nentries++] = subwentry->key.subid;
+
+ /*
+ * If the message is full, send it out and reinitialize to empty
+ */
+ if (spmsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONPURGE)
+ {
+ pgstat_send_subscription_purge(&spmsg);
+ spmsg.m_nentries = 0;
+ }
+ }
+
+ /* Send the rest of dead subscriptions */
+ if (spmsg.m_nentries > 0)
+ pgstat_send_subscription_purge(&spmsg);
+
+ hash_destroy(htab);
+ }
}
@@ -1474,7 +1532,8 @@ pgstat_reset_shared_counters(const char *target)
* ----------
*/
void
-pgstat_reset_single_counter(Oid objoid, PgStat_Single_Reset_Type type)
+pgstat_reset_single_counter(Oid objoid, Oid subobjoid,
+ PgStat_Single_Reset_Type type)
{
PgStat_MsgResetsinglecounter msg;
@@ -1485,6 +1544,7 @@ pgstat_reset_single_counter(Oid objoid, PgStat_Single_Reset_Type type)
msg.m_databaseid = MyDatabaseId;
msg.m_resettype = type;
msg.m_objectid = objoid;
+ msg.m_subobjectid = subobjoid;
pgstat_send(&msg, sizeof(msg));
}
@@ -1869,6 +1929,53 @@ pgstat_report_replslot_drop(const char *slotname)
pgstat_send(&msg, sizeof(PgStat_MsgReplSlot));
}
+/* ----------
+ * pgstat_report_subworker_error() -
+ *
+ * Tell the collector about the subscription worker error.
+ * ----------
+ */
+void
+pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command, TransactionId xid,
+ const char *errmsg)
+{
+ PgStat_MsgSubWorkerError msg;
+ int len;
+
+ Assert(strlen(errmsg) < PGSTAT_SUBWORKERERROR_MSGLEN);
+ len = offsetof(PgStat_MsgSubWorkerError, m_message[0]) + strlen(errmsg) + 1;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBWORKERERROR);
+ msg.m_databaseid = MyDatabaseId;
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_relid = relid;
+ msg.m_command = command;
+ msg.m_xid = xid;
+ msg.m_timestamp = GetCurrentTimestamp();
+ strlcpy(msg.m_message, errmsg, PGSTAT_SUBWORKERERROR_MSGLEN);
+
+ pgstat_send(&msg, len);
+}
+
+/* ----------
+ * pgstat_report_subscription_drop() -
+ *
+ * Tell the collector about dropping the subscription.
+ * ----------
+ */
+void
+pgstat_report_subscription_drop(Oid subid)
+{
+ PgStat_MsgSubscriptionPurge msg;
+
+ msg.m_databaseid = MyDatabaseId;
+ msg.m_subids[0] = subid;
+ msg.m_nentries = 1;
+ pgstat_send_subscription_purge(&msg);
+}
+
/* ----------
* pgstat_ping() -
*
@@ -2874,6 +2981,33 @@ pgstat_fetch_stat_funcentry(Oid func_id)
return funcentry;
}
+/*
+ * ---------
+ * pgstat_fetch_stat_subworker_entry() -
+ *
+ * Support function for the SQL-callable pgstat* functions. Returns
+ * a pointer to the subscription worker struct or NULL.
+ * ---------
+ */
+PgStat_StatSubWorkerEntry *
+pgstat_fetch_stat_subworker_entry(Oid subid, Oid subrelid)
+{
+ PgStat_StatDBEntry *dbentry;
+ PgStat_StatSubWorkerEntry *wentry = NULL;
+
+ /* Load the stats file if needed */
+ backend_read_statsfile();
+
+ /* Look up database, then find the requested subscription worker stats */
+ dbentry = pgstat_fetch_stat_dbentry(MyDatabaseId);
+ if (dbentry != NULL && dbentry->subworkers != NULL)
+ {
+ wentry = pgstat_get_subworker_entry(dbentry, subid, subrelid,
+ false);
+ }
+
+ return wentry;
+}
/*
* ---------
@@ -3312,6 +3446,23 @@ pgstat_send_slru(void)
}
}
+/* --------
+ * pgstat_send_subscription_purge() -
+ *
+ * Send a subscription purge message to the collector
+ * --------
+ */
+static void
+pgstat_send_subscription_purge(PgStat_MsgSubscriptionPurge *msg)
+{
+ int len;
+
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + msg->m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&msg->m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(msg, len);
+}
/* ----------
* PgstatCollectorMain() -
@@ -3568,6 +3719,14 @@ PgstatCollectorMain(int argc, char *argv[])
pgstat_recv_disconnect(&msg.msg_disconnect, len);
break;
+ case PGSTAT_MTYPE_SUBSCRIPTIONPURGE:
+ pgstat_recv_subscription_purge(&msg.msg_subscriptionpurge, len);
+ break;
+
+ case PGSTAT_MTYPE_SUBWORKERERROR:
+ pgstat_recv_subworker_error(&msg.msg_subworkererror, len);
+ break;
+
default:
break;
}
@@ -3613,7 +3772,8 @@ PgstatCollectorMain(int argc, char *argv[])
/*
* Subroutine to clear stats in a database entry
*
- * Tables and functions hashes are initialized to empty.
+ * Tables, functions, and subscription workers hashes are initialized
+ * to empty.
*/
static void
reset_dbentry_counters(PgStat_StatDBEntry *dbentry)
@@ -3666,6 +3826,13 @@ reset_dbentry_counters(PgStat_StatDBEntry *dbentry)
PGSTAT_FUNCTION_HASH_SIZE,
&hash_ctl,
HASH_ELEM | HASH_BLOBS);
+
+ hash_ctl.keysize = sizeof(PgStat_StatSubWorkerKey);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubWorkerEntry);
+ dbentry->subworkers = hash_create("Per-database subscription worker",
+ PGSTAT_SUBWORKER_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
}
/*
@@ -3690,7 +3857,7 @@ pgstat_get_db_entry(Oid databaseid, bool create)
/*
* If not found, initialize the new one. This creates empty hash tables
- * for tables and functions, too.
+ * for tables, functions, and subscription worker, too.
*/
if (!found)
reset_dbentry_counters(result);
@@ -3748,6 +3915,45 @@ pgstat_get_tab_entry(PgStat_StatDBEntry *dbentry, Oid tableoid, bool create)
return result;
}
+/* ----------
+ * pgstat_get_subworker_entry
+ *
+ * Return subscription worker entry with the given subscription OID and
+ * relation OID. If subrelid is InvalidOid, it returns an entry of the
+ * apply worker otherwise of the table sync worker associated with subrelid.
+ * If no subscription entry exists, initialize it, if the create parameter
+ * is true. Else, return NULL.
+ * ----------
+ */
+static PgStat_StatSubWorkerEntry *
+pgstat_get_subworker_entry(PgStat_StatDBEntry *dbentry, Oid subid, Oid subrelid,
+ bool create)
+{
+ PgStat_StatSubWorkerEntry *subwentry;
+ PgStat_StatSubWorkerKey key;
+ bool found;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);
+
+ key.subid = subid;
+ key.subrelid = subrelid;
+ subwentry = (PgStat_StatSubWorkerEntry *) hash_search(dbentry->subworkers,
+ (void *) &key,
+ action, &found);
+
+ /* If not found, initialize the new one */
+ if (create && !found)
+ {
+ subwentry->relid = InvalidOid;
+ subwentry->command = 0;
+ subwentry->xid = InvalidTransactionId;
+ subwentry->error_count = 0;
+ subwentry->first_error_time = 0;
+ subwentry->last_error_time = 0;
+ subwentry->error_message[0] = '\0';
+ }
+
+ return subwentry;
+}
/* ----------
* pgstat_write_statsfiles() -
@@ -3947,8 +4153,10 @@ pgstat_write_db_statsfile(PgStat_StatDBEntry *dbentry, bool permanent)
{
HASH_SEQ_STATUS tstat;
HASH_SEQ_STATUS fstat;
+ HASH_SEQ_STATUS sstat;
PgStat_StatTabEntry *tabentry;
PgStat_StatFuncEntry *funcentry;
+ PgStat_StatSubWorkerEntry *subwentry;
FILE *fpout;
int32 format_id;
Oid dbid = dbentry->databaseid;
@@ -4003,6 +4211,17 @@ pgstat_write_db_statsfile(PgStat_StatDBEntry *dbentry, bool permanent)
(void) rc; /* we'll check for error with ferror */
}
+ /*
+ * Walk through the database's subscription worker stats table.
+ */
+ hash_seq_init(&sstat, dbentry->subworkers);
+ while ((subwentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&sstat)) != NULL)
+ {
+ fputc('S', fpout);
+ rc = fwrite(subwentry, sizeof(PgStat_StatSubWorkerEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+ }
+
/*
* No more output to be done. Close the temp file and replace the old
* pgstat.stat with it. The ferror() check replaces testing for error
@@ -4241,6 +4460,7 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
memcpy(dbentry, &dbbuf, sizeof(PgStat_StatDBEntry));
dbentry->tables = NULL;
dbentry->functions = NULL;
+ dbentry->subworkers = NULL;
/*
* In the collector, disregard the timestamp we read from the
@@ -4252,8 +4472,8 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
dbentry->stats_timestamp = 0;
/*
- * Don't create tables/functions hashtables for uninteresting
- * databases.
+ * Don't create tables/functions/subworkers hashtables for
+ * uninteresting databases.
*/
if (onlydb != InvalidOid)
{
@@ -4278,6 +4498,14 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
&hash_ctl,
HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ hash_ctl.keysize = sizeof(PgStat_StatSubWorkerKey);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubWorkerEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ dbentry->subworkers = hash_create("Per-database subscription worker",
+ PGSTAT_SUBWORKER_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+
/*
* If requested, read the data from the database-specific
* file. Otherwise we just leave the hashtables empty.
@@ -4286,6 +4514,7 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
pgstat_read_db_statsfile(dbentry->databaseid,
dbentry->tables,
dbentry->functions,
+ dbentry->subworkers,
permanent);
break;
@@ -4363,19 +4592,21 @@ done:
* As in pgstat_read_statsfiles, if the permanent file is requested, it is
* removed after reading.
*
- * Note: this code has the ability to skip storing per-table or per-function
- * data, if NULL is passed for the corresponding hashtable. That's not used
- * at the moment though.
+ * Note: this code has the ability to skip storing per-table, per-function or
+ * per-subscription-worker data, if NULL is passed for the corresponding hashtable.
+ * That's not used at the moment though.
* ----------
*/
static void
pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash,
- bool permanent)
+ HTAB *subworkerhash, bool permanent)
{
PgStat_StatTabEntry *tabentry;
PgStat_StatTabEntry tabbuf;
PgStat_StatFuncEntry funcbuf;
PgStat_StatFuncEntry *funcentry;
+ PgStat_StatSubWorkerEntry subwbuf;
+ PgStat_StatSubWorkerEntry *subwentry;
FILE *fpin;
int32 format_id;
bool found;
@@ -4489,6 +4720,41 @@ pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash,
memcpy(funcentry, &funcbuf, sizeof(funcbuf));
break;
+ /*
+ * 'S' A PgStat_StatSubWorkerEntry struct describing
+ * subscription worker statistics.
+ */
+ case 'S':
+ if (fread(&subwbuf, 1, sizeof(PgStat_StatSubWorkerEntry),
+ fpin) != sizeof(PgStat_StatSubWorkerEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /*
+ * Skip if subscription worker data not wanted.
+ */
+ if (subworkerhash == NULL)
+ break;
+
+ subwentry = (PgStat_StatSubWorkerEntry *) hash_search(subworkerhash,
+ (void *) &subwbuf.key,
+ HASH_ENTER, &found);
+
+ if (found)
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ memcpy(subwentry, &subwbuf, sizeof(subwbuf));
+ break;
+
/*
* 'E' The EOF marker of a complete stats file.
*/
@@ -5162,6 +5428,8 @@ pgstat_recv_dropdb(PgStat_MsgDropdb *msg, int len)
hash_destroy(dbentry->tables);
if (dbentry->functions != NULL)
hash_destroy(dbentry->functions);
+ if (dbentry->subworkers != NULL)
+ hash_destroy(dbentry->subworkers);
if (hash_search(pgStatDBHash,
(void *) &dbid,
@@ -5199,13 +5467,16 @@ pgstat_recv_resetcounter(PgStat_MsgResetcounter *msg, int len)
hash_destroy(dbentry->tables);
if (dbentry->functions != NULL)
hash_destroy(dbentry->functions);
+ if (dbentry->subworkers != NULL)
+ hash_destroy(dbentry->subworkers);
dbentry->tables = NULL;
dbentry->functions = NULL;
+ dbentry->subworkers = NULL;
/*
* Reset database-level stats, too. This creates empty hash tables for
- * tables and functions.
+ * tables, functions, and subscription workers.
*/
reset_dbentry_counters(dbentry);
}
@@ -5274,6 +5545,14 @@ pgstat_recv_resetsinglecounter(PgStat_MsgResetsinglecounter *msg, int len)
else if (msg->m_resettype == RESET_FUNCTION)
(void) hash_search(dbentry->functions, (void *) &(msg->m_objectid),
HASH_REMOVE, NULL);
+ else if (msg->m_resettype == RESET_SUBWORKER)
+ {
+ PgStat_StatSubWorkerKey key;
+
+ key.subid = msg->m_objectid;
+ key.subrelid = msg->m_subobjectid;
+ (void) hash_search(dbentry->subworkers, (void *) &key, HASH_REMOVE, NULL);
+ }
}
/* ----------
@@ -5816,6 +6095,84 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
}
}
+/* ----------
+ * pgstat_recv_subscription_purge() -
+ *
+ * Process a SUBSCRIPTIONPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len)
+{
+ HASH_SEQ_STATUS hstat;
+ PgStat_StatDBEntry *dbentry;
+ PgStat_StatSubWorkerEntry *subwentry;
+
+ dbentry = pgstat_get_db_entry(msg->m_databaseid, false);
+
+ /* No need to purge if we don't have even know the database */
+ if (!dbentry || !dbentry->subworkers)
+ return;
+
+ /* Remove all subscription worker statistics of the given subscriptions */
+ hash_seq_init(&hstat, dbentry->subworkers);
+ while ((subwentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ if (subwentry->key.subid == msg->m_subids[i])
+ {
+ (void) hash_search(dbentry->subworkers, (void *) &(subwentry->key),
+ HASH_REMOVE, NULL);
+ break;
+ }
+ }
+ }
+}
+
+/* ----------
+ * pgstat_recv_subworker_error() -
+ *
+ * Process a SUBWORKERERROR message.
+ * ----------
+ */
+static void
+pgstat_recv_subworker_error(PgStat_MsgSubWorkerError *msg, int len)
+{
+ PgStat_StatDBEntry *dbentry;
+ PgStat_StatSubWorkerEntry *subwentry;
+
+ dbentry = pgstat_get_db_entry(msg->m_databaseid, true);
+
+ /* Get the subscription worker stats */
+ subwentry = pgstat_get_subworker_entry(dbentry, msg->m_subid,
+ msg->m_subrelid, true);
+ Assert(subwentry);
+
+ /*
+ * Update only the counter and last error timestamp if we received
+ * the same error again
+ */
+ if (subwentry->relid == msg->m_relid &&
+ subwentry->command == msg->m_command &&
+ subwentry->xid == msg->m_xid &&
+ strcmp(subwentry->error_message, msg->m_message) == 0)
+ {
+ subwentry->error_count++;
+ subwentry->last_error_time = msg->m_timestamp;
+ return;
+ }
+
+ /* Otherwise, update the error information */
+ subwentry->relid = msg->m_relid;
+ subwentry->command = msg->m_command;
+ subwentry->xid = msg->m_xid;
+ subwentry->error_count = 1;
+ subwentry->first_error_time = msg->m_timestamp;
+ subwentry->last_error_time = msg->m_timestamp;
+ strlcpy(subwentry->error_message, msg->m_message, PGSTAT_SUBWORKERERROR_MSGLEN);
+}
+
/* ----------
* pgstat_write_statsfile_needed() -
*
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index ae1b391bda..2e79302a48 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3332,6 +3332,7 @@ void
ApplyWorkerMain(Datum main_arg)
{
int worker_slot = DatumGetInt32(main_arg);
+ MemoryContext cctx = CurrentMemoryContext;
MemoryContext oldctx;
char originname[NAMEDATALEN];
XLogRecPtr origin_startpos;
@@ -3432,8 +3433,30 @@ ApplyWorkerMain(Datum main_arg)
{
char *syncslotname;
- /* This is table synchronization worker, call initial sync. */
- syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ PG_TRY();
+ {
+ /* This is table synchronization worker, call initial sync. */
+ syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ }
+ PG_CATCH();
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ /*
+ * Report the table sync error. There is no corresponding message
+ * type for table synchronization.
+ */
+ pgstat_report_subworker_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ MyLogicalRepWorker->relid,
+ 0, /* message type */
+ InvalidTransactionId,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
/* allocate slot name in long-lived context */
myslotname = MemoryContextStrdup(ApplyContext, syncslotname);
@@ -3551,7 +3574,32 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- LogicalRepApplyLoop(origin_startpos);
+ PG_TRY();
+ {
+ LogicalRepApplyLoop(origin_startpos);
+ }
+ PG_CATCH();
+ {
+ /* report the apply error */
+ if (apply_error_callback_arg.command != 0)
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ pgstat_report_subworker_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ apply_error_callback_arg.rel != NULL
+ ? apply_error_callback_arg.rel->localreloid
+ : InvalidOid,
+ apply_error_callback_arg.command,
+ apply_error_callback_arg.remote_xid,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ }
+
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
proc_exit(0);
}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index ff5aedc99c..88bf0a6076 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -2171,7 +2171,7 @@ pg_stat_reset_single_table_counters(PG_FUNCTION_ARGS)
{
Oid taboid = PG_GETARG_OID(0);
- pgstat_reset_single_counter(taboid, RESET_TABLE);
+ pgstat_reset_single_counter(taboid, InvalidOid, RESET_TABLE);
PG_RETURN_VOID();
}
@@ -2181,7 +2181,18 @@ pg_stat_reset_single_function_counters(PG_FUNCTION_ARGS)
{
Oid funcoid = PG_GETARG_OID(0);
- pgstat_reset_single_counter(funcoid, RESET_FUNCTION);
+ pgstat_reset_single_counter(funcoid, InvalidOid, RESET_FUNCTION);
+
+ PG_RETURN_VOID();
+}
+
+Datum
+pg_stat_reset_subscription_worker_subrel(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid = PG_ARGISNULL(1) ? InvalidOid : PG_GETARG_OID(1);
+
+ pgstat_reset_single_counter(subid, relid, RESET_SUBWORKER);
PG_RETURN_VOID();
}
@@ -2239,6 +2250,21 @@ pg_stat_reset_replication_slot(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
+/* Reset all subscription worker stats associated with the given subscription */
+Datum
+pg_stat_reset_subscription_worker_sub(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+
+ /*
+ * Use subscription drop message to remove statistics of all subscription
+ * workers.
+ */
+ pgstat_report_subscription_drop(subid);
+
+ PG_RETURN_VOID();
+}
+
Datum
pg_stat_get_archiver(PG_FUNCTION_ARGS)
{
@@ -2379,3 +2405,107 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
/* Returns the record as Datum */
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
+
+/*
+ * Get the subscription worker statistics for the given subscription
+ * (and relation).
+ */
+Datum
+pg_stat_get_subscription_worker(PG_FUNCTION_ARGS)
+{
+#define PG_STAT_GET_SUBSCRIPTION_WORKER_COLS 9
+ Oid subid = PG_GETARG_OID(0);
+ Oid subrelid;
+ TupleDesc tupdesc;
+ Datum values[PG_STAT_GET_SUBSCRIPTION_WORKER_COLS];
+ bool nulls[PG_STAT_GET_SUBSCRIPTION_WORKER_COLS];
+ PgStat_StatSubWorkerEntry *wentry;
+ int i;
+
+ if (PG_ARGISNULL(1))
+ subrelid = InvalidOid;
+ else
+ subrelid = PG_GETARG_OID(1);
+
+ /* Get subscription worker stats */
+ wentry = pgstat_fetch_stat_subworker_entry(subid, subrelid);
+
+ /* Return NULL if there is no worker statistics */
+ if (wentry == NULL)
+ PG_RETURN_NULL();
+
+ /* Initialise attributes information in the tuple descriptor */
+ tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_SUBSCRIPTION_WORKER_COLS);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "subid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subrelid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 3, "relid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 4, "command",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 5, "xid",
+ XIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 6, "error_count",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 7, "error_message",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 8, "first_error_time",
+ TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 9, "last_error_time",
+ TIMESTAMPTZOID, -1, 0);
+ BlessTupleDesc(tupdesc);
+
+ /* Initialise values and NULL flags arrays */
+ MemSet(values, 0, sizeof(values));
+ MemSet(nulls, 0, sizeof(nulls));
+
+ i = 0;
+ /* subid */
+ values[i++] = ObjectIdGetDatum(subid);
+
+ /* subrelid */
+ if (OidIsValid(subrelid))
+ values[i++] = ObjectIdGetDatum(subrelid);
+ else
+ nulls[i++] = true;
+
+ /* relid */
+ if (OidIsValid(wentry->relid))
+ values[i++] = ObjectIdGetDatum(wentry->relid);
+ else
+ nulls[i++] = true;
+
+ /* command */
+ if (wentry->command != 0)
+ values[i++] = CStringGetTextDatum(logicalrep_message_type(wentry->command));
+ else
+ nulls[i++] = true;
+
+ /* xid */
+ if (TransactionIdIsValid(wentry->xid))
+ values[i++] = TransactionIdGetDatum(wentry->xid);
+ else
+ nulls[i++] = true;
+
+ /* error_count */
+ values[i++] = Int64GetDatum(wentry->error_count);
+
+ /* error_message */
+ values[i++] = CStringGetTextDatum(wentry->error_message);
+
+ /* first_error_time */
+ if (wentry->first_error_time != 0)
+ values[i++] = TimestampTzGetDatum(wentry->first_error_time);
+ else
+ nulls[i++] = true;
+
+ /* last_error_time */
+ if (wentry->last_error_time != 0)
+ values[i++] = TimestampTzGetDatum(wentry->last_error_time);
+ else
+ nulls[i++] = true;
+
+ /* Returns the record as Datum */
+ PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
+}
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index d068d6532e..50e1c7b68d 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5385,6 +5385,14 @@
proargmodes => '{i,o,o,o,o,o,o,o,o,o,o}',
proargnames => '{slot_name,slot_name,spill_txns,spill_count,spill_bytes,stream_txns,stream_count,stream_bytes,total_txns,total_bytes,stats_reset}',
prosrc => 'pg_stat_get_replication_slot' },
+{ oid => '8523', descr => 'statistics: information about subscription worker',
+ proname => 'pg_stat_get_subscription_worker', prorows => '1', proisstrict => 'f',
+ proretset => 't', provolatile => 's', proparallel => 'r',
+ prorettype => 'record', proargtypes => 'oid oid',
+ proallargtypes => '{oid,oid,oid,oid,oid,text,xid,int8,text,timestamptz,timestamptz}',
+ proargmodes => '{i,i,o,o,o,o,o,o,o,o,o}',
+ proargnames => '{subid,subrelid,subid,subrelid,relid,command,xid,error_count,error_message,first_error_time,last_error_time}',
+ prosrc => 'pg_stat_get_subscription_worker' },
{ oid => '6118', descr => 'statistics: information about subscription',
proname => 'pg_stat_get_subscription', prorows => '10', proisstrict => 'f',
proretset => 't', provolatile => 's', proparallel => 'r',
@@ -5772,6 +5780,16 @@
proname => 'pg_stat_reset_replication_slot', proisstrict => 'f',
provolatile => 'v', prorettype => 'void', proargtypes => 'text',
prosrc => 'pg_stat_reset_replication_slot' },
+{ oid => '8524',
+ descr => 'statistics: reset collected statistics for a single subscription worker',
+ proname => 'pg_stat_reset_subscription_worker', proisstrict => 'f',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid oid',
+ prosrc => 'pg_stat_reset_subscription_worker_subrel' },
+{ oid => '8525',
+ descr => 'statistics: reset all collected statistics for a single subscription',
+ proname => 'pg_stat_reset_subscription_worker',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid',
+ prosrc => 'pg_stat_reset_subscription_worker_sub' },
{ oid => '3163', descr => 'current trigger depth',
proname => 'pg_trigger_depth', provolatile => 's', proparallel => 'r',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index bcd3588ea2..6643938b55 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -14,6 +14,7 @@
#include "datatype/timestamp.h"
#include "portability/instr_time.h"
#include "postmaster/pgarch.h" /* for MAX_XFN_CHARS */
+#include "replication/logicalproto.h"
#include "utils/backend_progress.h" /* for backward compatibility */
#include "utils/backend_status.h" /* for backward compatibility */
#include "utils/hsearch.h"
@@ -83,6 +84,8 @@ typedef enum StatMsgType
PGSTAT_MTYPE_REPLSLOT,
PGSTAT_MTYPE_CONNECT,
PGSTAT_MTYPE_DISCONNECT,
+ PGSTAT_MTYPE_SUBSCRIPTIONPURGE,
+ PGSTAT_MTYPE_SUBWORKERERROR,
} StatMsgType;
/* ----------
@@ -145,7 +148,8 @@ typedef enum PgStat_Shared_Reset_Target
typedef enum PgStat_Single_Reset_Type
{
RESET_TABLE,
- RESET_FUNCTION
+ RESET_FUNCTION,
+ RESET_SUBWORKER
} PgStat_Single_Reset_Type;
/* ------------------------------------------------------------
@@ -364,6 +368,7 @@ typedef struct PgStat_MsgResetsinglecounter
Oid m_databaseid;
PgStat_Single_Reset_Type m_resettype;
Oid m_objectid;
+ Oid m_subobjectid;
} PgStat_MsgResetsinglecounter;
/* ----------
@@ -536,6 +541,53 @@ typedef struct PgStat_MsgReplSlot
PgStat_Counter m_total_bytes;
} PgStat_MsgReplSlot;
+/* ----------
+ * PgStat_MsgSubscriptionPurge Sent by the backend and autovacuum to tell the
+ * collector about the dead subscriptions.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(Oid) - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionPurge
+{
+ PgStat_MsgHdr m_hdr;
+ Oid m_databaseid;
+ int m_nentries;
+ Oid m_subids[PGSTAT_NUM_SUBSCRIPTIONPURGE];
+} PgStat_MsgSubscriptionPurge;
+
+/* ----------
+ * PgStat_MsgSubWorkerError Sent by the apply worker or the table sync worker to
+ * report the error occurred during logical replication.
+ * ----------
+ */
+#define PGSTAT_SUBWORKERERROR_MSGLEN 256
+typedef struct PgStat_MsgSubWorkerError
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * m_subid and m_subrelid are used to determine the subscription and the
+ * reporter of the error. m_subrelid is InvalidOid if reported by an apply
+ * worker otherwise reported by a table sync worker.
+ */
+ Oid m_databaseid;
+ Oid m_subid;
+ Oid m_subrelid;
+
+ /*
+ * Oids of the database and the table that the reporter was actually
+ * processing. m_relid can be InvalidOid if an error occurred during
+ * worker applying a non-data-modification message such as RELATION.
+ */
+ Oid m_relid;
+
+ LogicalRepMsgType m_command;
+ TransactionId m_xid;
+ TimestampTz m_timestamp;
+ char m_message[PGSTAT_SUBWORKERERROR_MSGLEN];
+} PgStat_MsgSubWorkerError;
/* ----------
* PgStat_MsgRecoveryConflict Sent by the backend upon recovery conflict
@@ -714,6 +766,8 @@ typedef union PgStat_Msg
PgStat_MsgReplSlot msg_replslot;
PgStat_MsgConnect msg_connect;
PgStat_MsgDisconnect msg_disconnect;
+ PgStat_MsgSubscriptionPurge msg_subscriptionpurge;
+ PgStat_MsgSubWorkerError msg_subworkererror;
} PgStat_Msg;
@@ -768,11 +822,16 @@ typedef struct PgStat_StatDBEntry
TimestampTz stats_timestamp; /* time of db stats file update */
/*
- * tables and functions must be last in the struct, because we don't write
- * the pointers out to the stats file.
+ * tables, functions, and subscription workers must be last in the struct,
+ * because we don't write the pointers out to the stats file.
+ *
+ * subworker is the hash table of PgStat_StatSubWorkerEntry which stores
+ * statistics of logical replication workers: apply worker
+ * and table sync worker.
*/
HTAB *tables;
HTAB *functions;
+ HTAB *subworkers;
} PgStat_StatDBEntry;
@@ -929,6 +988,36 @@ typedef struct PgStat_StatReplSlotEntry
TimestampTz stat_reset_timestamp;
} PgStat_StatReplSlotEntry;
+/* The lookup key for subscription worker hash table */
+typedef struct PgStat_StatSubWorkerKey
+{
+ Oid subid;
+ Oid subrelid; /* InvalidOid for apply worker, otherwise for
+ * table sync worker */
+} PgStat_StatSubWorkerKey;
+
+/*
+ * Logical replication apply worker and table sync worker statistics kept in the
+ * stats collector.
+ */
+typedef struct PgStat_StatSubWorkerEntry
+{
+ PgStat_StatSubWorkerKey key; /* hash key (must be first) */
+
+ /*
+ * Subscription worker error statistics representing an error that
+ * occurred during application of logical replication or the initial table
+ * synchronization.
+ */
+ Oid dbid;
+ Oid relid;
+ LogicalRepMsgType command;
+ TransactionId xid;
+ PgStat_Counter error_count;
+ TimestampTz first_error_time;
+ TimestampTz last_error_time;
+ char error_message[PGSTAT_SUBWORKERERROR_MSGLEN];
+} PgStat_StatSubWorkerEntry;
/*
* Working state needed to accumulate per-function-call timing statistics.
@@ -1019,9 +1108,11 @@ extern void pgstat_drop_database(Oid databaseid);
extern void pgstat_clear_snapshot(void);
extern void pgstat_reset_counters(void);
extern void pgstat_reset_shared_counters(const char *);
-extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type type);
+extern void pgstat_reset_single_counter(Oid objectid, Oid subobjectid,
+ PgStat_Single_Reset_Type type);
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
+extern void pgstat_reset_subworker_stats(Oid subid, Oid subrelid, bool allstats);
extern void pgstat_report_connect(Oid dboid);
extern void pgstat_report_autovac(Oid dboid);
@@ -1038,6 +1129,10 @@ extern void pgstat_report_checksum_failure(void);
extern void pgstat_report_replslot(const PgStat_StatReplSlotEntry *repSlotStat);
extern void pgstat_report_replslot_create(const char *slotname);
extern void pgstat_report_replslot_drop(const char *slotname);
+extern void pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command,
+ TransactionId xid, const char *errmsg);
+extern void pgstat_report_subscription_drop(Oid subid);
extern void pgstat_initialize(void);
@@ -1129,6 +1224,8 @@ extern void pgstat_send_wal(bool force);
extern PgStat_StatDBEntry *pgstat_fetch_stat_dbentry(Oid dbid);
extern PgStat_StatTabEntry *pgstat_fetch_stat_tabentry(Oid relid);
extern PgStat_StatFuncEntry *pgstat_fetch_stat_funcentry(Oid funcid);
+extern PgStat_StatSubWorkerEntry *pgstat_fetch_stat_subworker_entry(Oid subid,
+ Oid subrelid);
extern PgStat_ArchiverStats *pgstat_fetch_stat_archiver(void);
extern PgStat_BgWriterStats *pgstat_fetch_stat_bgwriter(void);
extern PgStat_CheckpointerStats *pgstat_fetch_stat_checkpointer(void);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2fa00a3c29..cb6da2c140 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2094,6 +2094,26 @@ pg_stat_subscription| SELECT su.oid AS subid,
st.latest_end_time
FROM (pg_subscription su
LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid, pid, received_lsn, last_msg_send_time, last_msg_receipt_time, latest_end_lsn, latest_end_time) ON ((st.subid = su.oid)));
+pg_stat_subscription_workers| SELECT w.subid,
+ s.subname,
+ w.subrelid,
+ w.relid,
+ w.command,
+ w.xid,
+ w.error_count,
+ w.error_message,
+ w.first_error_time,
+ w.last_error_time
+ FROM ( SELECT pg_subscription.oid AS subid,
+ NULL::oid AS relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT pg_subscription_rel.srsubid AS subid,
+ pg_subscription_rel.srrelid AS relid
+ FROM pg_subscription_rel
+ WHERE (pg_subscription_rel.srsubstate <> 'r'::"char")) sr,
+ (LATERAL pg_stat_get_subscription_worker(sr.subid, sr.relid) w(subid, subrelid, relid, command, xid, error_count, error_message, first_error_time, last_error_time)
+ JOIN pg_subscription s ON ((w.subid = s.oid)));
pg_stat_sys_indexes| SELECT pg_stat_all_indexes.relid,
pg_stat_all_indexes.indexrelid,
pg_stat_all_indexes.schemaname,
diff --git a/src/test/subscription/t/026_error_report.pl b/src/test/subscription/t/026_error_report.pl
new file mode 100644
index 0000000000..1227654774
--- /dev/null
+++ b/src/test/subscription/t/026_error_report.pl
@@ -0,0 +1,191 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Tests for subscription error reporting.
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 5;
+
+# Test if the error reported on pg_stat_subscription_workers view is expected.
+sub test_subscription_error
+{
+ my ($node, $relname, $xid, $expected_error, $msg) = @_;
+
+ my $check_sql = qq[
+SELECT count(1) > 0 FROM pg_stat_subscription_workers
+WHERE relid = '$relname'::regclass];
+ $check_sql .= " AND xid = '$xid'::xid;" if $xid ne '';
+
+ # Wait for the error statistics to be updated.
+ $node->poll_query_until(
+ 'postgres', $check_sql,
+) or die "Timed out while waiting for statistics to be updated";
+
+ my $result = $node->safe_psql(
+ 'postgres',
+ qq[
+SELECT subname, command, relid::regclass, error_count > 0
+FROM pg_stat_subscription_workers
+WHERE relid = '$relname'::regclass;
+]);
+ is($result, $expected_error, $msg);
+}
+
+# Create publisher node.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ qq[
+logical_decoding_work_mem = 64kB
+]);
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+
+# The subscriber will enter an infinite error loop, so we don't want
+# to overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+wal_retrieve_retry_interval = 5s
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we create
+# the same tables but with primary keys. Also, insert some data that will conflict
+# with the data replicated from publisher later.
+$node_publisher->safe_psql(
+ 'postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int);
+CREATE TABLE test_tab2 (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+INSERT INTO test_tab1 VALUES (1);
+INSERT INTO test_tab2 VALUES (1);
+COMMIT;
+]);
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int primary key);
+CREATE TABLE test_tab2 (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab2 VALUES (1);
+INSERT INTO test_tab_streaming SELECT 10000, md5(10000::text);
+COMMIT;
+]);
+
+# Setup publications.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ q[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Check if there is no subscription errors before starting logical replication.
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_workers");
+is($result, qq(0), 'check no subscription error');
+
+# Create subscriptions. The table sync for test_tab2 on tap_sub will enter into
+# infinite error loop due to violating the unique constraint.
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (streaming = off, two_phase = on);");
+my $appname_streaming = 'tap_sub_streaming';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming' PUBLICATION tap_pub_streaming WITH (streaming = on, two_phase = on);");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Wait for initial table sync for test_tab1 and test_tab_streaming to finish.
+$node_subscriber->poll_query_until('postgres',
+ q[
+SELECT count(1) = 2 FROM pg_subscription_rel
+WHERE srrelid in ('test_tab1'::regclass, 'test_tab_streaming'::regclass) AND srsubstate in ('r', 's')
+]) or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check the initial data.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab1");
+is($result, q(1), 'check initial data are copied to subscriber');
+
+# Insert more data to test_tab1, raising an error on the subscriber due to violation
+# of the unique constraint on test_tab1.
+my $xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab1 VALUES (1);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_subscription_error($node_subscriber, 'test_tab1', $xid,
+ qq(tap_sub|INSERT|test_tab1|t),
+ 'check the error reported by the apply worker');
+
+# Check the table sync worker's error in the view.
+test_subscription_error($node_subscriber, 'test_tab2', '',
+ qq(tap_sub||test_tab2|t),
+ 'check the error reported by the table sync worker');
+
+# Test for resetting subscription worker statistics.
+# Truncate test_tab1 and test_tab2 so that logical replication can continue.
+$node_subscriber->safe_psql(
+ 'postgres',
+ "TRUNCATE test_tab1, test_tab2;");
+
+# Wait for the data to be replicated.
+$node_subscriber->poll_query_until(
+ 'postgres',
+ "SELECT count(1) > 0 FROM test_tab1");
+$node_subscriber->poll_query_until(
+ 'postgres',
+ "SELECT count(1) > 0 FROM test_tab2");
+
+# Reset stats of all subscription workers running on tap_sub.
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+SELECT pg_stat_reset_subscription_worker(sw.subid)
+FROM pg_stat_subscription_workers sw
+ JOIN pg_subscription s ON s.oid = sw.subid
+WHERE
+ s.subname = 'tap_sub';
+]);
+
+# Wait for stats of all subscription workers running on tap_sub to be reset.
+$node_subscriber->poll_query_until(
+ 'postgres',
+ qq[
+SELECT count(1) = 0
+FROM pg_stat_subscription_workers sw
+ JOIN pg_subscription s ON s.oid = sw.subid
+WHERE
+ s.subname = 'tap_sub';
+]);
+
+# Check if the view doesn't show any entries after dropping the subscriptions.
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+DROP SUBSCRIPTION tap_sub;
+DROP SUBSCRIPTION tap_sub_streaming;
+]);
+$result = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT count(1) FROM pg_stat_subscription_workers");
+is($result, q(0), 'no error after dropping subscription');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index da6ac8ed83..f41ef0d2bc 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1943,6 +1943,8 @@ PgStat_MsgResetsharedcounter
PgStat_MsgResetsinglecounter
PgStat_MsgResetslrucounter
PgStat_MsgSLRU
+PgStat_MsgSubscriptionPurge
+PgStat_MsgSubWorkerError
PgStat_MsgTabpurge
PgStat_MsgTabstat
PgStat_MsgTempFile
@@ -1954,6 +1956,8 @@ PgStat_Single_Reset_Type
PgStat_StatDBEntry
PgStat_StatFuncEntry
PgStat_StatReplSlotEntry
+PgStat_StatSubWorkerEntry
+PgStat_StatSubWorkerKey
PgStat_StatTabEntry
PgStat_SubXactStatus
PgStat_TableCounts
--
2.24.3 (Apple Git-128)
On Mon, Nov 15, 2021 at 1:49 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated patch that incorporates all comments I got so
far. Please review it.
Thanks for the updated patch.
A few minor comments:
doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
(1) tab in doc updates
There's a tab before "Otherwise,":
+ copy of the relation with <parameter>relid</parameter>.
Otherwise,
src/backend/utils/adt/pgstatfuncs.c
(2) The function comment for "pg_stat_reset_subscription_worker_sub"
seems a bit long and I expected it to be multi-line (did you run
pg_indent?)
src/include/pgstat.h
(3) Remove PgStat_StatSubWorkerEntry.dbid?
The "dbid" member of the new PgStat_StatSubWorkerEntry struct doesn't
seem to be used, so I think it should be removed.
(I could remove it and everything builds OK and tests pass).
Regards,
Greg Nancarrow
Fujitsu Australia
On Mon, Nov 15, 2021 at 4:49 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Mon, Nov 15, 2021 at 1:49 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated patch that incorporates all comments I got so
far. Please review it.Thanks for the updated patch.
A few minor comments:doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
(1) tab in doc updates
There's a tab before "Otherwise,":
+ copy of the relation with <parameter>relid</parameter>.
Otherwise,
Fixed.
src/backend/utils/adt/pgstatfuncs.c
(2) The function comment for "pg_stat_reset_subscription_worker_sub"
seems a bit long and I expected it to be multi-line (did you run
pg_indent?)
I ran pg_indent on pgstatfuncs.c but it didn't become a multi-line comment.
src/include/pgstat.h
(3) Remove PgStat_StatSubWorkerEntry.dbid?
The "dbid" member of the new PgStat_StatSubWorkerEntry struct doesn't
seem to be used, so I think it should be removed.
(I could remove it and everything builds OK and tests pass).
Fixed.
Thank you for the comments! I've updated an updated version patch.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v22-0001-Add-a-subscription-worker-statistics-view-pg_sta.patchapplication/octet-stream; name=v22-0001-Add-a-subscription-worker-statistics-view-pg_sta.patchDownload
From 32dda2b772933c2359874e99e496652d08c8e6a1 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 16 Jul 2021 23:10:22 +0900
Subject: [PATCH v22 1/3] Add a subscription worker statistics view
"pg_stat_subscription_workers".
This commit adds a new system view pg_stat_subscription_workers,
that shows information about any errors which occur during application
of logical replication changes as well as during performing initial table
synchronization. The subscription statistics entries are removed when
the corresponding subscription is removed.
It also adds an SQL function pg_stat_reset_subscription_worker() to
reset single subscription errors.
The contents of this view can be used by an upcoming patch that skips
the particular transaction that conflicts with the existing data on
the subscriber.
This view can be extended in the future to track other xact related
statistics for subscription workers.
---
doc/src/sgml/monitoring.sgml | 167 +++++++++
src/backend/catalog/system_functions.sql | 4 +
src/backend/catalog/system_views.sql | 25 ++
src/backend/commands/subscriptioncmds.c | 16 +-
src/backend/postmaster/pgstat.c | 379 +++++++++++++++++++-
src/backend/replication/logical/worker.c | 54 ++-
src/backend/utils/adt/pgstatfuncs.c | 134 ++++++-
src/include/catalog/pg_proc.dat | 18 +
src/include/pgstat.h | 104 +++++-
src/test/regress/expected/rules.out | 20 ++
src/test/subscription/t/026_error_report.pl | 191 ++++++++++
src/tools/pgindent/typedefs.list | 4 +
12 files changed, 1095 insertions(+), 21 deletions(-)
create mode 100644 src/test/subscription/t/026_error_report.pl
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 3173ec2566..daf9fe89d5 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -627,6 +627,15 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
</entry>
</row>
+ <row>
+ <entry><structname>pg_stat_subscription_workers</structname><indexterm><primary>pg_stat_subscription_workers</primary></indexterm></entry>
+ <entry>One row per subscription worker, showing statistics about errors
+ that occurred on that subscription worker.
+ See <link linkend="monitoring-pg-stat-subscription-workers">
+ <structname>pg_stat_subscription_workers</structname></link> for details.
+ </entry>
+ </row>
+
</tbody>
</tgroup>
</table>
@@ -3034,6 +3043,138 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
</sect2>
+ <sect2 id="monitoring-pg-stat-subscription-workers">
+ <title><structname>pg_stat_subscription_workers</structname></title>
+
+ <indexterm>
+ <primary>pg_stat_subscription_workers</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_stat_subscription_workers</structname> view will contain
+ one row per subscription error reported by workers applying logical
+ replication changes and workers handling the initial data copy of the
+ subscribed tables. The statistics entry is removed when the subscription
+ the worker is running on is removed.
+ </para>
+
+ <table id="pg-stat-subscription-workers" xreflabel="pg_stat_subscription_workers">
+ <title><structname>pg_stat_subscription_workers</structname> View</title>
+ <tgroup cols="1">
+ <thead>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ Column Type
+ </para>
+ <para>
+ Description
+ </para></entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the subscription
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the subscription
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subrelid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker is synchronizing; null for the
+ main apply worker
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>relid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker was processing when the
+ error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>command</structfield> <type>text</type>
+ </para>
+ <para>
+ Name of command being applied when the error occurred. This field
+ is always NULL if the error was reported during the initial data
+ copy.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>xid</structfield> <type>xid</type>
+ </para>
+ <para>
+ Transaction ID of the publisher node being applied when the error
+ occurred. This field is always NULL if the error was reported
+ during the initial data copy.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>error_count</structfield> <type>uint8</type>
+ </para>
+ <para>
+ Number of consecutive times the error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>error_message</structfield> <type>text</type>
+ </para>
+ <para>
+ The error message
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>first_error_time</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which the first error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_error_time</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which the last error occurred
+ </para></entry>
+ </row>
+
+ </tbody>
+ </tgroup>
+ </table>
+
+ </sect2>
+
<sect2 id="monitoring-pg-stat-ssl-view">
<title><structname>pg_stat_ssl</structname></title>
@@ -5156,6 +5297,32 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
can be granted EXECUTE to run the function.
</para></entry>
</row>
+
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_stat_reset_subscription_worker</primary>
+ </indexterm>
+ <function>pg_stat_reset_subscription_worker</function> ( <parameter>subid</parameter> <type>oid</type>, <optional> <parameter>relid</parameter> <type>oid</type> </optional> )
+ <returnvalue>void</returnvalue>
+ </para>
+ <para>
+ Resets the statistics of a single subscription worker running on the
+ subscription with <parameter>subid</parameter> shown in the
+ <structname>pg_stat_subscription_worker</structname> view. If the
+ argument <parameter>relid</parameter> is not <literal>NULL</literal>,
+ resets statistics of the subscription worker handling the initial data
+ copy of the relation with <parameter>relid</parameter>. Otherwise,
+ resets the subscription worker statistics of the main apply worker.
+ If the argument <parameter>relid</parameter> is omitted, resets the
+ statistics of all subscription workers running on the subscription
+ with <parameter>subid</parameter>.
+ </para>
+ <para>
+ This function is restricted to superusers by default, but other users
+ can be granted EXECUTE to run the function.
+ </para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index 54c93b16c4..cd1d649f9f 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -639,6 +639,10 @@ REVOKE EXECUTE ON FUNCTION pg_stat_reset_single_function_counters(oid) FROM publ
REVOKE EXECUTE ON FUNCTION pg_stat_reset_replication_slot(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_stat_reset_subscription_worker(oid) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_stat_reset_subscription_worker(oid, oid) FROM public;
+
REVOKE EXECUTE ON FUNCTION lo_import(text) FROM public;
REVOKE EXECUTE ON FUNCTION lo_import(text, oid) FROM public;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index eb560955cd..cb2f77cd1e 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1261,3 +1261,28 @@ REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
substream, subtwophasestate, subslotname, subsynccommit, subpublications)
ON pg_subscription TO public;
+
+CREATE VIEW pg_stat_subscription_workers AS
+ SELECT
+ w.subid,
+ s.subname,
+ w.subrelid,
+ w.relid,
+ w.command,
+ w.xid,
+ w.error_count,
+ w.error_message,
+ w.first_error_time,
+ w.last_error_time
+ FROM (SELECT
+ oid as subid,
+ NULL as relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT
+ srsubid as subid,
+ srrelid as relid
+ FROM pg_subscription_rel
+ WHERE srsubstate <> 'r') sr,
+ LATERAL pg_stat_get_subscription_worker(sr.subid, sr.relid) w
+ JOIN pg_subscription s ON (w.subid = s.oid);
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index c47ba26369..9427e86fee 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -32,6 +32,7 @@
#include "executor/executor.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "pgstat.h"
#include "replication/logicallauncher.h"
#include "replication/origin.h"
#include "replication/slot.h"
@@ -1204,7 +1205,8 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
* Since dropping a replication slot is not transactional, the replication
* slot stays dropped even if the transaction rolls back. So we cannot
* run DROP SUBSCRIPTION inside a transaction block if dropping the
- * replication slot.
+ * replication slot. Also, in this case, we report a message for dropping
+ * the subscription to the stats collector.
*
* XXX The command name should really be something like "DROP SUBSCRIPTION
* of a subscription that is associated with a replication slot", but we
@@ -1377,6 +1379,18 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
}
PG_END_TRY();
+ /*
+ * Send a message for dropping this subscription to the stats collector.
+ * We can safely report dropping the subscription statistics here if the
+ * subscription is associated with a replication slot since we cannot run
+ * DROP SUBSCRIPTION inside a transaction block. Subscription statistics
+ * will be removed later by (auto)vacuum either if it's not associated
+ * with a replication slot or if the message for dropping the subscription
+ * gets lost.
+ */
+ if (slotname)
+ pgstat_report_subscription_drop(subid);
+
table_close(rel, NoLock);
}
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 8c166e5e16..209a2b49ce 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -41,6 +41,7 @@
#include "catalog/catalog.h"
#include "catalog/pg_database.h"
#include "catalog/pg_proc.h"
+#include "catalog/pg_subscription.h"
#include "common/ip.h"
#include "executor/instrument.h"
#include "libpq/libpq.h"
@@ -105,6 +106,7 @@
#define PGSTAT_DB_HASH_SIZE 16
#define PGSTAT_TAB_HASH_SIZE 512
#define PGSTAT_FUNCTION_HASH_SIZE 512
+#define PGSTAT_SUBWORKER_HASH_SIZE 256
#define PGSTAT_REPLSLOT_HASH_SIZE 32
@@ -320,10 +322,14 @@ NON_EXEC_STATIC void PgstatCollectorMain(int argc, char *argv[]) pg_attribute_no
static PgStat_StatDBEntry *pgstat_get_db_entry(Oid databaseid, bool create);
static PgStat_StatTabEntry *pgstat_get_tab_entry(PgStat_StatDBEntry *dbentry,
Oid tableoid, bool create);
+static PgStat_StatSubWorkerEntry *pgstat_get_subworker_entry(PgStat_StatDBEntry *dbentry,
+ Oid subid, Oid subrelid,
+ bool create);
static void pgstat_write_statsfiles(bool permanent, bool allDbs);
static void pgstat_write_db_statsfile(PgStat_StatDBEntry *dbentry, bool permanent);
static HTAB *pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep);
-static void pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash, bool permanent);
+static void pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash,
+ HTAB *subworkerhash, bool permanent);
static void backend_read_statsfile(void);
static bool pgstat_write_statsfile_needed(void);
@@ -335,6 +341,7 @@ static void pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotstats, Timestamp
static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg, TimestampTz now);
static void pgstat_send_funcstats(void);
static void pgstat_send_slru(void);
+static void pgstat_send_subscription_purge(PgStat_MsgSubscriptionPurge *msg);
static HTAB *pgstat_collect_oids(Oid catalogid, AttrNumber anum_oid);
static bool pgstat_should_report_connstat(void);
static void pgstat_report_disconnect(Oid dboid);
@@ -373,6 +380,8 @@ static void pgstat_recv_connect(PgStat_MsgConnect *msg, int len);
static void pgstat_recv_disconnect(PgStat_MsgDisconnect *msg, int len);
static void pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len);
static void pgstat_recv_tempfile(PgStat_MsgTempFile *msg, int len);
+static void pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len);
+static void pgstat_recv_subworker_error(PgStat_MsgSubWorkerError *msg, int len);
/* ------------------------------------------------------------
* Public functions called from postmaster follow
@@ -1302,6 +1311,55 @@ pgstat_vacuum_stat(void)
hash_destroy(htab);
}
+
+ /*
+ * Repeat for subscription workers. Similarly, we needn't bother
+ * in the common case where no function stats are being collected.
+ */
+ if (dbentry->subworkers != NULL &&
+ hash_get_num_entries(dbentry->subworkers) > 0)
+ {
+ PgStat_StatSubWorkerEntry *subwentry;
+ PgStat_MsgSubscriptionPurge spmsg;
+
+ /*
+ * Read pg_subscription and make a list of OIDs of all existing
+ * subscriptions
+ */
+ htab = pgstat_collect_oids(SubscriptionRelationId, Anum_pg_subscription_oid);
+
+ pgstat_setheader(&spmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ spmsg.m_databaseid = MyDatabaseId;
+ spmsg.m_nentries = 0;
+
+ hash_seq_init(&hstat, dbentry->subworkers);
+ while ((subwentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ CHECK_FOR_INTERRUPTS();
+
+ if (hash_search(htab, (void *) &(subwentry->key.subid), HASH_FIND, NULL)
+ != NULL)
+ continue;
+
+ /* This subscription is dead, add the subid to the message */
+ spmsg.m_subids[spmsg.m_nentries++] = subwentry->key.subid;
+
+ /*
+ * If the message is full, send it out and reinitialize to empty
+ */
+ if (spmsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONPURGE)
+ {
+ pgstat_send_subscription_purge(&spmsg);
+ spmsg.m_nentries = 0;
+ }
+ }
+
+ /* Send the rest of dead subscriptions */
+ if (spmsg.m_nentries > 0)
+ pgstat_send_subscription_purge(&spmsg);
+
+ hash_destroy(htab);
+ }
}
@@ -1474,7 +1532,8 @@ pgstat_reset_shared_counters(const char *target)
* ----------
*/
void
-pgstat_reset_single_counter(Oid objoid, PgStat_Single_Reset_Type type)
+pgstat_reset_single_counter(Oid objoid, Oid subobjoid,
+ PgStat_Single_Reset_Type type)
{
PgStat_MsgResetsinglecounter msg;
@@ -1485,6 +1544,7 @@ pgstat_reset_single_counter(Oid objoid, PgStat_Single_Reset_Type type)
msg.m_databaseid = MyDatabaseId;
msg.m_resettype = type;
msg.m_objectid = objoid;
+ msg.m_subobjectid = subobjoid;
pgstat_send(&msg, sizeof(msg));
}
@@ -1869,6 +1929,53 @@ pgstat_report_replslot_drop(const char *slotname)
pgstat_send(&msg, sizeof(PgStat_MsgReplSlot));
}
+/* ----------
+ * pgstat_report_subworker_error() -
+ *
+ * Tell the collector about the subscription worker error.
+ * ----------
+ */
+void
+pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command, TransactionId xid,
+ const char *errmsg)
+{
+ PgStat_MsgSubWorkerError msg;
+ int len;
+
+ Assert(strlen(errmsg) < PGSTAT_SUBWORKERERROR_MSGLEN);
+ len = offsetof(PgStat_MsgSubWorkerError, m_message[0]) + strlen(errmsg) + 1;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBWORKERERROR);
+ msg.m_databaseid = MyDatabaseId;
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_relid = relid;
+ msg.m_command = command;
+ msg.m_xid = xid;
+ msg.m_timestamp = GetCurrentTimestamp();
+ strlcpy(msg.m_message, errmsg, PGSTAT_SUBWORKERERROR_MSGLEN);
+
+ pgstat_send(&msg, len);
+}
+
+/* ----------
+ * pgstat_report_subscription_drop() -
+ *
+ * Tell the collector about dropping the subscription.
+ * ----------
+ */
+void
+pgstat_report_subscription_drop(Oid subid)
+{
+ PgStat_MsgSubscriptionPurge msg;
+
+ msg.m_databaseid = MyDatabaseId;
+ msg.m_subids[0] = subid;
+ msg.m_nentries = 1;
+ pgstat_send_subscription_purge(&msg);
+}
+
/* ----------
* pgstat_ping() -
*
@@ -2874,6 +2981,33 @@ pgstat_fetch_stat_funcentry(Oid func_id)
return funcentry;
}
+/*
+ * ---------
+ * pgstat_fetch_stat_subworker_entry() -
+ *
+ * Support function for the SQL-callable pgstat* functions. Returns
+ * a pointer to the subscription worker struct or NULL.
+ * ---------
+ */
+PgStat_StatSubWorkerEntry *
+pgstat_fetch_stat_subworker_entry(Oid subid, Oid subrelid)
+{
+ PgStat_StatDBEntry *dbentry;
+ PgStat_StatSubWorkerEntry *wentry = NULL;
+
+ /* Load the stats file if needed */
+ backend_read_statsfile();
+
+ /* Look up database, then find the requested subscription worker stats */
+ dbentry = pgstat_fetch_stat_dbentry(MyDatabaseId);
+ if (dbentry != NULL && dbentry->subworkers != NULL)
+ {
+ wentry = pgstat_get_subworker_entry(dbentry, subid, subrelid,
+ false);
+ }
+
+ return wentry;
+}
/*
* ---------
@@ -3312,6 +3446,23 @@ pgstat_send_slru(void)
}
}
+/* --------
+ * pgstat_send_subscription_purge() -
+ *
+ * Send a subscription purge message to the collector
+ * --------
+ */
+static void
+pgstat_send_subscription_purge(PgStat_MsgSubscriptionPurge *msg)
+{
+ int len;
+
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + msg->m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&msg->m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(msg, len);
+}
/* ----------
* PgstatCollectorMain() -
@@ -3568,6 +3719,14 @@ PgstatCollectorMain(int argc, char *argv[])
pgstat_recv_disconnect(&msg.msg_disconnect, len);
break;
+ case PGSTAT_MTYPE_SUBSCRIPTIONPURGE:
+ pgstat_recv_subscription_purge(&msg.msg_subscriptionpurge, len);
+ break;
+
+ case PGSTAT_MTYPE_SUBWORKERERROR:
+ pgstat_recv_subworker_error(&msg.msg_subworkererror, len);
+ break;
+
default:
break;
}
@@ -3613,7 +3772,8 @@ PgstatCollectorMain(int argc, char *argv[])
/*
* Subroutine to clear stats in a database entry
*
- * Tables and functions hashes are initialized to empty.
+ * Tables, functions, and subscription workers hashes are initialized
+ * to empty.
*/
static void
reset_dbentry_counters(PgStat_StatDBEntry *dbentry)
@@ -3666,6 +3826,13 @@ reset_dbentry_counters(PgStat_StatDBEntry *dbentry)
PGSTAT_FUNCTION_HASH_SIZE,
&hash_ctl,
HASH_ELEM | HASH_BLOBS);
+
+ hash_ctl.keysize = sizeof(PgStat_StatSubWorkerKey);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubWorkerEntry);
+ dbentry->subworkers = hash_create("Per-database subscription worker",
+ PGSTAT_SUBWORKER_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
}
/*
@@ -3690,7 +3857,7 @@ pgstat_get_db_entry(Oid databaseid, bool create)
/*
* If not found, initialize the new one. This creates empty hash tables
- * for tables and functions, too.
+ * for tables, functions, and subscription worker, too.
*/
if (!found)
reset_dbentry_counters(result);
@@ -3748,6 +3915,45 @@ pgstat_get_tab_entry(PgStat_StatDBEntry *dbentry, Oid tableoid, bool create)
return result;
}
+/* ----------
+ * pgstat_get_subworker_entry
+ *
+ * Return subscription worker entry with the given subscription OID and
+ * relation OID. If subrelid is InvalidOid, it returns an entry of the
+ * apply worker otherwise of the table sync worker associated with subrelid.
+ * If no subscription entry exists, initialize it, if the create parameter
+ * is true. Else, return NULL.
+ * ----------
+ */
+static PgStat_StatSubWorkerEntry *
+pgstat_get_subworker_entry(PgStat_StatDBEntry *dbentry, Oid subid, Oid subrelid,
+ bool create)
+{
+ PgStat_StatSubWorkerEntry *subwentry;
+ PgStat_StatSubWorkerKey key;
+ bool found;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);
+
+ key.subid = subid;
+ key.subrelid = subrelid;
+ subwentry = (PgStat_StatSubWorkerEntry *) hash_search(dbentry->subworkers,
+ (void *) &key,
+ action, &found);
+
+ /* If not found, initialize the new one */
+ if (create && !found)
+ {
+ subwentry->relid = InvalidOid;
+ subwentry->command = 0;
+ subwentry->xid = InvalidTransactionId;
+ subwentry->error_count = 0;
+ subwentry->first_error_time = 0;
+ subwentry->last_error_time = 0;
+ subwentry->error_message[0] = '\0';
+ }
+
+ return subwentry;
+}
/* ----------
* pgstat_write_statsfiles() -
@@ -3947,8 +4153,10 @@ pgstat_write_db_statsfile(PgStat_StatDBEntry *dbentry, bool permanent)
{
HASH_SEQ_STATUS tstat;
HASH_SEQ_STATUS fstat;
+ HASH_SEQ_STATUS sstat;
PgStat_StatTabEntry *tabentry;
PgStat_StatFuncEntry *funcentry;
+ PgStat_StatSubWorkerEntry *subwentry;
FILE *fpout;
int32 format_id;
Oid dbid = dbentry->databaseid;
@@ -4003,6 +4211,17 @@ pgstat_write_db_statsfile(PgStat_StatDBEntry *dbentry, bool permanent)
(void) rc; /* we'll check for error with ferror */
}
+ /*
+ * Walk through the database's subscription worker stats table.
+ */
+ hash_seq_init(&sstat, dbentry->subworkers);
+ while ((subwentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&sstat)) != NULL)
+ {
+ fputc('S', fpout);
+ rc = fwrite(subwentry, sizeof(PgStat_StatSubWorkerEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+ }
+
/*
* No more output to be done. Close the temp file and replace the old
* pgstat.stat with it. The ferror() check replaces testing for error
@@ -4241,6 +4460,7 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
memcpy(dbentry, &dbbuf, sizeof(PgStat_StatDBEntry));
dbentry->tables = NULL;
dbentry->functions = NULL;
+ dbentry->subworkers = NULL;
/*
* In the collector, disregard the timestamp we read from the
@@ -4252,8 +4472,8 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
dbentry->stats_timestamp = 0;
/*
- * Don't create tables/functions hashtables for uninteresting
- * databases.
+ * Don't create tables/functions/subworkers hashtables for
+ * uninteresting databases.
*/
if (onlydb != InvalidOid)
{
@@ -4278,6 +4498,14 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
&hash_ctl,
HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ hash_ctl.keysize = sizeof(PgStat_StatSubWorkerKey);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubWorkerEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ dbentry->subworkers = hash_create("Per-database subscription worker",
+ PGSTAT_SUBWORKER_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+
/*
* If requested, read the data from the database-specific
* file. Otherwise we just leave the hashtables empty.
@@ -4286,6 +4514,7 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
pgstat_read_db_statsfile(dbentry->databaseid,
dbentry->tables,
dbentry->functions,
+ dbentry->subworkers,
permanent);
break;
@@ -4363,19 +4592,21 @@ done:
* As in pgstat_read_statsfiles, if the permanent file is requested, it is
* removed after reading.
*
- * Note: this code has the ability to skip storing per-table or per-function
- * data, if NULL is passed for the corresponding hashtable. That's not used
- * at the moment though.
+ * Note: this code has the ability to skip storing per-table, per-function, or
+ * per-subscription-worker data, if NULL is passed for the corresponding hashtable.
+ * That's not used at the moment though.
* ----------
*/
static void
pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash,
- bool permanent)
+ HTAB *subworkerhash, bool permanent)
{
PgStat_StatTabEntry *tabentry;
PgStat_StatTabEntry tabbuf;
PgStat_StatFuncEntry funcbuf;
PgStat_StatFuncEntry *funcentry;
+ PgStat_StatSubWorkerEntry subwbuf;
+ PgStat_StatSubWorkerEntry *subwentry;
FILE *fpin;
int32 format_id;
bool found;
@@ -4489,6 +4720,41 @@ pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash,
memcpy(funcentry, &funcbuf, sizeof(funcbuf));
break;
+ /*
+ * 'S' A PgStat_StatSubWorkerEntry struct describing
+ * subscription worker statistics.
+ */
+ case 'S':
+ if (fread(&subwbuf, 1, sizeof(PgStat_StatSubWorkerEntry),
+ fpin) != sizeof(PgStat_StatSubWorkerEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /*
+ * Skip if subscription worker data not wanted.
+ */
+ if (subworkerhash == NULL)
+ break;
+
+ subwentry = (PgStat_StatSubWorkerEntry *) hash_search(subworkerhash,
+ (void *) &subwbuf.key,
+ HASH_ENTER, &found);
+
+ if (found)
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ memcpy(subwentry, &subwbuf, sizeof(subwbuf));
+ break;
+
/*
* 'E' The EOF marker of a complete stats file.
*/
@@ -5162,6 +5428,8 @@ pgstat_recv_dropdb(PgStat_MsgDropdb *msg, int len)
hash_destroy(dbentry->tables);
if (dbentry->functions != NULL)
hash_destroy(dbentry->functions);
+ if (dbentry->subworkers != NULL)
+ hash_destroy(dbentry->subworkers);
if (hash_search(pgStatDBHash,
(void *) &dbid,
@@ -5199,13 +5467,16 @@ pgstat_recv_resetcounter(PgStat_MsgResetcounter *msg, int len)
hash_destroy(dbentry->tables);
if (dbentry->functions != NULL)
hash_destroy(dbentry->functions);
+ if (dbentry->subworkers != NULL)
+ hash_destroy(dbentry->subworkers);
dbentry->tables = NULL;
dbentry->functions = NULL;
+ dbentry->subworkers = NULL;
/*
* Reset database-level stats, too. This creates empty hash tables for
- * tables and functions.
+ * tables, functions, and subscription workers.
*/
reset_dbentry_counters(dbentry);
}
@@ -5274,6 +5545,14 @@ pgstat_recv_resetsinglecounter(PgStat_MsgResetsinglecounter *msg, int len)
else if (msg->m_resettype == RESET_FUNCTION)
(void) hash_search(dbentry->functions, (void *) &(msg->m_objectid),
HASH_REMOVE, NULL);
+ else if (msg->m_resettype == RESET_SUBWORKER)
+ {
+ PgStat_StatSubWorkerKey key;
+
+ key.subid = msg->m_objectid;
+ key.subrelid = msg->m_subobjectid;
+ (void) hash_search(dbentry->subworkers, (void *) &key, HASH_REMOVE, NULL);
+ }
}
/* ----------
@@ -5816,6 +6095,84 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
}
}
+/* ----------
+ * pgstat_recv_subscription_purge() -
+ *
+ * Process a SUBSCRIPTIONPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len)
+{
+ HASH_SEQ_STATUS hstat;
+ PgStat_StatDBEntry *dbentry;
+ PgStat_StatSubWorkerEntry *subwentry;
+
+ dbentry = pgstat_get_db_entry(msg->m_databaseid, false);
+
+ /* No need to purge if we don't have even know the database */
+ if (!dbentry || !dbentry->subworkers)
+ return;
+
+ /* Remove all subscription worker statistics of the given subscriptions */
+ hash_seq_init(&hstat, dbentry->subworkers);
+ while ((subwentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ if (subwentry->key.subid == msg->m_subids[i])
+ {
+ (void) hash_search(dbentry->subworkers, (void *) &(subwentry->key),
+ HASH_REMOVE, NULL);
+ break;
+ }
+ }
+ }
+}
+
+/* ----------
+ * pgstat_recv_subworker_error() -
+ *
+ * Process a SUBWORKERERROR message.
+ * ----------
+ */
+static void
+pgstat_recv_subworker_error(PgStat_MsgSubWorkerError *msg, int len)
+{
+ PgStat_StatDBEntry *dbentry;
+ PgStat_StatSubWorkerEntry *subwentry;
+
+ dbentry = pgstat_get_db_entry(msg->m_databaseid, true);
+
+ /* Get the subscription worker stats */
+ subwentry = pgstat_get_subworker_entry(dbentry, msg->m_subid,
+ msg->m_subrelid, true);
+ Assert(subwentry);
+
+ /*
+ * Update only the counter and last error timestamp if we received
+ * the same error again
+ */
+ if (subwentry->relid == msg->m_relid &&
+ subwentry->command == msg->m_command &&
+ subwentry->xid == msg->m_xid &&
+ strcmp(subwentry->error_message, msg->m_message) == 0)
+ {
+ subwentry->error_count++;
+ subwentry->last_error_time = msg->m_timestamp;
+ return;
+ }
+
+ /* Otherwise, update the error information */
+ subwentry->relid = msg->m_relid;
+ subwentry->command = msg->m_command;
+ subwentry->xid = msg->m_xid;
+ subwentry->error_count = 1;
+ subwentry->first_error_time = msg->m_timestamp;
+ subwentry->last_error_time = msg->m_timestamp;
+ strlcpy(subwentry->error_message, msg->m_message, PGSTAT_SUBWORKERERROR_MSGLEN);
+}
+
/* ----------
* pgstat_write_statsfile_needed() -
*
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index ae1b391bda..2e79302a48 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3332,6 +3332,7 @@ void
ApplyWorkerMain(Datum main_arg)
{
int worker_slot = DatumGetInt32(main_arg);
+ MemoryContext cctx = CurrentMemoryContext;
MemoryContext oldctx;
char originname[NAMEDATALEN];
XLogRecPtr origin_startpos;
@@ -3432,8 +3433,30 @@ ApplyWorkerMain(Datum main_arg)
{
char *syncslotname;
- /* This is table synchronization worker, call initial sync. */
- syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ PG_TRY();
+ {
+ /* This is table synchronization worker, call initial sync. */
+ syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ }
+ PG_CATCH();
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ /*
+ * Report the table sync error. There is no corresponding message
+ * type for table synchronization.
+ */
+ pgstat_report_subworker_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ MyLogicalRepWorker->relid,
+ 0, /* message type */
+ InvalidTransactionId,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
/* allocate slot name in long-lived context */
myslotname = MemoryContextStrdup(ApplyContext, syncslotname);
@@ -3551,7 +3574,32 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- LogicalRepApplyLoop(origin_startpos);
+ PG_TRY();
+ {
+ LogicalRepApplyLoop(origin_startpos);
+ }
+ PG_CATCH();
+ {
+ /* report the apply error */
+ if (apply_error_callback_arg.command != 0)
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ pgstat_report_subworker_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ apply_error_callback_arg.rel != NULL
+ ? apply_error_callback_arg.rel->localreloid
+ : InvalidOid,
+ apply_error_callback_arg.command,
+ apply_error_callback_arg.remote_xid,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ }
+
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
proc_exit(0);
}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index e64857e540..b19729d1ca 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -2172,7 +2172,7 @@ pg_stat_reset_single_table_counters(PG_FUNCTION_ARGS)
{
Oid taboid = PG_GETARG_OID(0);
- pgstat_reset_single_counter(taboid, RESET_TABLE);
+ pgstat_reset_single_counter(taboid, InvalidOid, RESET_TABLE);
PG_RETURN_VOID();
}
@@ -2182,7 +2182,18 @@ pg_stat_reset_single_function_counters(PG_FUNCTION_ARGS)
{
Oid funcoid = PG_GETARG_OID(0);
- pgstat_reset_single_counter(funcoid, RESET_FUNCTION);
+ pgstat_reset_single_counter(funcoid, InvalidOid, RESET_FUNCTION);
+
+ PG_RETURN_VOID();
+}
+
+Datum
+pg_stat_reset_subscription_worker_subrel(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid = PG_ARGISNULL(1) ? InvalidOid : PG_GETARG_OID(1);
+
+ pgstat_reset_single_counter(subid, relid, RESET_SUBWORKER);
PG_RETURN_VOID();
}
@@ -2240,6 +2251,21 @@ pg_stat_reset_replication_slot(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
+/* Reset all subscription worker stats associated with the given subscription */
+Datum
+pg_stat_reset_subscription_worker_sub(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+
+ /*
+ * Use subscription drop message to remove statistics of all subscription
+ * workers.
+ */
+ pgstat_report_subscription_drop(subid);
+
+ PG_RETURN_VOID();
+}
+
Datum
pg_stat_get_archiver(PG_FUNCTION_ARGS)
{
@@ -2380,3 +2406,107 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
/* Returns the record as Datum */
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
+
+/*
+ * Get the subscription worker statistics for the given subscription
+ * (and relation).
+ */
+Datum
+pg_stat_get_subscription_worker(PG_FUNCTION_ARGS)
+{
+#define PG_STAT_GET_SUBSCRIPTION_WORKER_COLS 9
+ Oid subid = PG_GETARG_OID(0);
+ Oid subrelid;
+ TupleDesc tupdesc;
+ Datum values[PG_STAT_GET_SUBSCRIPTION_WORKER_COLS];
+ bool nulls[PG_STAT_GET_SUBSCRIPTION_WORKER_COLS];
+ PgStat_StatSubWorkerEntry *wentry;
+ int i;
+
+ if (PG_ARGISNULL(1))
+ subrelid = InvalidOid;
+ else
+ subrelid = PG_GETARG_OID(1);
+
+ /* Get subscription worker stats */
+ wentry = pgstat_fetch_stat_subworker_entry(subid, subrelid);
+
+ /* Return NULL if there is no worker statistics */
+ if (wentry == NULL)
+ PG_RETURN_NULL();
+
+ /* Initialise attributes information in the tuple descriptor */
+ tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_SUBSCRIPTION_WORKER_COLS);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "subid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subrelid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 3, "relid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 4, "command",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 5, "xid",
+ XIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 6, "error_count",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 7, "error_message",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 8, "first_error_time",
+ TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 9, "last_error_time",
+ TIMESTAMPTZOID, -1, 0);
+ BlessTupleDesc(tupdesc);
+
+ /* Initialise values and NULL flags arrays */
+ MemSet(values, 0, sizeof(values));
+ MemSet(nulls, 0, sizeof(nulls));
+
+ i = 0;
+ /* subid */
+ values[i++] = ObjectIdGetDatum(subid);
+
+ /* subrelid */
+ if (OidIsValid(subrelid))
+ values[i++] = ObjectIdGetDatum(subrelid);
+ else
+ nulls[i++] = true;
+
+ /* relid */
+ if (OidIsValid(wentry->relid))
+ values[i++] = ObjectIdGetDatum(wentry->relid);
+ else
+ nulls[i++] = true;
+
+ /* command */
+ if (wentry->command != 0)
+ values[i++] = CStringGetTextDatum(logicalrep_message_type(wentry->command));
+ else
+ nulls[i++] = true;
+
+ /* xid */
+ if (TransactionIdIsValid(wentry->xid))
+ values[i++] = TransactionIdGetDatum(wentry->xid);
+ else
+ nulls[i++] = true;
+
+ /* error_count */
+ values[i++] = Int64GetDatum(wentry->error_count);
+
+ /* error_message */
+ values[i++] = CStringGetTextDatum(wentry->error_message);
+
+ /* first_error_time */
+ if (wentry->first_error_time != 0)
+ values[i++] = TimestampTzGetDatum(wentry->first_error_time);
+ else
+ nulls[i++] = true;
+
+ /* last_error_time */
+ if (wentry->last_error_time != 0)
+ values[i++] = TimestampTzGetDatum(wentry->last_error_time);
+ else
+ nulls[i++] = true;
+
+ /* Returns the record as Datum */
+ PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
+}
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index d068d6532e..50e1c7b68d 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5385,6 +5385,14 @@
proargmodes => '{i,o,o,o,o,o,o,o,o,o,o}',
proargnames => '{slot_name,slot_name,spill_txns,spill_count,spill_bytes,stream_txns,stream_count,stream_bytes,total_txns,total_bytes,stats_reset}',
prosrc => 'pg_stat_get_replication_slot' },
+{ oid => '8523', descr => 'statistics: information about subscription worker',
+ proname => 'pg_stat_get_subscription_worker', prorows => '1', proisstrict => 'f',
+ proretset => 't', provolatile => 's', proparallel => 'r',
+ prorettype => 'record', proargtypes => 'oid oid',
+ proallargtypes => '{oid,oid,oid,oid,oid,text,xid,int8,text,timestamptz,timestamptz}',
+ proargmodes => '{i,i,o,o,o,o,o,o,o,o,o}',
+ proargnames => '{subid,subrelid,subid,subrelid,relid,command,xid,error_count,error_message,first_error_time,last_error_time}',
+ prosrc => 'pg_stat_get_subscription_worker' },
{ oid => '6118', descr => 'statistics: information about subscription',
proname => 'pg_stat_get_subscription', prorows => '10', proisstrict => 'f',
proretset => 't', provolatile => 's', proparallel => 'r',
@@ -5772,6 +5780,16 @@
proname => 'pg_stat_reset_replication_slot', proisstrict => 'f',
provolatile => 'v', prorettype => 'void', proargtypes => 'text',
prosrc => 'pg_stat_reset_replication_slot' },
+{ oid => '8524',
+ descr => 'statistics: reset collected statistics for a single subscription worker',
+ proname => 'pg_stat_reset_subscription_worker', proisstrict => 'f',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid oid',
+ prosrc => 'pg_stat_reset_subscription_worker_subrel' },
+{ oid => '8525',
+ descr => 'statistics: reset all collected statistics for a single subscription',
+ proname => 'pg_stat_reset_subscription_worker',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid',
+ prosrc => 'pg_stat_reset_subscription_worker_sub' },
{ oid => '3163', descr => 'current trigger depth',
proname => 'pg_trigger_depth', provolatile => 's', proparallel => 'r',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index bcd3588ea2..2c26b1cbd4 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -14,6 +14,7 @@
#include "datatype/timestamp.h"
#include "portability/instr_time.h"
#include "postmaster/pgarch.h" /* for MAX_XFN_CHARS */
+#include "replication/logicalproto.h"
#include "utils/backend_progress.h" /* for backward compatibility */
#include "utils/backend_status.h" /* for backward compatibility */
#include "utils/hsearch.h"
@@ -83,6 +84,8 @@ typedef enum StatMsgType
PGSTAT_MTYPE_REPLSLOT,
PGSTAT_MTYPE_CONNECT,
PGSTAT_MTYPE_DISCONNECT,
+ PGSTAT_MTYPE_SUBSCRIPTIONPURGE,
+ PGSTAT_MTYPE_SUBWORKERERROR,
} StatMsgType;
/* ----------
@@ -145,7 +148,8 @@ typedef enum PgStat_Shared_Reset_Target
typedef enum PgStat_Single_Reset_Type
{
RESET_TABLE,
- RESET_FUNCTION
+ RESET_FUNCTION,
+ RESET_SUBWORKER
} PgStat_Single_Reset_Type;
/* ------------------------------------------------------------
@@ -364,6 +368,7 @@ typedef struct PgStat_MsgResetsinglecounter
Oid m_databaseid;
PgStat_Single_Reset_Type m_resettype;
Oid m_objectid;
+ Oid m_subobjectid;
} PgStat_MsgResetsinglecounter;
/* ----------
@@ -536,6 +541,53 @@ typedef struct PgStat_MsgReplSlot
PgStat_Counter m_total_bytes;
} PgStat_MsgReplSlot;
+/* ----------
+ * PgStat_MsgSubscriptionPurge Sent by the backend and autovacuum to tell the
+ * collector about the dead subscriptions.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(Oid) - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionPurge
+{
+ PgStat_MsgHdr m_hdr;
+ Oid m_databaseid;
+ int m_nentries;
+ Oid m_subids[PGSTAT_NUM_SUBSCRIPTIONPURGE];
+} PgStat_MsgSubscriptionPurge;
+
+/* ----------
+ * PgStat_MsgSubWorkerError Sent by the apply worker or the table sync worker to
+ * report the error occurred during logical replication.
+ * ----------
+ */
+#define PGSTAT_SUBWORKERERROR_MSGLEN 256
+typedef struct PgStat_MsgSubWorkerError
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * m_subid and m_subrelid are used to determine the subscription and the
+ * reporter of the error. m_subrelid is InvalidOid if reported by an apply
+ * worker otherwise reported by a table sync worker.
+ */
+ Oid m_databaseid;
+ Oid m_subid;
+ Oid m_subrelid;
+
+ /*
+ * Oid of the table that the reporter was actually processing. m_relid can
+ * be InvalidOid if an error occurred during worker applying a
+ * non-data-modification message such as RELATION.
+ */
+ Oid m_relid;
+
+ LogicalRepMsgType m_command;
+ TransactionId m_xid;
+ TimestampTz m_timestamp;
+ char m_message[PGSTAT_SUBWORKERERROR_MSGLEN];
+} PgStat_MsgSubWorkerError;
/* ----------
* PgStat_MsgRecoveryConflict Sent by the backend upon recovery conflict
@@ -714,6 +766,8 @@ typedef union PgStat_Msg
PgStat_MsgReplSlot msg_replslot;
PgStat_MsgConnect msg_connect;
PgStat_MsgDisconnect msg_disconnect;
+ PgStat_MsgSubscriptionPurge msg_subscriptionpurge;
+ PgStat_MsgSubWorkerError msg_subworkererror;
} PgStat_Msg;
@@ -768,11 +822,16 @@ typedef struct PgStat_StatDBEntry
TimestampTz stats_timestamp; /* time of db stats file update */
/*
- * tables and functions must be last in the struct, because we don't write
- * the pointers out to the stats file.
+ * tables, functions, and subscription workers must be last in the struct,
+ * because we don't write the pointers out to the stats file.
+ *
+ * subworker is the hash table of PgStat_StatSubWorkerEntry which stores
+ * statistics of logical replication workers: apply worker and table sync
+ * worker.
*/
HTAB *tables;
HTAB *functions;
+ HTAB *subworkers;
} PgStat_StatDBEntry;
@@ -929,6 +988,35 @@ typedef struct PgStat_StatReplSlotEntry
TimestampTz stat_reset_timestamp;
} PgStat_StatReplSlotEntry;
+/* The lookup key for subscription worker hash table */
+typedef struct PgStat_StatSubWorkerKey
+{
+ Oid subid;
+ Oid subrelid; /* InvalidOid for apply worker, otherwise for
+ * table sync worker */
+} PgStat_StatSubWorkerKey;
+
+/*
+ * Logical replication apply worker and table sync worker statistics kept in the
+ * stats collector.
+ */
+typedef struct PgStat_StatSubWorkerEntry
+{
+ PgStat_StatSubWorkerKey key; /* hash key (must be first) */
+
+ /*
+ * Subscription worker error statistics representing an error that
+ * occurred during application of logical replication or the initial table
+ * synchronization.
+ */
+ Oid relid;
+ LogicalRepMsgType command;
+ TransactionId xid;
+ PgStat_Counter error_count;
+ TimestampTz first_error_time;
+ TimestampTz last_error_time;
+ char error_message[PGSTAT_SUBWORKERERROR_MSGLEN];
+} PgStat_StatSubWorkerEntry;
/*
* Working state needed to accumulate per-function-call timing statistics.
@@ -1019,9 +1107,11 @@ extern void pgstat_drop_database(Oid databaseid);
extern void pgstat_clear_snapshot(void);
extern void pgstat_reset_counters(void);
extern void pgstat_reset_shared_counters(const char *);
-extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type type);
+extern void pgstat_reset_single_counter(Oid objectid, Oid subobjectid,
+ PgStat_Single_Reset_Type type);
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
+extern void pgstat_reset_subworker_stats(Oid subid, Oid subrelid, bool allstats);
extern void pgstat_report_connect(Oid dboid);
extern void pgstat_report_autovac(Oid dboid);
@@ -1038,6 +1128,10 @@ extern void pgstat_report_checksum_failure(void);
extern void pgstat_report_replslot(const PgStat_StatReplSlotEntry *repSlotStat);
extern void pgstat_report_replslot_create(const char *slotname);
extern void pgstat_report_replslot_drop(const char *slotname);
+extern void pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command,
+ TransactionId xid, const char *errmsg);
+extern void pgstat_report_subscription_drop(Oid subid);
extern void pgstat_initialize(void);
@@ -1129,6 +1223,8 @@ extern void pgstat_send_wal(bool force);
extern PgStat_StatDBEntry *pgstat_fetch_stat_dbentry(Oid dbid);
extern PgStat_StatTabEntry *pgstat_fetch_stat_tabentry(Oid relid);
extern PgStat_StatFuncEntry *pgstat_fetch_stat_funcentry(Oid funcid);
+extern PgStat_StatSubWorkerEntry *pgstat_fetch_stat_subworker_entry(Oid subid,
+ Oid subrelid);
extern PgStat_ArchiverStats *pgstat_fetch_stat_archiver(void);
extern PgStat_BgWriterStats *pgstat_fetch_stat_bgwriter(void);
extern PgStat_CheckpointerStats *pgstat_fetch_stat_checkpointer(void);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2fa00a3c29..cb6da2c140 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2094,6 +2094,26 @@ pg_stat_subscription| SELECT su.oid AS subid,
st.latest_end_time
FROM (pg_subscription su
LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid, pid, received_lsn, last_msg_send_time, last_msg_receipt_time, latest_end_lsn, latest_end_time) ON ((st.subid = su.oid)));
+pg_stat_subscription_workers| SELECT w.subid,
+ s.subname,
+ w.subrelid,
+ w.relid,
+ w.command,
+ w.xid,
+ w.error_count,
+ w.error_message,
+ w.first_error_time,
+ w.last_error_time
+ FROM ( SELECT pg_subscription.oid AS subid,
+ NULL::oid AS relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT pg_subscription_rel.srsubid AS subid,
+ pg_subscription_rel.srrelid AS relid
+ FROM pg_subscription_rel
+ WHERE (pg_subscription_rel.srsubstate <> 'r'::"char")) sr,
+ (LATERAL pg_stat_get_subscription_worker(sr.subid, sr.relid) w(subid, subrelid, relid, command, xid, error_count, error_message, first_error_time, last_error_time)
+ JOIN pg_subscription s ON ((w.subid = s.oid)));
pg_stat_sys_indexes| SELECT pg_stat_all_indexes.relid,
pg_stat_all_indexes.indexrelid,
pg_stat_all_indexes.schemaname,
diff --git a/src/test/subscription/t/026_error_report.pl b/src/test/subscription/t/026_error_report.pl
new file mode 100644
index 0000000000..1227654774
--- /dev/null
+++ b/src/test/subscription/t/026_error_report.pl
@@ -0,0 +1,191 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Tests for subscription error reporting.
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 5;
+
+# Test if the error reported on pg_stat_subscription_workers view is expected.
+sub test_subscription_error
+{
+ my ($node, $relname, $xid, $expected_error, $msg) = @_;
+
+ my $check_sql = qq[
+SELECT count(1) > 0 FROM pg_stat_subscription_workers
+WHERE relid = '$relname'::regclass];
+ $check_sql .= " AND xid = '$xid'::xid;" if $xid ne '';
+
+ # Wait for the error statistics to be updated.
+ $node->poll_query_until(
+ 'postgres', $check_sql,
+) or die "Timed out while waiting for statistics to be updated";
+
+ my $result = $node->safe_psql(
+ 'postgres',
+ qq[
+SELECT subname, command, relid::regclass, error_count > 0
+FROM pg_stat_subscription_workers
+WHERE relid = '$relname'::regclass;
+]);
+ is($result, $expected_error, $msg);
+}
+
+# Create publisher node.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ qq[
+logical_decoding_work_mem = 64kB
+]);
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+
+# The subscriber will enter an infinite error loop, so we don't want
+# to overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+wal_retrieve_retry_interval = 5s
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we create
+# the same tables but with primary keys. Also, insert some data that will conflict
+# with the data replicated from publisher later.
+$node_publisher->safe_psql(
+ 'postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int);
+CREATE TABLE test_tab2 (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+INSERT INTO test_tab1 VALUES (1);
+INSERT INTO test_tab2 VALUES (1);
+COMMIT;
+]);
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int primary key);
+CREATE TABLE test_tab2 (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab2 VALUES (1);
+INSERT INTO test_tab_streaming SELECT 10000, md5(10000::text);
+COMMIT;
+]);
+
+# Setup publications.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ q[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Check if there is no subscription errors before starting logical replication.
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_workers");
+is($result, qq(0), 'check no subscription error');
+
+# Create subscriptions. The table sync for test_tab2 on tap_sub will enter into
+# infinite error loop due to violating the unique constraint.
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (streaming = off, two_phase = on);");
+my $appname_streaming = 'tap_sub_streaming';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming' PUBLICATION tap_pub_streaming WITH (streaming = on, two_phase = on);");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Wait for initial table sync for test_tab1 and test_tab_streaming to finish.
+$node_subscriber->poll_query_until('postgres',
+ q[
+SELECT count(1) = 2 FROM pg_subscription_rel
+WHERE srrelid in ('test_tab1'::regclass, 'test_tab_streaming'::regclass) AND srsubstate in ('r', 's')
+]) or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check the initial data.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab1");
+is($result, q(1), 'check initial data are copied to subscriber');
+
+# Insert more data to test_tab1, raising an error on the subscriber due to violation
+# of the unique constraint on test_tab1.
+my $xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab1 VALUES (1);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_subscription_error($node_subscriber, 'test_tab1', $xid,
+ qq(tap_sub|INSERT|test_tab1|t),
+ 'check the error reported by the apply worker');
+
+# Check the table sync worker's error in the view.
+test_subscription_error($node_subscriber, 'test_tab2', '',
+ qq(tap_sub||test_tab2|t),
+ 'check the error reported by the table sync worker');
+
+# Test for resetting subscription worker statistics.
+# Truncate test_tab1 and test_tab2 so that logical replication can continue.
+$node_subscriber->safe_psql(
+ 'postgres',
+ "TRUNCATE test_tab1, test_tab2;");
+
+# Wait for the data to be replicated.
+$node_subscriber->poll_query_until(
+ 'postgres',
+ "SELECT count(1) > 0 FROM test_tab1");
+$node_subscriber->poll_query_until(
+ 'postgres',
+ "SELECT count(1) > 0 FROM test_tab2");
+
+# Reset stats of all subscription workers running on tap_sub.
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+SELECT pg_stat_reset_subscription_worker(sw.subid)
+FROM pg_stat_subscription_workers sw
+ JOIN pg_subscription s ON s.oid = sw.subid
+WHERE
+ s.subname = 'tap_sub';
+]);
+
+# Wait for stats of all subscription workers running on tap_sub to be reset.
+$node_subscriber->poll_query_until(
+ 'postgres',
+ qq[
+SELECT count(1) = 0
+FROM pg_stat_subscription_workers sw
+ JOIN pg_subscription s ON s.oid = sw.subid
+WHERE
+ s.subname = 'tap_sub';
+]);
+
+# Check if the view doesn't show any entries after dropping the subscriptions.
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+DROP SUBSCRIPTION tap_sub;
+DROP SUBSCRIPTION tap_sub_streaming;
+]);
+$result = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT count(1) FROM pg_stat_subscription_workers");
+is($result, q(0), 'no error after dropping subscription');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index da6ac8ed83..f41ef0d2bc 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1943,6 +1943,8 @@ PgStat_MsgResetsharedcounter
PgStat_MsgResetsinglecounter
PgStat_MsgResetslrucounter
PgStat_MsgSLRU
+PgStat_MsgSubscriptionPurge
+PgStat_MsgSubWorkerError
PgStat_MsgTabpurge
PgStat_MsgTabstat
PgStat_MsgTempFile
@@ -1954,6 +1956,8 @@ PgStat_Single_Reset_Type
PgStat_StatDBEntry
PgStat_StatFuncEntry
PgStat_StatReplSlotEntry
+PgStat_StatSubWorkerEntry
+PgStat_StatSubWorkerKey
PgStat_StatTabEntry
PgStat_SubXactStatus
PgStat_TableCounts
--
2.24.3 (Apple Git-128)
On Mon, Nov 15, 2021 at 2:48 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Nov 15, 2021 at 4:49 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Mon, Nov 15, 2021 at 1:49 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated patch that incorporates all comments I got so
far. Please review it.Thanks for the updated patch.
A few minor comments:doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
(1) tab in doc updates
There's a tab before "Otherwise,":
+ copy of the relation with <parameter>relid</parameter>.
Otherwise,Fixed.
src/backend/utils/adt/pgstatfuncs.c
(2) The function comment for "pg_stat_reset_subscription_worker_sub"
seems a bit long and I expected it to be multi-line (did you run
pg_indent?)I ran pg_indent on pgstatfuncs.c but it didn't become a multi-line comment.
src/include/pgstat.h
(3) Remove PgStat_StatSubWorkerEntry.dbid?
The "dbid" member of the new PgStat_StatSubWorkerEntry struct doesn't
seem to be used, so I think it should be removed.
(I could remove it and everything builds OK and tests pass).Fixed.
Thank you for the comments! I've updated an updated version patch.
Thanks for the updated patch.
I found one issue:
This Assert can fail in few cases:
+void
+pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
+
LogicalRepMsgType command, TransactionId xid,
+ const char *errmsg)
+{
+ PgStat_MsgSubWorkerError msg;
+ int len;
+
+ Assert(strlen(errmsg) < PGSTAT_SUBWORKERERROR_MSGLEN);
+ len = offsetof(PgStat_MsgSubWorkerError, m_message[0]) +
strlen(errmsg) + 1;
+
I could reproduce the problem with the following scenario:
Publisher:
create table t1 (c1 varchar);
create publication pub1 for table t1;
insert into t1 values(repeat('abcd', 5000));
Subscriber:
create table t1(c1 smallint);
create subscription sub1 connection 'dbname=postgres port=5432'
publication pub1 with ( two_phase = true);
postgres=# select * from pg_stat_subscription_workers;
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process
exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
Subscriber logs:
2021-11-15 19:27:56.380 IST [15685] LOG: logical replication apply
worker for subscription "sub1" has started
2021-11-15 19:27:56.384 IST [15687] LOG: logical replication table
synchronization worker for subscription "sub1", table "t1" has started
TRAP: FailedAssertion("strlen(errmsg) < PGSTAT_SUBWORKERERROR_MSGLEN",
File: "pgstat.c", Line: 1946, PID: 15687)
postgres: logical replication worker for subscription 16387 sync 16384
(ExceptionalCondition+0xd0)[0x55a18f3c727f]
postgres: logical replication worker for subscription 16387 sync 16384
(pgstat_report_subworker_error+0x7a)[0x55a18f126417]
postgres: logical replication worker for subscription 16387 sync 16384
(ApplyWorkerMain+0x493)[0x55a18f176611]
postgres: logical replication worker for subscription 16387 sync 16384
(StartBackgroundWorker+0x23c)[0x55a18f11f7e2]
postgres: logical replication worker for subscription 16387 sync 16384
(+0x54efc0)[0x55a18f134fc0]
postgres: logical replication worker for subscription 16387 sync 16384
(+0x54f3af)[0x55a18f1353af]
postgres: logical replication worker for subscription 16387 sync 16384
(+0x54e338)[0x55a18f134338]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x141f0)[0x7feef84371f0]
/lib/x86_64-linux-gnu/libc.so.6(__select+0x57)[0x7feef81e3ac7]
postgres: logical replication worker for subscription 16387 sync 16384
(+0x5498c2)[0x55a18f12f8c2]
postgres: logical replication worker for subscription 16387 sync 16384
(PostmasterMain+0x134c)[0x55a18f12f1dd]
postgres: logical replication worker for subscription 16387 sync 16384
(+0x43c3d4)[0x55a18f0223d4]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xd5)[0x7feef80fd565]
postgres: logical replication worker for subscription 16387 sync 16384
(_start+0x2e)[0x55a18ecaf4fe]
2021-11-15 19:27:56.483 IST [15645] LOG: background worker "logical
replication worker" (PID 15687) was terminated by signal 6: Aborted
2021-11-15 19:27:56.483 IST [15645] LOG: terminating any other active
server processes
2021-11-15 19:27:56.485 IST [15645] LOG: all server processes
terminated; reinitializing
Here it fails because of a long error message ""invalid input syntax
for type smallint:
\"abcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabc...."
because we try to insert varchar type data into smallint type. Maybe
we should trim the error message in this case.
Regards,
Vignesh
On Mon, Nov 15, 2021 at 11:43 PM vignesh C <vignesh21@gmail.com> wrote:
On Mon, Nov 15, 2021 at 2:48 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Nov 15, 2021 at 4:49 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Mon, Nov 15, 2021 at 1:49 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated patch that incorporates all comments I got so
far. Please review it.Thanks for the updated patch.
A few minor comments:doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
(1) tab in doc updates
There's a tab before "Otherwise,":
+ copy of the relation with <parameter>relid</parameter>.
Otherwise,Fixed.
src/backend/utils/adt/pgstatfuncs.c
(2) The function comment for "pg_stat_reset_subscription_worker_sub"
seems a bit long and I expected it to be multi-line (did you run
pg_indent?)I ran pg_indent on pgstatfuncs.c but it didn't become a multi-line comment.
src/include/pgstat.h
(3) Remove PgStat_StatSubWorkerEntry.dbid?
The "dbid" member of the new PgStat_StatSubWorkerEntry struct doesn't
seem to be used, so I think it should be removed.
(I could remove it and everything builds OK and tests pass).Fixed.
Thank you for the comments! I've updated an updated version patch.
Thanks for the updated patch. I found one issue: This Assert can fail in few cases: +void +pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid, + LogicalRepMsgType command, TransactionId xid, + const char *errmsg) +{ + PgStat_MsgSubWorkerError msg; + int len; + + Assert(strlen(errmsg) < PGSTAT_SUBWORKERERROR_MSGLEN); + len = offsetof(PgStat_MsgSubWorkerError, m_message[0]) + strlen(errmsg) + 1; +I could reproduce the problem with the following scenario:
Publisher:
create table t1 (c1 varchar);
create publication pub1 for table t1;
insert into t1 values(repeat('abcd', 5000));Subscriber:
create table t1(c1 smallint);
create subscription sub1 connection 'dbname=postgres port=5432'
publication pub1 with ( two_phase = true);
postgres=# select * from pg_stat_subscription_workers;
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process
exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.Subscriber logs:
2021-11-15 19:27:56.380 IST [15685] LOG: logical replication apply
worker for subscription "sub1" has started
2021-11-15 19:27:56.384 IST [15687] LOG: logical replication table
synchronization worker for subscription "sub1", table "t1" has started
TRAP: FailedAssertion("strlen(errmsg) < PGSTAT_SUBWORKERERROR_MSGLEN",
File: "pgstat.c", Line: 1946, PID: 15687)
postgres: logical replication worker for subscription 16387 sync 16384
(ExceptionalCondition+0xd0)[0x55a18f3c727f]
postgres: logical replication worker for subscription 16387 sync 16384
(pgstat_report_subworker_error+0x7a)[0x55a18f126417]
postgres: logical replication worker for subscription 16387 sync 16384
(ApplyWorkerMain+0x493)[0x55a18f176611]
postgres: logical replication worker for subscription 16387 sync 16384
(StartBackgroundWorker+0x23c)[0x55a18f11f7e2]
postgres: logical replication worker for subscription 16387 sync 16384
(+0x54efc0)[0x55a18f134fc0]
postgres: logical replication worker for subscription 16387 sync 16384
(+0x54f3af)[0x55a18f1353af]
postgres: logical replication worker for subscription 16387 sync 16384
(+0x54e338)[0x55a18f134338]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x141f0)[0x7feef84371f0]
/lib/x86_64-linux-gnu/libc.so.6(__select+0x57)[0x7feef81e3ac7]
postgres: logical replication worker for subscription 16387 sync 16384
(+0x5498c2)[0x55a18f12f8c2]
postgres: logical replication worker for subscription 16387 sync 16384
(PostmasterMain+0x134c)[0x55a18f12f1dd]
postgres: logical replication worker for subscription 16387 sync 16384
(+0x43c3d4)[0x55a18f0223d4]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xd5)[0x7feef80fd565]
postgres: logical replication worker for subscription 16387 sync 16384
(_start+0x2e)[0x55a18ecaf4fe]
2021-11-15 19:27:56.483 IST [15645] LOG: background worker "logical
replication worker" (PID 15687) was terminated by signal 6: Aborted
2021-11-15 19:27:56.483 IST [15645] LOG: terminating any other active
server processes
2021-11-15 19:27:56.485 IST [15645] LOG: all server processes
terminated; reinitializingHere it fails because of a long error message ""invalid input syntax
for type smallint:
Good catch!
\"abcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabc...."
because we try to insert varchar type data into smallint type. Maybe
we should trim the error message in this case.
Right. I've fixed this issue and attached an updated patch.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v23-0001-Add-a-subscription-worker-statistics-view-pg_sta.patchapplication/octet-stream; name=v23-0001-Add-a-subscription-worker-statistics-view-pg_sta.patchDownload
From 2d70e0a4bdbff451564a611463c8dfb589d6933b Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 16 Jul 2021 23:10:22 +0900
Subject: [PATCH v23 1/3] Add a subscription worker statistics view
"pg_stat_subscription_workers".
This commit adds a new system view pg_stat_subscription_workers,
that shows information about any errors which occur during application
of logical replication changes as well as during performing initial table
synchronization. The subscription statistics entries are removed when
the corresponding subscription is removed.
It also adds an SQL function pg_stat_reset_subscription_worker() to
reset single subscription errors.
The contents of this view can be used by an upcoming patch that skips
the particular transaction that conflicts with the existing data on
the subscriber.
This view can be extended in the future to track other xact related
statistics for subscription workers.
---
doc/src/sgml/monitoring.sgml | 167 +++++++++
src/backend/catalog/system_functions.sql | 4 +
src/backend/catalog/system_views.sql | 25 ++
src/backend/commands/subscriptioncmds.c | 16 +-
src/backend/postmaster/pgstat.c | 377 +++++++++++++++++++-
src/backend/replication/logical/worker.c | 54 ++-
src/backend/utils/adt/pgstatfuncs.c | 134 ++++++-
src/include/catalog/pg_proc.dat | 18 +
src/include/pgstat.h | 104 +++++-
src/test/regress/expected/rules.out | 20 ++
src/test/subscription/t/026_error_report.pl | 191 ++++++++++
src/tools/pgindent/typedefs.list | 4 +
12 files changed, 1093 insertions(+), 21 deletions(-)
create mode 100644 src/test/subscription/t/026_error_report.pl
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 3173ec2566..daf9fe89d5 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -627,6 +627,15 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
</entry>
</row>
+ <row>
+ <entry><structname>pg_stat_subscription_workers</structname><indexterm><primary>pg_stat_subscription_workers</primary></indexterm></entry>
+ <entry>One row per subscription worker, showing statistics about errors
+ that occurred on that subscription worker.
+ See <link linkend="monitoring-pg-stat-subscription-workers">
+ <structname>pg_stat_subscription_workers</structname></link> for details.
+ </entry>
+ </row>
+
</tbody>
</tgroup>
</table>
@@ -3034,6 +3043,138 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
</sect2>
+ <sect2 id="monitoring-pg-stat-subscription-workers">
+ <title><structname>pg_stat_subscription_workers</structname></title>
+
+ <indexterm>
+ <primary>pg_stat_subscription_workers</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_stat_subscription_workers</structname> view will contain
+ one row per subscription error reported by workers applying logical
+ replication changes and workers handling the initial data copy of the
+ subscribed tables. The statistics entry is removed when the subscription
+ the worker is running on is removed.
+ </para>
+
+ <table id="pg-stat-subscription-workers" xreflabel="pg_stat_subscription_workers">
+ <title><structname>pg_stat_subscription_workers</structname> View</title>
+ <tgroup cols="1">
+ <thead>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ Column Type
+ </para>
+ <para>
+ Description
+ </para></entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the subscription
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the subscription
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subrelid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker is synchronizing; null for the
+ main apply worker
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>relid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker was processing when the
+ error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>command</structfield> <type>text</type>
+ </para>
+ <para>
+ Name of command being applied when the error occurred. This field
+ is always NULL if the error was reported during the initial data
+ copy.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>xid</structfield> <type>xid</type>
+ </para>
+ <para>
+ Transaction ID of the publisher node being applied when the error
+ occurred. This field is always NULL if the error was reported
+ during the initial data copy.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>error_count</structfield> <type>uint8</type>
+ </para>
+ <para>
+ Number of consecutive times the error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>error_message</structfield> <type>text</type>
+ </para>
+ <para>
+ The error message
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>first_error_time</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which the first error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_error_time</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which the last error occurred
+ </para></entry>
+ </row>
+
+ </tbody>
+ </tgroup>
+ </table>
+
+ </sect2>
+
<sect2 id="monitoring-pg-stat-ssl-view">
<title><structname>pg_stat_ssl</structname></title>
@@ -5156,6 +5297,32 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
can be granted EXECUTE to run the function.
</para></entry>
</row>
+
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_stat_reset_subscription_worker</primary>
+ </indexterm>
+ <function>pg_stat_reset_subscription_worker</function> ( <parameter>subid</parameter> <type>oid</type>, <optional> <parameter>relid</parameter> <type>oid</type> </optional> )
+ <returnvalue>void</returnvalue>
+ </para>
+ <para>
+ Resets the statistics of a single subscription worker running on the
+ subscription with <parameter>subid</parameter> shown in the
+ <structname>pg_stat_subscription_worker</structname> view. If the
+ argument <parameter>relid</parameter> is not <literal>NULL</literal>,
+ resets statistics of the subscription worker handling the initial data
+ copy of the relation with <parameter>relid</parameter>. Otherwise,
+ resets the subscription worker statistics of the main apply worker.
+ If the argument <parameter>relid</parameter> is omitted, resets the
+ statistics of all subscription workers running on the subscription
+ with <parameter>subid</parameter>.
+ </para>
+ <para>
+ This function is restricted to superusers by default, but other users
+ can be granted EXECUTE to run the function.
+ </para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index 54c93b16c4..cd1d649f9f 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -639,6 +639,10 @@ REVOKE EXECUTE ON FUNCTION pg_stat_reset_single_function_counters(oid) FROM publ
REVOKE EXECUTE ON FUNCTION pg_stat_reset_replication_slot(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_stat_reset_subscription_worker(oid) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_stat_reset_subscription_worker(oid, oid) FROM public;
+
REVOKE EXECUTE ON FUNCTION lo_import(text) FROM public;
REVOKE EXECUTE ON FUNCTION lo_import(text, oid) FROM public;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index eb560955cd..cb2f77cd1e 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1261,3 +1261,28 @@ REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
substream, subtwophasestate, subslotname, subsynccommit, subpublications)
ON pg_subscription TO public;
+
+CREATE VIEW pg_stat_subscription_workers AS
+ SELECT
+ w.subid,
+ s.subname,
+ w.subrelid,
+ w.relid,
+ w.command,
+ w.xid,
+ w.error_count,
+ w.error_message,
+ w.first_error_time,
+ w.last_error_time
+ FROM (SELECT
+ oid as subid,
+ NULL as relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT
+ srsubid as subid,
+ srrelid as relid
+ FROM pg_subscription_rel
+ WHERE srsubstate <> 'r') sr,
+ LATERAL pg_stat_get_subscription_worker(sr.subid, sr.relid) w
+ JOIN pg_subscription s ON (w.subid = s.oid);
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index c47ba26369..9427e86fee 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -32,6 +32,7 @@
#include "executor/executor.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "pgstat.h"
#include "replication/logicallauncher.h"
#include "replication/origin.h"
#include "replication/slot.h"
@@ -1204,7 +1205,8 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
* Since dropping a replication slot is not transactional, the replication
* slot stays dropped even if the transaction rolls back. So we cannot
* run DROP SUBSCRIPTION inside a transaction block if dropping the
- * replication slot.
+ * replication slot. Also, in this case, we report a message for dropping
+ * the subscription to the stats collector.
*
* XXX The command name should really be something like "DROP SUBSCRIPTION
* of a subscription that is associated with a replication slot", but we
@@ -1377,6 +1379,18 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
}
PG_END_TRY();
+ /*
+ * Send a message for dropping this subscription to the stats collector.
+ * We can safely report dropping the subscription statistics here if the
+ * subscription is associated with a replication slot since we cannot run
+ * DROP SUBSCRIPTION inside a transaction block. Subscription statistics
+ * will be removed later by (auto)vacuum either if it's not associated
+ * with a replication slot or if the message for dropping the subscription
+ * gets lost.
+ */
+ if (slotname)
+ pgstat_report_subscription_drop(subid);
+
table_close(rel, NoLock);
}
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 8c166e5e16..ee3b39a301 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -41,6 +41,7 @@
#include "catalog/catalog.h"
#include "catalog/pg_database.h"
#include "catalog/pg_proc.h"
+#include "catalog/pg_subscription.h"
#include "common/ip.h"
#include "executor/instrument.h"
#include "libpq/libpq.h"
@@ -105,6 +106,7 @@
#define PGSTAT_DB_HASH_SIZE 16
#define PGSTAT_TAB_HASH_SIZE 512
#define PGSTAT_FUNCTION_HASH_SIZE 512
+#define PGSTAT_SUBWORKER_HASH_SIZE 256
#define PGSTAT_REPLSLOT_HASH_SIZE 32
@@ -320,10 +322,14 @@ NON_EXEC_STATIC void PgstatCollectorMain(int argc, char *argv[]) pg_attribute_no
static PgStat_StatDBEntry *pgstat_get_db_entry(Oid databaseid, bool create);
static PgStat_StatTabEntry *pgstat_get_tab_entry(PgStat_StatDBEntry *dbentry,
Oid tableoid, bool create);
+static PgStat_StatSubWorkerEntry *pgstat_get_subworker_entry(PgStat_StatDBEntry *dbentry,
+ Oid subid, Oid subrelid,
+ bool create);
static void pgstat_write_statsfiles(bool permanent, bool allDbs);
static void pgstat_write_db_statsfile(PgStat_StatDBEntry *dbentry, bool permanent);
static HTAB *pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep);
-static void pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash, bool permanent);
+static void pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash,
+ HTAB *subworkerhash, bool permanent);
static void backend_read_statsfile(void);
static bool pgstat_write_statsfile_needed(void);
@@ -335,6 +341,7 @@ static void pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotstats, Timestamp
static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg, TimestampTz now);
static void pgstat_send_funcstats(void);
static void pgstat_send_slru(void);
+static void pgstat_send_subscription_purge(PgStat_MsgSubscriptionPurge *msg);
static HTAB *pgstat_collect_oids(Oid catalogid, AttrNumber anum_oid);
static bool pgstat_should_report_connstat(void);
static void pgstat_report_disconnect(Oid dboid);
@@ -373,6 +380,8 @@ static void pgstat_recv_connect(PgStat_MsgConnect *msg, int len);
static void pgstat_recv_disconnect(PgStat_MsgDisconnect *msg, int len);
static void pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len);
static void pgstat_recv_tempfile(PgStat_MsgTempFile *msg, int len);
+static void pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len);
+static void pgstat_recv_subworker_error(PgStat_MsgSubWorkerError *msg, int len);
/* ------------------------------------------------------------
* Public functions called from postmaster follow
@@ -1302,6 +1311,55 @@ pgstat_vacuum_stat(void)
hash_destroy(htab);
}
+
+ /*
+ * Repeat for subscription workers. Similarly, we needn't bother
+ * in the common case where no function stats are being collected.
+ */
+ if (dbentry->subworkers != NULL &&
+ hash_get_num_entries(dbentry->subworkers) > 0)
+ {
+ PgStat_StatSubWorkerEntry *subwentry;
+ PgStat_MsgSubscriptionPurge spmsg;
+
+ /*
+ * Read pg_subscription and make a list of OIDs of all existing
+ * subscriptions
+ */
+ htab = pgstat_collect_oids(SubscriptionRelationId, Anum_pg_subscription_oid);
+
+ pgstat_setheader(&spmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ spmsg.m_databaseid = MyDatabaseId;
+ spmsg.m_nentries = 0;
+
+ hash_seq_init(&hstat, dbentry->subworkers);
+ while ((subwentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ CHECK_FOR_INTERRUPTS();
+
+ if (hash_search(htab, (void *) &(subwentry->key.subid), HASH_FIND, NULL)
+ != NULL)
+ continue;
+
+ /* This subscription is dead, add the subid to the message */
+ spmsg.m_subids[spmsg.m_nentries++] = subwentry->key.subid;
+
+ /*
+ * If the message is full, send it out and reinitialize to empty
+ */
+ if (spmsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONPURGE)
+ {
+ pgstat_send_subscription_purge(&spmsg);
+ spmsg.m_nentries = 0;
+ }
+ }
+
+ /* Send the rest of dead subscriptions */
+ if (spmsg.m_nentries > 0)
+ pgstat_send_subscription_purge(&spmsg);
+
+ hash_destroy(htab);
+ }
}
@@ -1474,7 +1532,8 @@ pgstat_reset_shared_counters(const char *target)
* ----------
*/
void
-pgstat_reset_single_counter(Oid objoid, PgStat_Single_Reset_Type type)
+pgstat_reset_single_counter(Oid objoid, Oid subobjoid,
+ PgStat_Single_Reset_Type type)
{
PgStat_MsgResetsinglecounter msg;
@@ -1485,6 +1544,7 @@ pgstat_reset_single_counter(Oid objoid, PgStat_Single_Reset_Type type)
msg.m_databaseid = MyDatabaseId;
msg.m_resettype = type;
msg.m_objectid = objoid;
+ msg.m_subobjectid = subobjoid;
pgstat_send(&msg, sizeof(msg));
}
@@ -1869,6 +1929,51 @@ pgstat_report_replslot_drop(const char *slotname)
pgstat_send(&msg, sizeof(PgStat_MsgReplSlot));
}
+/* ----------
+ * pgstat_report_subworker_error() -
+ *
+ * Tell the collector about the subscription worker error.
+ * ----------
+ */
+void
+pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command, TransactionId xid,
+ const char *errmsg)
+{
+ PgStat_MsgSubWorkerError msg;
+ int len;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBWORKERERROR);
+ msg.m_databaseid = MyDatabaseId;
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_relid = relid;
+ msg.m_command = command;
+ msg.m_xid = xid;
+ msg.m_timestamp = GetCurrentTimestamp();
+ strlcpy(msg.m_message, errmsg, PGSTAT_SUBWORKERERROR_MSGLEN);
+
+ len = offsetof(PgStat_MsgSubWorkerError, m_message) + strlen(msg.m_message) + 1;
+ pgstat_send(&msg, len);
+}
+
+/* ----------
+ * pgstat_report_subscription_drop() -
+ *
+ * Tell the collector about dropping the subscription.
+ * ----------
+ */
+void
+pgstat_report_subscription_drop(Oid subid)
+{
+ PgStat_MsgSubscriptionPurge msg;
+
+ msg.m_databaseid = MyDatabaseId;
+ msg.m_subids[0] = subid;
+ msg.m_nentries = 1;
+ pgstat_send_subscription_purge(&msg);
+}
+
/* ----------
* pgstat_ping() -
*
@@ -2874,6 +2979,33 @@ pgstat_fetch_stat_funcentry(Oid func_id)
return funcentry;
}
+/*
+ * ---------
+ * pgstat_fetch_stat_subworker_entry() -
+ *
+ * Support function for the SQL-callable pgstat* functions. Returns
+ * a pointer to the subscription worker struct or NULL.
+ * ---------
+ */
+PgStat_StatSubWorkerEntry *
+pgstat_fetch_stat_subworker_entry(Oid subid, Oid subrelid)
+{
+ PgStat_StatDBEntry *dbentry;
+ PgStat_StatSubWorkerEntry *wentry = NULL;
+
+ /* Load the stats file if needed */
+ backend_read_statsfile();
+
+ /* Look up database, then find the requested subscription worker stats */
+ dbentry = pgstat_fetch_stat_dbentry(MyDatabaseId);
+ if (dbentry != NULL && dbentry->subworkers != NULL)
+ {
+ wentry = pgstat_get_subworker_entry(dbentry, subid, subrelid,
+ false);
+ }
+
+ return wentry;
+}
/*
* ---------
@@ -3312,6 +3444,23 @@ pgstat_send_slru(void)
}
}
+/* --------
+ * pgstat_send_subscription_purge() -
+ *
+ * Send a subscription purge message to the collector
+ * --------
+ */
+static void
+pgstat_send_subscription_purge(PgStat_MsgSubscriptionPurge *msg)
+{
+ int len;
+
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + msg->m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&msg->m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(msg, len);
+}
/* ----------
* PgstatCollectorMain() -
@@ -3568,6 +3717,14 @@ PgstatCollectorMain(int argc, char *argv[])
pgstat_recv_disconnect(&msg.msg_disconnect, len);
break;
+ case PGSTAT_MTYPE_SUBSCRIPTIONPURGE:
+ pgstat_recv_subscription_purge(&msg.msg_subscriptionpurge, len);
+ break;
+
+ case PGSTAT_MTYPE_SUBWORKERERROR:
+ pgstat_recv_subworker_error(&msg.msg_subworkererror, len);
+ break;
+
default:
break;
}
@@ -3613,7 +3770,8 @@ PgstatCollectorMain(int argc, char *argv[])
/*
* Subroutine to clear stats in a database entry
*
- * Tables and functions hashes are initialized to empty.
+ * Tables, functions, and subscription workers hashes are initialized
+ * to empty.
*/
static void
reset_dbentry_counters(PgStat_StatDBEntry *dbentry)
@@ -3666,6 +3824,13 @@ reset_dbentry_counters(PgStat_StatDBEntry *dbentry)
PGSTAT_FUNCTION_HASH_SIZE,
&hash_ctl,
HASH_ELEM | HASH_BLOBS);
+
+ hash_ctl.keysize = sizeof(PgStat_StatSubWorkerKey);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubWorkerEntry);
+ dbentry->subworkers = hash_create("Per-database subscription worker",
+ PGSTAT_SUBWORKER_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
}
/*
@@ -3690,7 +3855,7 @@ pgstat_get_db_entry(Oid databaseid, bool create)
/*
* If not found, initialize the new one. This creates empty hash tables
- * for tables and functions, too.
+ * for tables, functions, and subscription worker, too.
*/
if (!found)
reset_dbentry_counters(result);
@@ -3748,6 +3913,45 @@ pgstat_get_tab_entry(PgStat_StatDBEntry *dbentry, Oid tableoid, bool create)
return result;
}
+/* ----------
+ * pgstat_get_subworker_entry
+ *
+ * Return subscription worker entry with the given subscription OID and
+ * relation OID. If subrelid is InvalidOid, it returns an entry of the
+ * apply worker otherwise of the table sync worker associated with subrelid.
+ * If no subscription entry exists, initialize it, if the create parameter
+ * is true. Else, return NULL.
+ * ----------
+ */
+static PgStat_StatSubWorkerEntry *
+pgstat_get_subworker_entry(PgStat_StatDBEntry *dbentry, Oid subid, Oid subrelid,
+ bool create)
+{
+ PgStat_StatSubWorkerEntry *subwentry;
+ PgStat_StatSubWorkerKey key;
+ bool found;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);
+
+ key.subid = subid;
+ key.subrelid = subrelid;
+ subwentry = (PgStat_StatSubWorkerEntry *) hash_search(dbentry->subworkers,
+ (void *) &key,
+ action, &found);
+
+ /* If not found, initialize the new one */
+ if (create && !found)
+ {
+ subwentry->relid = InvalidOid;
+ subwentry->command = 0;
+ subwentry->xid = InvalidTransactionId;
+ subwentry->error_count = 0;
+ subwentry->first_error_time = 0;
+ subwentry->last_error_time = 0;
+ subwentry->error_message[0] = '\0';
+ }
+
+ return subwentry;
+}
/* ----------
* pgstat_write_statsfiles() -
@@ -3947,8 +4151,10 @@ pgstat_write_db_statsfile(PgStat_StatDBEntry *dbentry, bool permanent)
{
HASH_SEQ_STATUS tstat;
HASH_SEQ_STATUS fstat;
+ HASH_SEQ_STATUS sstat;
PgStat_StatTabEntry *tabentry;
PgStat_StatFuncEntry *funcentry;
+ PgStat_StatSubWorkerEntry *subwentry;
FILE *fpout;
int32 format_id;
Oid dbid = dbentry->databaseid;
@@ -4003,6 +4209,17 @@ pgstat_write_db_statsfile(PgStat_StatDBEntry *dbentry, bool permanent)
(void) rc; /* we'll check for error with ferror */
}
+ /*
+ * Walk through the database's subscription worker stats table.
+ */
+ hash_seq_init(&sstat, dbentry->subworkers);
+ while ((subwentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&sstat)) != NULL)
+ {
+ fputc('S', fpout);
+ rc = fwrite(subwentry, sizeof(PgStat_StatSubWorkerEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+ }
+
/*
* No more output to be done. Close the temp file and replace the old
* pgstat.stat with it. The ferror() check replaces testing for error
@@ -4241,6 +4458,7 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
memcpy(dbentry, &dbbuf, sizeof(PgStat_StatDBEntry));
dbentry->tables = NULL;
dbentry->functions = NULL;
+ dbentry->subworkers = NULL;
/*
* In the collector, disregard the timestamp we read from the
@@ -4252,8 +4470,8 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
dbentry->stats_timestamp = 0;
/*
- * Don't create tables/functions hashtables for uninteresting
- * databases.
+ * Don't create tables/functions/subworkers hashtables for
+ * uninteresting databases.
*/
if (onlydb != InvalidOid)
{
@@ -4278,6 +4496,14 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
&hash_ctl,
HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ hash_ctl.keysize = sizeof(PgStat_StatSubWorkerKey);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubWorkerEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ dbentry->subworkers = hash_create("Per-database subscription worker",
+ PGSTAT_SUBWORKER_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+
/*
* If requested, read the data from the database-specific
* file. Otherwise we just leave the hashtables empty.
@@ -4286,6 +4512,7 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
pgstat_read_db_statsfile(dbentry->databaseid,
dbentry->tables,
dbentry->functions,
+ dbentry->subworkers,
permanent);
break;
@@ -4363,19 +4590,21 @@ done:
* As in pgstat_read_statsfiles, if the permanent file is requested, it is
* removed after reading.
*
- * Note: this code has the ability to skip storing per-table or per-function
- * data, if NULL is passed for the corresponding hashtable. That's not used
- * at the moment though.
+ * Note: this code has the ability to skip storing per-table, per-function, or
+ * per-subscription-worker data, if NULL is passed for the corresponding hashtable.
+ * That's not used at the moment though.
* ----------
*/
static void
pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash,
- bool permanent)
+ HTAB *subworkerhash, bool permanent)
{
PgStat_StatTabEntry *tabentry;
PgStat_StatTabEntry tabbuf;
PgStat_StatFuncEntry funcbuf;
PgStat_StatFuncEntry *funcentry;
+ PgStat_StatSubWorkerEntry subwbuf;
+ PgStat_StatSubWorkerEntry *subwentry;
FILE *fpin;
int32 format_id;
bool found;
@@ -4489,6 +4718,41 @@ pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash,
memcpy(funcentry, &funcbuf, sizeof(funcbuf));
break;
+ /*
+ * 'S' A PgStat_StatSubWorkerEntry struct describing
+ * subscription worker statistics.
+ */
+ case 'S':
+ if (fread(&subwbuf, 1, sizeof(PgStat_StatSubWorkerEntry),
+ fpin) != sizeof(PgStat_StatSubWorkerEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /*
+ * Skip if subscription worker data not wanted.
+ */
+ if (subworkerhash == NULL)
+ break;
+
+ subwentry = (PgStat_StatSubWorkerEntry *) hash_search(subworkerhash,
+ (void *) &subwbuf.key,
+ HASH_ENTER, &found);
+
+ if (found)
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ memcpy(subwentry, &subwbuf, sizeof(subwbuf));
+ break;
+
/*
* 'E' The EOF marker of a complete stats file.
*/
@@ -5162,6 +5426,8 @@ pgstat_recv_dropdb(PgStat_MsgDropdb *msg, int len)
hash_destroy(dbentry->tables);
if (dbentry->functions != NULL)
hash_destroy(dbentry->functions);
+ if (dbentry->subworkers != NULL)
+ hash_destroy(dbentry->subworkers);
if (hash_search(pgStatDBHash,
(void *) &dbid,
@@ -5199,13 +5465,16 @@ pgstat_recv_resetcounter(PgStat_MsgResetcounter *msg, int len)
hash_destroy(dbentry->tables);
if (dbentry->functions != NULL)
hash_destroy(dbentry->functions);
+ if (dbentry->subworkers != NULL)
+ hash_destroy(dbentry->subworkers);
dbentry->tables = NULL;
dbentry->functions = NULL;
+ dbentry->subworkers = NULL;
/*
* Reset database-level stats, too. This creates empty hash tables for
- * tables and functions.
+ * tables, functions, and subscription workers.
*/
reset_dbentry_counters(dbentry);
}
@@ -5274,6 +5543,14 @@ pgstat_recv_resetsinglecounter(PgStat_MsgResetsinglecounter *msg, int len)
else if (msg->m_resettype == RESET_FUNCTION)
(void) hash_search(dbentry->functions, (void *) &(msg->m_objectid),
HASH_REMOVE, NULL);
+ else if (msg->m_resettype == RESET_SUBWORKER)
+ {
+ PgStat_StatSubWorkerKey key;
+
+ key.subid = msg->m_objectid;
+ key.subrelid = msg->m_subobjectid;
+ (void) hash_search(dbentry->subworkers, (void *) &key, HASH_REMOVE, NULL);
+ }
}
/* ----------
@@ -5816,6 +6093,84 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
}
}
+/* ----------
+ * pgstat_recv_subscription_purge() -
+ *
+ * Process a SUBSCRIPTIONPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len)
+{
+ HASH_SEQ_STATUS hstat;
+ PgStat_StatDBEntry *dbentry;
+ PgStat_StatSubWorkerEntry *subwentry;
+
+ dbentry = pgstat_get_db_entry(msg->m_databaseid, false);
+
+ /* No need to purge if we don't have even know the database */
+ if (!dbentry || !dbentry->subworkers)
+ return;
+
+ /* Remove all subscription worker statistics of the given subscriptions */
+ hash_seq_init(&hstat, dbentry->subworkers);
+ while ((subwentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ if (subwentry->key.subid == msg->m_subids[i])
+ {
+ (void) hash_search(dbentry->subworkers, (void *) &(subwentry->key),
+ HASH_REMOVE, NULL);
+ break;
+ }
+ }
+ }
+}
+
+/* ----------
+ * pgstat_recv_subworker_error() -
+ *
+ * Process a SUBWORKERERROR message.
+ * ----------
+ */
+static void
+pgstat_recv_subworker_error(PgStat_MsgSubWorkerError *msg, int len)
+{
+ PgStat_StatDBEntry *dbentry;
+ PgStat_StatSubWorkerEntry *subwentry;
+
+ dbentry = pgstat_get_db_entry(msg->m_databaseid, true);
+
+ /* Get the subscription worker stats */
+ subwentry = pgstat_get_subworker_entry(dbentry, msg->m_subid,
+ msg->m_subrelid, true);
+ Assert(subwentry);
+
+ /*
+ * Update only the counter and last error timestamp if we received
+ * the same error again
+ */
+ if (subwentry->relid == msg->m_relid &&
+ subwentry->command == msg->m_command &&
+ subwentry->xid == msg->m_xid &&
+ strcmp(subwentry->error_message, msg->m_message) == 0)
+ {
+ subwentry->error_count++;
+ subwentry->last_error_time = msg->m_timestamp;
+ return;
+ }
+
+ /* Otherwise, update the error information */
+ subwentry->relid = msg->m_relid;
+ subwentry->command = msg->m_command;
+ subwentry->xid = msg->m_xid;
+ subwentry->error_count = 1;
+ subwentry->first_error_time = msg->m_timestamp;
+ subwentry->last_error_time = msg->m_timestamp;
+ strlcpy(subwentry->error_message, msg->m_message, PGSTAT_SUBWORKERERROR_MSGLEN);
+}
+
/* ----------
* pgstat_write_statsfile_needed() -
*
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index ae1b391bda..2e79302a48 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3332,6 +3332,7 @@ void
ApplyWorkerMain(Datum main_arg)
{
int worker_slot = DatumGetInt32(main_arg);
+ MemoryContext cctx = CurrentMemoryContext;
MemoryContext oldctx;
char originname[NAMEDATALEN];
XLogRecPtr origin_startpos;
@@ -3432,8 +3433,30 @@ ApplyWorkerMain(Datum main_arg)
{
char *syncslotname;
- /* This is table synchronization worker, call initial sync. */
- syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ PG_TRY();
+ {
+ /* This is table synchronization worker, call initial sync. */
+ syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ }
+ PG_CATCH();
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ /*
+ * Report the table sync error. There is no corresponding message
+ * type for table synchronization.
+ */
+ pgstat_report_subworker_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ MyLogicalRepWorker->relid,
+ 0, /* message type */
+ InvalidTransactionId,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
/* allocate slot name in long-lived context */
myslotname = MemoryContextStrdup(ApplyContext, syncslotname);
@@ -3551,7 +3574,32 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- LogicalRepApplyLoop(origin_startpos);
+ PG_TRY();
+ {
+ LogicalRepApplyLoop(origin_startpos);
+ }
+ PG_CATCH();
+ {
+ /* report the apply error */
+ if (apply_error_callback_arg.command != 0)
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ pgstat_report_subworker_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ apply_error_callback_arg.rel != NULL
+ ? apply_error_callback_arg.rel->localreloid
+ : InvalidOid,
+ apply_error_callback_arg.command,
+ apply_error_callback_arg.remote_xid,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ }
+
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
proc_exit(0);
}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index e64857e540..b19729d1ca 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -2172,7 +2172,7 @@ pg_stat_reset_single_table_counters(PG_FUNCTION_ARGS)
{
Oid taboid = PG_GETARG_OID(0);
- pgstat_reset_single_counter(taboid, RESET_TABLE);
+ pgstat_reset_single_counter(taboid, InvalidOid, RESET_TABLE);
PG_RETURN_VOID();
}
@@ -2182,7 +2182,18 @@ pg_stat_reset_single_function_counters(PG_FUNCTION_ARGS)
{
Oid funcoid = PG_GETARG_OID(0);
- pgstat_reset_single_counter(funcoid, RESET_FUNCTION);
+ pgstat_reset_single_counter(funcoid, InvalidOid, RESET_FUNCTION);
+
+ PG_RETURN_VOID();
+}
+
+Datum
+pg_stat_reset_subscription_worker_subrel(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid = PG_ARGISNULL(1) ? InvalidOid : PG_GETARG_OID(1);
+
+ pgstat_reset_single_counter(subid, relid, RESET_SUBWORKER);
PG_RETURN_VOID();
}
@@ -2240,6 +2251,21 @@ pg_stat_reset_replication_slot(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
+/* Reset all subscription worker stats associated with the given subscription */
+Datum
+pg_stat_reset_subscription_worker_sub(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+
+ /*
+ * Use subscription drop message to remove statistics of all subscription
+ * workers.
+ */
+ pgstat_report_subscription_drop(subid);
+
+ PG_RETURN_VOID();
+}
+
Datum
pg_stat_get_archiver(PG_FUNCTION_ARGS)
{
@@ -2380,3 +2406,107 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
/* Returns the record as Datum */
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
+
+/*
+ * Get the subscription worker statistics for the given subscription
+ * (and relation).
+ */
+Datum
+pg_stat_get_subscription_worker(PG_FUNCTION_ARGS)
+{
+#define PG_STAT_GET_SUBSCRIPTION_WORKER_COLS 9
+ Oid subid = PG_GETARG_OID(0);
+ Oid subrelid;
+ TupleDesc tupdesc;
+ Datum values[PG_STAT_GET_SUBSCRIPTION_WORKER_COLS];
+ bool nulls[PG_STAT_GET_SUBSCRIPTION_WORKER_COLS];
+ PgStat_StatSubWorkerEntry *wentry;
+ int i;
+
+ if (PG_ARGISNULL(1))
+ subrelid = InvalidOid;
+ else
+ subrelid = PG_GETARG_OID(1);
+
+ /* Get subscription worker stats */
+ wentry = pgstat_fetch_stat_subworker_entry(subid, subrelid);
+
+ /* Return NULL if there is no worker statistics */
+ if (wentry == NULL)
+ PG_RETURN_NULL();
+
+ /* Initialise attributes information in the tuple descriptor */
+ tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_SUBSCRIPTION_WORKER_COLS);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "subid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subrelid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 3, "relid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 4, "command",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 5, "xid",
+ XIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 6, "error_count",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 7, "error_message",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 8, "first_error_time",
+ TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 9, "last_error_time",
+ TIMESTAMPTZOID, -1, 0);
+ BlessTupleDesc(tupdesc);
+
+ /* Initialise values and NULL flags arrays */
+ MemSet(values, 0, sizeof(values));
+ MemSet(nulls, 0, sizeof(nulls));
+
+ i = 0;
+ /* subid */
+ values[i++] = ObjectIdGetDatum(subid);
+
+ /* subrelid */
+ if (OidIsValid(subrelid))
+ values[i++] = ObjectIdGetDatum(subrelid);
+ else
+ nulls[i++] = true;
+
+ /* relid */
+ if (OidIsValid(wentry->relid))
+ values[i++] = ObjectIdGetDatum(wentry->relid);
+ else
+ nulls[i++] = true;
+
+ /* command */
+ if (wentry->command != 0)
+ values[i++] = CStringGetTextDatum(logicalrep_message_type(wentry->command));
+ else
+ nulls[i++] = true;
+
+ /* xid */
+ if (TransactionIdIsValid(wentry->xid))
+ values[i++] = TransactionIdGetDatum(wentry->xid);
+ else
+ nulls[i++] = true;
+
+ /* error_count */
+ values[i++] = Int64GetDatum(wentry->error_count);
+
+ /* error_message */
+ values[i++] = CStringGetTextDatum(wentry->error_message);
+
+ /* first_error_time */
+ if (wentry->first_error_time != 0)
+ values[i++] = TimestampTzGetDatum(wentry->first_error_time);
+ else
+ nulls[i++] = true;
+
+ /* last_error_time */
+ if (wentry->last_error_time != 0)
+ values[i++] = TimestampTzGetDatum(wentry->last_error_time);
+ else
+ nulls[i++] = true;
+
+ /* Returns the record as Datum */
+ PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
+}
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index d068d6532e..50e1c7b68d 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5385,6 +5385,14 @@
proargmodes => '{i,o,o,o,o,o,o,o,o,o,o}',
proargnames => '{slot_name,slot_name,spill_txns,spill_count,spill_bytes,stream_txns,stream_count,stream_bytes,total_txns,total_bytes,stats_reset}',
prosrc => 'pg_stat_get_replication_slot' },
+{ oid => '8523', descr => 'statistics: information about subscription worker',
+ proname => 'pg_stat_get_subscription_worker', prorows => '1', proisstrict => 'f',
+ proretset => 't', provolatile => 's', proparallel => 'r',
+ prorettype => 'record', proargtypes => 'oid oid',
+ proallargtypes => '{oid,oid,oid,oid,oid,text,xid,int8,text,timestamptz,timestamptz}',
+ proargmodes => '{i,i,o,o,o,o,o,o,o,o,o}',
+ proargnames => '{subid,subrelid,subid,subrelid,relid,command,xid,error_count,error_message,first_error_time,last_error_time}',
+ prosrc => 'pg_stat_get_subscription_worker' },
{ oid => '6118', descr => 'statistics: information about subscription',
proname => 'pg_stat_get_subscription', prorows => '10', proisstrict => 'f',
proretset => 't', provolatile => 's', proparallel => 'r',
@@ -5772,6 +5780,16 @@
proname => 'pg_stat_reset_replication_slot', proisstrict => 'f',
provolatile => 'v', prorettype => 'void', proargtypes => 'text',
prosrc => 'pg_stat_reset_replication_slot' },
+{ oid => '8524',
+ descr => 'statistics: reset collected statistics for a single subscription worker',
+ proname => 'pg_stat_reset_subscription_worker', proisstrict => 'f',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid oid',
+ prosrc => 'pg_stat_reset_subscription_worker_subrel' },
+{ oid => '8525',
+ descr => 'statistics: reset all collected statistics for a single subscription',
+ proname => 'pg_stat_reset_subscription_worker',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid',
+ prosrc => 'pg_stat_reset_subscription_worker_sub' },
{ oid => '3163', descr => 'current trigger depth',
proname => 'pg_trigger_depth', provolatile => 's', proparallel => 'r',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index bcd3588ea2..2c26b1cbd4 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -14,6 +14,7 @@
#include "datatype/timestamp.h"
#include "portability/instr_time.h"
#include "postmaster/pgarch.h" /* for MAX_XFN_CHARS */
+#include "replication/logicalproto.h"
#include "utils/backend_progress.h" /* for backward compatibility */
#include "utils/backend_status.h" /* for backward compatibility */
#include "utils/hsearch.h"
@@ -83,6 +84,8 @@ typedef enum StatMsgType
PGSTAT_MTYPE_REPLSLOT,
PGSTAT_MTYPE_CONNECT,
PGSTAT_MTYPE_DISCONNECT,
+ PGSTAT_MTYPE_SUBSCRIPTIONPURGE,
+ PGSTAT_MTYPE_SUBWORKERERROR,
} StatMsgType;
/* ----------
@@ -145,7 +148,8 @@ typedef enum PgStat_Shared_Reset_Target
typedef enum PgStat_Single_Reset_Type
{
RESET_TABLE,
- RESET_FUNCTION
+ RESET_FUNCTION,
+ RESET_SUBWORKER
} PgStat_Single_Reset_Type;
/* ------------------------------------------------------------
@@ -364,6 +368,7 @@ typedef struct PgStat_MsgResetsinglecounter
Oid m_databaseid;
PgStat_Single_Reset_Type m_resettype;
Oid m_objectid;
+ Oid m_subobjectid;
} PgStat_MsgResetsinglecounter;
/* ----------
@@ -536,6 +541,53 @@ typedef struct PgStat_MsgReplSlot
PgStat_Counter m_total_bytes;
} PgStat_MsgReplSlot;
+/* ----------
+ * PgStat_MsgSubscriptionPurge Sent by the backend and autovacuum to tell the
+ * collector about the dead subscriptions.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(Oid) - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionPurge
+{
+ PgStat_MsgHdr m_hdr;
+ Oid m_databaseid;
+ int m_nentries;
+ Oid m_subids[PGSTAT_NUM_SUBSCRIPTIONPURGE];
+} PgStat_MsgSubscriptionPurge;
+
+/* ----------
+ * PgStat_MsgSubWorkerError Sent by the apply worker or the table sync worker to
+ * report the error occurred during logical replication.
+ * ----------
+ */
+#define PGSTAT_SUBWORKERERROR_MSGLEN 256
+typedef struct PgStat_MsgSubWorkerError
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * m_subid and m_subrelid are used to determine the subscription and the
+ * reporter of the error. m_subrelid is InvalidOid if reported by an apply
+ * worker otherwise reported by a table sync worker.
+ */
+ Oid m_databaseid;
+ Oid m_subid;
+ Oid m_subrelid;
+
+ /*
+ * Oid of the table that the reporter was actually processing. m_relid can
+ * be InvalidOid if an error occurred during worker applying a
+ * non-data-modification message such as RELATION.
+ */
+ Oid m_relid;
+
+ LogicalRepMsgType m_command;
+ TransactionId m_xid;
+ TimestampTz m_timestamp;
+ char m_message[PGSTAT_SUBWORKERERROR_MSGLEN];
+} PgStat_MsgSubWorkerError;
/* ----------
* PgStat_MsgRecoveryConflict Sent by the backend upon recovery conflict
@@ -714,6 +766,8 @@ typedef union PgStat_Msg
PgStat_MsgReplSlot msg_replslot;
PgStat_MsgConnect msg_connect;
PgStat_MsgDisconnect msg_disconnect;
+ PgStat_MsgSubscriptionPurge msg_subscriptionpurge;
+ PgStat_MsgSubWorkerError msg_subworkererror;
} PgStat_Msg;
@@ -768,11 +822,16 @@ typedef struct PgStat_StatDBEntry
TimestampTz stats_timestamp; /* time of db stats file update */
/*
- * tables and functions must be last in the struct, because we don't write
- * the pointers out to the stats file.
+ * tables, functions, and subscription workers must be last in the struct,
+ * because we don't write the pointers out to the stats file.
+ *
+ * subworker is the hash table of PgStat_StatSubWorkerEntry which stores
+ * statistics of logical replication workers: apply worker and table sync
+ * worker.
*/
HTAB *tables;
HTAB *functions;
+ HTAB *subworkers;
} PgStat_StatDBEntry;
@@ -929,6 +988,35 @@ typedef struct PgStat_StatReplSlotEntry
TimestampTz stat_reset_timestamp;
} PgStat_StatReplSlotEntry;
+/* The lookup key for subscription worker hash table */
+typedef struct PgStat_StatSubWorkerKey
+{
+ Oid subid;
+ Oid subrelid; /* InvalidOid for apply worker, otherwise for
+ * table sync worker */
+} PgStat_StatSubWorkerKey;
+
+/*
+ * Logical replication apply worker and table sync worker statistics kept in the
+ * stats collector.
+ */
+typedef struct PgStat_StatSubWorkerEntry
+{
+ PgStat_StatSubWorkerKey key; /* hash key (must be first) */
+
+ /*
+ * Subscription worker error statistics representing an error that
+ * occurred during application of logical replication or the initial table
+ * synchronization.
+ */
+ Oid relid;
+ LogicalRepMsgType command;
+ TransactionId xid;
+ PgStat_Counter error_count;
+ TimestampTz first_error_time;
+ TimestampTz last_error_time;
+ char error_message[PGSTAT_SUBWORKERERROR_MSGLEN];
+} PgStat_StatSubWorkerEntry;
/*
* Working state needed to accumulate per-function-call timing statistics.
@@ -1019,9 +1107,11 @@ extern void pgstat_drop_database(Oid databaseid);
extern void pgstat_clear_snapshot(void);
extern void pgstat_reset_counters(void);
extern void pgstat_reset_shared_counters(const char *);
-extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type type);
+extern void pgstat_reset_single_counter(Oid objectid, Oid subobjectid,
+ PgStat_Single_Reset_Type type);
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
+extern void pgstat_reset_subworker_stats(Oid subid, Oid subrelid, bool allstats);
extern void pgstat_report_connect(Oid dboid);
extern void pgstat_report_autovac(Oid dboid);
@@ -1038,6 +1128,10 @@ extern void pgstat_report_checksum_failure(void);
extern void pgstat_report_replslot(const PgStat_StatReplSlotEntry *repSlotStat);
extern void pgstat_report_replslot_create(const char *slotname);
extern void pgstat_report_replslot_drop(const char *slotname);
+extern void pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command,
+ TransactionId xid, const char *errmsg);
+extern void pgstat_report_subscription_drop(Oid subid);
extern void pgstat_initialize(void);
@@ -1129,6 +1223,8 @@ extern void pgstat_send_wal(bool force);
extern PgStat_StatDBEntry *pgstat_fetch_stat_dbentry(Oid dbid);
extern PgStat_StatTabEntry *pgstat_fetch_stat_tabentry(Oid relid);
extern PgStat_StatFuncEntry *pgstat_fetch_stat_funcentry(Oid funcid);
+extern PgStat_StatSubWorkerEntry *pgstat_fetch_stat_subworker_entry(Oid subid,
+ Oid subrelid);
extern PgStat_ArchiverStats *pgstat_fetch_stat_archiver(void);
extern PgStat_BgWriterStats *pgstat_fetch_stat_bgwriter(void);
extern PgStat_CheckpointerStats *pgstat_fetch_stat_checkpointer(void);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2fa00a3c29..cb6da2c140 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2094,6 +2094,26 @@ pg_stat_subscription| SELECT su.oid AS subid,
st.latest_end_time
FROM (pg_subscription su
LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid, pid, received_lsn, last_msg_send_time, last_msg_receipt_time, latest_end_lsn, latest_end_time) ON ((st.subid = su.oid)));
+pg_stat_subscription_workers| SELECT w.subid,
+ s.subname,
+ w.subrelid,
+ w.relid,
+ w.command,
+ w.xid,
+ w.error_count,
+ w.error_message,
+ w.first_error_time,
+ w.last_error_time
+ FROM ( SELECT pg_subscription.oid AS subid,
+ NULL::oid AS relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT pg_subscription_rel.srsubid AS subid,
+ pg_subscription_rel.srrelid AS relid
+ FROM pg_subscription_rel
+ WHERE (pg_subscription_rel.srsubstate <> 'r'::"char")) sr,
+ (LATERAL pg_stat_get_subscription_worker(sr.subid, sr.relid) w(subid, subrelid, relid, command, xid, error_count, error_message, first_error_time, last_error_time)
+ JOIN pg_subscription s ON ((w.subid = s.oid)));
pg_stat_sys_indexes| SELECT pg_stat_all_indexes.relid,
pg_stat_all_indexes.indexrelid,
pg_stat_all_indexes.schemaname,
diff --git a/src/test/subscription/t/026_error_report.pl b/src/test/subscription/t/026_error_report.pl
new file mode 100644
index 0000000000..1227654774
--- /dev/null
+++ b/src/test/subscription/t/026_error_report.pl
@@ -0,0 +1,191 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Tests for subscription error reporting.
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 5;
+
+# Test if the error reported on pg_stat_subscription_workers view is expected.
+sub test_subscription_error
+{
+ my ($node, $relname, $xid, $expected_error, $msg) = @_;
+
+ my $check_sql = qq[
+SELECT count(1) > 0 FROM pg_stat_subscription_workers
+WHERE relid = '$relname'::regclass];
+ $check_sql .= " AND xid = '$xid'::xid;" if $xid ne '';
+
+ # Wait for the error statistics to be updated.
+ $node->poll_query_until(
+ 'postgres', $check_sql,
+) or die "Timed out while waiting for statistics to be updated";
+
+ my $result = $node->safe_psql(
+ 'postgres',
+ qq[
+SELECT subname, command, relid::regclass, error_count > 0
+FROM pg_stat_subscription_workers
+WHERE relid = '$relname'::regclass;
+]);
+ is($result, $expected_error, $msg);
+}
+
+# Create publisher node.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ qq[
+logical_decoding_work_mem = 64kB
+]);
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+
+# The subscriber will enter an infinite error loop, so we don't want
+# to overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+wal_retrieve_retry_interval = 5s
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we create
+# the same tables but with primary keys. Also, insert some data that will conflict
+# with the data replicated from publisher later.
+$node_publisher->safe_psql(
+ 'postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int);
+CREATE TABLE test_tab2 (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+INSERT INTO test_tab1 VALUES (1);
+INSERT INTO test_tab2 VALUES (1);
+COMMIT;
+]);
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int primary key);
+CREATE TABLE test_tab2 (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab2 VALUES (1);
+INSERT INTO test_tab_streaming SELECT 10000, md5(10000::text);
+COMMIT;
+]);
+
+# Setup publications.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ q[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Check if there is no subscription errors before starting logical replication.
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_workers");
+is($result, qq(0), 'check no subscription error');
+
+# Create subscriptions. The table sync for test_tab2 on tap_sub will enter into
+# infinite error loop due to violating the unique constraint.
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (streaming = off, two_phase = on);");
+my $appname_streaming = 'tap_sub_streaming';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming' PUBLICATION tap_pub_streaming WITH (streaming = on, two_phase = on);");
+
+$node_publisher->wait_for_catchup($appname);
+$node_publisher->wait_for_catchup($appname_streaming);
+
+# Wait for initial table sync for test_tab1 and test_tab_streaming to finish.
+$node_subscriber->poll_query_until('postgres',
+ q[
+SELECT count(1) = 2 FROM pg_subscription_rel
+WHERE srrelid in ('test_tab1'::regclass, 'test_tab_streaming'::regclass) AND srsubstate in ('r', 's')
+]) or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check the initial data.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab1");
+is($result, q(1), 'check initial data are copied to subscriber');
+
+# Insert more data to test_tab1, raising an error on the subscriber due to violation
+# of the unique constraint on test_tab1.
+my $xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab1 VALUES (1);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_subscription_error($node_subscriber, 'test_tab1', $xid,
+ qq(tap_sub|INSERT|test_tab1|t),
+ 'check the error reported by the apply worker');
+
+# Check the table sync worker's error in the view.
+test_subscription_error($node_subscriber, 'test_tab2', '',
+ qq(tap_sub||test_tab2|t),
+ 'check the error reported by the table sync worker');
+
+# Test for resetting subscription worker statistics.
+# Truncate test_tab1 and test_tab2 so that logical replication can continue.
+$node_subscriber->safe_psql(
+ 'postgres',
+ "TRUNCATE test_tab1, test_tab2;");
+
+# Wait for the data to be replicated.
+$node_subscriber->poll_query_until(
+ 'postgres',
+ "SELECT count(1) > 0 FROM test_tab1");
+$node_subscriber->poll_query_until(
+ 'postgres',
+ "SELECT count(1) > 0 FROM test_tab2");
+
+# Reset stats of all subscription workers running on tap_sub.
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+SELECT pg_stat_reset_subscription_worker(sw.subid)
+FROM pg_stat_subscription_workers sw
+ JOIN pg_subscription s ON s.oid = sw.subid
+WHERE
+ s.subname = 'tap_sub';
+]);
+
+# Wait for stats of all subscription workers running on tap_sub to be reset.
+$node_subscriber->poll_query_until(
+ 'postgres',
+ qq[
+SELECT count(1) = 0
+FROM pg_stat_subscription_workers sw
+ JOIN pg_subscription s ON s.oid = sw.subid
+WHERE
+ s.subname = 'tap_sub';
+]);
+
+# Check if the view doesn't show any entries after dropping the subscriptions.
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+DROP SUBSCRIPTION tap_sub;
+DROP SUBSCRIPTION tap_sub_streaming;
+]);
+$result = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT count(1) FROM pg_stat_subscription_workers");
+is($result, q(0), 'no error after dropping subscription');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index da6ac8ed83..f41ef0d2bc 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1943,6 +1943,8 @@ PgStat_MsgResetsharedcounter
PgStat_MsgResetsinglecounter
PgStat_MsgResetslrucounter
PgStat_MsgSLRU
+PgStat_MsgSubscriptionPurge
+PgStat_MsgSubWorkerError
PgStat_MsgTabpurge
PgStat_MsgTabstat
PgStat_MsgTempFile
@@ -1954,6 +1956,8 @@ PgStat_Single_Reset_Type
PgStat_StatDBEntry
PgStat_StatFuncEntry
PgStat_StatReplSlotEntry
+PgStat_StatSubWorkerEntry
+PgStat_StatSubWorkerKey
PgStat_StatTabEntry
PgStat_SubXactStatus
PgStat_TableCounts
--
2.24.3 (Apple Git-128)
On Tues, Nov 16, 2021 2:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Right. I've fixed this issue and attached an updated patch.
Hi,
Thanks for updating the patch.
Here are few comments.
1)
+ <function>pg_stat_reset_subscription_worker</function> ( <parameter>subid</parameter> <type>oid</type>, <optional> <parameter>relid</parameter> <type>oid</type> </optional> )
It seems we should put '<optional>' before the comma(',').
2)
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subrelid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker is synchronizing; null for the
+ main apply worker
+ </para></entry>
+ </row>
Is the 'subrelid' only used for distinguishing the worker type ? If so, would it
be clear to have a string value here. I recalled the previous version patch has
failure_source column but was removed. Maybe I missed something.
3)
.
+extern void pgstat_reset_subworker_stats(Oid subid, Oid subrelid, bool allstats);
I didn't find the code of this functions, maybe we can remove this declaration ?
Best regards,
Hou zj
On Wed, Nov 17, 2021 at 9:13 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
On Tues, Nov 16, 2021 2:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
2) + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>subrelid</structfield> <type>oid</type> + </para> + <para> + OID of the relation that the worker is synchronizing; null for the + main apply worker + </para></entry> + </row>Is the 'subrelid' only used for distinguishing the worker type ?
I think it will additionally tell which table sync worker as well.
If so, would it
be clear to have a string value here. I recalled the previous version patch has
failure_source column but was removed. Maybe I missed something.
I also don't remember the reason for this but like to know.
I am also reviewing the latest version of the patch and will share
comments/questions sometime today.
--
With Regards,
Amit Kapila.
On Wed, Nov 17, 2021 at 1:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Nov 17, 2021 at 9:13 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:On Tues, Nov 16, 2021 2:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
2) + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>subrelid</structfield> <type>oid</type> + </para> + <para> + OID of the relation that the worker is synchronizing; null for the + main apply worker + </para></entry> + </row>Is the 'subrelid' only used for distinguishing the worker type ?
I think it will additionally tell which table sync worker as well.
Right.
If so, would it
be clear to have a string value here. I recalled the previous version patch has
failure_source column but was removed. Maybe I missed something.I also don't remember the reason for this but like to know.
I felt it's a bit redundant. Setting subrelid to NULL already means
that it’s an entry for a tablesync worker. If users want the value
like “apply” or “tablesync” for each entry, they can use the subrelid
value.
I am also reviewing the latest version of the patch and will share
comments/questions sometime today.
Thanks!
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, Nov 16, 2021 at 12:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Nov 15, 2021 at 11:43 PM vignesh C <vignesh21@gmail.com> wrote:
On Mon, Nov 15, 2021 at 2:48 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Nov 15, 2021 at 4:49 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Mon, Nov 15, 2021 at 1:49 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated patch that incorporates all comments I got so
far. Please review it.Thanks for the updated patch.
A few minor comments:doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
(1) tab in doc updates
There's a tab before "Otherwise,":
+ copy of the relation with <parameter>relid</parameter>.
Otherwise,Fixed.
src/backend/utils/adt/pgstatfuncs.c
(2) The function comment for "pg_stat_reset_subscription_worker_sub"
seems a bit long and I expected it to be multi-line (did you run
pg_indent?)I ran pg_indent on pgstatfuncs.c but it didn't become a multi-line comment.
src/include/pgstat.h
(3) Remove PgStat_StatSubWorkerEntry.dbid?
The "dbid" member of the new PgStat_StatSubWorkerEntry struct doesn't
seem to be used, so I think it should be removed.
(I could remove it and everything builds OK and tests pass).Fixed.
Thank you for the comments! I've updated an updated version patch.
Thanks for the updated patch. I found one issue: This Assert can fail in few cases: +void +pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid, + LogicalRepMsgType command, TransactionId xid, + const char *errmsg) +{ + PgStat_MsgSubWorkerError msg; + int len; + + Assert(strlen(errmsg) < PGSTAT_SUBWORKERERROR_MSGLEN); + len = offsetof(PgStat_MsgSubWorkerError, m_message[0]) + strlen(errmsg) + 1; +I could reproduce the problem with the following scenario:
Publisher:
create table t1 (c1 varchar);
create publication pub1 for table t1;
insert into t1 values(repeat('abcd', 5000));Subscriber:
create table t1(c1 smallint);
create subscription sub1 connection 'dbname=postgres port=5432'
publication pub1 with ( two_phase = true);
postgres=# select * from pg_stat_subscription_workers;
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process
exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.Subscriber logs:
2021-11-15 19:27:56.380 IST [15685] LOG: logical replication apply
worker for subscription "sub1" has started
2021-11-15 19:27:56.384 IST [15687] LOG: logical replication table
synchronization worker for subscription "sub1", table "t1" has started
TRAP: FailedAssertion("strlen(errmsg) < PGSTAT_SUBWORKERERROR_MSGLEN",
File: "pgstat.c", Line: 1946, PID: 15687)
postgres: logical replication worker for subscription 16387 sync 16384
(ExceptionalCondition+0xd0)[0x55a18f3c727f]
postgres: logical replication worker for subscription 16387 sync 16384
(pgstat_report_subworker_error+0x7a)[0x55a18f126417]
postgres: logical replication worker for subscription 16387 sync 16384
(ApplyWorkerMain+0x493)[0x55a18f176611]
postgres: logical replication worker for subscription 16387 sync 16384
(StartBackgroundWorker+0x23c)[0x55a18f11f7e2]
postgres: logical replication worker for subscription 16387 sync 16384
(+0x54efc0)[0x55a18f134fc0]
postgres: logical replication worker for subscription 16387 sync 16384
(+0x54f3af)[0x55a18f1353af]
postgres: logical replication worker for subscription 16387 sync 16384
(+0x54e338)[0x55a18f134338]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x141f0)[0x7feef84371f0]
/lib/x86_64-linux-gnu/libc.so.6(__select+0x57)[0x7feef81e3ac7]
postgres: logical replication worker for subscription 16387 sync 16384
(+0x5498c2)[0x55a18f12f8c2]
postgres: logical replication worker for subscription 16387 sync 16384
(PostmasterMain+0x134c)[0x55a18f12f1dd]
postgres: logical replication worker for subscription 16387 sync 16384
(+0x43c3d4)[0x55a18f0223d4]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xd5)[0x7feef80fd565]
postgres: logical replication worker for subscription 16387 sync 16384
(_start+0x2e)[0x55a18ecaf4fe]
2021-11-15 19:27:56.483 IST [15645] LOG: background worker "logical
replication worker" (PID 15687) was terminated by signal 6: Aborted
2021-11-15 19:27:56.483 IST [15645] LOG: terminating any other active
server processes
2021-11-15 19:27:56.485 IST [15645] LOG: all server processes
terminated; reinitializingHere it fails because of a long error message ""invalid input syntax
for type smallint:Good catch!
\"abcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabc...."
because we try to insert varchar type data into smallint type. Maybe
we should trim the error message in this case.Right. I've fixed this issue and attached an updated patch.
Thanks for the updated patch. The issue is fixed in the patch provided.
I found that in one of the scenarios the statistics is getting lost:
Test steps:
Step 1:
Setup Publisher(create 100 publications pub1...pub100 for t1...t100) like below:
===============================================
create table t1(c1 int);
create publication pub1 for table t1;
insert into t1 values(10);
insert into t1 values(10);
create table t2(c1 int);
create publication pub1 for table t2;
insert into t2 values(10);
insert into t2 values(10);
....
Script can be generated using:
while [ $a -lt 100 ]
do
a=`expr $a + 1`
echo "./psql -d postgres -p 5432 -c \"create table t$a(c1
int);\"" >> publisher.sh
echo "./psql -d postgres -p 5432 -c \"create publication pub$a
for table t$a;\"" >> publisher.sh
echo "./psql -d postgres -p 5432 -c \"insert into t$a
values(10);\"" >> publisher.sh
echo "./psql -d postgres -p 5432 -c \"insert into t$a
values(10);\"" >> publisher.sh
done
Step 2:
Setup Subscriber(create 100 subscriptions):
===============================================
create table t1(c1 int primary key);
create subscription sub1 connection 'dbname=postgres port=5432'
publication pub1;
create table t2(c1 int primary key);
create subscription sub2 connection 'dbname=postgres port=5432'
publication pub2;
....
Script can be generated using:
while [ $a -lt 100]
do
a=`expr $a + 1`
echo "./psql -d postgres -p 5433 -c \"create table t$a(c1 int
primary key);\"" >> subscriber.sh
echo "./psql -d postgres -p 5433 -c \"create subscription
sub$a connection 'dbname=postgres port=5432' publication pub$a;\"" >>
subscriber.sh
done
Step 3:
postgres=# select * from pg_stat_subscription_workers order by subid;
subid | subname | subrelid | relid | command | xid | error_count |
error_message | first_error_time | last_error_time
-------+---------+----------+-------+---------+-----+-------------+------------------------------------------------------------+----------------------------------+----------------------------------
16389 | sub1 | 16384 | 16384 | | | 17 | duplicate key value violates
unique constraint "t1_pkey" | 2021-11-17 12:01:46.141086+05:30 |
2021-11-17 12:03:13.175698+05:30
16395 | sub2 | 16390 | 16390 | | | 16 | duplicate key value violates
unique constraint "t2_pkey" | 2021-11-17 12:01:51.337055+05:30 |
2021-11-17 12:03:15.512249+05:30
16401 | sub3 | 16396 | 16396 | | | 16 | duplicate key value violates
unique constraint "t3_pkey" | 2021-11-17 12:01:51.352157+05:30 |
2021-11-17 12:03:15.802225+05:30
16407 | sub4 | 16402 | 16402 | | | 16 | duplicate key value violates
unique constraint "t4_pkey" | 2021-11-17 12:01:51.390638+05:30 |
2021-11-17 12:03:14.709496+05:30
16413 | sub5 | 16408 | 16408 | | | 16 | duplicate key value violates
unique constraint "t5_pkey" | 2021-11-17 12:01:51.418825+05:30 |
2021-11-17 12:03:15.257235+05:30
Step 4:
Then restart the publisher
Step 5:
postgres=# select * from pg_stat_subscription_workers order by subid;
subid | subname | subrelid | relid | command | xid | error_count |
error_message |
first_error_time | last_error_time
-------+---------+----------+-------+---------+-----+-------------+------------------------------------------------------------------------------------------------------------------------------------------+-----
-----------------------------+----------------------------------
16389 | sub1 | 16384 | 16384 | | | 1 | could not create replication
slot "pg_16389_sync_16384_7031422794938304519": FATAL: terminating
connection due to administrator command+| 2021
-11-17 12:03:28.201247+05:30 | 2021-11-17 12:03:28.201247+05:30
| | | | | | | server closed the connection unexpectedly +|
|
| | | | | | | This probably means the server terminated abnormally +|
|
| | | | | | | before or while proce |
|
16395 | sub2 | 16390 | 16390 | | | 18 | duplicate key value violates
unique constraint "t2_pkey" | 2021
-11-17 12:01:51.337055+05:30 | 2021-11-17 12:03:23.832585+05:30
16401 | sub3 | 16396 | 16396 | | | 18 | duplicate key value violates
unique constraint "t3_pkey" | 2021
-11-17 12:01:51.352157+05:30 | 2021-11-17 12:03:26.567873+05:30
16407 | sub4 | 16402 | 16402 | | | 1 | could not create replication
slot "pg_16407_sync_16402_7031422794938304519": FATAL: terminating
connection due to administrator command+| 2021
-11-17 12:03:28.196958+05:30 | 2021-11-17 12:03:28.196958+05:30
| | | | | | | server closed the connection unexpectedly +|
|
| | | | | | | This probably means the server terminated abnormally +|
|
| | | | | | | before or while proce |
|
16413 | sub5 | 16408 | 16408 | | | 18 | duplicate key value violates
unique constraint "t5_pkey" | 2021
-11-17 12:01:51.418825+05:30 | 2021-11-17 12:03:25.595697+05:30
Step 6:
postgres=# select * from pg_stat_subscription_workers order by subid;
subid | subname | subrelid | relid | command | xid | error_count |
error_message | first_error_time | last_error_time
-------+---------+----------+-------+---------+-----+-------------+------------------------------------------------------------+----------------------------------+----------------------------------
16389 | sub1 | 16384 | 16384 | | | 1 | duplicate key value violates
unique constraint "t1_pkey" | 2021-11-17 12:03:33.346514+05:30 |
2021-11-17 12:03:33.346514+05:30
16395 | sub2 | 16390 | 16390 | | | 19 | duplicate key value violates
unique constraint "t2_pkey" | 2021-11-17 12:01:51.337055+05:30 |
2021-11-17 12:03:33.437505+05:30
16401 | sub3 | 16396 | 16396 | | | 19 | duplicate key value violates
unique constraint "t3_pkey" | 2021-11-17 12:01:51.352157+05:30 |
2021-11-17 12:03:33.482954+05:30
16407 | sub4 | 16402 | 16402 | | | 1 | duplicate key value violates
unique constraint "t4_pkey" | 2021-11-17 12:03:33.327489+05:30 |
2021-11-17 12:03:33.327489+05:30
16413 | sub5 | 16408 | 16408 | | | 19 | duplicate key value violates
unique constraint "t5_pkey" | 2021-11-17 12:01:51.418825+05:30 |
2021-11-17 12:03:33.374522+05:30
We can see that sub1 and sub4 statistics are lost, old error_count
value is lost. I'm not sure if this behavior is ok or not. Thoughts?
Regards,
Vignesh
On Wed, Nov 17, 2021 at 3:52 PM vignesh C <vignesh21@gmail.com> wrote:
On Tue, Nov 16, 2021 at 12:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Nov 15, 2021 at 11:43 PM vignesh C <vignesh21@gmail.com> wrote:
On Mon, Nov 15, 2021 at 2:48 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Nov 15, 2021 at 4:49 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Mon, Nov 15, 2021 at 1:49 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated patch that incorporates all comments I got so
far. Please review it.Thanks for the updated patch.
A few minor comments:doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
(1) tab in doc updates
There's a tab before "Otherwise,":
+ copy of the relation with <parameter>relid</parameter>.
Otherwise,Fixed.
src/backend/utils/adt/pgstatfuncs.c
(2) The function comment for "pg_stat_reset_subscription_worker_sub"
seems a bit long and I expected it to be multi-line (did you run
pg_indent?)I ran pg_indent on pgstatfuncs.c but it didn't become a multi-line comment.
src/include/pgstat.h
(3) Remove PgStat_StatSubWorkerEntry.dbid?
The "dbid" member of the new PgStat_StatSubWorkerEntry struct doesn't
seem to be used, so I think it should be removed.
(I could remove it and everything builds OK and tests pass).Fixed.
Thank you for the comments! I've updated an updated version patch.
Thanks for the updated patch. I found one issue: This Assert can fail in few cases: +void +pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid, + LogicalRepMsgType command, TransactionId xid, + const char *errmsg) +{ + PgStat_MsgSubWorkerError msg; + int len; + + Assert(strlen(errmsg) < PGSTAT_SUBWORKERERROR_MSGLEN); + len = offsetof(PgStat_MsgSubWorkerError, m_message[0]) + strlen(errmsg) + 1; +I could reproduce the problem with the following scenario:
Publisher:
create table t1 (c1 varchar);
create publication pub1 for table t1;
insert into t1 values(repeat('abcd', 5000));Subscriber:
create table t1(c1 smallint);
create subscription sub1 connection 'dbname=postgres port=5432'
publication pub1 with ( two_phase = true);
postgres=# select * from pg_stat_subscription_workers;
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process
exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.Subscriber logs:
2021-11-15 19:27:56.380 IST [15685] LOG: logical replication apply
worker for subscription "sub1" has started
2021-11-15 19:27:56.384 IST [15687] LOG: logical replication table
synchronization worker for subscription "sub1", table "t1" has started
TRAP: FailedAssertion("strlen(errmsg) < PGSTAT_SUBWORKERERROR_MSGLEN",
File: "pgstat.c", Line: 1946, PID: 15687)
postgres: logical replication worker for subscription 16387 sync 16384
(ExceptionalCondition+0xd0)[0x55a18f3c727f]
postgres: logical replication worker for subscription 16387 sync 16384
(pgstat_report_subworker_error+0x7a)[0x55a18f126417]
postgres: logical replication worker for subscription 16387 sync 16384
(ApplyWorkerMain+0x493)[0x55a18f176611]
postgres: logical replication worker for subscription 16387 sync 16384
(StartBackgroundWorker+0x23c)[0x55a18f11f7e2]
postgres: logical replication worker for subscription 16387 sync 16384
(+0x54efc0)[0x55a18f134fc0]
postgres: logical replication worker for subscription 16387 sync 16384
(+0x54f3af)[0x55a18f1353af]
postgres: logical replication worker for subscription 16387 sync 16384
(+0x54e338)[0x55a18f134338]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x141f0)[0x7feef84371f0]
/lib/x86_64-linux-gnu/libc.so.6(__select+0x57)[0x7feef81e3ac7]
postgres: logical replication worker for subscription 16387 sync 16384
(+0x5498c2)[0x55a18f12f8c2]
postgres: logical replication worker for subscription 16387 sync 16384
(PostmasterMain+0x134c)[0x55a18f12f1dd]
postgres: logical replication worker for subscription 16387 sync 16384
(+0x43c3d4)[0x55a18f0223d4]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xd5)[0x7feef80fd565]
postgres: logical replication worker for subscription 16387 sync 16384
(_start+0x2e)[0x55a18ecaf4fe]
2021-11-15 19:27:56.483 IST [15645] LOG: background worker "logical
replication worker" (PID 15687) was terminated by signal 6: Aborted
2021-11-15 19:27:56.483 IST [15645] LOG: terminating any other active
server processes
2021-11-15 19:27:56.485 IST [15645] LOG: all server processes
terminated; reinitializingHere it fails because of a long error message ""invalid input syntax
for type smallint:Good catch!
\"abcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabc...."
because we try to insert varchar type data into smallint type. Maybe
we should trim the error message in this case.Right. I've fixed this issue and attached an updated patch.
Thanks for the updated patch. The issue is fixed in the patch provided.
I found that in one of the scenarios the statistics is getting lost:
Thank you for the tests!!
Step 3:
postgres=# select * from pg_stat_subscription_workers order by subid;
subid | subname | subrelid | relid | command | xid | error_count |
error_message | first_error_time | last_error_time
-------+---------+----------+-------+---------+-----+-------------+------------------------------------------------------------+----------------------------------+----------------------------------
16389 | sub1 | 16384 | 16384 | | | 17 | duplicate key value violates
unique constraint "t1_pkey" | 2021-11-17 12:01:46.141086+05:30 |
2021-11-17 12:03:13.175698+05:30
16395 | sub2 | 16390 | 16390 | | | 16 | duplicate key value violates
unique constraint "t2_pkey" | 2021-11-17 12:01:51.337055+05:30 |
2021-11-17 12:03:15.512249+05:30
16401 | sub3 | 16396 | 16396 | | | 16 | duplicate key value violates
unique constraint "t3_pkey" | 2021-11-17 12:01:51.352157+05:30 |
2021-11-17 12:03:15.802225+05:30
16407 | sub4 | 16402 | 16402 | | | 16 | duplicate key value violates
unique constraint "t4_pkey" | 2021-11-17 12:01:51.390638+05:30 |
2021-11-17 12:03:14.709496+05:30
16413 | sub5 | 16408 | 16408 | | | 16 | duplicate key value violates
unique constraint "t5_pkey" | 2021-11-17 12:01:51.418825+05:30 |
2021-11-17 12:03:15.257235+05:30Step 4:
Then restart the publisherStep 5:
postgres=# select * from pg_stat_subscription_workers order by subid;
subid | subname | subrelid | relid | command | xid | error_count |
error_message |
first_error_time | last_error_time
-------+---------+----------+-------+---------+-----+-------------+------------------------------------------------------------------------------------------------------------------------------------------+-----
-----------------------------+----------------------------------
16389 | sub1 | 16384 | 16384 | | | 1 | could not create replication
slot "pg_16389_sync_16384_7031422794938304519": FATAL: terminating
connection due to administrator command+| 2021
-11-17 12:03:28.201247+05:30 | 2021-11-17 12:03:28.201247+05:30
| | | | | | | server closed the connection unexpectedly +|
|
| | | | | | | This probably means the server terminated abnormally +|
|
| | | | | | | before or while proce |
|
16395 | sub2 | 16390 | 16390 | | | 18 | duplicate key value violates
unique constraint "t2_pkey" | 2021
-11-17 12:01:51.337055+05:30 | 2021-11-17 12:03:23.832585+05:30
16401 | sub3 | 16396 | 16396 | | | 18 | duplicate key value violates
unique constraint "t3_pkey" | 2021
-11-17 12:01:51.352157+05:30 | 2021-11-17 12:03:26.567873+05:30
16407 | sub4 | 16402 | 16402 | | | 1 | could not create replication
slot "pg_16407_sync_16402_7031422794938304519": FATAL: terminating
connection due to administrator command+| 2021
-11-17 12:03:28.196958+05:30 | 2021-11-17 12:03:28.196958+05:30
| | | | | | | server closed the connection unexpectedly +|
|
| | | | | | | This probably means the server terminated abnormally +|
|
| | | | | | | before or while proce |
|
16413 | sub5 | 16408 | 16408 | | | 18 | duplicate key value violates
unique constraint "t5_pkey" | 2021
-11-17 12:01:51.418825+05:30 | 2021-11-17 12:03:25.595697+05:30Step 6:
postgres=# select * from pg_stat_subscription_workers order by subid;
subid | subname | subrelid | relid | command | xid | error_count |
error_message | first_error_time | last_error_time
-------+---------+----------+-------+---------+-----+-------------+------------------------------------------------------------+----------------------------------+----------------------------------
16389 | sub1 | 16384 | 16384 | | | 1 | duplicate key value violates
unique constraint "t1_pkey" | 2021-11-17 12:03:33.346514+05:30 |
2021-11-17 12:03:33.346514+05:30
16395 | sub2 | 16390 | 16390 | | | 19 | duplicate key value violates
unique constraint "t2_pkey" | 2021-11-17 12:01:51.337055+05:30 |
2021-11-17 12:03:33.437505+05:30
16401 | sub3 | 16396 | 16396 | | | 19 | duplicate key value violates
unique constraint "t3_pkey" | 2021-11-17 12:01:51.352157+05:30 |
2021-11-17 12:03:33.482954+05:30
16407 | sub4 | 16402 | 16402 | | | 1 | duplicate key value violates
unique constraint "t4_pkey" | 2021-11-17 12:03:33.327489+05:30 |
2021-11-17 12:03:33.327489+05:30
16413 | sub5 | 16408 | 16408 | | | 19 | duplicate key value violates
unique constraint "t5_pkey" | 2021-11-17 12:01:51.418825+05:30 |
2021-11-17 12:03:33.374522+05:30We can see that sub1 and sub4 statistics are lost, old error_count
value is lost. I'm not sure if this behavior is ok or not. Thoughts?
Looking at the outputs of steps 3, 5, and 6, the error messages are
different. In the current design, error_count is incremented only when
the exact same error (i.g., xid, command, relid, error message are the
same) comes. Since some different kinds of errors happened on the
subscription the error_count was reset. Similarly, the
first_error_time value was also reset.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, Nov 16, 2021 at 12:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Nov 15, 2021 at 11:43 PM vignesh C <vignesh21@gmail.com> wrote:
On Mon, Nov 15, 2021 at 2:48 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Nov 15, 2021 at 4:49 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Mon, Nov 15, 2021 at 1:49 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Right. I've fixed this issue and attached an updated patch.
Few comments:
1) should we set subwentry to NULL to handle !create && !found case
or we could return NULL similar to the earlier function.
+static PgStat_StatSubWorkerEntry *
+pgstat_get_subworker_entry(PgStat_StatDBEntry *dbentry, Oid subid,
Oid subrelid,
+ bool create)
+{
+ PgStat_StatSubWorkerEntry *subwentry;
+ PgStat_StatSubWorkerKey key;
+ bool found;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);
+
+ key.subid = subid;
+ key.subrelid = subrelid;
+ subwentry = (PgStat_StatSubWorkerEntry *)
hash_search(dbentry->subworkers,
+
(void *) &key,
+
action, &found);
+
+ /* If not found, initialize the new one */
+ if (create && !found)
2) Should we keep the line width to 80 chars:
+/* ----------
+ * PgStat_MsgSubWorkerError Sent by the apply worker or
the table sync worker to
+ * report
the error occurred during logical replication.
+ * ----------
+ */
+#define PGSTAT_SUBWORKERERROR_MSGLEN 256
+typedef struct PgStat_MsgSubWorkerError
+{
Regards,
Vignesh
On Tue, Nov 16, 2021 at 12:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Right. I've fixed this issue and attached an updated patch.
Few comments/questions:
=====================
1.
+ <para>
+ The <structname>pg_stat_subscription_workers</structname> view will contain
+ one row per subscription error reported by workers applying logical
+ replication changes and workers handling the initial data copy of the
+ subscribed tables. The statistics entry is removed when the subscription
+ the worker is running on is removed.
+ </para>
The last line of this paragraph is not clear to me. First "the" before
"worker" in the following part of the sentence seems unnecessary
"..when the subscription the worker..". Then the part "running on is
removed" is unclear because it could also mean that we remove the
entry when a subscription is disabled. Can we rephrase it to: "The
statistics entry is removed when the corresponding subscription is
dropped"?
2.
Between v20 and v23 versions of patch the size of hash table
PGSTAT_SUBWORKER_HASH_SIZE is increased from 32 to 256. I might have
missed the comment which lead to this change, can you point me to the
same or if you changed it for some other reason, can you let me know
the same?
3.
+
+ /*
+ * Repeat for subscription workers. Similarly, we needn't bother
+ * in the common case where no function stats are being collected.
+ */
/function/subscription workers'
4.
+ <para>
+ Name of command being applied when the error occurred. This field
+ is always NULL if the error was reported during the initial data
+ copy.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>xid</structfield> <type>xid</type>
+ </para>
+ <para>
+ Transaction ID of the publisher node being applied when the error
+ occurred. This field is always NULL if the error was reported
+ during the initial data copy.
+ </para></entry>
Is it important to stress on 'always' in the above two descriptions?
5.
The current description of first/last_error_time seems sliglthy
misleading as one can interpret that these are about different errors.
Let's slightly change the description of first/last_error_time as
follows or something on those lines:
</para>
+ <para>
+ Time at which the first error occurred
+ </para></entry>
+ </row>
First time at which this error occurred
<structfield>last_error_time</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which the last error occurred
Last time at which this error occurred. This will be the same as
first_error_time except when the same error occurred more than once
consecutively.
6.
+ </indexterm>
+ <function>pg_stat_reset_subscription_worker</function> (
<parameter>subid</parameter> <type>oid</type>, <optional>
<parameter>relid</parameter> <type>oid</type> </optional> )
+ <returnvalue>void</returnvalue>
+ </para>
+ <para>
+ Resets the statistics of a single subscription worker running on the
+ subscription with <parameter>subid</parameter> shown in the
+ <structname>pg_stat_subscription_worker</structname> view. If the
+ argument <parameter>relid</parameter> is not <literal>NULL</literal>,
+ resets statistics of the subscription worker handling the initial data
+ copy of the relation with <parameter>relid</parameter>. Otherwise,
+ resets the subscription worker statistics of the main apply worker.
+ If the argument <parameter>relid</parameter> is omitted, resets the
+ statistics of all subscription workers running on the subscription
+ with <parameter>subid</parameter>.
+ </para>
The first line of this description seems to indicate that we can only
reset the stats of a single worker but the later part indicates that
we can reset stats of all subscription workers. Can we change the
first line as: "Resets the statistics of subscription workers running
on the subscription with <parameter>subid</parameter> shown in the
<structname>pg_stat_subscription_worker</structname> view.".
7.
pgstat_vacuum_stat()
{
..
+ pgstat_setheader(&spmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ spmsg.m_databaseid = MyDatabaseId;
+ spmsg.m_nentries = 0;
..
}
Do we really need to set the header here? It seems to be getting set
in pgstat_send_subscription_purge() while sending this message.
8.
pgstat_vacuum_stat()
{
..
+
+ if (hash_search(htab, (void *) &(subwentry->key.subid), HASH_FIND, NULL)
+ != NULL)
+ continue;
+
+ /* This subscription is dead, add the subid to the message */
+ spmsg.m_subids[spmsg.m_nentries++] = subwentry->key.subid;
..
}
I think it is better to use a separate variable here for subid as we
are using for funcid and tableid. That will make this part of the code
easier to follow and look consistent.
9.
+/* ----------
+ * PgStat_MsgSubWorkerError Sent by the apply worker or the table
sync worker to
+ * report the error occurred during logical replication.
+ * ----------
In this comment "during logical replication" sounds too generic. Can
we instead use "while processing changes." or something like that to
make it a bit more specific?
--
With Regards,
Amit Kapila.
On Wed, Nov 17, 2021 at 4:16 PM vignesh C <vignesh21@gmail.com> wrote:
Few comments:
1) should we set subwentry to NULL to handle !create && !found case
or we could return NULL similar to the earlier function.
I think it is good to be consistent with the nearby code in this case.
--
With Regards,
Amit Kapila.
On Tuesday, November 16, 2021 2:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Right. I've fixed this issue and attached an updated patch.
Hi,
I have few comments for the testcases.
1)
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (streaming = off, two_phase = on);");
+my $appname_streaming = 'tap_sub_streaming';
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming' PUBLICATION tap_pub_streaming WITH (streaming = on, two_phase = on);");
+
I think we can remove the 'application_name=$appname', so that the command
could be shorter.
2)
+...(streaming = on, two_phase = on);");
Besides, is there some reasons to set two_phase to ? If so,
It might be better to add some comments about it.
3)
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
It seems there's no tests to use the table test_tab_streaming. I guess this
table is used to test streaming change error, maybe we can add some tests for
it ?
Best regards,
Hou zj
On Tuesday, November 16, 2021 2:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Right. I've fixed this issue and attached an updated patch.
Thanks for your patch.
I read the discussion about stats entries for table sync worker[1]/messages/by-id/CAD21AoAT42mhcqeB1jPfRL1+EUHbZk8MMY_fBgsyZvJeKNpG+w@mail.gmail.com, the
statistics are retained after table sync worker finished its jobs and user can remove
them via pg_stat_reset_subscription_worker function.
But I notice that, if a table sync worker finished its jobs, the error reported by
this worker will not be shown in the pg_stat_subscription_workers view. (It seemed caused by this condition: "WHERE srsubstate <> 'r'") Is it intentional? I think this may cause a result that users don't know the statistics are still exist, and won't remove the statistics manually. And that is not friendly to users' storage, right?
[1]: /messages/by-id/CAD21AoAT42mhcqeB1jPfRL1+EUHbZk8MMY_fBgsyZvJeKNpG+w@mail.gmail.com
Regards
Tang
On Thu, Nov 18, 2021 at 5:45 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
On Tuesday, November 16, 2021 2:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Right. I've fixed this issue and attached an updated patch.
Thanks for your patch.
I read the discussion about stats entries for table sync worker[1], the
statistics are retained after table sync worker finished its jobs and user can remove
them via pg_stat_reset_subscription_worker function.But I notice that, if a table sync worker finished its jobs, the error reported by
this worker will not be shown in the pg_stat_subscription_workers view. (It seemed caused by this condition: "WHERE srsubstate <> 'r'") Is it intentional? I think this may cause a result that users don't know the statistics are still exist, and won't remove the statistics manually. And that is not friendly to users' storage, right?
You're right. The condition "WHERE substate <> 'r') should be removed.
I'll do that change in the next version patch. Thanks!
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wed, Nov 17, 2021 at 12:43 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
On Tues, Nov 16, 2021 2:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Right. I've fixed this issue and attached an updated patch.
Hi,
Thanks for updating the patch.
Here are few comments.
Thank you for the comments!
1)
+ <function>pg_stat_reset_subscription_worker</function> ( <parameter>subid</parameter> <type>oid</type>, <optional> <parameter>relid</parameter> <type>oid</type> </optional> )
It seems we should put '<optional>' before the comma(',').
Will fix.
2) + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>subrelid</structfield> <type>oid</type> + </para> + <para> + OID of the relation that the worker is synchronizing; null for the + main apply worker + </para></entry> + </row>Is the 'subrelid' only used for distinguishing the worker type ? If so, would it
be clear to have a string value here. I recalled the previous version patch has
failure_source column but was removed. Maybe I missed something.
As Amit mentioned, users can use this check which table sync worker.
3)
.
+extern void pgstat_reset_subworker_stats(Oid subid, Oid subrelid, bool allstats);I didn't find the code of this functions, maybe we can remove this declaration ?
Will remove.
I'll submit an updated patch.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wed, Nov 17, 2021 at 7:46 PM vignesh C <vignesh21@gmail.com> wrote:
On Tue, Nov 16, 2021 at 12:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Nov 15, 2021 at 11:43 PM vignesh C <vignesh21@gmail.com> wrote:
On Mon, Nov 15, 2021 at 2:48 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Nov 15, 2021 at 4:49 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Mon, Nov 15, 2021 at 1:49 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Right. I've fixed this issue and attached an updated patch.
Few comments:
Thank you for the comments!
1) should we set subwentry to NULL to handle !create && !found case or we could return NULL similar to the earlier function. +static PgStat_StatSubWorkerEntry * +pgstat_get_subworker_entry(PgStat_StatDBEntry *dbentry, Oid subid, Oid subrelid, + bool create) +{ + PgStat_StatSubWorkerEntry *subwentry; + PgStat_StatSubWorkerKey key; + bool found; + HASHACTION action = (create ? HASH_ENTER : HASH_FIND); + + key.subid = subid; + key.subrelid = subrelid; + subwentry = (PgStat_StatSubWorkerEntry *) hash_search(dbentry->subworkers, + (void *) &key, + action, &found); + + /* If not found, initialize the new one */ + if (create && !found)
It's better to return NULL if !create && !found. WIll fix.
2) Should we keep the line width to 80 chars: +/* ---------- + * PgStat_MsgSubWorkerError Sent by the apply worker or the table sync worker to + * report the error occurred during logical replication. + * ---------- + */ +#define PGSTAT_SUBWORKERERROR_MSGLEN 256 +typedef struct PgStat_MsgSubWorkerError +{
Hmm, pg_indent seems not to fix it. Anyway, will fix.
I'll fix an updated patch.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, Nov 16, 2021 at 5:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Right. I've fixed this issue and attached an updated patch.
A couple of comments for the v23 patch:
doc/src/sgml/monitoring.sgml
(1) inconsistent decription
I think that the following description seems inconsistent with the
previous description given above it in the patch (i.e. "One row per
subscription worker, showing statistics about errors that occurred on
that subscription worker"):
"The <structname>pg_stat_subscription_workers</structname> view will
contain one row per subscription error reported by workers applying
logical replication changes and workers handling the initial data copy
of the subscribed tables."
I think it is inconsistent because it implies there could be multiple
subscription error rows for the same worker.
Maybe the following wording could be used instead, or something similar:
"The <structname>pg_stat_subscription_workers</structname> view will
contain one row per subscription worker on which errors have occurred,
for workers applying logical replication changes and workers handling
the initial data copy of the subscribed tables."
(2) null vs NULL
The "subrelid" column description uses "null" but the "command" column
description uses "NULL".
I think "NULL" should be used for consistency.
Regards,
Greg Nancarrow
Fujitsu Australia
On Thu, Nov 18, 2021 at 5:10 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Nov 18, 2021 at 5:45 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:On Tuesday, November 16, 2021 2:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Right. I've fixed this issue and attached an updated patch.
Thanks for your patch.
I read the discussion about stats entries for table sync worker[1], the
statistics are retained after table sync worker finished its jobs and user can remove
them via pg_stat_reset_subscription_worker function.But I notice that, if a table sync worker finished its jobs, the error reported by
this worker will not be shown in the pg_stat_subscription_workers view. (It seemed caused by this condition: "WHERE srsubstate <> 'r'") Is it intentional? I think this may cause a result that users don't know the statistics are still exist, and won't remove the statistics manually. And that is not friendly to users' storage, right?You're right. The condition "WHERE substate <> 'r') should be removed.
I'll do that change in the next version patch. Thanks!
One more thing you might want to consider for the next version is
whether to rename the columns as discussed in the related thread [1]/messages/by-id/CAA4eK1KR41bRUuPeNBSGv2+q7ROKukS3myeAUqrZMD8MEwR0DQ@mail.gmail.com?
I think we should consider future work and name them accordingly.
[1]: /messages/by-id/CAA4eK1KR41bRUuPeNBSGv2+q7ROKukS3myeAUqrZMD8MEwR0DQ@mail.gmail.com
--
With Regards,
Amit Kapila.
On Fri, Nov 19, 2021 at 9:22 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Nov 18, 2021 at 5:10 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Nov 18, 2021 at 5:45 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:On Tuesday, November 16, 2021 2:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Right. I've fixed this issue and attached an updated patch.
Thanks for your patch.
I read the discussion about stats entries for table sync worker[1], the
statistics are retained after table sync worker finished its jobs and user can remove
them via pg_stat_reset_subscription_worker function.But I notice that, if a table sync worker finished its jobs, the error reported by
this worker will not be shown in the pg_stat_subscription_workers view. (It seemed caused by this condition: "WHERE srsubstate <> 'r'") Is it intentional? I think this may cause a result that users don't know the statistics are still exist, and won't remove the statistics manually. And that is not friendly to users' storage, right?You're right. The condition "WHERE substate <> 'r') should be removed.
I'll do that change in the next version patch. Thanks!One more thing you might want to consider for the next version is
whether to rename the columns as discussed in the related thread [1]?
I think we should consider future work and name them accordingly.[1] - /messages/by-id/CAA4eK1KR41bRUuPeNBSGv2+q7ROKukS3myeAUqrZMD8MEwR0DQ@mail.gmail.com
Since the statistics collector process uses UDP socket, the sequencing
of the messages is not guaranteed. Will there be a problem if
Subscription is dropped and stats collector receives
PGSTAT_MTYPE_SUBSCRIPTIONPURGE first and the subscription worker entry
is removed and then receives PGSTAT_MTYPE_SUBWORKERERROR(this order
can happen because of UDP socket). I'm not sure if the Assert will be
a problem in this case. If this scenario is possible we could just
silently return in that case.
+static void
+pgstat_recv_subworker_error(PgStat_MsgSubWorkerError *msg, int len)
+{
+ PgStat_StatDBEntry *dbentry;
+ PgStat_StatSubWorkerEntry *subwentry;
+
+ dbentry = pgstat_get_db_entry(msg->m_databaseid, true);
+
+ /* Get the subscription worker stats */
+ subwentry = pgstat_get_subworker_entry(dbentry, msg->m_subid,
+
msg->m_subrelid, true);
+ Assert(subwentry);
+
+ /*
+ * Update only the counter and last error timestamp if we received
+ * the same error again
+ */
Thoughts?
Regards,
Vignesh
On Fri, Nov 19, 2021 at 4:39 PM vignesh C <vignesh21@gmail.com> wrote:
Since the statistics collector process uses UDP socket, the sequencing
of the messages is not guaranteed. Will there be a problem if
Subscription is dropped and stats collector receives
PGSTAT_MTYPE_SUBSCRIPTIONPURGE first and the subscription worker entry
is removed and then receives PGSTAT_MTYPE_SUBWORKERERROR(this order
can happen because of UDP socket). I'm not sure if the Assert will be
a problem in this case. If this scenario is possible we could just
silently return in that case.
Given that the message sequencing is not guaranteed, it looks like
that Assert and the current code after it won't handle that scenario
well. Silently returning if subwentry is NULL does seem like the way
to deal with that possibility.
Doesn't this possibility of out-of-sequence messaging due to UDP
similarly mean that "first_error_time" and "last_error_time" may not
be currently handled correctly?
Regards,
Greg Nancarrow
Fujitsu Australia
On Fri, Nov 19, 2021 at 11:09 AM vignesh C <vignesh21@gmail.com> wrote:
On Fri, Nov 19, 2021 at 9:22 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Nov 18, 2021 at 5:10 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Nov 18, 2021 at 5:45 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:On Tuesday, November 16, 2021 2:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Right. I've fixed this issue and attached an updated patch.
Thanks for your patch.
I read the discussion about stats entries for table sync worker[1], the
statistics are retained after table sync worker finished its jobs and user can remove
them via pg_stat_reset_subscription_worker function.But I notice that, if a table sync worker finished its jobs, the error reported by
this worker will not be shown in the pg_stat_subscription_workers view. (It seemed caused by this condition: "WHERE srsubstate <> 'r'") Is it intentional? I think this may cause a result that users don't know the statistics are still exist, and won't remove the statistics manually. And that is not friendly to users' storage, right?You're right. The condition "WHERE substate <> 'r') should be removed.
I'll do that change in the next version patch. Thanks!One more thing you might want to consider for the next version is
whether to rename the columns as discussed in the related thread [1]?
I think we should consider future work and name them accordingly.[1] - /messages/by-id/CAA4eK1KR41bRUuPeNBSGv2+q7ROKukS3myeAUqrZMD8MEwR0DQ@mail.gmail.com
Since the statistics collector process uses UDP socket, the sequencing
of the messages is not guaranteed. Will there be a problem if
Subscription is dropped and stats collector receives
PGSTAT_MTYPE_SUBSCRIPTIONPURGE first and the subscription worker entry
is removed and then receives PGSTAT_MTYPE_SUBWORKERERROR(this order
can happen because of UDP socket). I'm not sure if the Assert will be
a problem in this case.
Why that Assert will hit? We seem to be always passing 'create' as
true so it should create a new entry. I think a similar situation can
happen for functions and it will be probably cleaned in the next
vacuum cycle.
--
With Regards,
Amit Kapila.
On Fri, Nov 19, 2021 at 12:22 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Nov 19, 2021 at 11:09 AM vignesh C <vignesh21@gmail.com> wrote:
On Fri, Nov 19, 2021 at 9:22 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Nov 18, 2021 at 5:10 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Nov 18, 2021 at 5:45 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:On Tuesday, November 16, 2021 2:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Right. I've fixed this issue and attached an updated patch.
Thanks for your patch.
I read the discussion about stats entries for table sync worker[1], the
statistics are retained after table sync worker finished its jobs and user can remove
them via pg_stat_reset_subscription_worker function.But I notice that, if a table sync worker finished its jobs, the error reported by
this worker will not be shown in the pg_stat_subscription_workers view. (It seemed caused by this condition: "WHERE srsubstate <> 'r'") Is it intentional? I think this may cause a result that users don't know the statistics are still exist, and won't remove the statistics manually. And that is not friendly to users' storage, right?You're right. The condition "WHERE substate <> 'r') should be removed.
I'll do that change in the next version patch. Thanks!One more thing you might want to consider for the next version is
whether to rename the columns as discussed in the related thread [1]?
I think we should consider future work and name them accordingly.[1] - /messages/by-id/CAA4eK1KR41bRUuPeNBSGv2+q7ROKukS3myeAUqrZMD8MEwR0DQ@mail.gmail.com
Since the statistics collector process uses UDP socket, the sequencing
of the messages is not guaranteed. Will there be a problem if
Subscription is dropped and stats collector receives
PGSTAT_MTYPE_SUBSCRIPTIONPURGE first and the subscription worker entry
is removed and then receives PGSTAT_MTYPE_SUBWORKERERROR(this order
can happen because of UDP socket). I'm not sure if the Assert will be
a problem in this case.Why that Assert will hit? We seem to be always passing 'create' as
true so it should create a new entry. I think a similar situation can
happen for functions and it will be probably cleaned in the next
vacuum cycle.
Since we are passing true that Assert will not hit, sorry I missed to
notice that. It will create a new entry as you rightly pointed out.
Since the cleaning is handled by vacuum and current code is also doing
that way, I felt no need to make any change.
Regards,
Vignesh
On Fri, Nov 19, 2021 at 5:52 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Why that Assert will hit? We seem to be always passing 'create' as
true so it should create a new entry. I think a similar situation can
happen for functions and it will be probably cleaned in the next
vacuum cycle.
Oops, I missed that too. So at worst, vacuum will clean it up in the
out-of-order SUBSCRIPTIONPURGE,SUBWORKERERROR case.
But I still think the current code may not correctly handle
first_error_time/last_error_time timestamps if out-of-order
SUBWORKERERROR messages occur, right?
Regards,
Greg Nancarrow
Fujitsu Australia
On Fri, Nov 19, 2021 at 1:22 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Fri, Nov 19, 2021 at 5:52 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Why that Assert will hit? We seem to be always passing 'create' as
true so it should create a new entry. I think a similar situation can
happen for functions and it will be probably cleaned in the next
vacuum cycle.Oops, I missed that too. So at worst, vacuum will clean it up in the
out-of-order SUBSCRIPTIONPURGE,SUBWORKERERROR case.But I still think the current code may not correctly handle
first_error_time/last_error_time timestamps if out-of-order
SUBWORKERERROR messages occur, right?
Yeah in such a case last_error_time can be shown as a time before
first_error_time but I don't think that will be a big problem, the
next message will fix it. I don't see what we can do about it and the
same is true for other cases like pg_stat_archiver where the success
and failure times can be out of order. If we want we can remove one of
those times but I don't think this happens frequently enough to be
considered a problem. Anyway, these stats are not considered to be
updated with the most latest info.
--
With Regards,
Amit Kapila.
On Fri, Nov 19, 2021 at 8:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Yeah in such a case last_error_time can be shown as a time before
first_error_time but I don't think that will be a big problem, the
next message will fix it. I don't see what we can do about it and the
same is true for other cases like pg_stat_archiver where the success
and failure times can be out of order. If we want we can remove one of
those times but I don't think this happens frequently enough to be
considered a problem. Anyway, these stats are not considered to be
updated with the most latest info.
Couldn't the code block in pgstat_recv_subworker_error() that
increments error_count just compare the new msg timestamp against the
existing first_error_time and last_error_time and, based on the
result, update those if required?
Regards,
Greg Nancarrow
Fujitsu Australia
On Fri, Nov 19, 2021 at 3:00 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Fri, Nov 19, 2021 at 8:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Yeah in such a case last_error_time can be shown as a time before
first_error_time but I don't think that will be a big problem, the
next message will fix it. I don't see what we can do about it and the
same is true for other cases like pg_stat_archiver where the success
and failure times can be out of order. If we want we can remove one of
those times but I don't think this happens frequently enough to be
considered a problem. Anyway, these stats are not considered to be
updated with the most latest info.Couldn't the code block in pgstat_recv_subworker_error() that
increments error_count just compare the new msg timestamp against the
existing first_error_time and last_error_time and, based on the
result, update those if required?
I don't see any problem with that but let's see what Sawada-San has to
say about this?
--
With Regards,
Amit Kapila.
On Fri, Nov 19, 2021 at 7:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Nov 19, 2021 at 3:00 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Fri, Nov 19, 2021 at 8:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Yeah in such a case last_error_time can be shown as a time before
first_error_time but I don't think that will be a big problem, the
next message will fix it. I don't see what we can do about it and the
same is true for other cases like pg_stat_archiver where the success
and failure times can be out of order. If we want we can remove one of
those times but I don't think this happens frequently enough to be
considered a problem. Anyway, these stats are not considered to be
updated with the most latest info.Couldn't the code block in pgstat_recv_subworker_error() that
increments error_count just compare the new msg timestamp against the
existing first_error_time and last_error_time and, based on the
result, update those if required?I don't see any problem with that but let's see what Sawada-San has to
say about this?
IMO not sure we should do that. Since the stats collector will not
likely to receive the same error report frequently in practice (5 sec
interval by default), perhaps this problem will unlikely to happen.
Even if the same messages are reported frequently enough to cause this
problem, the next message will also be reported soon, fixing it soon,
as Amit mentioned. Also, IIUC once we have the shared memory based
stats collector, we won’t need to worry about this problem. Given that
this kind of problem potentially exists also in other stats views that
have timestamp values, I’m not sure it's worth dealing with this
problem only in pg_stat_subscription_workers view.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thu, Nov 18, 2021 at 12:52 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
On Tuesday, November 16, 2021 2:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Right. I've fixed this issue and attached an updated patch.
Hi,
I have few comments for the testcases.
1)
+my $appname = 'tap_sub'; +$node_subscriber->safe_psql( + 'postgres', + "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (streaming = off, two_phase = on);"); +my $appname_streaming = 'tap_sub_streaming'; +$node_subscriber->safe_psql( + 'postgres', + "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming' PUBLICATION tap_pub_streaming WITH (streaming = on, two_phase = on);"); +I think we can remove the 'application_name=$appname', so that the command
could be shorter.
But we wait for the subscription to catch up by using
wait_for_catchup() with application_name, no?
2)
+...(streaming = on, two_phase = on);");
Besides, is there some reasons to set two_phase to ? If so,
It might be better to add some comments about it.
Yes, two_phase = on is required by the tests for skip transaction
patch. WIll remove it.
3) +CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming; +]); +It seems there's no tests to use the table test_tab_streaming. I guess this
table is used to test streaming change error, maybe we can add some tests for
it ?
Oops, similarly this is also required by the skip transaction tests.
Will remove it.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wed, Nov 24, 2021 at 7:51 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Nov 18, 2021 at 12:52 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:On Tuesday, November 16, 2021 2:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Right. I've fixed this issue and attached an updated patch.
Hi,
I have few comments for the testcases.
1)
+my $appname = 'tap_sub'; +$node_subscriber->safe_psql( + 'postgres', + "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (streaming = off, two_phase = on);"); +my $appname_streaming = 'tap_sub_streaming'; +$node_subscriber->safe_psql( + 'postgres', + "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming' PUBLICATION tap_pub_streaming WITH (streaming = on, two_phase = on);"); +I think we can remove the 'application_name=$appname', so that the command
could be shorter.But we wait for the subscription to catch up by using
wait_for_catchup() with application_name, no?
Yeah, but you can directly use the subscription name in
wait_for_catchup because we internally use that as
fallback_application_name. If application_name is not specified in the
connection string as suggested by Hou-San then
fallback_application_name will be considered. Both ways are okay and I
see we use both ways in the tests but it seems there are more places
where we use the method Hou-San is suggesting in subscription tests.
--
With Regards,
Amit Kapila.
On Wed, Nov 24, 2021 at 12:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Nov 24, 2021 at 7:51 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Nov 18, 2021 at 12:52 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:On Tuesday, November 16, 2021 2:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Right. I've fixed this issue and attached an updated patch.
Hi,
I have few comments for the testcases.
1)
+my $appname = 'tap_sub'; +$node_subscriber->safe_psql( + 'postgres', + "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (streaming = off, two_phase = on);"); +my $appname_streaming = 'tap_sub_streaming'; +$node_subscriber->safe_psql( + 'postgres', + "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming' PUBLICATION tap_pub_streaming WITH (streaming = on, two_phase = on);"); +I think we can remove the 'application_name=$appname', so that the command
could be shorter.But we wait for the subscription to catch up by using
wait_for_catchup() with application_name, no?Yeah, but you can directly use the subscription name in
wait_for_catchup because we internally use that as
fallback_application_name. If application_name is not specified in the
connection string as suggested by Hou-San then
fallback_application_name will be considered. Both ways are okay and I
see we use both ways in the tests but it seems there are more places
where we use the method Hou-San is suggesting in subscription tests.
Okay, thanks! I referred to tests that set application_name. ISTM it's
better to unite them so as not to confuse them in future tests.
Anyway, I'll remove it in the next version patch that I'll submit soon.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wed, Nov 24, 2021 at 1:50 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Nov 24, 2021 at 12:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Nov 24, 2021 at 7:51 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Nov 18, 2021 at 12:52 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:On Tuesday, November 16, 2021 2:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Right. I've fixed this issue and attached an updated patch.
Hi,
I have few comments for the testcases.
1)
+my $appname = 'tap_sub'; +$node_subscriber->safe_psql( + 'postgres', + "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (streaming = off, two_phase = on);"); +my $appname_streaming = 'tap_sub_streaming'; +$node_subscriber->safe_psql( + 'postgres', + "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming' PUBLICATION tap_pub_streaming WITH (streaming = on, two_phase = on);"); +I think we can remove the 'application_name=$appname', so that the command
could be shorter.But we wait for the subscription to catch up by using
wait_for_catchup() with application_name, no?Yeah, but you can directly use the subscription name in
wait_for_catchup because we internally use that as
fallback_application_name. If application_name is not specified in the
connection string as suggested by Hou-San then
fallback_application_name will be considered. Both ways are okay and I
see we use both ways in the tests but it seems there are more places
where we use the method Hou-San is suggesting in subscription tests.Okay, thanks! I referred to tests that set application_name. ISTM it's
better to unite them so as not to confuse them in future tests.
Agreed, but let's do this clean-up as a separate patch. Feel free to
submit the patch for the same in a separate thread.
--
With Regards,
Amit Kapila.
On Wed, Nov 17, 2021 at 8:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Nov 16, 2021 at 12:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Right. I've fixed this issue and attached an updated patch.
Few comments/questions: ===================== 1. + <para> + The <structname>pg_stat_subscription_workers</structname> view will contain + one row per subscription error reported by workers applying logical + replication changes and workers handling the initial data copy of the + subscribed tables. The statistics entry is removed when the subscription + the worker is running on is removed. + </para>The last line of this paragraph is not clear to me. First "the" before
"worker" in the following part of the sentence seems unnecessary
"..when the subscription the worker..". Then the part "running on is
removed" is unclear because it could also mean that we remove the
entry when a subscription is disabled. Can we rephrase it to: "The
statistics entry is removed when the corresponding subscription is
dropped"?
Agreed. Fixed.
2.
Between v20 and v23 versions of patch the size of hash table
PGSTAT_SUBWORKER_HASH_SIZE is increased from 32 to 256. I might have
missed the comment which lead to this change, can you point me to the
same or if you changed it for some other reason, can you let me know
the same?
I'd missed reverting this change. I considered increasing this value
since the lifetime of subscription is long. But when it comes to
unshared hashtable can be expanded on-the-fly, it's better to start
with a small value. Reverted.
3. + + /* + * Repeat for subscription workers. Similarly, we needn't bother + * in the common case where no function stats are being collected. + *//function/subscription workers'
Fixed.
4. + <para> + Name of command being applied when the error occurred. This field + is always NULL if the error was reported during the initial data + copy. + </para></entry> + </row> + + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>xid</structfield> <type>xid</type> + </para> + <para> + Transaction ID of the publisher node being applied when the error + occurred. This field is always NULL if the error was reported + during the initial data copy. + </para></entry>Is it important to stress on 'always' in the above two descriptions?
No, removed.
5.
The current description of first/last_error_time seems sliglthy
misleading as one can interpret that these are about different errors.
Let's slightly change the description of first/last_error_time as
follows or something on those lines:</para> + <para> + Time at which the first error occurred + </para></entry> + </row>First time at which this error occurred
<structfield>last_error_time</structfield> <type>timestamp with time zone</type> + </para> + <para> + Time at which the last error occurredLast time at which this error occurred. This will be the same as
first_error_time except when the same error occurred more than once
consecutively.
Changed. I've removed first_error_time as per discussion on the thread
for adding xact stats.
6. + </indexterm> + <function>pg_stat_reset_subscription_worker</function> ( <parameter>subid</parameter> <type>oid</type>, <optional> <parameter>relid</parameter> <type>oid</type> </optional> ) + <returnvalue>void</returnvalue> + </para> + <para> + Resets the statistics of a single subscription worker running on the + subscription with <parameter>subid</parameter> shown in the + <structname>pg_stat_subscription_worker</structname> view. If the + argument <parameter>relid</parameter> is not <literal>NULL</literal>, + resets statistics of the subscription worker handling the initial data + copy of the relation with <parameter>relid</parameter>. Otherwise, + resets the subscription worker statistics of the main apply worker. + If the argument <parameter>relid</parameter> is omitted, resets the + statistics of all subscription workers running on the subscription + with <parameter>subid</parameter>. + </para>The first line of this description seems to indicate that we can only
reset the stats of a single worker but the later part indicates that
we can reset stats of all subscription workers. Can we change the
first line as: "Resets the statistics of subscription workers running
on the subscription with <parameter>subid</parameter> shown in the
<structname>pg_stat_subscription_worker</structname> view.".
Changed.
7. pgstat_vacuum_stat() { .. + pgstat_setheader(&spmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE); + spmsg.m_databaseid = MyDatabaseId; + spmsg.m_nentries = 0; .. }Do we really need to set the header here? It seems to be getting set
in pgstat_send_subscription_purge() while sending this message.
Removed.
8. pgstat_vacuum_stat() { .. + + if (hash_search(htab, (void *) &(subwentry->key.subid), HASH_FIND, NULL) + != NULL) + continue; + + /* This subscription is dead, add the subid to the message */ + spmsg.m_subids[spmsg.m_nentries++] = subwentry->key.subid; .. }I think it is better to use a separate variable here for subid as we
are using for funcid and tableid. That will make this part of the code
easier to follow and look consistent.
Agreed, and changed.
9. +/* ---------- + * PgStat_MsgSubWorkerError Sent by the apply worker or the table sync worker to + * report the error occurred during logical replication. + * ----------In this comment "during logical replication" sounds too generic. Can
we instead use "while processing changes." or something like that to
make it a bit more specific?
"while processing changes" sounds good.
I've attached an updated version patch. Unless I miss something, all
comments I got so far have been incorporated into this patch. Please
review it.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v24-0001-Add-a-subscription-worker-statistics-view-pg_sta.patchapplication/octet-stream; name=v24-0001-Add-a-subscription-worker-statistics-view-pg_sta.patchDownload
From c699fff60613244312763c52975d64da11140ad9 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 16 Jul 2021 23:10:22 +0900
Subject: [PATCH v24 1/3] Add a subscription worker statistics view
"pg_stat_subscription_workers".
This commit adds a new system view pg_stat_subscription_workers,
that shows information about any errors which occur during application
of logical replication changes as well as during performing initial table
synchronization. The subscription statistics entries are removed when
the corresponding subscription is removed.
It also adds an SQL function pg_stat_reset_subscription_worker() to
reset single subscription errors.
The contents of this view can be used by an upcoming patch that skips
the particular transaction that conflicts with the existing data on
the subscriber.
This view can be extended in the future to track other xact related
statistics for subscription workers.
---
doc/src/sgml/monitoring.sgml | 157 ++++++++
src/backend/catalog/system_functions.sql | 4 +
src/backend/catalog/system_views.sql | 23 ++
src/backend/commands/subscriptioncmds.c | 16 +-
src/backend/postmaster/pgstat.c | 380 +++++++++++++++++++-
src/backend/replication/logical/worker.c | 54 ++-
src/backend/utils/adt/pgstatfuncs.c | 126 ++++++-
src/include/catalog/pg_proc.dat | 18 +
src/include/pgstat.h | 103 +++++-
src/test/regress/expected/rules.out | 18 +
src/test/subscription/t/026_error_report.pl | 181 ++++++++++
src/tools/pgindent/typedefs.list | 4 +
12 files changed, 1063 insertions(+), 21 deletions(-)
create mode 100644 src/test/subscription/t/026_error_report.pl
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index af6914872b..e6609a693e 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -627,6 +627,15 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
</entry>
</row>
+ <row>
+ <entry><structname>pg_stat_subscription_workers</structname><indexterm><primary>pg_stat_subscription_workers</primary></indexterm></entry>
+ <entry>One row per subscription worker, showing statistics about errors
+ that occurred on that subscription worker.
+ See <link linkend="monitoring-pg-stat-subscription-workers">
+ <structname>pg_stat_subscription_workers</structname></link> for details.
+ </entry>
+ </row>
+
</tbody>
</tgroup>
</table>
@@ -3054,6 +3063,128 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
</sect2>
+ <sect2 id="monitoring-pg-stat-subscription-workers">
+ <title><structname>pg_stat_subscription_workers</structname></title>
+
+ <indexterm>
+ <primary>pg_stat_subscription_workers</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_stat_subscription_workers</structname> view will contain
+ one row per subscription worker on which errors have occurred, for workers
+ applying logical replication changes and workers handling the initial data
+ copy of the subscribed tables. The statistics entry is removed when the
+ corresponding subscription is dropped.
+ </para>
+
+ <table id="pg-stat-subscription-workers" xreflabel="pg_stat_subscription_workers">
+ <title><structname>pg_stat_subscription_workers</structname> View</title>
+ <tgroup cols="1">
+ <thead>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ Column Type
+ </para>
+ <para>
+ Description
+ </para></entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the subscription
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the subscription
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subrelid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker is synchronizing; null for the
+ main apply worker
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>relid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker was processing when the
+ error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>command</structfield> <type>text</type>
+ </para>
+ <para>
+ Name of command being applied when the error occurred. This field
+ is null if the error was reported during the initial data copy.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>xid</structfield> <type>xid</type>
+ </para>
+ <para>
+ Transaction ID of the publisher node being applied when the error
+ occurred. This field is null if the error was reported
+ during the initial data copy.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>error_count</structfield> <type>uint8</type>
+ </para>
+ <para>
+ Number of consecutive times the error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>error_message</structfield> <type>text</type>
+ </para>
+ <para>
+ The error message
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_error_time</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Last time at which this error occurred.
+ </para></entry>
+ </row>
+
+ </tbody>
+ </tgroup>
+ </table>
+
+ </sect2>
+
<sect2 id="monitoring-pg-stat-ssl-view">
<title><structname>pg_stat_ssl</structname></title>
@@ -5176,6 +5307,32 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
can be granted EXECUTE to run the function.
</para></entry>
</row>
+
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_stat_reset_subscription_worker</primary>
+ </indexterm>
+ <function>pg_stat_reset_subscription_worker</function> ( <parameter>subid</parameter> <type>oid</type> <optional>, <parameter>relid</parameter> <type>oid</type> </optional> )
+ <returnvalue>void</returnvalue>
+ </para>
+ <para>
+ Resets the statistics of subscription workers running on the
+ subscription with <parameter>subid</parameter> shown in the
+ <structname>pg_stat_subscription_worker</structname> view. If the
+ argument <parameter>relid</parameter> is not <literal>NULL</literal>,
+ resets statistics of the subscription worker handling the initial data
+ copy of the relation with <parameter>relid</parameter>. Otherwise,
+ resets the subscription worker statistics of the main apply worker.
+ If the argument <parameter>relid</parameter> is omitted, resets the
+ statistics of all subscription workers running on the subscription
+ with <parameter>subid</parameter>.
+ </para>
+ <para>
+ This function is restricted to superusers by default, but other users
+ can be granted EXECUTE to run the function.
+ </para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index f6789025a5..3a4fa9091b 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -639,6 +639,10 @@ REVOKE EXECUTE ON FUNCTION pg_stat_reset_single_function_counters(oid) FROM publ
REVOKE EXECUTE ON FUNCTION pg_stat_reset_replication_slot(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_stat_reset_subscription_worker(oid) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_stat_reset_subscription_worker(oid, oid) FROM public;
+
REVOKE EXECUTE ON FUNCTION lo_import(text) FROM public;
REVOKE EXECUTE ON FUNCTION lo_import(text, oid) FROM public;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index eb560955cd..5535a68d4e 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1261,3 +1261,26 @@ REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
substream, subtwophasestate, subslotname, subsynccommit, subpublications)
ON pg_subscription TO public;
+
+CREATE VIEW pg_stat_subscription_workers AS
+ SELECT
+ w.subid,
+ s.subname,
+ w.subrelid,
+ w.relid,
+ w.command,
+ w.xid,
+ w.error_count,
+ w.error_message,
+ w.last_error_time
+ FROM (SELECT
+ oid as subid,
+ NULL as relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT
+ srsubid as subid,
+ srrelid as relid
+ FROM pg_subscription_rel) sr,
+ LATERAL pg_stat_get_subscription_worker(sr.subid, sr.relid) w
+ JOIN pg_subscription s ON (w.subid = s.oid);
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index c47ba26369..9427e86fee 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -32,6 +32,7 @@
#include "executor/executor.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "pgstat.h"
#include "replication/logicallauncher.h"
#include "replication/origin.h"
#include "replication/slot.h"
@@ -1204,7 +1205,8 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
* Since dropping a replication slot is not transactional, the replication
* slot stays dropped even if the transaction rolls back. So we cannot
* run DROP SUBSCRIPTION inside a transaction block if dropping the
- * replication slot.
+ * replication slot. Also, in this case, we report a message for dropping
+ * the subscription to the stats collector.
*
* XXX The command name should really be something like "DROP SUBSCRIPTION
* of a subscription that is associated with a replication slot", but we
@@ -1377,6 +1379,18 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
}
PG_END_TRY();
+ /*
+ * Send a message for dropping this subscription to the stats collector.
+ * We can safely report dropping the subscription statistics here if the
+ * subscription is associated with a replication slot since we cannot run
+ * DROP SUBSCRIPTION inside a transaction block. Subscription statistics
+ * will be removed later by (auto)vacuum either if it's not associated
+ * with a replication slot or if the message for dropping the subscription
+ * gets lost.
+ */
+ if (slotname)
+ pgstat_report_subscription_drop(subid);
+
table_close(rel, NoLock);
}
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 8c166e5e16..02e9ca2472 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -41,6 +41,7 @@
#include "catalog/catalog.h"
#include "catalog/pg_database.h"
#include "catalog/pg_proc.h"
+#include "catalog/pg_subscription.h"
#include "common/ip.h"
#include "executor/instrument.h"
#include "libpq/libpq.h"
@@ -105,6 +106,7 @@
#define PGSTAT_DB_HASH_SIZE 16
#define PGSTAT_TAB_HASH_SIZE 512
#define PGSTAT_FUNCTION_HASH_SIZE 512
+#define PGSTAT_SUBWORKER_HASH_SIZE 32
#define PGSTAT_REPLSLOT_HASH_SIZE 32
@@ -320,10 +322,14 @@ NON_EXEC_STATIC void PgstatCollectorMain(int argc, char *argv[]) pg_attribute_no
static PgStat_StatDBEntry *pgstat_get_db_entry(Oid databaseid, bool create);
static PgStat_StatTabEntry *pgstat_get_tab_entry(PgStat_StatDBEntry *dbentry,
Oid tableoid, bool create);
+static PgStat_StatSubWorkerEntry *pgstat_get_subworker_entry(PgStat_StatDBEntry *dbentry,
+ Oid subid, Oid subrelid,
+ bool create);
static void pgstat_write_statsfiles(bool permanent, bool allDbs);
static void pgstat_write_db_statsfile(PgStat_StatDBEntry *dbentry, bool permanent);
static HTAB *pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep);
-static void pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash, bool permanent);
+static void pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash,
+ HTAB *subworkerhash, bool permanent);
static void backend_read_statsfile(void);
static bool pgstat_write_statsfile_needed(void);
@@ -335,6 +341,7 @@ static void pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotstats, Timestamp
static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg, TimestampTz now);
static void pgstat_send_funcstats(void);
static void pgstat_send_slru(void);
+static void pgstat_send_subscription_purge(PgStat_MsgSubscriptionPurge *msg);
static HTAB *pgstat_collect_oids(Oid catalogid, AttrNumber anum_oid);
static bool pgstat_should_report_connstat(void);
static void pgstat_report_disconnect(Oid dboid);
@@ -373,6 +380,8 @@ static void pgstat_recv_connect(PgStat_MsgConnect *msg, int len);
static void pgstat_recv_disconnect(PgStat_MsgDisconnect *msg, int len);
static void pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len);
static void pgstat_recv_tempfile(PgStat_MsgTempFile *msg, int len);
+static void pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len);
+static void pgstat_recv_subworker_error(PgStat_MsgSubWorkerError *msg, int len);
/* ------------------------------------------------------------
* Public functions called from postmaster follow
@@ -1302,6 +1311,55 @@ pgstat_vacuum_stat(void)
hash_destroy(htab);
}
+
+ /*
+ * Repeat for subscription workers. Similarly, we needn't bother in the
+ * common case where no subscription workers' stats are being collected.
+ */
+ if (dbentry->subworkers != NULL &&
+ hash_get_num_entries(dbentry->subworkers) > 0)
+ {
+ PgStat_StatSubWorkerEntry *subwentry;
+ PgStat_MsgSubscriptionPurge spmsg;
+
+ /*
+ * Read pg_subscription and make a list of OIDs of all existing
+ * subscriptions
+ */
+ htab = pgstat_collect_oids(SubscriptionRelationId, Anum_pg_subscription_oid);
+
+ spmsg.m_databaseid = MyDatabaseId;
+ spmsg.m_nentries = 0;
+
+ hash_seq_init(&hstat, dbentry->subworkers);
+ while ((subwentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ Oid subid = subwentry->key.subid;
+
+ CHECK_FOR_INTERRUPTS();
+
+ if (hash_search(htab, (void *) &subid, HASH_FIND, NULL) != NULL)
+ continue;
+
+ /* This subscription is dead, add the subid to the message */
+ spmsg.m_subids[spmsg.m_nentries++] = subid;
+
+ /*
+ * If the message is full, send it out and reinitialize to empty
+ */
+ if (spmsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONPURGE)
+ {
+ pgstat_send_subscription_purge(&spmsg);
+ spmsg.m_nentries = 0;
+ }
+ }
+
+ /* Send the rest of dead subscriptions */
+ if (spmsg.m_nentries > 0)
+ pgstat_send_subscription_purge(&spmsg);
+
+ hash_destroy(htab);
+ }
}
@@ -1474,7 +1532,8 @@ pgstat_reset_shared_counters(const char *target)
* ----------
*/
void
-pgstat_reset_single_counter(Oid objoid, PgStat_Single_Reset_Type type)
+pgstat_reset_single_counter(Oid objoid, Oid subobjoid,
+ PgStat_Single_Reset_Type type)
{
PgStat_MsgResetsinglecounter msg;
@@ -1485,6 +1544,7 @@ pgstat_reset_single_counter(Oid objoid, PgStat_Single_Reset_Type type)
msg.m_databaseid = MyDatabaseId;
msg.m_resettype = type;
msg.m_objectid = objoid;
+ msg.m_subobjectid = subobjoid;
pgstat_send(&msg, sizeof(msg));
}
@@ -1869,6 +1929,51 @@ pgstat_report_replslot_drop(const char *slotname)
pgstat_send(&msg, sizeof(PgStat_MsgReplSlot));
}
+/* ----------
+ * pgstat_report_subworker_error() -
+ *
+ * Tell the collector about the subscription worker error.
+ * ----------
+ */
+void
+pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command, TransactionId xid,
+ const char *errmsg)
+{
+ PgStat_MsgSubWorkerError msg;
+ int len;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBWORKERERROR);
+ msg.m_databaseid = MyDatabaseId;
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_relid = relid;
+ msg.m_command = command;
+ msg.m_xid = xid;
+ msg.m_timestamp = GetCurrentTimestamp();
+ strlcpy(msg.m_message, errmsg, PGSTAT_SUBWORKERERROR_MSGLEN);
+
+ len = offsetof(PgStat_MsgSubWorkerError, m_message) + strlen(msg.m_message) + 1;
+ pgstat_send(&msg, len);
+}
+
+/* ----------
+ * pgstat_report_subscription_drop() -
+ *
+ * Tell the collector about dropping the subscription.
+ * ----------
+ */
+void
+pgstat_report_subscription_drop(Oid subid)
+{
+ PgStat_MsgSubscriptionPurge msg;
+
+ msg.m_databaseid = MyDatabaseId;
+ msg.m_subids[0] = subid;
+ msg.m_nentries = 1;
+ pgstat_send_subscription_purge(&msg);
+}
+
/* ----------
* pgstat_ping() -
*
@@ -2874,6 +2979,35 @@ pgstat_fetch_stat_funcentry(Oid func_id)
return funcentry;
}
+/*
+ * ---------
+ * pgstat_fetch_stat_subworker_entry() -
+ *
+ * Support function for the SQL-callable pgstat* functions. Returns
+ * a pointer to the subscription worker struct or NULL.
+ * ---------
+ */
+PgStat_StatSubWorkerEntry *
+pgstat_fetch_stat_subworker_entry(Oid subid, Oid subrelid)
+{
+ PgStat_StatDBEntry *dbentry;
+ PgStat_StatSubWorkerEntry *wentry = NULL;
+
+ /* Load the stats file if needed */
+ backend_read_statsfile();
+
+ /*
+ * Lookup our database, then find the requested subscription worker stats.
+ */
+ dbentry = pgstat_fetch_stat_dbentry(MyDatabaseId);
+ if (dbentry != NULL && dbentry->subworkers != NULL)
+ {
+ wentry = pgstat_get_subworker_entry(dbentry, subid, subrelid,
+ false);
+ }
+
+ return wentry;
+}
/*
* ---------
@@ -3312,6 +3446,23 @@ pgstat_send_slru(void)
}
}
+/* --------
+ * pgstat_send_subscription_purge() -
+ *
+ * Send a subscription purge message to the collector
+ * --------
+ */
+static void
+pgstat_send_subscription_purge(PgStat_MsgSubscriptionPurge *msg)
+{
+ int len;
+
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + msg->m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&msg->m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(msg, len);
+}
/* ----------
* PgstatCollectorMain() -
@@ -3568,6 +3719,14 @@ PgstatCollectorMain(int argc, char *argv[])
pgstat_recv_disconnect(&msg.msg_disconnect, len);
break;
+ case PGSTAT_MTYPE_SUBSCRIPTIONPURGE:
+ pgstat_recv_subscription_purge(&msg.msg_subscriptionpurge, len);
+ break;
+
+ case PGSTAT_MTYPE_SUBWORKERERROR:
+ pgstat_recv_subworker_error(&msg.msg_subworkererror, len);
+ break;
+
default:
break;
}
@@ -3613,7 +3772,8 @@ PgstatCollectorMain(int argc, char *argv[])
/*
* Subroutine to clear stats in a database entry
*
- * Tables and functions hashes are initialized to empty.
+ * Tables, functions, and subscription workers hashes are initialized
+ * to empty.
*/
static void
reset_dbentry_counters(PgStat_StatDBEntry *dbentry)
@@ -3666,6 +3826,13 @@ reset_dbentry_counters(PgStat_StatDBEntry *dbentry)
PGSTAT_FUNCTION_HASH_SIZE,
&hash_ctl,
HASH_ELEM | HASH_BLOBS);
+
+ hash_ctl.keysize = sizeof(PgStat_StatSubWorkerKey);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubWorkerEntry);
+ dbentry->subworkers = hash_create("Per-database subscription worker",
+ PGSTAT_SUBWORKER_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
}
/*
@@ -3690,7 +3857,7 @@ pgstat_get_db_entry(Oid databaseid, bool create)
/*
* If not found, initialize the new one. This creates empty hash tables
- * for tables and functions, too.
+ * for tables, functions, and subscription worker, too.
*/
if (!found)
reset_dbentry_counters(result);
@@ -3748,6 +3915,47 @@ pgstat_get_tab_entry(PgStat_StatDBEntry *dbentry, Oid tableoid, bool create)
return result;
}
+/* ----------
+ * pgstat_get_subworker_entry
+ *
+ * Return subscription worker entry with the given subscription OID and
+ * relation OID. If subrelid is InvalidOid, it returns an entry of the
+ * apply worker otherwise returns entry of the table sync worker associated
+ * with subrelid. If no subscription entry exists, initialize it, if the
+ * create parameter is true. Else, return NULL.
+ * ----------
+ */
+static PgStat_StatSubWorkerEntry *
+pgstat_get_subworker_entry(PgStat_StatDBEntry *dbentry, Oid subid, Oid subrelid,
+ bool create)
+{
+ PgStat_StatSubWorkerEntry *subwentry;
+ PgStat_StatSubWorkerKey key;
+ bool found;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);
+
+ key.subid = subid;
+ key.subrelid = subrelid;
+ subwentry = (PgStat_StatSubWorkerEntry *) hash_search(dbentry->subworkers,
+ (void *) &key,
+ action, &found);
+
+ if (!create && !found)
+ return NULL;
+
+ /* If not found, initialize the new one */
+ if (!found)
+ {
+ subwentry->relid = InvalidOid;
+ subwentry->command = 0;
+ subwentry->xid = InvalidTransactionId;
+ subwentry->error_count = 0;
+ subwentry->last_error_time = 0;
+ subwentry->error_message[0] = '\0';
+ }
+
+ return subwentry;
+}
/* ----------
* pgstat_write_statsfiles() -
@@ -3947,8 +4155,10 @@ pgstat_write_db_statsfile(PgStat_StatDBEntry *dbentry, bool permanent)
{
HASH_SEQ_STATUS tstat;
HASH_SEQ_STATUS fstat;
+ HASH_SEQ_STATUS sstat;
PgStat_StatTabEntry *tabentry;
PgStat_StatFuncEntry *funcentry;
+ PgStat_StatSubWorkerEntry *subwentry;
FILE *fpout;
int32 format_id;
Oid dbid = dbentry->databaseid;
@@ -4003,6 +4213,17 @@ pgstat_write_db_statsfile(PgStat_StatDBEntry *dbentry, bool permanent)
(void) rc; /* we'll check for error with ferror */
}
+ /*
+ * Walk through the database's subscription worker stats table.
+ */
+ hash_seq_init(&sstat, dbentry->subworkers);
+ while ((subwentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&sstat)) != NULL)
+ {
+ fputc('S', fpout);
+ rc = fwrite(subwentry, sizeof(PgStat_StatSubWorkerEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+ }
+
/*
* No more output to be done. Close the temp file and replace the old
* pgstat.stat with it. The ferror() check replaces testing for error
@@ -4241,6 +4462,7 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
memcpy(dbentry, &dbbuf, sizeof(PgStat_StatDBEntry));
dbentry->tables = NULL;
dbentry->functions = NULL;
+ dbentry->subworkers = NULL;
/*
* In the collector, disregard the timestamp we read from the
@@ -4252,8 +4474,8 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
dbentry->stats_timestamp = 0;
/*
- * Don't create tables/functions hashtables for uninteresting
- * databases.
+ * Don't create tables/functions/subworkers hashtables for
+ * uninteresting databases.
*/
if (onlydb != InvalidOid)
{
@@ -4278,6 +4500,14 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
&hash_ctl,
HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ hash_ctl.keysize = sizeof(PgStat_StatSubWorkerKey);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubWorkerEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ dbentry->subworkers = hash_create("Per-database subscription worker",
+ PGSTAT_SUBWORKER_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+
/*
* If requested, read the data from the database-specific
* file. Otherwise we just leave the hashtables empty.
@@ -4286,6 +4516,7 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
pgstat_read_db_statsfile(dbentry->databaseid,
dbentry->tables,
dbentry->functions,
+ dbentry->subworkers,
permanent);
break;
@@ -4363,19 +4594,21 @@ done:
* As in pgstat_read_statsfiles, if the permanent file is requested, it is
* removed after reading.
*
- * Note: this code has the ability to skip storing per-table or per-function
- * data, if NULL is passed for the corresponding hashtable. That's not used
- * at the moment though.
+ * Note: this code has the ability to skip storing per-table, per-function, or
+ * per-subscription-worker data, if NULL is passed for the corresponding hashtable.
+ * That's not used at the moment though.
* ----------
*/
static void
pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash,
- bool permanent)
+ HTAB *subworkerhash, bool permanent)
{
PgStat_StatTabEntry *tabentry;
PgStat_StatTabEntry tabbuf;
PgStat_StatFuncEntry funcbuf;
PgStat_StatFuncEntry *funcentry;
+ PgStat_StatSubWorkerEntry subwbuf;
+ PgStat_StatSubWorkerEntry *subwentry;
FILE *fpin;
int32 format_id;
bool found;
@@ -4489,6 +4722,41 @@ pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash,
memcpy(funcentry, &funcbuf, sizeof(funcbuf));
break;
+ /*
+ * 'S' A PgStat_StatSubWorkerEntry struct describing
+ * subscription worker statistics.
+ */
+ case 'S':
+ if (fread(&subwbuf, 1, sizeof(PgStat_StatSubWorkerEntry),
+ fpin) != sizeof(PgStat_StatSubWorkerEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /*
+ * Skip if subscription worker data not wanted.
+ */
+ if (subworkerhash == NULL)
+ break;
+
+ subwentry = (PgStat_StatSubWorkerEntry *) hash_search(subworkerhash,
+ (void *) &subwbuf.key,
+ HASH_ENTER, &found);
+
+ if (found)
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ memcpy(subwentry, &subwbuf, sizeof(subwbuf));
+ break;
+
/*
* 'E' The EOF marker of a complete stats file.
*/
@@ -5162,6 +5430,8 @@ pgstat_recv_dropdb(PgStat_MsgDropdb *msg, int len)
hash_destroy(dbentry->tables);
if (dbentry->functions != NULL)
hash_destroy(dbentry->functions);
+ if (dbentry->subworkers != NULL)
+ hash_destroy(dbentry->subworkers);
if (hash_search(pgStatDBHash,
(void *) &dbid,
@@ -5199,13 +5469,16 @@ pgstat_recv_resetcounter(PgStat_MsgResetcounter *msg, int len)
hash_destroy(dbentry->tables);
if (dbentry->functions != NULL)
hash_destroy(dbentry->functions);
+ if (dbentry->subworkers != NULL)
+ hash_destroy(dbentry->subworkers);
dbentry->tables = NULL;
dbentry->functions = NULL;
+ dbentry->subworkers = NULL;
/*
* Reset database-level stats, too. This creates empty hash tables for
- * tables and functions.
+ * tables, functions, and subscription workers.
*/
reset_dbentry_counters(dbentry);
}
@@ -5274,6 +5547,14 @@ pgstat_recv_resetsinglecounter(PgStat_MsgResetsinglecounter *msg, int len)
else if (msg->m_resettype == RESET_FUNCTION)
(void) hash_search(dbentry->functions, (void *) &(msg->m_objectid),
HASH_REMOVE, NULL);
+ else if (msg->m_resettype == RESET_SUBWORKER)
+ {
+ PgStat_StatSubWorkerKey key;
+
+ key.subid = msg->m_objectid;
+ key.subrelid = msg->m_subobjectid;
+ (void) hash_search(dbentry->subworkers, (void *) &key, HASH_REMOVE, NULL);
+ }
}
/* ----------
@@ -5816,6 +6097,83 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
}
}
+/* ----------
+ * pgstat_recv_subscription_purge() -
+ *
+ * Process a SUBSCRIPTIONPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len)
+{
+ HASH_SEQ_STATUS hstat;
+ PgStat_StatDBEntry *dbentry;
+ PgStat_StatSubWorkerEntry *subwentry;
+
+ dbentry = pgstat_get_db_entry(msg->m_databaseid, false);
+
+ /* No need to purge if we don't have even know the database */
+ if (!dbentry || !dbentry->subworkers)
+ return;
+
+ /* Remove all subscription worker statistics of the given subscriptions */
+ hash_seq_init(&hstat, dbentry->subworkers);
+ while ((subwentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ if (subwentry->key.subid == msg->m_subids[i])
+ {
+ (void) hash_search(dbentry->subworkers, (void *) &(subwentry->key),
+ HASH_REMOVE, NULL);
+ break;
+ }
+ }
+ }
+}
+
+/* ----------
+ * pgstat_recv_subworker_error() -
+ *
+ * Process a SUBWORKERERROR message.
+ * ----------
+ */
+static void
+pgstat_recv_subworker_error(PgStat_MsgSubWorkerError *msg, int len)
+{
+ PgStat_StatDBEntry *dbentry;
+ PgStat_StatSubWorkerEntry *subwentry;
+
+ dbentry = pgstat_get_db_entry(msg->m_databaseid, true);
+
+ /* Get the subscription worker stats */
+ subwentry = pgstat_get_subworker_entry(dbentry, msg->m_subid,
+ msg->m_subrelid, true);
+ Assert(subwentry);
+
+ /*
+ * Update only the counter and last error timestamp if we received the
+ * same error again
+ */
+ if (subwentry->relid == msg->m_relid &&
+ subwentry->command == msg->m_command &&
+ subwentry->xid == msg->m_xid &&
+ strcmp(subwentry->error_message, msg->m_message) == 0)
+ {
+ subwentry->error_count++;
+ subwentry->last_error_time = msg->m_timestamp;
+ return;
+ }
+
+ /* Otherwise, update the error information */
+ subwentry->relid = msg->m_relid;
+ subwentry->command = msg->m_command;
+ subwentry->xid = msg->m_xid;
+ subwentry->error_count = 1;
+ subwentry->last_error_time = msg->m_timestamp;
+ strlcpy(subwentry->error_message, msg->m_message, PGSTAT_SUBWORKERERROR_MSGLEN);
+}
+
/* ----------
* pgstat_write_statsfile_needed() -
*
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index ae1b391bda..2e79302a48 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3332,6 +3332,7 @@ void
ApplyWorkerMain(Datum main_arg)
{
int worker_slot = DatumGetInt32(main_arg);
+ MemoryContext cctx = CurrentMemoryContext;
MemoryContext oldctx;
char originname[NAMEDATALEN];
XLogRecPtr origin_startpos;
@@ -3432,8 +3433,30 @@ ApplyWorkerMain(Datum main_arg)
{
char *syncslotname;
- /* This is table synchronization worker, call initial sync. */
- syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ PG_TRY();
+ {
+ /* This is table synchronization worker, call initial sync. */
+ syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ }
+ PG_CATCH();
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ /*
+ * Report the table sync error. There is no corresponding message
+ * type for table synchronization.
+ */
+ pgstat_report_subworker_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ MyLogicalRepWorker->relid,
+ 0, /* message type */
+ InvalidTransactionId,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
/* allocate slot name in long-lived context */
myslotname = MemoryContextStrdup(ApplyContext, syncslotname);
@@ -3551,7 +3574,32 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- LogicalRepApplyLoop(origin_startpos);
+ PG_TRY();
+ {
+ LogicalRepApplyLoop(origin_startpos);
+ }
+ PG_CATCH();
+ {
+ /* report the apply error */
+ if (apply_error_callback_arg.command != 0)
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ pgstat_report_subworker_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ apply_error_callback_arg.rel != NULL
+ ? apply_error_callback_arg.rel->localreloid
+ : InvalidOid,
+ apply_error_callback_arg.command,
+ apply_error_callback_arg.remote_xid,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ }
+
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
proc_exit(0);
}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index e64857e540..859900f60f 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -2172,7 +2172,7 @@ pg_stat_reset_single_table_counters(PG_FUNCTION_ARGS)
{
Oid taboid = PG_GETARG_OID(0);
- pgstat_reset_single_counter(taboid, RESET_TABLE);
+ pgstat_reset_single_counter(taboid, InvalidOid, RESET_TABLE);
PG_RETURN_VOID();
}
@@ -2182,7 +2182,18 @@ pg_stat_reset_single_function_counters(PG_FUNCTION_ARGS)
{
Oid funcoid = PG_GETARG_OID(0);
- pgstat_reset_single_counter(funcoid, RESET_FUNCTION);
+ pgstat_reset_single_counter(funcoid, InvalidOid, RESET_FUNCTION);
+
+ PG_RETURN_VOID();
+}
+
+Datum
+pg_stat_reset_subscription_worker_subrel(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid = PG_ARGISNULL(1) ? InvalidOid : PG_GETARG_OID(1);
+
+ pgstat_reset_single_counter(subid, relid, RESET_SUBWORKER);
PG_RETURN_VOID();
}
@@ -2240,6 +2251,21 @@ pg_stat_reset_replication_slot(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
+/* Reset all subscription worker stats associated with the given subscription */
+Datum
+pg_stat_reset_subscription_worker_sub(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+
+ /*
+ * Use subscription drop message to remove statistics of all subscription
+ * workers.
+ */
+ pgstat_report_subscription_drop(subid);
+
+ PG_RETURN_VOID();
+}
+
Datum
pg_stat_get_archiver(PG_FUNCTION_ARGS)
{
@@ -2380,3 +2406,99 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
/* Returns the record as Datum */
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
+
+/*
+ * Get the subscription worker statistics for the given subscription
+ * (and relation).
+ */
+Datum
+pg_stat_get_subscription_worker(PG_FUNCTION_ARGS)
+{
+#define PG_STAT_GET_SUBSCRIPTION_WORKER_COLS 8
+ Oid subid = PG_GETARG_OID(0);
+ Oid subrelid;
+ TupleDesc tupdesc;
+ Datum values[PG_STAT_GET_SUBSCRIPTION_WORKER_COLS];
+ bool nulls[PG_STAT_GET_SUBSCRIPTION_WORKER_COLS];
+ PgStat_StatSubWorkerEntry *wentry;
+ int i;
+
+ if (PG_ARGISNULL(1))
+ subrelid = InvalidOid;
+ else
+ subrelid = PG_GETARG_OID(1);
+
+ /* Get subscription worker stats */
+ wentry = pgstat_fetch_stat_subworker_entry(subid, subrelid);
+
+ /* Return NULL if there is no worker statistics */
+ if (wentry == NULL)
+ PG_RETURN_NULL();
+
+ /* Initialise attributes information in the tuple descriptor */
+ tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_SUBSCRIPTION_WORKER_COLS);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "subid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subrelid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 3, "relid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 4, "command",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 5, "xid",
+ XIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 6, "error_count",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 7, "error_message",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 8, "last_error_time",
+ TIMESTAMPTZOID, -1, 0);
+ BlessTupleDesc(tupdesc);
+
+ /* Initialise values and NULL flags arrays */
+ MemSet(values, 0, sizeof(values));
+ MemSet(nulls, 0, sizeof(nulls));
+
+ i = 0;
+ /* subid */
+ values[i++] = ObjectIdGetDatum(subid);
+
+ /* subrelid */
+ if (OidIsValid(subrelid))
+ values[i++] = ObjectIdGetDatum(subrelid);
+ else
+ nulls[i++] = true;
+
+ /* relid */
+ if (OidIsValid(wentry->relid))
+ values[i++] = ObjectIdGetDatum(wentry->relid);
+ else
+ nulls[i++] = true;
+
+ /* command */
+ if (wentry->command != 0)
+ values[i++] = CStringGetTextDatum(logicalrep_message_type(wentry->command));
+ else
+ nulls[i++] = true;
+
+ /* xid */
+ if (TransactionIdIsValid(wentry->xid))
+ values[i++] = TransactionIdGetDatum(wentry->xid);
+ else
+ nulls[i++] = true;
+
+ /* error_count */
+ values[i++] = Int64GetDatum(wentry->error_count);
+
+ /* error_message */
+ values[i++] = CStringGetTextDatum(wentry->error_message);
+
+ /* last_error_time */
+ if (wentry->last_error_time != 0)
+ values[i++] = TimestampTzGetDatum(wentry->last_error_time);
+ else
+ nulls[i++] = true;
+
+ /* Returns the record as Datum */
+ PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
+}
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index e934361dc3..6fc70fb4fb 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5389,6 +5389,14 @@
proargmodes => '{i,o,o,o,o,o,o,o,o,o,o}',
proargnames => '{slot_name,slot_name,spill_txns,spill_count,spill_bytes,stream_txns,stream_count,stream_bytes,total_txns,total_bytes,stats_reset}',
prosrc => 'pg_stat_get_replication_slot' },
+{ oid => '8523', descr => 'statistics: information about subscription worker',
+ proname => 'pg_stat_get_subscription_worker', prorows => '1', proisstrict => 'f',
+ proretset => 't', provolatile => 's', proparallel => 'r',
+ prorettype => 'record', proargtypes => 'oid oid',
+ proallargtypes => '{oid,oid,oid,oid,oid,text,xid,int8,text,timestamptz}',
+ proargmodes => '{i,i,o,o,o,o,o,o,o,o}',
+ proargnames => '{subid,subrelid,subid,subrelid,relid,command,xid,error_count,error_message,last_error_time}',
+ prosrc => 'pg_stat_get_subscription_worker' },
{ oid => '6118', descr => 'statistics: information about subscription',
proname => 'pg_stat_get_subscription', prorows => '10', proisstrict => 'f',
proretset => 't', provolatile => 's', proparallel => 'r',
@@ -5776,6 +5784,16 @@
proname => 'pg_stat_reset_replication_slot', proisstrict => 'f',
provolatile => 'v', prorettype => 'void', proargtypes => 'text',
prosrc => 'pg_stat_reset_replication_slot' },
+{ oid => '8524',
+ descr => 'statistics: reset collected statistics for a single subscription worker',
+ proname => 'pg_stat_reset_subscription_worker', proisstrict => 'f',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid oid',
+ prosrc => 'pg_stat_reset_subscription_worker_subrel' },
+{ oid => '8525',
+ descr => 'statistics: reset all collected statistics for a single subscription',
+ proname => 'pg_stat_reset_subscription_worker',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid',
+ prosrc => 'pg_stat_reset_subscription_worker_sub' },
{ oid => '3163', descr => 'current trigger depth',
proname => 'pg_trigger_depth', provolatile => 's', proparallel => 'r',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index bcd3588ea2..ce8fb2d98f 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -14,6 +14,7 @@
#include "datatype/timestamp.h"
#include "portability/instr_time.h"
#include "postmaster/pgarch.h" /* for MAX_XFN_CHARS */
+#include "replication/logicalproto.h"
#include "utils/backend_progress.h" /* for backward compatibility */
#include "utils/backend_status.h" /* for backward compatibility */
#include "utils/hsearch.h"
@@ -83,6 +84,8 @@ typedef enum StatMsgType
PGSTAT_MTYPE_REPLSLOT,
PGSTAT_MTYPE_CONNECT,
PGSTAT_MTYPE_DISCONNECT,
+ PGSTAT_MTYPE_SUBSCRIPTIONPURGE,
+ PGSTAT_MTYPE_SUBWORKERERROR,
} StatMsgType;
/* ----------
@@ -145,7 +148,8 @@ typedef enum PgStat_Shared_Reset_Target
typedef enum PgStat_Single_Reset_Type
{
RESET_TABLE,
- RESET_FUNCTION
+ RESET_FUNCTION,
+ RESET_SUBWORKER
} PgStat_Single_Reset_Type;
/* ------------------------------------------------------------
@@ -364,6 +368,7 @@ typedef struct PgStat_MsgResetsinglecounter
Oid m_databaseid;
PgStat_Single_Reset_Type m_resettype;
Oid m_objectid;
+ Oid m_subobjectid;
} PgStat_MsgResetsinglecounter;
/* ----------
@@ -536,6 +541,54 @@ typedef struct PgStat_MsgReplSlot
PgStat_Counter m_total_bytes;
} PgStat_MsgReplSlot;
+/* ----------
+ * PgStat_MsgSubscriptionPurge Sent by the backend and autovacuum to tell the
+ * collector about the dead subscriptions.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(Oid) - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionPurge
+{
+ PgStat_MsgHdr m_hdr;
+ Oid m_databaseid;
+ int m_nentries;
+ Oid m_subids[PGSTAT_NUM_SUBSCRIPTIONPURGE];
+} PgStat_MsgSubscriptionPurge;
+
+/* ----------
+ * PgStat_MsgSubWorkerError Sent by the apply worker or the table sync
+ * worker to report the error occurred while
+ * processing changes.
+ * ----------
+ */
+#define PGSTAT_SUBWORKERERROR_MSGLEN 256
+typedef struct PgStat_MsgSubWorkerError
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * m_subid and m_subrelid are used to determine the subscription and the
+ * reporter of the error. m_subrelid is InvalidOid if reported by an apply
+ * worker otherwise reported by a table sync worker.
+ */
+ Oid m_databaseid;
+ Oid m_subid;
+ Oid m_subrelid;
+
+ /*
+ * Oid of the table that the reporter was actually processing. m_relid can
+ * be InvalidOid if an error occurred during worker applying a
+ * non-data-modification message such as RELATION.
+ */
+ Oid m_relid;
+
+ LogicalRepMsgType m_command;
+ TransactionId m_xid;
+ TimestampTz m_timestamp;
+ char m_message[PGSTAT_SUBWORKERERROR_MSGLEN];
+} PgStat_MsgSubWorkerError;
/* ----------
* PgStat_MsgRecoveryConflict Sent by the backend upon recovery conflict
@@ -714,6 +767,8 @@ typedef union PgStat_Msg
PgStat_MsgReplSlot msg_replslot;
PgStat_MsgConnect msg_connect;
PgStat_MsgDisconnect msg_disconnect;
+ PgStat_MsgSubscriptionPurge msg_subscriptionpurge;
+ PgStat_MsgSubWorkerError msg_subworkererror;
} PgStat_Msg;
@@ -768,11 +823,16 @@ typedef struct PgStat_StatDBEntry
TimestampTz stats_timestamp; /* time of db stats file update */
/*
- * tables and functions must be last in the struct, because we don't write
- * the pointers out to the stats file.
+ * tables, functions, and subscription workers must be last in the struct,
+ * because we don't write the pointers out to the stats file.
+ *
+ * subworker is the hash table of PgStat_StatSubWorkerEntry which stores
+ * statistics of logical replication workers: apply worker and table sync
+ * worker.
*/
HTAB *tables;
HTAB *functions;
+ HTAB *subworkers;
} PgStat_StatDBEntry;
@@ -929,6 +989,34 @@ typedef struct PgStat_StatReplSlotEntry
TimestampTz stat_reset_timestamp;
} PgStat_StatReplSlotEntry;
+/* The lookup key for subscription worker hash table */
+typedef struct PgStat_StatSubWorkerKey
+{
+ Oid subid;
+ Oid subrelid; /* InvalidOid for apply worker, otherwise for
+ * table sync worker */
+} PgStat_StatSubWorkerKey;
+
+/*
+ * Logical replication apply worker and table sync worker statistics kept in the
+ * stats collector.
+ */
+typedef struct PgStat_StatSubWorkerEntry
+{
+ PgStat_StatSubWorkerKey key; /* hash key (must be first) */
+
+ /*
+ * Subscription worker error statistics representing an error that
+ * occurred during application of logical replication or the initial table
+ * synchronization.
+ */
+ Oid relid;
+ LogicalRepMsgType command;
+ TransactionId xid;
+ PgStat_Counter error_count;
+ TimestampTz last_error_time;
+ char error_message[PGSTAT_SUBWORKERERROR_MSGLEN];
+} PgStat_StatSubWorkerEntry;
/*
* Working state needed to accumulate per-function-call timing statistics.
@@ -1019,7 +1107,8 @@ extern void pgstat_drop_database(Oid databaseid);
extern void pgstat_clear_snapshot(void);
extern void pgstat_reset_counters(void);
extern void pgstat_reset_shared_counters(const char *);
-extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type type);
+extern void pgstat_reset_single_counter(Oid objectid, Oid subobjectid,
+ PgStat_Single_Reset_Type type);
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
@@ -1038,6 +1127,10 @@ extern void pgstat_report_checksum_failure(void);
extern void pgstat_report_replslot(const PgStat_StatReplSlotEntry *repSlotStat);
extern void pgstat_report_replslot_create(const char *slotname);
extern void pgstat_report_replslot_drop(const char *slotname);
+extern void pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command,
+ TransactionId xid, const char *errmsg);
+extern void pgstat_report_subscription_drop(Oid subid);
extern void pgstat_initialize(void);
@@ -1129,6 +1222,8 @@ extern void pgstat_send_wal(bool force);
extern PgStat_StatDBEntry *pgstat_fetch_stat_dbentry(Oid dbid);
extern PgStat_StatTabEntry *pgstat_fetch_stat_tabentry(Oid relid);
extern PgStat_StatFuncEntry *pgstat_fetch_stat_funcentry(Oid funcid);
+extern PgStat_StatSubWorkerEntry *pgstat_fetch_stat_subworker_entry(Oid subid,
+ Oid subrelid);
extern PgStat_ArchiverStats *pgstat_fetch_stat_archiver(void);
extern PgStat_BgWriterStats *pgstat_fetch_stat_bgwriter(void);
extern PgStat_CheckpointerStats *pgstat_fetch_stat_checkpointer(void);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2fa00a3c29..132ea53a9e 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2094,6 +2094,24 @@ pg_stat_subscription| SELECT su.oid AS subid,
st.latest_end_time
FROM (pg_subscription su
LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid, pid, received_lsn, last_msg_send_time, last_msg_receipt_time, latest_end_lsn, latest_end_time) ON ((st.subid = su.oid)));
+pg_stat_subscription_workers| SELECT w.subid,
+ s.subname,
+ w.subrelid,
+ w.relid,
+ w.command,
+ w.xid,
+ w.error_count,
+ w.error_message,
+ w.last_error_time
+ FROM ( SELECT pg_subscription.oid AS subid,
+ NULL::oid AS relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT pg_subscription_rel.srsubid AS subid,
+ pg_subscription_rel.srrelid AS relid
+ FROM pg_subscription_rel) sr,
+ (LATERAL pg_stat_get_subscription_worker(sr.subid, sr.relid) w(subid, subrelid, relid, command, xid, error_count, error_message, last_error_time)
+ JOIN pg_subscription s ON ((w.subid = s.oid)));
pg_stat_sys_indexes| SELECT pg_stat_all_indexes.relid,
pg_stat_all_indexes.indexrelid,
pg_stat_all_indexes.schemaname,
diff --git a/src/test/subscription/t/026_error_report.pl b/src/test/subscription/t/026_error_report.pl
new file mode 100644
index 0000000000..e07bf31a42
--- /dev/null
+++ b/src/test/subscription/t/026_error_report.pl
@@ -0,0 +1,181 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Tests for subscription error reporting.
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 5;
+
+# Test if the error reported on pg_stat_subscription_workers view is expected.
+sub test_subscription_error
+{
+ my ($node, $relname, $xid, $expected_error, $msg) = @_;
+
+ my $check_sql = qq[
+SELECT count(1) > 0 FROM pg_stat_subscription_workers
+WHERE relid = '$relname'::regclass];
+ $check_sql .= " AND xid = '$xid'::xid;" if $xid ne '';
+
+ # Wait for the error statistics to be updated.
+ $node->poll_query_until(
+ 'postgres', $check_sql,
+) or die "Timed out while waiting for statistics to be updated";
+
+ my $result = $node->safe_psql(
+ 'postgres',
+ qq[
+SELECT subname, command, relid::regclass, error_count > 0
+FROM pg_stat_subscription_workers
+WHERE relid = '$relname'::regclass;
+]);
+ is($result, $expected_error, $msg);
+}
+
+# Create publisher node.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ qq[
+logical_decoding_work_mem = 64kB
+]);
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+
+# The subscriber will enter an infinite error loop, so we don't want
+# to overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+wal_retrieve_retry_interval = 5s
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we create
+# the same tables but with primary keys. Also, insert some data that will conflict
+# with the data replicated from publisher later.
+$node_publisher->safe_psql(
+ 'postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int);
+CREATE TABLE test_tab2 (a int);
+INSERT INTO test_tab1 VALUES (1);
+INSERT INTO test_tab2 VALUES (1);
+COMMIT;
+]);
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int primary key);
+CREATE TABLE test_tab2 (a int primary key);
+INSERT INTO test_tab2 VALUES (1);
+COMMIT;
+]);
+
+# Setup publications.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ q[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;
+]);
+
+# Check if there is no subscription errors before starting logical replication.
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_workers");
+is($result, qq(0), 'check no subscription error');
+
+# Create subscription. The table sync for test_tab2 on tap_sub will enter into
+# infinite error loop due to violating the unique constraint.
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr' PUBLICATION tap_pub WITH (streaming = off);");
+
+$node_publisher->wait_for_catchup('tap_sub');
+
+# Wait for initial table sync for test_tab1 to finish.
+$node_subscriber->poll_query_until('postgres',
+ q[
+SELECT count(1) = 1 FROM pg_subscription_rel
+WHERE srrelid = 'test_tab1'::regclass AND srsubstate in ('r', 's')
+]) or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check the initial data.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab1");
+is($result, q(1), 'check initial data are copied to subscriber');
+
+# Insert more data to test_tab1, raising an error on the subscriber due to violation
+# of the unique constraint on test_tab1.
+my $xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab1 VALUES (1);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_subscription_error($node_subscriber, 'test_tab1', $xid,
+ qq(tap_sub|INSERT|test_tab1|t),
+ 'check the error reported by the apply worker');
+
+# Check the table sync worker's error in the view.
+test_subscription_error($node_subscriber, 'test_tab2', '',
+ qq(tap_sub||test_tab2|t),
+ 'check the error reported by the table sync worker');
+
+# Test for resetting subscription worker statistics.
+# Truncate test_tab1 and test_tab2 so that applying changes and table sync can
+# continue, respectively.
+$node_subscriber->safe_psql(
+ 'postgres',
+ "TRUNCATE test_tab1, test_tab2;");
+
+# Wait for the data to be replicated.
+$node_subscriber->poll_query_until(
+ 'postgres',
+ "SELECT count(1) > 0 FROM test_tab1");
+$node_subscriber->poll_query_until(
+ 'postgres',
+ "SELECT count(1) > 0 FROM test_tab2");
+
+# Reset stats of all subscription workers running on tap_sub.
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+SELECT pg_stat_reset_subscription_worker(sw.subid)
+FROM pg_stat_subscription_workers sw
+ JOIN pg_subscription s ON s.oid = sw.subid
+WHERE
+ s.subname = 'tap_sub';
+]);
+
+# Wait for stats of all subscription workers running on tap_sub to be reset.
+$node_subscriber->poll_query_until(
+ 'postgres',
+ qq[
+SELECT count(1) = 0
+FROM pg_stat_subscription_workers sw
+ JOIN pg_subscription s ON s.oid = sw.subid
+WHERE
+ s.subname = 'tap_sub';
+]);
+
+# Check if the view doesn't show any entries after dropping the subscriptions.
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+DROP SUBSCRIPTION tap_sub;
+]);
+$result = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT count(1) FROM pg_stat_subscription_workers");
+is($result, q(0), 'no error after dropping subscription');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index da6ac8ed83..f41ef0d2bc 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1943,6 +1943,8 @@ PgStat_MsgResetsharedcounter
PgStat_MsgResetsinglecounter
PgStat_MsgResetslrucounter
PgStat_MsgSLRU
+PgStat_MsgSubscriptionPurge
+PgStat_MsgSubWorkerError
PgStat_MsgTabpurge
PgStat_MsgTabstat
PgStat_MsgTempFile
@@ -1954,6 +1956,8 @@ PgStat_Single_Reset_Type
PgStat_StatDBEntry
PgStat_StatFuncEntry
PgStat_StatReplSlotEntry
+PgStat_StatSubWorkerEntry
+PgStat_StatSubWorkerKey
PgStat_StatTabEntry
PgStat_SubXactStatus
PgStat_TableCounts
--
2.24.3 (Apple Git-128)
On Wed, Nov 24, 2021 at 5:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Changed. I've removed first_error_time as per discussion on the thread
for adding xact stats.
We also agreed to change the column names to start with last_error_*
[1]: /messages/by-id/CAD21AoCQ8z5goy3BCqfk2gn5p8NVH5B-uxO3Xc-dXN-MXVfnKg@mail.gmail.com
can change it just before committing that patch? I thought it might be
better to do it that way now itself.
[1]: /messages/by-id/CAD21AoCQ8z5goy3BCqfk2gn5p8NVH5B-uxO3Xc-dXN-MXVfnKg@mail.gmail.com
--
With Regards,
Amit Kapila.
On Wed, Nov 24, 2021 at 5:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Nov 17, 2021 at 8:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Nov 16, 2021 at 12:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Right. I've fixed this issue and attached an updated patch.
One very minor comment:
conflict can be moved to next line to keep it within 80 chars boundary
wherever possible
+# Initial table setup on both publisher and subscriber. On subscriber we create
+# the same tables but with primary keys. Also, insert some data that
will conflict
+# with the data replicated from publisher later.
+$node_publisher->safe_psql(
Similarly in the below:
+# Insert more data to test_tab1, raising an error on the subscriber
due to violation
+# of the unique constraint on test_tab1.
+my $xid = $node_publisher->safe_psql(
The rest of the patch looks good.
Regards,
Vignesh
On Wed, Nov 24, 2021 at 10:44 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated version patch. Unless I miss something, all
comments I got so far have been incorporated into this patch. Please
review it.
Only a couple of minor points:
src/backend/postmaster/pgstat.c
(1) pgstat_get_subworker_entry
In the following comment, it should say "returns an entry ...":
+ * apply worker otherwise returns entry of the table sync worker associated
src/include/pgstat.h
(2) typedef struct PgStat_StatDBEntry
"subworker" should be "subworkers" in the following comment, to match
the struct member name:
* subworker is the hash table of PgStat_StatSubWorkerEntry which stores
Otherwise, the patch LGTM.
Regards,
Greg Nancarrow
Fujitsu Australia
On Thu, Nov 25, 2021 at 1:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Nov 24, 2021 at 5:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Changed. I've removed first_error_time as per discussion on the thread
for adding xact stats.We also agreed to change the column names to start with last_error_*
[1]. Is there a reason to not make those changes? Do you think that we
can change it just before committing that patch? I thought it might be
better to do it that way now itself.
Oh, I thought that you think that we change the column names when
adding xact stats to the view. But these names also make sense even
without the xact stats. I've attached an updated patch. It also
incorporated comments from Vignesh and Greg.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v25-0001-Add-a-subscription-worker-statistics-view-pg_sta.patchapplication/octet-stream; name=v25-0001-Add-a-subscription-worker-statistics-view-pg_sta.patchDownload
From a9670d42539de68de56f49b57924552f5cd397d5 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 16 Jul 2021 23:10:22 +0900
Subject: [PATCH v25 1/3] Add a subscription worker statistics view
"pg_stat_subscription_workers".
This commit adds a new system view pg_stat_subscription_workers,
that shows information about any errors which occur during application
of logical replication changes as well as during performing initial table
synchronization. The subscription statistics entries are removed when
the corresponding subscription is removed.
It also adds an SQL function pg_stat_reset_subscription_worker() to
reset single subscription errors.
The contents of this view can be used by an upcoming patch that skips
the particular transaction that conflicts with the existing data on
the subscriber.
This view can be extended in the future to track other xact related
statistics for subscription workers.
---
doc/src/sgml/monitoring.sgml | 157 ++++++++
src/backend/catalog/system_functions.sql | 4 +
src/backend/catalog/system_views.sql | 23 ++
src/backend/commands/subscriptioncmds.c | 16 +-
src/backend/postmaster/pgstat.c | 381 +++++++++++++++++++-
src/backend/replication/logical/worker.c | 54 ++-
src/backend/utils/adt/pgstatfuncs.c | 127 ++++++-
src/include/catalog/pg_proc.dat | 18 +
src/include/pgstat.h | 103 +++++-
src/test/regress/expected/rules.out | 18 +
src/test/subscription/t/026_error_report.pl | 181 ++++++++++
src/tools/pgindent/typedefs.list | 4 +
12 files changed, 1065 insertions(+), 21 deletions(-)
create mode 100644 src/test/subscription/t/026_error_report.pl
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index af6914872b..d86a8158ba 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -627,6 +627,15 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
</entry>
</row>
+ <row>
+ <entry><structname>pg_stat_subscription_workers</structname><indexterm><primary>pg_stat_subscription_workers</primary></indexterm></entry>
+ <entry>One row per subscription worker, showing statistics about errors
+ that occurred on that subscription worker.
+ See <link linkend="monitoring-pg-stat-subscription-workers">
+ <structname>pg_stat_subscription_workers</structname></link> for details.
+ </entry>
+ </row>
+
</tbody>
</tgroup>
</table>
@@ -3054,6 +3063,128 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
</sect2>
+ <sect2 id="monitoring-pg-stat-subscription-workers">
+ <title><structname>pg_stat_subscription_workers</structname></title>
+
+ <indexterm>
+ <primary>pg_stat_subscription_workers</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_stat_subscription_workers</structname> view will contain
+ one row per subscription worker on which errors have occurred, for workers
+ applying logical replication changes and workers handling the initial data
+ copy of the subscribed tables. The statistics entry is removed when the
+ corresponding subscription is dropped.
+ </para>
+
+ <table id="pg-stat-subscription-workers" xreflabel="pg_stat_subscription_workers">
+ <title><structname>pg_stat_subscription_workers</structname> View</title>
+ <tgroup cols="1">
+ <thead>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ Column Type
+ </para>
+ <para>
+ Description
+ </para></entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the subscription
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the subscription
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subrelid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker is synchronizing; null for the
+ main apply worker
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_error_relid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker was processing when the
+ error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_error_command</structfield> <type>text</type>
+ </para>
+ <para>
+ Name of command being applied when the error occurred. This field
+ is null if the error was reported during the initial data copy.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_error_xid</structfield> <type>xid</type>
+ </para>
+ <para>
+ Transaction ID of the publisher node being applied when the error
+ occurred. This field is null if the error was reported
+ during the initial data copy.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_error_count</structfield> <type>uint8</type>
+ </para>
+ <para>
+ Number of consecutive times the error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_error_message</structfield> <type>text</type>
+ </para>
+ <para>
+ The error message
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_error_time</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Last time at which this error occurred.
+ </para></entry>
+ </row>
+
+ </tbody>
+ </tgroup>
+ </table>
+
+ </sect2>
+
<sect2 id="monitoring-pg-stat-ssl-view">
<title><structname>pg_stat_ssl</structname></title>
@@ -5176,6 +5307,32 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
can be granted EXECUTE to run the function.
</para></entry>
</row>
+
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_stat_reset_subscription_worker</primary>
+ </indexterm>
+ <function>pg_stat_reset_subscription_worker</function> ( <parameter>subid</parameter> <type>oid</type> <optional>, <parameter>relid</parameter> <type>oid</type> </optional> )
+ <returnvalue>void</returnvalue>
+ </para>
+ <para>
+ Resets the statistics of subscription workers running on the
+ subscription with <parameter>subid</parameter> shown in the
+ <structname>pg_stat_subscription_worker</structname> view. If the
+ argument <parameter>relid</parameter> is not <literal>NULL</literal>,
+ resets statistics of the subscription worker handling the initial data
+ copy of the relation with <parameter>relid</parameter>. Otherwise,
+ resets the subscription worker statistics of the main apply worker.
+ If the argument <parameter>relid</parameter> is omitted, resets the
+ statistics of all subscription workers running on the subscription
+ with <parameter>subid</parameter>.
+ </para>
+ <para>
+ This function is restricted to superusers by default, but other users
+ can be granted EXECUTE to run the function.
+ </para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index f6789025a5..3a4fa9091b 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -639,6 +639,10 @@ REVOKE EXECUTE ON FUNCTION pg_stat_reset_single_function_counters(oid) FROM publ
REVOKE EXECUTE ON FUNCTION pg_stat_reset_replication_slot(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_stat_reset_subscription_worker(oid) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_stat_reset_subscription_worker(oid, oid) FROM public;
+
REVOKE EXECUTE ON FUNCTION lo_import(text) FROM public;
REVOKE EXECUTE ON FUNCTION lo_import(text, oid) FROM public;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index eb560955cd..61b515cdb8 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1261,3 +1261,26 @@ REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
substream, subtwophasestate, subslotname, subsynccommit, subpublications)
ON pg_subscription TO public;
+
+CREATE VIEW pg_stat_subscription_workers AS
+ SELECT
+ w.subid,
+ s.subname,
+ w.subrelid,
+ w.last_error_relid,
+ w.last_error_command,
+ w.last_error_xid,
+ w.last_error_count,
+ w.last_error_message,
+ w.last_error_time
+ FROM (SELECT
+ oid as subid,
+ NULL as relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT
+ srsubid as subid,
+ srrelid as relid
+ FROM pg_subscription_rel) sr,
+ LATERAL pg_stat_get_subscription_worker(sr.subid, sr.relid) w
+ JOIN pg_subscription s ON (w.subid = s.oid);
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index c47ba26369..9427e86fee 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -32,6 +32,7 @@
#include "executor/executor.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "pgstat.h"
#include "replication/logicallauncher.h"
#include "replication/origin.h"
#include "replication/slot.h"
@@ -1204,7 +1205,8 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
* Since dropping a replication slot is not transactional, the replication
* slot stays dropped even if the transaction rolls back. So we cannot
* run DROP SUBSCRIPTION inside a transaction block if dropping the
- * replication slot.
+ * replication slot. Also, in this case, we report a message for dropping
+ * the subscription to the stats collector.
*
* XXX The command name should really be something like "DROP SUBSCRIPTION
* of a subscription that is associated with a replication slot", but we
@@ -1377,6 +1379,18 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
}
PG_END_TRY();
+ /*
+ * Send a message for dropping this subscription to the stats collector.
+ * We can safely report dropping the subscription statistics here if the
+ * subscription is associated with a replication slot since we cannot run
+ * DROP SUBSCRIPTION inside a transaction block. Subscription statistics
+ * will be removed later by (auto)vacuum either if it's not associated
+ * with a replication slot or if the message for dropping the subscription
+ * gets lost.
+ */
+ if (slotname)
+ pgstat_report_subscription_drop(subid);
+
table_close(rel, NoLock);
}
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 8c166e5e16..7666508dc4 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -41,6 +41,7 @@
#include "catalog/catalog.h"
#include "catalog/pg_database.h"
#include "catalog/pg_proc.h"
+#include "catalog/pg_subscription.h"
#include "common/ip.h"
#include "executor/instrument.h"
#include "libpq/libpq.h"
@@ -105,6 +106,7 @@
#define PGSTAT_DB_HASH_SIZE 16
#define PGSTAT_TAB_HASH_SIZE 512
#define PGSTAT_FUNCTION_HASH_SIZE 512
+#define PGSTAT_SUBWORKER_HASH_SIZE 32
#define PGSTAT_REPLSLOT_HASH_SIZE 32
@@ -320,10 +322,14 @@ NON_EXEC_STATIC void PgstatCollectorMain(int argc, char *argv[]) pg_attribute_no
static PgStat_StatDBEntry *pgstat_get_db_entry(Oid databaseid, bool create);
static PgStat_StatTabEntry *pgstat_get_tab_entry(PgStat_StatDBEntry *dbentry,
Oid tableoid, bool create);
+static PgStat_StatSubWorkerEntry *pgstat_get_subworker_entry(PgStat_StatDBEntry *dbentry,
+ Oid subid, Oid subrelid,
+ bool create);
static void pgstat_write_statsfiles(bool permanent, bool allDbs);
static void pgstat_write_db_statsfile(PgStat_StatDBEntry *dbentry, bool permanent);
static HTAB *pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep);
-static void pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash, bool permanent);
+static void pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash,
+ HTAB *subworkerhash, bool permanent);
static void backend_read_statsfile(void);
static bool pgstat_write_statsfile_needed(void);
@@ -335,6 +341,7 @@ static void pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotstats, Timestamp
static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg, TimestampTz now);
static void pgstat_send_funcstats(void);
static void pgstat_send_slru(void);
+static void pgstat_send_subscription_purge(PgStat_MsgSubscriptionPurge *msg);
static HTAB *pgstat_collect_oids(Oid catalogid, AttrNumber anum_oid);
static bool pgstat_should_report_connstat(void);
static void pgstat_report_disconnect(Oid dboid);
@@ -373,6 +380,8 @@ static void pgstat_recv_connect(PgStat_MsgConnect *msg, int len);
static void pgstat_recv_disconnect(PgStat_MsgDisconnect *msg, int len);
static void pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len);
static void pgstat_recv_tempfile(PgStat_MsgTempFile *msg, int len);
+static void pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len);
+static void pgstat_recv_subworker_error(PgStat_MsgSubWorkerError *msg, int len);
/* ------------------------------------------------------------
* Public functions called from postmaster follow
@@ -1302,6 +1311,55 @@ pgstat_vacuum_stat(void)
hash_destroy(htab);
}
+
+ /*
+ * Repeat for subscription workers. Similarly, we needn't bother in the
+ * common case where no subscription workers' stats are being collected.
+ */
+ if (dbentry->subworkers != NULL &&
+ hash_get_num_entries(dbentry->subworkers) > 0)
+ {
+ PgStat_StatSubWorkerEntry *subwentry;
+ PgStat_MsgSubscriptionPurge spmsg;
+
+ /*
+ * Read pg_subscription and make a list of OIDs of all existing
+ * subscriptions
+ */
+ htab = pgstat_collect_oids(SubscriptionRelationId, Anum_pg_subscription_oid);
+
+ spmsg.m_databaseid = MyDatabaseId;
+ spmsg.m_nentries = 0;
+
+ hash_seq_init(&hstat, dbentry->subworkers);
+ while ((subwentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ Oid subid = subwentry->key.subid;
+
+ CHECK_FOR_INTERRUPTS();
+
+ if (hash_search(htab, (void *) &subid, HASH_FIND, NULL) != NULL)
+ continue;
+
+ /* This subscription is dead, add the subid to the message */
+ spmsg.m_subids[spmsg.m_nentries++] = subid;
+
+ /*
+ * If the message is full, send it out and reinitialize to empty
+ */
+ if (spmsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONPURGE)
+ {
+ pgstat_send_subscription_purge(&spmsg);
+ spmsg.m_nentries = 0;
+ }
+ }
+
+ /* Send the rest of dead subscriptions */
+ if (spmsg.m_nentries > 0)
+ pgstat_send_subscription_purge(&spmsg);
+
+ hash_destroy(htab);
+ }
}
@@ -1474,7 +1532,8 @@ pgstat_reset_shared_counters(const char *target)
* ----------
*/
void
-pgstat_reset_single_counter(Oid objoid, PgStat_Single_Reset_Type type)
+pgstat_reset_single_counter(Oid objoid, Oid subobjoid,
+ PgStat_Single_Reset_Type type)
{
PgStat_MsgResetsinglecounter msg;
@@ -1485,6 +1544,7 @@ pgstat_reset_single_counter(Oid objoid, PgStat_Single_Reset_Type type)
msg.m_databaseid = MyDatabaseId;
msg.m_resettype = type;
msg.m_objectid = objoid;
+ msg.m_subobjectid = subobjoid;
pgstat_send(&msg, sizeof(msg));
}
@@ -1869,6 +1929,51 @@ pgstat_report_replslot_drop(const char *slotname)
pgstat_send(&msg, sizeof(PgStat_MsgReplSlot));
}
+/* ----------
+ * pgstat_report_subworker_error() -
+ *
+ * Tell the collector about the subscription worker error.
+ * ----------
+ */
+void
+pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command, TransactionId xid,
+ const char *errmsg)
+{
+ PgStat_MsgSubWorkerError msg;
+ int len;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBWORKERERROR);
+ msg.m_databaseid = MyDatabaseId;
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_relid = relid;
+ msg.m_command = command;
+ msg.m_xid = xid;
+ msg.m_timestamp = GetCurrentTimestamp();
+ strlcpy(msg.m_message, errmsg, PGSTAT_SUBWORKERERROR_MSGLEN);
+
+ len = offsetof(PgStat_MsgSubWorkerError, m_message) + strlen(msg.m_message) + 1;
+ pgstat_send(&msg, len);
+}
+
+/* ----------
+ * pgstat_report_subscription_drop() -
+ *
+ * Tell the collector about dropping the subscription.
+ * ----------
+ */
+void
+pgstat_report_subscription_drop(Oid subid)
+{
+ PgStat_MsgSubscriptionPurge msg;
+
+ msg.m_databaseid = MyDatabaseId;
+ msg.m_subids[0] = subid;
+ msg.m_nentries = 1;
+ pgstat_send_subscription_purge(&msg);
+}
+
/* ----------
* pgstat_ping() -
*
@@ -2874,6 +2979,35 @@ pgstat_fetch_stat_funcentry(Oid func_id)
return funcentry;
}
+/*
+ * ---------
+ * pgstat_fetch_stat_subworker_entry() -
+ *
+ * Support function for the SQL-callable pgstat* functions. Returns
+ * a pointer to the subscription worker struct or NULL.
+ * ---------
+ */
+PgStat_StatSubWorkerEntry *
+pgstat_fetch_stat_subworker_entry(Oid subid, Oid subrelid)
+{
+ PgStat_StatDBEntry *dbentry;
+ PgStat_StatSubWorkerEntry *wentry = NULL;
+
+ /* Load the stats file if needed */
+ backend_read_statsfile();
+
+ /*
+ * Lookup our database, then find the requested subscription worker stats.
+ */
+ dbentry = pgstat_fetch_stat_dbentry(MyDatabaseId);
+ if (dbentry != NULL && dbentry->subworkers != NULL)
+ {
+ wentry = pgstat_get_subworker_entry(dbentry, subid, subrelid,
+ false);
+ }
+
+ return wentry;
+}
/*
* ---------
@@ -3312,6 +3446,23 @@ pgstat_send_slru(void)
}
}
+/* --------
+ * pgstat_send_subscription_purge() -
+ *
+ * Send a subscription purge message to the collector
+ * --------
+ */
+static void
+pgstat_send_subscription_purge(PgStat_MsgSubscriptionPurge *msg)
+{
+ int len;
+
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + msg->m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&msg->m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(msg, len);
+}
/* ----------
* PgstatCollectorMain() -
@@ -3568,6 +3719,14 @@ PgstatCollectorMain(int argc, char *argv[])
pgstat_recv_disconnect(&msg.msg_disconnect, len);
break;
+ case PGSTAT_MTYPE_SUBSCRIPTIONPURGE:
+ pgstat_recv_subscription_purge(&msg.msg_subscriptionpurge, len);
+ break;
+
+ case PGSTAT_MTYPE_SUBWORKERERROR:
+ pgstat_recv_subworker_error(&msg.msg_subworkererror, len);
+ break;
+
default:
break;
}
@@ -3613,7 +3772,8 @@ PgstatCollectorMain(int argc, char *argv[])
/*
* Subroutine to clear stats in a database entry
*
- * Tables and functions hashes are initialized to empty.
+ * Tables, functions, and subscription workers hashes are initialized
+ * to empty.
*/
static void
reset_dbentry_counters(PgStat_StatDBEntry *dbentry)
@@ -3666,6 +3826,13 @@ reset_dbentry_counters(PgStat_StatDBEntry *dbentry)
PGSTAT_FUNCTION_HASH_SIZE,
&hash_ctl,
HASH_ELEM | HASH_BLOBS);
+
+ hash_ctl.keysize = sizeof(PgStat_StatSubWorkerKey);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubWorkerEntry);
+ dbentry->subworkers = hash_create("Per-database subscription worker",
+ PGSTAT_SUBWORKER_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
}
/*
@@ -3690,7 +3857,7 @@ pgstat_get_db_entry(Oid databaseid, bool create)
/*
* If not found, initialize the new one. This creates empty hash tables
- * for tables and functions, too.
+ * for tables, functions, and subscription worker, too.
*/
if (!found)
reset_dbentry_counters(result);
@@ -3748,6 +3915,47 @@ pgstat_get_tab_entry(PgStat_StatDBEntry *dbentry, Oid tableoid, bool create)
return result;
}
+/* ----------
+ * pgstat_get_subworker_entry
+ *
+ * Return subscription worker entry with the given subscription OID and
+ * relation OID. If subrelid is InvalidOid, it returns an entry of the
+ * apply worker otherwise returns an entry of the table sync worker
+ * associated with subrelid. If no subscription entry exists,
+ * initialize it, if the create parameter is true. Else, return NULL.
+ * ----------
+ */
+static PgStat_StatSubWorkerEntry *
+pgstat_get_subworker_entry(PgStat_StatDBEntry *dbentry, Oid subid, Oid subrelid,
+ bool create)
+{
+ PgStat_StatSubWorkerEntry *subwentry;
+ PgStat_StatSubWorkerKey key;
+ bool found;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);
+
+ key.subid = subid;
+ key.subrelid = subrelid;
+ subwentry = (PgStat_StatSubWorkerEntry *) hash_search(dbentry->subworkers,
+ (void *) &key,
+ action, &found);
+
+ if (!create && !found)
+ return NULL;
+
+ /* If not found, initialize the new one */
+ if (!found)
+ {
+ subwentry->last_error_relid = InvalidOid;
+ subwentry->last_error_command = 0;
+ subwentry->last_error_xid = InvalidTransactionId;
+ subwentry->last_error_count = 0;
+ subwentry->last_error_time = 0;
+ subwentry->last_error_message[0] = '\0';
+ }
+
+ return subwentry;
+}
/* ----------
* pgstat_write_statsfiles() -
@@ -3947,8 +4155,10 @@ pgstat_write_db_statsfile(PgStat_StatDBEntry *dbentry, bool permanent)
{
HASH_SEQ_STATUS tstat;
HASH_SEQ_STATUS fstat;
+ HASH_SEQ_STATUS sstat;
PgStat_StatTabEntry *tabentry;
PgStat_StatFuncEntry *funcentry;
+ PgStat_StatSubWorkerEntry *subwentry;
FILE *fpout;
int32 format_id;
Oid dbid = dbentry->databaseid;
@@ -4003,6 +4213,17 @@ pgstat_write_db_statsfile(PgStat_StatDBEntry *dbentry, bool permanent)
(void) rc; /* we'll check for error with ferror */
}
+ /*
+ * Walk through the database's subscription worker stats table.
+ */
+ hash_seq_init(&sstat, dbentry->subworkers);
+ while ((subwentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&sstat)) != NULL)
+ {
+ fputc('S', fpout);
+ rc = fwrite(subwentry, sizeof(PgStat_StatSubWorkerEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+ }
+
/*
* No more output to be done. Close the temp file and replace the old
* pgstat.stat with it. The ferror() check replaces testing for error
@@ -4241,6 +4462,7 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
memcpy(dbentry, &dbbuf, sizeof(PgStat_StatDBEntry));
dbentry->tables = NULL;
dbentry->functions = NULL;
+ dbentry->subworkers = NULL;
/*
* In the collector, disregard the timestamp we read from the
@@ -4252,8 +4474,8 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
dbentry->stats_timestamp = 0;
/*
- * Don't create tables/functions hashtables for uninteresting
- * databases.
+ * Don't create tables/functions/subworkers hashtables for
+ * uninteresting databases.
*/
if (onlydb != InvalidOid)
{
@@ -4278,6 +4500,14 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
&hash_ctl,
HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ hash_ctl.keysize = sizeof(PgStat_StatSubWorkerKey);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubWorkerEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ dbentry->subworkers = hash_create("Per-database subscription worker",
+ PGSTAT_SUBWORKER_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+
/*
* If requested, read the data from the database-specific
* file. Otherwise we just leave the hashtables empty.
@@ -4286,6 +4516,7 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
pgstat_read_db_statsfile(dbentry->databaseid,
dbentry->tables,
dbentry->functions,
+ dbentry->subworkers,
permanent);
break;
@@ -4363,19 +4594,21 @@ done:
* As in pgstat_read_statsfiles, if the permanent file is requested, it is
* removed after reading.
*
- * Note: this code has the ability to skip storing per-table or per-function
- * data, if NULL is passed for the corresponding hashtable. That's not used
- * at the moment though.
+ * Note: this code has the ability to skip storing per-table, per-function, or
+ * per-subscription-worker data, if NULL is passed for the corresponding hashtable.
+ * That's not used at the moment though.
* ----------
*/
static void
pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash,
- bool permanent)
+ HTAB *subworkerhash, bool permanent)
{
PgStat_StatTabEntry *tabentry;
PgStat_StatTabEntry tabbuf;
PgStat_StatFuncEntry funcbuf;
PgStat_StatFuncEntry *funcentry;
+ PgStat_StatSubWorkerEntry subwbuf;
+ PgStat_StatSubWorkerEntry *subwentry;
FILE *fpin;
int32 format_id;
bool found;
@@ -4489,6 +4722,41 @@ pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash,
memcpy(funcentry, &funcbuf, sizeof(funcbuf));
break;
+ /*
+ * 'S' A PgStat_StatSubWorkerEntry struct describing
+ * subscription worker statistics.
+ */
+ case 'S':
+ if (fread(&subwbuf, 1, sizeof(PgStat_StatSubWorkerEntry),
+ fpin) != sizeof(PgStat_StatSubWorkerEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /*
+ * Skip if subscription worker data not wanted.
+ */
+ if (subworkerhash == NULL)
+ break;
+
+ subwentry = (PgStat_StatSubWorkerEntry *) hash_search(subworkerhash,
+ (void *) &subwbuf.key,
+ HASH_ENTER, &found);
+
+ if (found)
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ memcpy(subwentry, &subwbuf, sizeof(subwbuf));
+ break;
+
/*
* 'E' The EOF marker of a complete stats file.
*/
@@ -5162,6 +5430,8 @@ pgstat_recv_dropdb(PgStat_MsgDropdb *msg, int len)
hash_destroy(dbentry->tables);
if (dbentry->functions != NULL)
hash_destroy(dbentry->functions);
+ if (dbentry->subworkers != NULL)
+ hash_destroy(dbentry->subworkers);
if (hash_search(pgStatDBHash,
(void *) &dbid,
@@ -5199,13 +5469,16 @@ pgstat_recv_resetcounter(PgStat_MsgResetcounter *msg, int len)
hash_destroy(dbentry->tables);
if (dbentry->functions != NULL)
hash_destroy(dbentry->functions);
+ if (dbentry->subworkers != NULL)
+ hash_destroy(dbentry->subworkers);
dbentry->tables = NULL;
dbentry->functions = NULL;
+ dbentry->subworkers = NULL;
/*
* Reset database-level stats, too. This creates empty hash tables for
- * tables and functions.
+ * tables, functions, and subscription workers.
*/
reset_dbentry_counters(dbentry);
}
@@ -5274,6 +5547,14 @@ pgstat_recv_resetsinglecounter(PgStat_MsgResetsinglecounter *msg, int len)
else if (msg->m_resettype == RESET_FUNCTION)
(void) hash_search(dbentry->functions, (void *) &(msg->m_objectid),
HASH_REMOVE, NULL);
+ else if (msg->m_resettype == RESET_SUBWORKER)
+ {
+ PgStat_StatSubWorkerKey key;
+
+ key.subid = msg->m_objectid;
+ key.subrelid = msg->m_subobjectid;
+ (void) hash_search(dbentry->subworkers, (void *) &key, HASH_REMOVE, NULL);
+ }
}
/* ----------
@@ -5816,6 +6097,84 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
}
}
+/* ----------
+ * pgstat_recv_subscription_purge() -
+ *
+ * Process a SUBSCRIPTIONPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len)
+{
+ HASH_SEQ_STATUS hstat;
+ PgStat_StatDBEntry *dbentry;
+ PgStat_StatSubWorkerEntry *subwentry;
+
+ dbentry = pgstat_get_db_entry(msg->m_databaseid, false);
+
+ /* No need to purge if we don't have even know the database */
+ if (!dbentry || !dbentry->subworkers)
+ return;
+
+ /* Remove all subscription worker statistics of the given subscriptions */
+ hash_seq_init(&hstat, dbentry->subworkers);
+ while ((subwentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ if (subwentry->key.subid == msg->m_subids[i])
+ {
+ (void) hash_search(dbentry->subworkers, (void *) &(subwentry->key),
+ HASH_REMOVE, NULL);
+ break;
+ }
+ }
+ }
+}
+
+/* ----------
+ * pgstat_recv_subworker_error() -
+ *
+ * Process a SUBWORKERERROR message.
+ * ----------
+ */
+static void
+pgstat_recv_subworker_error(PgStat_MsgSubWorkerError *msg, int len)
+{
+ PgStat_StatDBEntry *dbentry;
+ PgStat_StatSubWorkerEntry *subwentry;
+
+ dbentry = pgstat_get_db_entry(msg->m_databaseid, true);
+
+ /* Get the subscription worker stats */
+ subwentry = pgstat_get_subworker_entry(dbentry, msg->m_subid,
+ msg->m_subrelid, true);
+ Assert(subwentry);
+
+ /*
+ * Update only the counter and last error timestamp if we received the
+ * same error again
+ */
+ if (subwentry->last_error_relid == msg->m_relid &&
+ subwentry->last_error_command == msg->m_command &&
+ subwentry->last_error_xid == msg->m_xid &&
+ strcmp(subwentry->last_error_message, msg->m_message) == 0)
+ {
+ subwentry->last_error_count++;
+ subwentry->last_error_time = msg->m_timestamp;
+ return;
+ }
+
+ /* Otherwise, update the error information */
+ subwentry->last_error_relid = msg->m_relid;
+ subwentry->last_error_command = msg->m_command;
+ subwentry->last_error_xid = msg->m_xid;
+ subwentry->last_error_count = 1;
+ subwentry->last_error_time = msg->m_timestamp;
+ strlcpy(subwentry->last_error_message, msg->m_message,
+ PGSTAT_SUBWORKERERROR_MSGLEN);
+}
+
/* ----------
* pgstat_write_statsfile_needed() -
*
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index ae1b391bda..2e79302a48 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3332,6 +3332,7 @@ void
ApplyWorkerMain(Datum main_arg)
{
int worker_slot = DatumGetInt32(main_arg);
+ MemoryContext cctx = CurrentMemoryContext;
MemoryContext oldctx;
char originname[NAMEDATALEN];
XLogRecPtr origin_startpos;
@@ -3432,8 +3433,30 @@ ApplyWorkerMain(Datum main_arg)
{
char *syncslotname;
- /* This is table synchronization worker, call initial sync. */
- syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ PG_TRY();
+ {
+ /* This is table synchronization worker, call initial sync. */
+ syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ }
+ PG_CATCH();
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ /*
+ * Report the table sync error. There is no corresponding message
+ * type for table synchronization.
+ */
+ pgstat_report_subworker_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ MyLogicalRepWorker->relid,
+ 0, /* message type */
+ InvalidTransactionId,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
/* allocate slot name in long-lived context */
myslotname = MemoryContextStrdup(ApplyContext, syncslotname);
@@ -3551,7 +3574,32 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- LogicalRepApplyLoop(origin_startpos);
+ PG_TRY();
+ {
+ LogicalRepApplyLoop(origin_startpos);
+ }
+ PG_CATCH();
+ {
+ /* report the apply error */
+ if (apply_error_callback_arg.command != 0)
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ pgstat_report_subworker_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ apply_error_callback_arg.rel != NULL
+ ? apply_error_callback_arg.rel->localreloid
+ : InvalidOid,
+ apply_error_callback_arg.command,
+ apply_error_callback_arg.remote_xid,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ }
+
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
proc_exit(0);
}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index e64857e540..cbca7167a6 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -2172,7 +2172,7 @@ pg_stat_reset_single_table_counters(PG_FUNCTION_ARGS)
{
Oid taboid = PG_GETARG_OID(0);
- pgstat_reset_single_counter(taboid, RESET_TABLE);
+ pgstat_reset_single_counter(taboid, InvalidOid, RESET_TABLE);
PG_RETURN_VOID();
}
@@ -2182,7 +2182,18 @@ pg_stat_reset_single_function_counters(PG_FUNCTION_ARGS)
{
Oid funcoid = PG_GETARG_OID(0);
- pgstat_reset_single_counter(funcoid, RESET_FUNCTION);
+ pgstat_reset_single_counter(funcoid, InvalidOid, RESET_FUNCTION);
+
+ PG_RETURN_VOID();
+}
+
+Datum
+pg_stat_reset_subscription_worker_subrel(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid = PG_ARGISNULL(1) ? InvalidOid : PG_GETARG_OID(1);
+
+ pgstat_reset_single_counter(subid, relid, RESET_SUBWORKER);
PG_RETURN_VOID();
}
@@ -2240,6 +2251,21 @@ pg_stat_reset_replication_slot(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
+/* Reset all subscription worker stats associated with the given subscription */
+Datum
+pg_stat_reset_subscription_worker_sub(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+
+ /*
+ * Use subscription drop message to remove statistics of all subscription
+ * workers.
+ */
+ pgstat_report_subscription_drop(subid);
+
+ PG_RETURN_VOID();
+}
+
Datum
pg_stat_get_archiver(PG_FUNCTION_ARGS)
{
@@ -2380,3 +2406,100 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
/* Returns the record as Datum */
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
+
+/*
+ * Get the subscription worker statistics for the given subscription
+ * (and relation).
+ */
+Datum
+pg_stat_get_subscription_worker(PG_FUNCTION_ARGS)
+{
+#define PG_STAT_GET_SUBSCRIPTION_WORKER_COLS 8
+ Oid subid = PG_GETARG_OID(0);
+ Oid subrelid;
+ TupleDesc tupdesc;
+ Datum values[PG_STAT_GET_SUBSCRIPTION_WORKER_COLS];
+ bool nulls[PG_STAT_GET_SUBSCRIPTION_WORKER_COLS];
+ PgStat_StatSubWorkerEntry *wentry;
+ int i;
+
+ if (PG_ARGISNULL(1))
+ subrelid = InvalidOid;
+ else
+ subrelid = PG_GETARG_OID(1);
+
+ /* Get subscription worker stats */
+ wentry = pgstat_fetch_stat_subworker_entry(subid, subrelid);
+
+ /* Return NULL if there is no worker statistics */
+ if (wentry == NULL)
+ PG_RETURN_NULL();
+
+ /* Initialise attributes information in the tuple descriptor */
+ tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_SUBSCRIPTION_WORKER_COLS);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "subid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subrelid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 3, "last_error_relid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 4, "last_error_command",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 5, "last_error_xid",
+ XIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 6, "last_error_count",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 7, "last_error_message",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 8, "last_error_time",
+ TIMESTAMPTZOID, -1, 0);
+ BlessTupleDesc(tupdesc);
+
+ /* Initialise values and NULL flags arrays */
+ MemSet(values, 0, sizeof(values));
+ MemSet(nulls, 0, sizeof(nulls));
+
+ i = 0;
+ /* subid */
+ values[i++] = ObjectIdGetDatum(subid);
+
+ /* subrelid */
+ if (OidIsValid(subrelid))
+ values[i++] = ObjectIdGetDatum(subrelid);
+ else
+ nulls[i++] = true;
+
+ /* last_error_relid */
+ if (OidIsValid(wentry->last_error_relid))
+ values[i++] = ObjectIdGetDatum(wentry->last_error_relid);
+ else
+ nulls[i++] = true;
+
+ /* last_error_command */
+ if (wentry->last_error_command != 0)
+ values[i++] =
+ CStringGetTextDatum(logicalrep_message_type(wentry->last_error_command));
+ else
+ nulls[i++] = true;
+
+ /* last_error_xid */
+ if (TransactionIdIsValid(wentry->last_error_xid))
+ values[i++] = TransactionIdGetDatum(wentry->last_error_xid);
+ else
+ nulls[i++] = true;
+
+ /* last_error_count */
+ values[i++] = Int64GetDatum(wentry->last_error_count);
+
+ /* last_error_message */
+ values[i++] = CStringGetTextDatum(wentry->last_error_message);
+
+ /* last_error_time */
+ if (wentry->last_error_time != 0)
+ values[i++] = TimestampTzGetDatum(wentry->last_error_time);
+ else
+ nulls[i++] = true;
+
+ /* Returns the record as Datum */
+ PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
+}
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index e934361dc3..79d787cd26 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5389,6 +5389,14 @@
proargmodes => '{i,o,o,o,o,o,o,o,o,o,o}',
proargnames => '{slot_name,slot_name,spill_txns,spill_count,spill_bytes,stream_txns,stream_count,stream_bytes,total_txns,total_bytes,stats_reset}',
prosrc => 'pg_stat_get_replication_slot' },
+{ oid => '8523', descr => 'statistics: information about subscription worker',
+ proname => 'pg_stat_get_subscription_worker', prorows => '1', proisstrict => 'f',
+ proretset => 't', provolatile => 's', proparallel => 'r',
+ prorettype => 'record', proargtypes => 'oid oid',
+ proallargtypes => '{oid,oid,oid,oid,oid,text,xid,int8,text,timestamptz}',
+ proargmodes => '{i,i,o,o,o,o,o,o,o,o}',
+ proargnames => '{subid,subrelid,subid,subrelid,last_error_relid,last_error_command,last_error_xid,last_error_count,last_error_message,last_error_time}',
+ prosrc => 'pg_stat_get_subscription_worker' },
{ oid => '6118', descr => 'statistics: information about subscription',
proname => 'pg_stat_get_subscription', prorows => '10', proisstrict => 'f',
proretset => 't', provolatile => 's', proparallel => 'r',
@@ -5776,6 +5784,16 @@
proname => 'pg_stat_reset_replication_slot', proisstrict => 'f',
provolatile => 'v', prorettype => 'void', proargtypes => 'text',
prosrc => 'pg_stat_reset_replication_slot' },
+{ oid => '8524',
+ descr => 'statistics: reset collected statistics for a single subscription worker',
+ proname => 'pg_stat_reset_subscription_worker', proisstrict => 'f',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid oid',
+ prosrc => 'pg_stat_reset_subscription_worker_subrel' },
+{ oid => '8525',
+ descr => 'statistics: reset all collected statistics for a single subscription',
+ proname => 'pg_stat_reset_subscription_worker',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid',
+ prosrc => 'pg_stat_reset_subscription_worker_sub' },
{ oid => '3163', descr => 'current trigger depth',
proname => 'pg_trigger_depth', provolatile => 's', proparallel => 'r',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index bcd3588ea2..dd1c28803c 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -14,6 +14,7 @@
#include "datatype/timestamp.h"
#include "portability/instr_time.h"
#include "postmaster/pgarch.h" /* for MAX_XFN_CHARS */
+#include "replication/logicalproto.h"
#include "utils/backend_progress.h" /* for backward compatibility */
#include "utils/backend_status.h" /* for backward compatibility */
#include "utils/hsearch.h"
@@ -83,6 +84,8 @@ typedef enum StatMsgType
PGSTAT_MTYPE_REPLSLOT,
PGSTAT_MTYPE_CONNECT,
PGSTAT_MTYPE_DISCONNECT,
+ PGSTAT_MTYPE_SUBSCRIPTIONPURGE,
+ PGSTAT_MTYPE_SUBWORKERERROR,
} StatMsgType;
/* ----------
@@ -145,7 +148,8 @@ typedef enum PgStat_Shared_Reset_Target
typedef enum PgStat_Single_Reset_Type
{
RESET_TABLE,
- RESET_FUNCTION
+ RESET_FUNCTION,
+ RESET_SUBWORKER
} PgStat_Single_Reset_Type;
/* ------------------------------------------------------------
@@ -364,6 +368,7 @@ typedef struct PgStat_MsgResetsinglecounter
Oid m_databaseid;
PgStat_Single_Reset_Type m_resettype;
Oid m_objectid;
+ Oid m_subobjectid;
} PgStat_MsgResetsinglecounter;
/* ----------
@@ -536,6 +541,54 @@ typedef struct PgStat_MsgReplSlot
PgStat_Counter m_total_bytes;
} PgStat_MsgReplSlot;
+/* ----------
+ * PgStat_MsgSubscriptionPurge Sent by the backend and autovacuum to tell the
+ * collector about the dead subscriptions.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(Oid) - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionPurge
+{
+ PgStat_MsgHdr m_hdr;
+ Oid m_databaseid;
+ int m_nentries;
+ Oid m_subids[PGSTAT_NUM_SUBSCRIPTIONPURGE];
+} PgStat_MsgSubscriptionPurge;
+
+/* ----------
+ * PgStat_MsgSubWorkerError Sent by the apply worker or the table sync
+ * worker to report the error occurred while
+ * processing changes.
+ * ----------
+ */
+#define PGSTAT_SUBWORKERERROR_MSGLEN 256
+typedef struct PgStat_MsgSubWorkerError
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * m_subid and m_subrelid are used to determine the subscription and the
+ * reporter of the error. m_subrelid is InvalidOid if reported by an apply
+ * worker otherwise reported by a table sync worker.
+ */
+ Oid m_databaseid;
+ Oid m_subid;
+ Oid m_subrelid;
+
+ /*
+ * Oid of the table that the reporter was actually processing. m_relid can
+ * be InvalidOid if an error occurred during worker applying a
+ * non-data-modification message such as RELATION.
+ */
+ Oid m_relid;
+
+ LogicalRepMsgType m_command;
+ TransactionId m_xid;
+ TimestampTz m_timestamp;
+ char m_message[PGSTAT_SUBWORKERERROR_MSGLEN];
+} PgStat_MsgSubWorkerError;
/* ----------
* PgStat_MsgRecoveryConflict Sent by the backend upon recovery conflict
@@ -714,6 +767,8 @@ typedef union PgStat_Msg
PgStat_MsgReplSlot msg_replslot;
PgStat_MsgConnect msg_connect;
PgStat_MsgDisconnect msg_disconnect;
+ PgStat_MsgSubscriptionPurge msg_subscriptionpurge;
+ PgStat_MsgSubWorkerError msg_subworkererror;
} PgStat_Msg;
@@ -768,11 +823,16 @@ typedef struct PgStat_StatDBEntry
TimestampTz stats_timestamp; /* time of db stats file update */
/*
- * tables and functions must be last in the struct, because we don't write
- * the pointers out to the stats file.
+ * tables, functions, and subscription workers must be last in the struct,
+ * because we don't write the pointers out to the stats file.
+ *
+ * subworkers is the hash table of PgStat_StatSubWorkerEntry which stores
+ * statistics of logical replication workers: apply worker and table sync
+ * worker.
*/
HTAB *tables;
HTAB *functions;
+ HTAB *subworkers;
} PgStat_StatDBEntry;
@@ -929,6 +989,34 @@ typedef struct PgStat_StatReplSlotEntry
TimestampTz stat_reset_timestamp;
} PgStat_StatReplSlotEntry;
+/* The lookup key for subscription worker hash table */
+typedef struct PgStat_StatSubWorkerKey
+{
+ Oid subid;
+ Oid subrelid; /* InvalidOid for apply worker, otherwise for
+ * table sync worker */
+} PgStat_StatSubWorkerKey;
+
+/*
+ * Logical replication apply worker and table sync worker statistics kept in the
+ * stats collector.
+ */
+typedef struct PgStat_StatSubWorkerEntry
+{
+ PgStat_StatSubWorkerKey key; /* hash key (must be first) */
+
+ /*
+ * Subscription worker error statistics representing an error that
+ * occurred during application of logical replication or the initial table
+ * synchronization.
+ */
+ Oid last_error_relid;
+ LogicalRepMsgType last_error_command;
+ TransactionId last_error_xid;
+ PgStat_Counter last_error_count;
+ TimestampTz last_error_time;
+ char last_error_message[PGSTAT_SUBWORKERERROR_MSGLEN];
+} PgStat_StatSubWorkerEntry;
/*
* Working state needed to accumulate per-function-call timing statistics.
@@ -1019,7 +1107,8 @@ extern void pgstat_drop_database(Oid databaseid);
extern void pgstat_clear_snapshot(void);
extern void pgstat_reset_counters(void);
extern void pgstat_reset_shared_counters(const char *);
-extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type type);
+extern void pgstat_reset_single_counter(Oid objectid, Oid subobjectid,
+ PgStat_Single_Reset_Type type);
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
@@ -1038,6 +1127,10 @@ extern void pgstat_report_checksum_failure(void);
extern void pgstat_report_replslot(const PgStat_StatReplSlotEntry *repSlotStat);
extern void pgstat_report_replslot_create(const char *slotname);
extern void pgstat_report_replslot_drop(const char *slotname);
+extern void pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command,
+ TransactionId xid, const char *errmsg);
+extern void pgstat_report_subscription_drop(Oid subid);
extern void pgstat_initialize(void);
@@ -1129,6 +1222,8 @@ extern void pgstat_send_wal(bool force);
extern PgStat_StatDBEntry *pgstat_fetch_stat_dbentry(Oid dbid);
extern PgStat_StatTabEntry *pgstat_fetch_stat_tabentry(Oid relid);
extern PgStat_StatFuncEntry *pgstat_fetch_stat_funcentry(Oid funcid);
+extern PgStat_StatSubWorkerEntry *pgstat_fetch_stat_subworker_entry(Oid subid,
+ Oid subrelid);
extern PgStat_ArchiverStats *pgstat_fetch_stat_archiver(void);
extern PgStat_BgWriterStats *pgstat_fetch_stat_bgwriter(void);
extern PgStat_CheckpointerStats *pgstat_fetch_stat_checkpointer(void);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2fa00a3c29..b58b062b10 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2094,6 +2094,24 @@ pg_stat_subscription| SELECT su.oid AS subid,
st.latest_end_time
FROM (pg_subscription su
LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid, pid, received_lsn, last_msg_send_time, last_msg_receipt_time, latest_end_lsn, latest_end_time) ON ((st.subid = su.oid)));
+pg_stat_subscription_workers| SELECT w.subid,
+ s.subname,
+ w.subrelid,
+ w.last_error_relid,
+ w.last_error_command,
+ w.last_error_xid,
+ w.last_error_count,
+ w.last_error_message,
+ w.last_error_time
+ FROM ( SELECT pg_subscription.oid AS subid,
+ NULL::oid AS relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT pg_subscription_rel.srsubid AS subid,
+ pg_subscription_rel.srrelid AS relid
+ FROM pg_subscription_rel) sr,
+ (LATERAL pg_stat_get_subscription_worker(sr.subid, sr.relid) w(subid, subrelid, last_error_relid, last_error_command, last_error_xid, last_error_count, last_error_message, last_error_time)
+ JOIN pg_subscription s ON ((w.subid = s.oid)));
pg_stat_sys_indexes| SELECT pg_stat_all_indexes.relid,
pg_stat_all_indexes.indexrelid,
pg_stat_all_indexes.schemaname,
diff --git a/src/test/subscription/t/026_error_report.pl b/src/test/subscription/t/026_error_report.pl
new file mode 100644
index 0000000000..9dc9e20ad6
--- /dev/null
+++ b/src/test/subscription/t/026_error_report.pl
@@ -0,0 +1,181 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Tests for subscription error reporting.
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 5;
+
+# Test if the error reported on pg_stat_subscription_workers view is expected.
+sub test_subscription_error
+{
+ my ($node, $relname, $xid, $expected_error, $msg) = @_;
+
+ my $check_sql = qq[
+SELECT count(1) > 0 FROM pg_stat_subscription_workers
+WHERE last_error_relid = '$relname'::regclass];
+ $check_sql .= " AND last_error_xid = '$xid'::xid;" if $xid ne '';
+
+ # Wait for the error statistics to be updated.
+ $node->poll_query_until(
+ 'postgres', $check_sql,
+) or die "Timed out while waiting for statistics to be updated";
+
+ my $result = $node->safe_psql(
+ 'postgres',
+ qq[
+SELECT subname, last_error_command, last_error_relid::regclass, last_error_count > 0
+FROM pg_stat_subscription_workers
+WHERE last_error_relid = '$relname'::regclass;
+]);
+ is($result, $expected_error, $msg);
+}
+
+# Create publisher node.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ qq[
+logical_decoding_work_mem = 64kB
+]);
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+
+# The subscriber will enter an infinite error loop, so we don't want
+# to overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+wal_retrieve_retry_interval = 5s
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we
+# create the same tables but with primary keys. Also, insert some data that
+# will conflict with the data replicated from publisher later.
+$node_publisher->safe_psql(
+ 'postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int);
+CREATE TABLE test_tab2 (a int);
+INSERT INTO test_tab1 VALUES (1);
+INSERT INTO test_tab2 VALUES (1);
+COMMIT;
+]);
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+BEGIN;
+CREATE TABLE test_tab1 (a int primary key);
+CREATE TABLE test_tab2 (a int primary key);
+INSERT INTO test_tab2 VALUES (1);
+COMMIT;
+]);
+
+# Setup publications.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ q[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;
+]);
+
+# Check if there is no subscription errors before starting logical replication.
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(1) FROM pg_stat_subscription_workers");
+is($result, qq(0), 'check no subscription error');
+
+# Create subscription. The table sync for test_tab2 on tap_sub will enter into
+# infinite error loop due to violating the unique constraint.
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr' PUBLICATION tap_pub WITH (streaming = off);");
+
+$node_publisher->wait_for_catchup('tap_sub');
+
+# Wait for initial table sync for test_tab1 to finish.
+$node_subscriber->poll_query_until('postgres',
+ q[
+SELECT count(1) = 1 FROM pg_subscription_rel
+WHERE srrelid = 'test_tab1'::regclass AND srsubstate in ('r', 's')
+]) or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check the initial data.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab1");
+is($result, q(1), 'check initial data are copied to subscriber');
+
+# Insert more data to test_tab1, raising an error on the subscriber due to
+# violation of the unique constraint on test_tab1.
+my $xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab1 VALUES (1);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_subscription_error($node_subscriber, 'test_tab1', $xid,
+ qq(tap_sub|INSERT|test_tab1|t),
+ 'check the error reported by the apply worker');
+
+# Check the table sync worker's error in the view.
+test_subscription_error($node_subscriber, 'test_tab2', '',
+ qq(tap_sub||test_tab2|t),
+ 'check the error reported by the table sync worker');
+
+# Test for resetting subscription worker statistics.
+# Truncate test_tab1 and test_tab2 so that applying changes and table sync can
+# continue, respectively.
+$node_subscriber->safe_psql(
+ 'postgres',
+ "TRUNCATE test_tab1, test_tab2;");
+
+# Wait for the data to be replicated.
+$node_subscriber->poll_query_until(
+ 'postgres',
+ "SELECT count(1) > 0 FROM test_tab1");
+$node_subscriber->poll_query_until(
+ 'postgres',
+ "SELECT count(1) > 0 FROM test_tab2");
+
+# Reset stats of all subscription workers running on tap_sub.
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+SELECT pg_stat_reset_subscription_worker(sw.subid)
+FROM pg_stat_subscription_workers sw
+ JOIN pg_subscription s ON s.oid = sw.subid
+WHERE
+ s.subname = 'tap_sub';
+]);
+
+# Wait for stats of all subscription workers running on tap_sub to be reset.
+$node_subscriber->poll_query_until(
+ 'postgres',
+ qq[
+SELECT count(1) = 0
+FROM pg_stat_subscription_workers sw
+ JOIN pg_subscription s ON s.oid = sw.subid
+WHERE
+ s.subname = 'tap_sub';
+]);
+
+# Check if the view doesn't show any entries after dropping the subscriptions.
+$node_subscriber->safe_psql(
+ 'postgres',
+ q[
+DROP SUBSCRIPTION tap_sub;
+]);
+$result = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT count(1) FROM pg_stat_subscription_workers");
+is($result, q(0), 'no error after dropping subscription');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index da6ac8ed83..f41ef0d2bc 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1943,6 +1943,8 @@ PgStat_MsgResetsharedcounter
PgStat_MsgResetsinglecounter
PgStat_MsgResetslrucounter
PgStat_MsgSLRU
+PgStat_MsgSubscriptionPurge
+PgStat_MsgSubWorkerError
PgStat_MsgTabpurge
PgStat_MsgTabstat
PgStat_MsgTempFile
@@ -1954,6 +1956,8 @@ PgStat_Single_Reset_Type
PgStat_StatDBEntry
PgStat_StatFuncEntry
PgStat_StatReplSlotEntry
+PgStat_StatSubWorkerEntry
+PgStat_StatSubWorkerKey
PgStat_StatTabEntry
PgStat_SubXactStatus
PgStat_TableCounts
--
2.24.3 (Apple Git-128)
On Thu, Nov 25, 2021 at 7:36 PM vignesh C <vignesh21@gmail.com> wrote:
On Wed, Nov 24, 2021 at 5:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Nov 17, 2021 at 8:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Nov 16, 2021 at 12:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Right. I've fixed this issue and attached an updated patch.
One very minor comment: conflict can be moved to next line to keep it within 80 chars boundary wherever possible +# Initial table setup on both publisher and subscriber. On subscriber we create +# the same tables but with primary keys. Also, insert some data that will conflict +# with the data replicated from publisher later. +$node_publisher->safe_psql(Similarly in the below: +# Insert more data to test_tab1, raising an error on the subscriber due to violation +# of the unique constraint on test_tab1. +my $xid = $node_publisher->safe_psql(The rest of the patch looks good.
Thank you for the comments! These are incorporated into v25 patch I
just submitted.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thu, Nov 25, 2021 at 9:08 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Wed, Nov 24, 2021 at 10:44 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated version patch. Unless I miss something, all
comments I got so far have been incorporated into this patch. Please
review it.Only a couple of minor points:
src/backend/postmaster/pgstat.c
(1) pgstat_get_subworker_entryIn the following comment, it should say "returns an entry ...":
+ * apply worker otherwise returns entry of the table sync worker associated
src/include/pgstat.h
(2) typedef struct PgStat_StatDBEntry"subworker" should be "subworkers" in the following comment, to match
the struct member name:* subworker is the hash table of PgStat_StatSubWorkerEntry which stores
Otherwise, the patch LGTM.
Thank you for the comments! These are incorporated into v25 patch I
just submitted.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thur, Nov 25, 2021 8:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Nov 25, 2021 at 1:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Nov 24, 2021 at 5:14 PM Masahiko Sawada
<sawada.mshk@gmail.com> wrote:
Changed. I've removed first_error_time as per discussion on the
thread for adding xact stats.We also agreed to change the column names to start with last_error_*
[1]. Is there a reason to not make those changes? Do you think that we
can change it just before committing that patch? I thought it might be
better to do it that way now itself.Oh, I thought that you think that we change the column names when adding xact
stats to the view. But these names also make sense even without the xact stats.
I've attached an updated patch. It also incorporated comments from Vignesh
and Greg.
Hi,
I only noticed some minor things in the testcases
1)
+$node_publisher->append_conf('postgresql.conf',
+ qq[
+logical_decoding_work_mem = 64kB
+]);
It seems we don’t need set the decode_work_mem since we don't test streaming ?
2)
+$node_publisher->safe_psql('postgres',
+ q[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;
+]);
There are a few places where only one command exists in the 'q[' or 'qq[' like the above code.
To be consistent, I think it might be better to remove the wrap here, maybe we can write like:
$node_publisher->safe_psql('postgres',
' CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;');
The others LGTM.
Best regards,
Hou zj
On Thu, Nov 25, 2021 at 10:06 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
On Thur, Nov 25, 2021 8:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Nov 25, 2021 at 1:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Nov 24, 2021 at 5:14 PM Masahiko Sawada
<sawada.mshk@gmail.com> wrote:
Changed. I've removed first_error_time as per discussion on the
thread for adding xact stats.We also agreed to change the column names to start with last_error_*
[1]. Is there a reason to not make those changes? Do you think that we
can change it just before committing that patch? I thought it might be
better to do it that way now itself.Oh, I thought that you think that we change the column names when adding xact
stats to the view. But these names also make sense even without the xact stats.
I've attached an updated patch. It also incorporated comments from Vignesh
and Greg.Hi,
I only noticed some minor things in the testcases
1) +$node_publisher->append_conf('postgresql.conf', + qq[ +logical_decoding_work_mem = 64kB +]);It seems we don’t need set the decode_work_mem since we don't test streaming ?
2) +$node_publisher->safe_psql('postgres', + q[ +CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2; +]);There are a few places where only one command exists in the 'q[' or 'qq[' like the above code.
To be consistent, I think it might be better to remove the wrap here, maybe we can write like:
$node_publisher->safe_psql('postgres',
' CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;');
Indeed. Attached an updated patch. Thanks!
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v26-0001-Add-a-subscription-worker-statistics-view-pg_sta.patchapplication/octet-stream; name=v26-0001-Add-a-subscription-worker-statistics-view-pg_sta.patchDownload
From d538b10d0f3fe66086eb32d04b9ca8c90433a302 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 16 Jul 2021 23:10:22 +0900
Subject: [PATCH v26 1/3] Add a subscription worker statistics view
"pg_stat_subscription_workers".
This commit adds a new system view pg_stat_subscription_workers,
that shows information about any errors which occur during application
of logical replication changes as well as during performing initial table
synchronization. The subscription statistics entries are removed when
the corresponding subscription is removed.
It also adds an SQL function pg_stat_reset_subscription_worker() to
reset single subscription errors.
The contents of this view can be used by an upcoming patch that skips
the particular transaction that conflicts with the existing data on
the subscriber.
This view can be extended in the future to track other xact related
statistics for subscription workers.
---
doc/src/sgml/monitoring.sgml | 157 ++++++++
src/backend/catalog/system_functions.sql | 4 +
src/backend/catalog/system_views.sql | 23 ++
src/backend/commands/subscriptioncmds.c | 16 +-
src/backend/postmaster/pgstat.c | 381 +++++++++++++++++++-
src/backend/replication/logical/worker.c | 54 ++-
src/backend/utils/adt/pgstatfuncs.c | 127 ++++++-
src/include/catalog/pg_proc.dat | 18 +
src/include/pgstat.h | 103 +++++-
src/test/regress/expected/rules.out | 18 +
src/test/subscription/t/026_error_report.pl | 176 +++++++++
src/tools/pgindent/typedefs.list | 4 +
12 files changed, 1060 insertions(+), 21 deletions(-)
create mode 100644 src/test/subscription/t/026_error_report.pl
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index af6914872b..11a513c17e 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -627,6 +627,15 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
</entry>
</row>
+ <row>
+ <entry><structname>pg_stat_subscription_workers</structname><indexterm><primary>pg_stat_subscription_workers</primary></indexterm></entry>
+ <entry>One row per subscription worker, showing statistics about errors
+ that occurred on that subscription worker.
+ See <link linkend="monitoring-pg-stat-subscription-workers">
+ <structname>pg_stat_subscription_workers</structname></link> for details.
+ </entry>
+ </row>
+
</tbody>
</tgroup>
</table>
@@ -3054,6 +3063,128 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
</sect2>
+ <sect2 id="monitoring-pg-stat-subscription-workers">
+ <title><structname>pg_stat_subscription_workers</structname></title>
+
+ <indexterm>
+ <primary>pg_stat_subscription_workers</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_stat_subscription_workers</structname> view will contain
+ one row per subscription worker on which errors have occurred, for workers
+ applying logical replication changes and workers handling the initial data
+ copy of the subscribed tables. The statistics entry is removed when the
+ corresponding subscription is dropped.
+ </para>
+
+ <table id="pg-stat-subscription-workers" xreflabel="pg_stat_subscription_workers">
+ <title><structname>pg_stat_subscription_workers</structname> View</title>
+ <tgroup cols="1">
+ <thead>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ Column Type
+ </para>
+ <para>
+ Description
+ </para></entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the subscription
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the subscription
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subrelid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker is synchronizing; null for the
+ main apply worker
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_error_relid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker was processing when the
+ error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_error_command</structfield> <type>text</type>
+ </para>
+ <para>
+ Name of command being applied when the error occurred. This field
+ is null if the error was reported during the initial data copy.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_error_xid</structfield> <type>xid</type>
+ </para>
+ <para>
+ Transaction ID of the publisher node being applied when the error
+ occurred. This field is null if the error was reported
+ during the initial data copy.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_error_count</structfield> <type>uint8</type>
+ </para>
+ <para>
+ Number of consecutive times the error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_error_message</structfield> <type>text</type>
+ </para>
+ <para>
+ The error message
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_error_time</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Last time at which this error occurred
+ </para></entry>
+ </row>
+
+ </tbody>
+ </tgroup>
+ </table>
+
+ </sect2>
+
<sect2 id="monitoring-pg-stat-ssl-view">
<title><structname>pg_stat_ssl</structname></title>
@@ -5176,6 +5307,32 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
can be granted EXECUTE to run the function.
</para></entry>
</row>
+
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_stat_reset_subscription_worker</primary>
+ </indexterm>
+ <function>pg_stat_reset_subscription_worker</function> ( <parameter>subid</parameter> <type>oid</type> <optional>, <parameter>relid</parameter> <type>oid</type> </optional> )
+ <returnvalue>void</returnvalue>
+ </para>
+ <para>
+ Resets the statistics of subscription workers running on the
+ subscription with <parameter>subid</parameter> shown in the
+ <structname>pg_stat_subscription_worker</structname> view. If the
+ argument <parameter>relid</parameter> is not <literal>NULL</literal>,
+ resets statistics of the subscription worker handling the initial data
+ copy of the relation with <parameter>relid</parameter>. Otherwise,
+ resets the subscription worker statistics of the main apply worker.
+ If the argument <parameter>relid</parameter> is omitted, resets the
+ statistics of all subscription workers running on the subscription
+ with <parameter>subid</parameter>.
+ </para>
+ <para>
+ This function is restricted to superusers by default, but other users
+ can be granted EXECUTE to run the function.
+ </para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index f6789025a5..3a4fa9091b 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -639,6 +639,10 @@ REVOKE EXECUTE ON FUNCTION pg_stat_reset_single_function_counters(oid) FROM publ
REVOKE EXECUTE ON FUNCTION pg_stat_reset_replication_slot(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_stat_reset_subscription_worker(oid) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_stat_reset_subscription_worker(oid, oid) FROM public;
+
REVOKE EXECUTE ON FUNCTION lo_import(text) FROM public;
REVOKE EXECUTE ON FUNCTION lo_import(text, oid) FROM public;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index eb560955cd..61b515cdb8 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1261,3 +1261,26 @@ REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
substream, subtwophasestate, subslotname, subsynccommit, subpublications)
ON pg_subscription TO public;
+
+CREATE VIEW pg_stat_subscription_workers AS
+ SELECT
+ w.subid,
+ s.subname,
+ w.subrelid,
+ w.last_error_relid,
+ w.last_error_command,
+ w.last_error_xid,
+ w.last_error_count,
+ w.last_error_message,
+ w.last_error_time
+ FROM (SELECT
+ oid as subid,
+ NULL as relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT
+ srsubid as subid,
+ srrelid as relid
+ FROM pg_subscription_rel) sr,
+ LATERAL pg_stat_get_subscription_worker(sr.subid, sr.relid) w
+ JOIN pg_subscription s ON (w.subid = s.oid);
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index c47ba26369..9427e86fee 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -32,6 +32,7 @@
#include "executor/executor.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "pgstat.h"
#include "replication/logicallauncher.h"
#include "replication/origin.h"
#include "replication/slot.h"
@@ -1204,7 +1205,8 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
* Since dropping a replication slot is not transactional, the replication
* slot stays dropped even if the transaction rolls back. So we cannot
* run DROP SUBSCRIPTION inside a transaction block if dropping the
- * replication slot.
+ * replication slot. Also, in this case, we report a message for dropping
+ * the subscription to the stats collector.
*
* XXX The command name should really be something like "DROP SUBSCRIPTION
* of a subscription that is associated with a replication slot", but we
@@ -1377,6 +1379,18 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
}
PG_END_TRY();
+ /*
+ * Send a message for dropping this subscription to the stats collector.
+ * We can safely report dropping the subscription statistics here if the
+ * subscription is associated with a replication slot since we cannot run
+ * DROP SUBSCRIPTION inside a transaction block. Subscription statistics
+ * will be removed later by (auto)vacuum either if it's not associated
+ * with a replication slot or if the message for dropping the subscription
+ * gets lost.
+ */
+ if (slotname)
+ pgstat_report_subscription_drop(subid);
+
table_close(rel, NoLock);
}
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 8c166e5e16..59ef5923b4 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -41,6 +41,7 @@
#include "catalog/catalog.h"
#include "catalog/pg_database.h"
#include "catalog/pg_proc.h"
+#include "catalog/pg_subscription.h"
#include "common/ip.h"
#include "executor/instrument.h"
#include "libpq/libpq.h"
@@ -105,6 +106,7 @@
#define PGSTAT_DB_HASH_SIZE 16
#define PGSTAT_TAB_HASH_SIZE 512
#define PGSTAT_FUNCTION_HASH_SIZE 512
+#define PGSTAT_SUBWORKER_HASH_SIZE 32
#define PGSTAT_REPLSLOT_HASH_SIZE 32
@@ -320,10 +322,14 @@ NON_EXEC_STATIC void PgstatCollectorMain(int argc, char *argv[]) pg_attribute_no
static PgStat_StatDBEntry *pgstat_get_db_entry(Oid databaseid, bool create);
static PgStat_StatTabEntry *pgstat_get_tab_entry(PgStat_StatDBEntry *dbentry,
Oid tableoid, bool create);
+static PgStat_StatSubWorkerEntry *pgstat_get_subworker_entry(PgStat_StatDBEntry *dbentry,
+ Oid subid, Oid subrelid,
+ bool create);
static void pgstat_write_statsfiles(bool permanent, bool allDbs);
static void pgstat_write_db_statsfile(PgStat_StatDBEntry *dbentry, bool permanent);
static HTAB *pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep);
-static void pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash, bool permanent);
+static void pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash,
+ HTAB *subworkerhash, bool permanent);
static void backend_read_statsfile(void);
static bool pgstat_write_statsfile_needed(void);
@@ -335,6 +341,7 @@ static void pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotstats, Timestamp
static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg, TimestampTz now);
static void pgstat_send_funcstats(void);
static void pgstat_send_slru(void);
+static void pgstat_send_subscription_purge(PgStat_MsgSubscriptionPurge *msg);
static HTAB *pgstat_collect_oids(Oid catalogid, AttrNumber anum_oid);
static bool pgstat_should_report_connstat(void);
static void pgstat_report_disconnect(Oid dboid);
@@ -373,6 +380,8 @@ static void pgstat_recv_connect(PgStat_MsgConnect *msg, int len);
static void pgstat_recv_disconnect(PgStat_MsgDisconnect *msg, int len);
static void pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len);
static void pgstat_recv_tempfile(PgStat_MsgTempFile *msg, int len);
+static void pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len);
+static void pgstat_recv_subworker_error(PgStat_MsgSubWorkerError *msg, int len);
/* ------------------------------------------------------------
* Public functions called from postmaster follow
@@ -1302,6 +1311,55 @@ pgstat_vacuum_stat(void)
hash_destroy(htab);
}
+
+ /*
+ * Repeat for subscription workers. Similarly, we needn't bother in the
+ * common case where no subscription workers' stats are being collected.
+ */
+ if (dbentry->subworkers != NULL &&
+ hash_get_num_entries(dbentry->subworkers) > 0)
+ {
+ PgStat_StatSubWorkerEntry *subwentry;
+ PgStat_MsgSubscriptionPurge spmsg;
+
+ /*
+ * Read pg_subscription and make a list of OIDs of all existing
+ * subscriptions
+ */
+ htab = pgstat_collect_oids(SubscriptionRelationId, Anum_pg_subscription_oid);
+
+ spmsg.m_databaseid = MyDatabaseId;
+ spmsg.m_nentries = 0;
+
+ hash_seq_init(&hstat, dbentry->subworkers);
+ while ((subwentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ Oid subid = subwentry->key.subid;
+
+ CHECK_FOR_INTERRUPTS();
+
+ if (hash_search(htab, (void *) &subid, HASH_FIND, NULL) != NULL)
+ continue;
+
+ /* This subscription is dead, add the subid to the message */
+ spmsg.m_subids[spmsg.m_nentries++] = subid;
+
+ /*
+ * If the message is full, send it out and reinitialize to empty
+ */
+ if (spmsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONPURGE)
+ {
+ pgstat_send_subscription_purge(&spmsg);
+ spmsg.m_nentries = 0;
+ }
+ }
+
+ /* Send the rest of dead subscriptions */
+ if (spmsg.m_nentries > 0)
+ pgstat_send_subscription_purge(&spmsg);
+
+ hash_destroy(htab);
+ }
}
@@ -1474,7 +1532,8 @@ pgstat_reset_shared_counters(const char *target)
* ----------
*/
void
-pgstat_reset_single_counter(Oid objoid, PgStat_Single_Reset_Type type)
+pgstat_reset_single_counter(Oid objoid, Oid subobjoid,
+ PgStat_Single_Reset_Type type)
{
PgStat_MsgResetsinglecounter msg;
@@ -1485,6 +1544,7 @@ pgstat_reset_single_counter(Oid objoid, PgStat_Single_Reset_Type type)
msg.m_databaseid = MyDatabaseId;
msg.m_resettype = type;
msg.m_objectid = objoid;
+ msg.m_subobjectid = subobjoid;
pgstat_send(&msg, sizeof(msg));
}
@@ -1869,6 +1929,51 @@ pgstat_report_replslot_drop(const char *slotname)
pgstat_send(&msg, sizeof(PgStat_MsgReplSlot));
}
+/* ----------
+ * pgstat_report_subworker_error() -
+ *
+ * Tell the collector about the subscription worker error.
+ * ----------
+ */
+void
+pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command, TransactionId xid,
+ const char *errmsg)
+{
+ PgStat_MsgSubWorkerError msg;
+ int len;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBWORKERERROR);
+ msg.m_databaseid = MyDatabaseId;
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_relid = relid;
+ msg.m_command = command;
+ msg.m_xid = xid;
+ msg.m_timestamp = GetCurrentTimestamp();
+ strlcpy(msg.m_message, errmsg, PGSTAT_SUBWORKERERROR_MSGLEN);
+
+ len = offsetof(PgStat_MsgSubWorkerError, m_message) + strlen(msg.m_message) + 1;
+ pgstat_send(&msg, len);
+}
+
+/* ----------
+ * pgstat_report_subscription_drop() -
+ *
+ * Tell the collector about dropping the subscription.
+ * ----------
+ */
+void
+pgstat_report_subscription_drop(Oid subid)
+{
+ PgStat_MsgSubscriptionPurge msg;
+
+ msg.m_databaseid = MyDatabaseId;
+ msg.m_subids[0] = subid;
+ msg.m_nentries = 1;
+ pgstat_send_subscription_purge(&msg);
+}
+
/* ----------
* pgstat_ping() -
*
@@ -2874,6 +2979,35 @@ pgstat_fetch_stat_funcentry(Oid func_id)
return funcentry;
}
+/*
+ * ---------
+ * pgstat_fetch_stat_subworker_entry() -
+ *
+ * Support function for the SQL-callable pgstat* functions. Returns
+ * a pointer to the subscription worker struct or NULL.
+ * ---------
+ */
+PgStat_StatSubWorkerEntry *
+pgstat_fetch_stat_subworker_entry(Oid subid, Oid subrelid)
+{
+ PgStat_StatDBEntry *dbentry;
+ PgStat_StatSubWorkerEntry *wentry = NULL;
+
+ /* Load the stats file if needed */
+ backend_read_statsfile();
+
+ /*
+ * Lookup our database, then find the requested subscription worker stats.
+ */
+ dbentry = pgstat_fetch_stat_dbentry(MyDatabaseId);
+ if (dbentry != NULL && dbentry->subworkers != NULL)
+ {
+ wentry = pgstat_get_subworker_entry(dbentry, subid, subrelid,
+ false);
+ }
+
+ return wentry;
+}
/*
* ---------
@@ -3312,6 +3446,23 @@ pgstat_send_slru(void)
}
}
+/* --------
+ * pgstat_send_subscription_purge() -
+ *
+ * Send a subscription purge message to the collector
+ * --------
+ */
+static void
+pgstat_send_subscription_purge(PgStat_MsgSubscriptionPurge *msg)
+{
+ int len;
+
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + msg->m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&msg->m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(msg, len);
+}
/* ----------
* PgstatCollectorMain() -
@@ -3568,6 +3719,14 @@ PgstatCollectorMain(int argc, char *argv[])
pgstat_recv_disconnect(&msg.msg_disconnect, len);
break;
+ case PGSTAT_MTYPE_SUBSCRIPTIONPURGE:
+ pgstat_recv_subscription_purge(&msg.msg_subscriptionpurge, len);
+ break;
+
+ case PGSTAT_MTYPE_SUBWORKERERROR:
+ pgstat_recv_subworker_error(&msg.msg_subworkererror, len);
+ break;
+
default:
break;
}
@@ -3613,7 +3772,8 @@ PgstatCollectorMain(int argc, char *argv[])
/*
* Subroutine to clear stats in a database entry
*
- * Tables and functions hashes are initialized to empty.
+ * Tables, functions, and subscription workers hashes are initialized
+ * to empty.
*/
static void
reset_dbentry_counters(PgStat_StatDBEntry *dbentry)
@@ -3666,6 +3826,13 @@ reset_dbentry_counters(PgStat_StatDBEntry *dbentry)
PGSTAT_FUNCTION_HASH_SIZE,
&hash_ctl,
HASH_ELEM | HASH_BLOBS);
+
+ hash_ctl.keysize = sizeof(PgStat_StatSubWorkerKey);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubWorkerEntry);
+ dbentry->subworkers = hash_create("Per-database subscription worker",
+ PGSTAT_SUBWORKER_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
}
/*
@@ -3690,7 +3857,7 @@ pgstat_get_db_entry(Oid databaseid, bool create)
/*
* If not found, initialize the new one. This creates empty hash tables
- * for tables and functions, too.
+ * for tables, functions, and subscription worker, too.
*/
if (!found)
reset_dbentry_counters(result);
@@ -3748,6 +3915,47 @@ pgstat_get_tab_entry(PgStat_StatDBEntry *dbentry, Oid tableoid, bool create)
return result;
}
+/* ----------
+ * pgstat_get_subworker_entry
+ *
+ * Return subscription worker entry with the given subscription OID and
+ * relation OID. If subrelid is InvalidOid, it returns an entry of the
+ * apply worker otherwise returns an entry of the table sync worker
+ * associated with subrelid. If no subscription entry exists,
+ * initialize it, if the create parameter is true. Else, return NULL.
+ * ----------
+ */
+static PgStat_StatSubWorkerEntry *
+pgstat_get_subworker_entry(PgStat_StatDBEntry *dbentry, Oid subid, Oid subrelid,
+ bool create)
+{
+ PgStat_StatSubWorkerEntry *subwentry;
+ PgStat_StatSubWorkerKey key;
+ bool found;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);
+
+ key.subid = subid;
+ key.subrelid = subrelid;
+ subwentry = (PgStat_StatSubWorkerEntry *) hash_search(dbentry->subworkers,
+ (void *) &key,
+ action, &found);
+
+ if (!create && !found)
+ return NULL;
+
+ /* If not found, initialize the new one */
+ if (!found)
+ {
+ subwentry->last_error_relid = InvalidOid;
+ subwentry->last_error_command = 0;
+ subwentry->last_error_xid = InvalidTransactionId;
+ subwentry->last_error_count = 0;
+ subwentry->last_error_time = 0;
+ subwentry->last_error_message[0] = '\0';
+ }
+
+ return subwentry;
+}
/* ----------
* pgstat_write_statsfiles() -
@@ -3947,8 +4155,10 @@ pgstat_write_db_statsfile(PgStat_StatDBEntry *dbentry, bool permanent)
{
HASH_SEQ_STATUS tstat;
HASH_SEQ_STATUS fstat;
+ HASH_SEQ_STATUS sstat;
PgStat_StatTabEntry *tabentry;
PgStat_StatFuncEntry *funcentry;
+ PgStat_StatSubWorkerEntry *subwentry;
FILE *fpout;
int32 format_id;
Oid dbid = dbentry->databaseid;
@@ -4003,6 +4213,17 @@ pgstat_write_db_statsfile(PgStat_StatDBEntry *dbentry, bool permanent)
(void) rc; /* we'll check for error with ferror */
}
+ /*
+ * Walk through the database's subscription worker stats table.
+ */
+ hash_seq_init(&sstat, dbentry->subworkers);
+ while ((subwentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&sstat)) != NULL)
+ {
+ fputc('S', fpout);
+ rc = fwrite(subwentry, sizeof(PgStat_StatSubWorkerEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+ }
+
/*
* No more output to be done. Close the temp file and replace the old
* pgstat.stat with it. The ferror() check replaces testing for error
@@ -4241,6 +4462,7 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
memcpy(dbentry, &dbbuf, sizeof(PgStat_StatDBEntry));
dbentry->tables = NULL;
dbentry->functions = NULL;
+ dbentry->subworkers = NULL;
/*
* In the collector, disregard the timestamp we read from the
@@ -4252,8 +4474,8 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
dbentry->stats_timestamp = 0;
/*
- * Don't create tables/functions hashtables for uninteresting
- * databases.
+ * Don't create tables/functions/subworkers hashtables for
+ * uninteresting databases.
*/
if (onlydb != InvalidOid)
{
@@ -4278,6 +4500,14 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
&hash_ctl,
HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ hash_ctl.keysize = sizeof(PgStat_StatSubWorkerKey);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubWorkerEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ dbentry->subworkers = hash_create("Per-database subscription worker",
+ PGSTAT_SUBWORKER_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+
/*
* If requested, read the data from the database-specific
* file. Otherwise we just leave the hashtables empty.
@@ -4286,6 +4516,7 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
pgstat_read_db_statsfile(dbentry->databaseid,
dbentry->tables,
dbentry->functions,
+ dbentry->subworkers,
permanent);
break;
@@ -4363,19 +4594,21 @@ done:
* As in pgstat_read_statsfiles, if the permanent file is requested, it is
* removed after reading.
*
- * Note: this code has the ability to skip storing per-table or per-function
- * data, if NULL is passed for the corresponding hashtable. That's not used
- * at the moment though.
+ * Note: this code has the ability to skip storing per-table, per-function, or
+ * per-subscription-worker data, if NULL is passed for the corresponding hashtable.
+ * That's not used at the moment though.
* ----------
*/
static void
pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash,
- bool permanent)
+ HTAB *subworkerhash, bool permanent)
{
PgStat_StatTabEntry *tabentry;
PgStat_StatTabEntry tabbuf;
PgStat_StatFuncEntry funcbuf;
PgStat_StatFuncEntry *funcentry;
+ PgStat_StatSubWorkerEntry subwbuf;
+ PgStat_StatSubWorkerEntry *subwentry;
FILE *fpin;
int32 format_id;
bool found;
@@ -4489,6 +4722,41 @@ pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash,
memcpy(funcentry, &funcbuf, sizeof(funcbuf));
break;
+ /*
+ * 'S' A PgStat_StatSubWorkerEntry struct describing
+ * subscription worker statistics.
+ */
+ case 'S':
+ if (fread(&subwbuf, 1, sizeof(PgStat_StatSubWorkerEntry),
+ fpin) != sizeof(PgStat_StatSubWorkerEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /*
+ * Skip if subscription worker data not wanted.
+ */
+ if (subworkerhash == NULL)
+ break;
+
+ subwentry = (PgStat_StatSubWorkerEntry *) hash_search(subworkerhash,
+ (void *) &subwbuf.key,
+ HASH_ENTER, &found);
+
+ if (found)
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ memcpy(subwentry, &subwbuf, sizeof(subwbuf));
+ break;
+
/*
* 'E' The EOF marker of a complete stats file.
*/
@@ -5162,6 +5430,8 @@ pgstat_recv_dropdb(PgStat_MsgDropdb *msg, int len)
hash_destroy(dbentry->tables);
if (dbentry->functions != NULL)
hash_destroy(dbentry->functions);
+ if (dbentry->subworkers != NULL)
+ hash_destroy(dbentry->subworkers);
if (hash_search(pgStatDBHash,
(void *) &dbid,
@@ -5199,13 +5469,16 @@ pgstat_recv_resetcounter(PgStat_MsgResetcounter *msg, int len)
hash_destroy(dbentry->tables);
if (dbentry->functions != NULL)
hash_destroy(dbentry->functions);
+ if (dbentry->subworkers != NULL)
+ hash_destroy(dbentry->subworkers);
dbentry->tables = NULL;
dbentry->functions = NULL;
+ dbentry->subworkers = NULL;
/*
* Reset database-level stats, too. This creates empty hash tables for
- * tables and functions.
+ * tables, functions, and subscription workers.
*/
reset_dbentry_counters(dbentry);
}
@@ -5274,6 +5547,14 @@ pgstat_recv_resetsinglecounter(PgStat_MsgResetsinglecounter *msg, int len)
else if (msg->m_resettype == RESET_FUNCTION)
(void) hash_search(dbentry->functions, (void *) &(msg->m_objectid),
HASH_REMOVE, NULL);
+ else if (msg->m_resettype == RESET_SUBWORKER)
+ {
+ PgStat_StatSubWorkerKey key;
+
+ key.subid = msg->m_objectid;
+ key.subrelid = msg->m_subobjectid;
+ (void) hash_search(dbentry->subworkers, (void *) &key, HASH_REMOVE, NULL);
+ }
}
/* ----------
@@ -5816,6 +6097,84 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
}
}
+/* ----------
+ * pgstat_recv_subscription_purge() -
+ *
+ * Process a SUBSCRIPTIONPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len)
+{
+ HASH_SEQ_STATUS hstat;
+ PgStat_StatDBEntry *dbentry;
+ PgStat_StatSubWorkerEntry *subwentry;
+
+ dbentry = pgstat_get_db_entry(msg->m_databaseid, false);
+
+ /* No need to purge if we don't have even know the database */
+ if (!dbentry || !dbentry->subworkers)
+ return;
+
+ /* Remove all subscription worker statistics of the given subscriptions */
+ hash_seq_init(&hstat, dbentry->subworkers);
+ while ((subwentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ if (subwentry->key.subid == msg->m_subids[i])
+ {
+ (void) hash_search(dbentry->subworkers, (void *) &(subwentry->key),
+ HASH_REMOVE, NULL);
+ break;
+ }
+ }
+ }
+}
+
+/* ----------
+ * pgstat_recv_subworker_error() -
+ *
+ * Process a SUBWORKERERROR message.
+ * ----------
+ */
+static void
+pgstat_recv_subworker_error(PgStat_MsgSubWorkerError *msg, int len)
+{
+ PgStat_StatDBEntry *dbentry;
+ PgStat_StatSubWorkerEntry *subwentry;
+
+ dbentry = pgstat_get_db_entry(msg->m_databaseid, true);
+
+ /* Get the subscription worker stats */
+ subwentry = pgstat_get_subworker_entry(dbentry, msg->m_subid,
+ msg->m_subrelid, true);
+ Assert(subwentry);
+
+ /*
+ * Update only the counter and last error timestamp if we received the
+ * same error again.
+ */
+ if (subwentry->last_error_relid == msg->m_relid &&
+ subwentry->last_error_command == msg->m_command &&
+ subwentry->last_error_xid == msg->m_xid &&
+ strcmp(subwentry->last_error_message, msg->m_message) == 0)
+ {
+ subwentry->last_error_count++;
+ subwentry->last_error_time = msg->m_timestamp;
+ return;
+ }
+
+ /* Otherwise, update the error information */
+ subwentry->last_error_relid = msg->m_relid;
+ subwentry->last_error_command = msg->m_command;
+ subwentry->last_error_xid = msg->m_xid;
+ subwentry->last_error_count = 1;
+ subwentry->last_error_time = msg->m_timestamp;
+ strlcpy(subwentry->last_error_message, msg->m_message,
+ PGSTAT_SUBWORKERERROR_MSGLEN);
+}
+
/* ----------
* pgstat_write_statsfile_needed() -
*
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index ae1b391bda..2e79302a48 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3332,6 +3332,7 @@ void
ApplyWorkerMain(Datum main_arg)
{
int worker_slot = DatumGetInt32(main_arg);
+ MemoryContext cctx = CurrentMemoryContext;
MemoryContext oldctx;
char originname[NAMEDATALEN];
XLogRecPtr origin_startpos;
@@ -3432,8 +3433,30 @@ ApplyWorkerMain(Datum main_arg)
{
char *syncslotname;
- /* This is table synchronization worker, call initial sync. */
- syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ PG_TRY();
+ {
+ /* This is table synchronization worker, call initial sync. */
+ syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ }
+ PG_CATCH();
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ /*
+ * Report the table sync error. There is no corresponding message
+ * type for table synchronization.
+ */
+ pgstat_report_subworker_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ MyLogicalRepWorker->relid,
+ 0, /* message type */
+ InvalidTransactionId,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
/* allocate slot name in long-lived context */
myslotname = MemoryContextStrdup(ApplyContext, syncslotname);
@@ -3551,7 +3574,32 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- LogicalRepApplyLoop(origin_startpos);
+ PG_TRY();
+ {
+ LogicalRepApplyLoop(origin_startpos);
+ }
+ PG_CATCH();
+ {
+ /* report the apply error */
+ if (apply_error_callback_arg.command != 0)
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ pgstat_report_subworker_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ apply_error_callback_arg.rel != NULL
+ ? apply_error_callback_arg.rel->localreloid
+ : InvalidOid,
+ apply_error_callback_arg.command,
+ apply_error_callback_arg.remote_xid,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ }
+
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
proc_exit(0);
}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index e64857e540..cbca7167a6 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -2172,7 +2172,7 @@ pg_stat_reset_single_table_counters(PG_FUNCTION_ARGS)
{
Oid taboid = PG_GETARG_OID(0);
- pgstat_reset_single_counter(taboid, RESET_TABLE);
+ pgstat_reset_single_counter(taboid, InvalidOid, RESET_TABLE);
PG_RETURN_VOID();
}
@@ -2182,7 +2182,18 @@ pg_stat_reset_single_function_counters(PG_FUNCTION_ARGS)
{
Oid funcoid = PG_GETARG_OID(0);
- pgstat_reset_single_counter(funcoid, RESET_FUNCTION);
+ pgstat_reset_single_counter(funcoid, InvalidOid, RESET_FUNCTION);
+
+ PG_RETURN_VOID();
+}
+
+Datum
+pg_stat_reset_subscription_worker_subrel(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid = PG_ARGISNULL(1) ? InvalidOid : PG_GETARG_OID(1);
+
+ pgstat_reset_single_counter(subid, relid, RESET_SUBWORKER);
PG_RETURN_VOID();
}
@@ -2240,6 +2251,21 @@ pg_stat_reset_replication_slot(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
+/* Reset all subscription worker stats associated with the given subscription */
+Datum
+pg_stat_reset_subscription_worker_sub(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+
+ /*
+ * Use subscription drop message to remove statistics of all subscription
+ * workers.
+ */
+ pgstat_report_subscription_drop(subid);
+
+ PG_RETURN_VOID();
+}
+
Datum
pg_stat_get_archiver(PG_FUNCTION_ARGS)
{
@@ -2380,3 +2406,100 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
/* Returns the record as Datum */
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
+
+/*
+ * Get the subscription worker statistics for the given subscription
+ * (and relation).
+ */
+Datum
+pg_stat_get_subscription_worker(PG_FUNCTION_ARGS)
+{
+#define PG_STAT_GET_SUBSCRIPTION_WORKER_COLS 8
+ Oid subid = PG_GETARG_OID(0);
+ Oid subrelid;
+ TupleDesc tupdesc;
+ Datum values[PG_STAT_GET_SUBSCRIPTION_WORKER_COLS];
+ bool nulls[PG_STAT_GET_SUBSCRIPTION_WORKER_COLS];
+ PgStat_StatSubWorkerEntry *wentry;
+ int i;
+
+ if (PG_ARGISNULL(1))
+ subrelid = InvalidOid;
+ else
+ subrelid = PG_GETARG_OID(1);
+
+ /* Get subscription worker stats */
+ wentry = pgstat_fetch_stat_subworker_entry(subid, subrelid);
+
+ /* Return NULL if there is no worker statistics */
+ if (wentry == NULL)
+ PG_RETURN_NULL();
+
+ /* Initialise attributes information in the tuple descriptor */
+ tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_SUBSCRIPTION_WORKER_COLS);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "subid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subrelid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 3, "last_error_relid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 4, "last_error_command",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 5, "last_error_xid",
+ XIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 6, "last_error_count",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 7, "last_error_message",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 8, "last_error_time",
+ TIMESTAMPTZOID, -1, 0);
+ BlessTupleDesc(tupdesc);
+
+ /* Initialise values and NULL flags arrays */
+ MemSet(values, 0, sizeof(values));
+ MemSet(nulls, 0, sizeof(nulls));
+
+ i = 0;
+ /* subid */
+ values[i++] = ObjectIdGetDatum(subid);
+
+ /* subrelid */
+ if (OidIsValid(subrelid))
+ values[i++] = ObjectIdGetDatum(subrelid);
+ else
+ nulls[i++] = true;
+
+ /* last_error_relid */
+ if (OidIsValid(wentry->last_error_relid))
+ values[i++] = ObjectIdGetDatum(wentry->last_error_relid);
+ else
+ nulls[i++] = true;
+
+ /* last_error_command */
+ if (wentry->last_error_command != 0)
+ values[i++] =
+ CStringGetTextDatum(logicalrep_message_type(wentry->last_error_command));
+ else
+ nulls[i++] = true;
+
+ /* last_error_xid */
+ if (TransactionIdIsValid(wentry->last_error_xid))
+ values[i++] = TransactionIdGetDatum(wentry->last_error_xid);
+ else
+ nulls[i++] = true;
+
+ /* last_error_count */
+ values[i++] = Int64GetDatum(wentry->last_error_count);
+
+ /* last_error_message */
+ values[i++] = CStringGetTextDatum(wentry->last_error_message);
+
+ /* last_error_time */
+ if (wentry->last_error_time != 0)
+ values[i++] = TimestampTzGetDatum(wentry->last_error_time);
+ else
+ nulls[i++] = true;
+
+ /* Returns the record as Datum */
+ PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
+}
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index e934361dc3..79d787cd26 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5389,6 +5389,14 @@
proargmodes => '{i,o,o,o,o,o,o,o,o,o,o}',
proargnames => '{slot_name,slot_name,spill_txns,spill_count,spill_bytes,stream_txns,stream_count,stream_bytes,total_txns,total_bytes,stats_reset}',
prosrc => 'pg_stat_get_replication_slot' },
+{ oid => '8523', descr => 'statistics: information about subscription worker',
+ proname => 'pg_stat_get_subscription_worker', prorows => '1', proisstrict => 'f',
+ proretset => 't', provolatile => 's', proparallel => 'r',
+ prorettype => 'record', proargtypes => 'oid oid',
+ proallargtypes => '{oid,oid,oid,oid,oid,text,xid,int8,text,timestamptz}',
+ proargmodes => '{i,i,o,o,o,o,o,o,o,o}',
+ proargnames => '{subid,subrelid,subid,subrelid,last_error_relid,last_error_command,last_error_xid,last_error_count,last_error_message,last_error_time}',
+ prosrc => 'pg_stat_get_subscription_worker' },
{ oid => '6118', descr => 'statistics: information about subscription',
proname => 'pg_stat_get_subscription', prorows => '10', proisstrict => 'f',
proretset => 't', provolatile => 's', proparallel => 'r',
@@ -5776,6 +5784,16 @@
proname => 'pg_stat_reset_replication_slot', proisstrict => 'f',
provolatile => 'v', prorettype => 'void', proargtypes => 'text',
prosrc => 'pg_stat_reset_replication_slot' },
+{ oid => '8524',
+ descr => 'statistics: reset collected statistics for a single subscription worker',
+ proname => 'pg_stat_reset_subscription_worker', proisstrict => 'f',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid oid',
+ prosrc => 'pg_stat_reset_subscription_worker_subrel' },
+{ oid => '8525',
+ descr => 'statistics: reset all collected statistics for a single subscription',
+ proname => 'pg_stat_reset_subscription_worker',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid',
+ prosrc => 'pg_stat_reset_subscription_worker_sub' },
{ oid => '3163', descr => 'current trigger depth',
proname => 'pg_trigger_depth', provolatile => 's', proparallel => 'r',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index bcd3588ea2..dd1c28803c 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -14,6 +14,7 @@
#include "datatype/timestamp.h"
#include "portability/instr_time.h"
#include "postmaster/pgarch.h" /* for MAX_XFN_CHARS */
+#include "replication/logicalproto.h"
#include "utils/backend_progress.h" /* for backward compatibility */
#include "utils/backend_status.h" /* for backward compatibility */
#include "utils/hsearch.h"
@@ -83,6 +84,8 @@ typedef enum StatMsgType
PGSTAT_MTYPE_REPLSLOT,
PGSTAT_MTYPE_CONNECT,
PGSTAT_MTYPE_DISCONNECT,
+ PGSTAT_MTYPE_SUBSCRIPTIONPURGE,
+ PGSTAT_MTYPE_SUBWORKERERROR,
} StatMsgType;
/* ----------
@@ -145,7 +148,8 @@ typedef enum PgStat_Shared_Reset_Target
typedef enum PgStat_Single_Reset_Type
{
RESET_TABLE,
- RESET_FUNCTION
+ RESET_FUNCTION,
+ RESET_SUBWORKER
} PgStat_Single_Reset_Type;
/* ------------------------------------------------------------
@@ -364,6 +368,7 @@ typedef struct PgStat_MsgResetsinglecounter
Oid m_databaseid;
PgStat_Single_Reset_Type m_resettype;
Oid m_objectid;
+ Oid m_subobjectid;
} PgStat_MsgResetsinglecounter;
/* ----------
@@ -536,6 +541,54 @@ typedef struct PgStat_MsgReplSlot
PgStat_Counter m_total_bytes;
} PgStat_MsgReplSlot;
+/* ----------
+ * PgStat_MsgSubscriptionPurge Sent by the backend and autovacuum to tell the
+ * collector about the dead subscriptions.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(Oid) - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionPurge
+{
+ PgStat_MsgHdr m_hdr;
+ Oid m_databaseid;
+ int m_nentries;
+ Oid m_subids[PGSTAT_NUM_SUBSCRIPTIONPURGE];
+} PgStat_MsgSubscriptionPurge;
+
+/* ----------
+ * PgStat_MsgSubWorkerError Sent by the apply worker or the table sync
+ * worker to report the error occurred while
+ * processing changes.
+ * ----------
+ */
+#define PGSTAT_SUBWORKERERROR_MSGLEN 256
+typedef struct PgStat_MsgSubWorkerError
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * m_subid and m_subrelid are used to determine the subscription and the
+ * reporter of the error. m_subrelid is InvalidOid if reported by an apply
+ * worker otherwise reported by a table sync worker.
+ */
+ Oid m_databaseid;
+ Oid m_subid;
+ Oid m_subrelid;
+
+ /*
+ * Oid of the table that the reporter was actually processing. m_relid can
+ * be InvalidOid if an error occurred during worker applying a
+ * non-data-modification message such as RELATION.
+ */
+ Oid m_relid;
+
+ LogicalRepMsgType m_command;
+ TransactionId m_xid;
+ TimestampTz m_timestamp;
+ char m_message[PGSTAT_SUBWORKERERROR_MSGLEN];
+} PgStat_MsgSubWorkerError;
/* ----------
* PgStat_MsgRecoveryConflict Sent by the backend upon recovery conflict
@@ -714,6 +767,8 @@ typedef union PgStat_Msg
PgStat_MsgReplSlot msg_replslot;
PgStat_MsgConnect msg_connect;
PgStat_MsgDisconnect msg_disconnect;
+ PgStat_MsgSubscriptionPurge msg_subscriptionpurge;
+ PgStat_MsgSubWorkerError msg_subworkererror;
} PgStat_Msg;
@@ -768,11 +823,16 @@ typedef struct PgStat_StatDBEntry
TimestampTz stats_timestamp; /* time of db stats file update */
/*
- * tables and functions must be last in the struct, because we don't write
- * the pointers out to the stats file.
+ * tables, functions, and subscription workers must be last in the struct,
+ * because we don't write the pointers out to the stats file.
+ *
+ * subworkers is the hash table of PgStat_StatSubWorkerEntry which stores
+ * statistics of logical replication workers: apply worker and table sync
+ * worker.
*/
HTAB *tables;
HTAB *functions;
+ HTAB *subworkers;
} PgStat_StatDBEntry;
@@ -929,6 +989,34 @@ typedef struct PgStat_StatReplSlotEntry
TimestampTz stat_reset_timestamp;
} PgStat_StatReplSlotEntry;
+/* The lookup key for subscription worker hash table */
+typedef struct PgStat_StatSubWorkerKey
+{
+ Oid subid;
+ Oid subrelid; /* InvalidOid for apply worker, otherwise for
+ * table sync worker */
+} PgStat_StatSubWorkerKey;
+
+/*
+ * Logical replication apply worker and table sync worker statistics kept in the
+ * stats collector.
+ */
+typedef struct PgStat_StatSubWorkerEntry
+{
+ PgStat_StatSubWorkerKey key; /* hash key (must be first) */
+
+ /*
+ * Subscription worker error statistics representing an error that
+ * occurred during application of logical replication or the initial table
+ * synchronization.
+ */
+ Oid last_error_relid;
+ LogicalRepMsgType last_error_command;
+ TransactionId last_error_xid;
+ PgStat_Counter last_error_count;
+ TimestampTz last_error_time;
+ char last_error_message[PGSTAT_SUBWORKERERROR_MSGLEN];
+} PgStat_StatSubWorkerEntry;
/*
* Working state needed to accumulate per-function-call timing statistics.
@@ -1019,7 +1107,8 @@ extern void pgstat_drop_database(Oid databaseid);
extern void pgstat_clear_snapshot(void);
extern void pgstat_reset_counters(void);
extern void pgstat_reset_shared_counters(const char *);
-extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type type);
+extern void pgstat_reset_single_counter(Oid objectid, Oid subobjectid,
+ PgStat_Single_Reset_Type type);
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
@@ -1038,6 +1127,10 @@ extern void pgstat_report_checksum_failure(void);
extern void pgstat_report_replslot(const PgStat_StatReplSlotEntry *repSlotStat);
extern void pgstat_report_replslot_create(const char *slotname);
extern void pgstat_report_replslot_drop(const char *slotname);
+extern void pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command,
+ TransactionId xid, const char *errmsg);
+extern void pgstat_report_subscription_drop(Oid subid);
extern void pgstat_initialize(void);
@@ -1129,6 +1222,8 @@ extern void pgstat_send_wal(bool force);
extern PgStat_StatDBEntry *pgstat_fetch_stat_dbentry(Oid dbid);
extern PgStat_StatTabEntry *pgstat_fetch_stat_tabentry(Oid relid);
extern PgStat_StatFuncEntry *pgstat_fetch_stat_funcentry(Oid funcid);
+extern PgStat_StatSubWorkerEntry *pgstat_fetch_stat_subworker_entry(Oid subid,
+ Oid subrelid);
extern PgStat_ArchiverStats *pgstat_fetch_stat_archiver(void);
extern PgStat_BgWriterStats *pgstat_fetch_stat_bgwriter(void);
extern PgStat_CheckpointerStats *pgstat_fetch_stat_checkpointer(void);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2fa00a3c29..b58b062b10 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2094,6 +2094,24 @@ pg_stat_subscription| SELECT su.oid AS subid,
st.latest_end_time
FROM (pg_subscription su
LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid, pid, received_lsn, last_msg_send_time, last_msg_receipt_time, latest_end_lsn, latest_end_time) ON ((st.subid = su.oid)));
+pg_stat_subscription_workers| SELECT w.subid,
+ s.subname,
+ w.subrelid,
+ w.last_error_relid,
+ w.last_error_command,
+ w.last_error_xid,
+ w.last_error_count,
+ w.last_error_message,
+ w.last_error_time
+ FROM ( SELECT pg_subscription.oid AS subid,
+ NULL::oid AS relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT pg_subscription_rel.srsubid AS subid,
+ pg_subscription_rel.srrelid AS relid
+ FROM pg_subscription_rel) sr,
+ (LATERAL pg_stat_get_subscription_worker(sr.subid, sr.relid) w(subid, subrelid, last_error_relid, last_error_command, last_error_xid, last_error_count, last_error_message, last_error_time)
+ JOIN pg_subscription s ON ((w.subid = s.oid)));
pg_stat_sys_indexes| SELECT pg_stat_all_indexes.relid,
pg_stat_all_indexes.indexrelid,
pg_stat_all_indexes.schemaname,
diff --git a/src/test/subscription/t/026_error_report.pl b/src/test/subscription/t/026_error_report.pl
new file mode 100644
index 0000000000..7a6991a5da
--- /dev/null
+++ b/src/test/subscription/t/026_error_report.pl
@@ -0,0 +1,176 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Tests for subscription error reporting.
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 5;
+
+# Test if the error reported on pg_stat_subscription_workers view is expected.
+sub test_subscription_error
+{
+ my ($node, $relname, $xid, $expected_error, $msg) = @_;
+
+ my $check_sql = qq[
+SELECT count(1) > 0 FROM pg_stat_subscription_workers
+WHERE last_error_relid = '$relname'::regclass];
+ $check_sql .= " AND last_error_xid = '$xid'::xid;" if $xid ne '';
+
+ # Wait for the error statistics to be updated.
+ $node->poll_query_until(
+ 'postgres', $check_sql,
+) or die "Timed out while waiting for statistics to be updated";
+
+ my $result = $node->safe_psql(
+ 'postgres',
+ qq[
+SELECT subname, last_error_command, last_error_relid::regclass, last_error_count > 0
+FROM pg_stat_subscription_workers
+WHERE last_error_relid = '$relname'::regclass;
+]);
+ is($result, $expected_error, $msg);
+}
+
+# Create publisher node.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+
+# The subscriber will enter an infinite error loop, so we don't want
+# to overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+wal_retrieve_retry_interval = 5s
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we
+# create the same tables but with primary keys. Also, insert some data that
+# will conflict with the data replicated from publisher later.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+CREATE TABLE test_tab1 (a int);
+CREATE TABLE test_tab2 (a int);
+INSERT INTO test_tab1 VALUES (1);
+INSERT INTO test_tab2 VALUES (1);
+COMMIT;
+]);
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+CREATE TABLE test_tab1 (a int primary key);
+CREATE TABLE test_tab2 (a int primary key);
+INSERT INTO test_tab2 VALUES (1);
+COMMIT;
+]);
+
+# Setup publications.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql(
+ 'postgres',
+ "CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;");
+
+# Check if there is no subscription errors before starting logical replication.
+my $result = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT count(1) FROM pg_stat_subscription_workers");
+is($result, qq(0), 'check no subscription error');
+
+# Create subscription. The table sync for test_tab2 on tap_sub will enter into
+# infinite error loop due to violating the unique constraint.
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr' PUBLICATION tap_pub WITH (streaming = off);");
+
+$node_publisher->wait_for_catchup('tap_sub');
+
+# Wait for initial table sync for test_tab1 to finish.
+$node_subscriber->poll_query_until(
+ 'postgres',
+ qq[
+SELECT count(1) = 1 FROM pg_subscription_rel
+WHERE srrelid = 'test_tab1'::regclass AND srsubstate in ('r', 's')
+]) or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check the initial data.
+$result = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT count(a) FROM test_tab1");
+is($result, q(1), 'check initial data are copied to subscriber');
+
+# Insert more data to test_tab1, raising an error on the subscriber due to
+# violation of the unique constraint on test_tab1.
+my $xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab1 VALUES (1);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_subscription_error($node_subscriber, 'test_tab1', $xid,
+ qq(tap_sub|INSERT|test_tab1|t),
+ 'check the error reported by the apply worker');
+
+# Check the table sync worker's error in the view.
+test_subscription_error($node_subscriber, 'test_tab2', '',
+ qq(tap_sub||test_tab2|t),
+ 'check the error reported by the table sync worker');
+
+# Test for resetting subscription worker statistics.
+# Truncate test_tab1 and test_tab2 so that applying changes and table sync can
+# continue, respectively.
+$node_subscriber->safe_psql(
+ 'postgres',
+ "TRUNCATE test_tab1, test_tab2;");
+
+# Wait for the data to be replicated.
+$node_subscriber->poll_query_until(
+ 'postgres',
+ "SELECT count(1) > 0 FROM test_tab1");
+$node_subscriber->poll_query_until(
+ 'postgres',
+ "SELECT count(1) > 0 FROM test_tab2");
+
+# Reset stats of all subscription workers running on tap_sub.
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+SELECT pg_stat_reset_subscription_worker(sw.subid)
+FROM pg_stat_subscription_workers sw
+ JOIN pg_subscription s ON s.oid = sw.subid
+WHERE
+ s.subname = 'tap_sub';
+]);
+
+# Wait for stats of all subscription workers running on tap_sub to be reset.
+$node_subscriber->poll_query_until(
+ 'postgres',
+ qq[
+SELECT count(1) = 0
+FROM pg_stat_subscription_workers sw
+ JOIN pg_subscription s ON s.oid = sw.subid
+WHERE
+ s.subname = 'tap_sub';
+]);
+
+# Check if the view doesn't show any entries after dropping the subscriptions.
+$node_subscriber->safe_psql(
+ 'postgres',
+ "DROP SUBSCRIPTION tap_sub;");
+$result = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT count(1) FROM pg_stat_subscription_workers");
+is($result, q(0), 'no error after dropping subscription');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index da6ac8ed83..f41ef0d2bc 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1943,6 +1943,8 @@ PgStat_MsgResetsharedcounter
PgStat_MsgResetsinglecounter
PgStat_MsgResetslrucounter
PgStat_MsgSLRU
+PgStat_MsgSubscriptionPurge
+PgStat_MsgSubWorkerError
PgStat_MsgTabpurge
PgStat_MsgTabstat
PgStat_MsgTempFile
@@ -1954,6 +1956,8 @@ PgStat_Single_Reset_Type
PgStat_StatDBEntry
PgStat_StatFuncEntry
PgStat_StatReplSlotEntry
+PgStat_StatSubWorkerEntry
+PgStat_StatSubWorkerKey
PgStat_StatTabEntry
PgStat_SubXactStatus
PgStat_TableCounts
--
2.24.3 (Apple Git-128)
On Friday, November 26, 2021 9:30 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Indeed. Attached an updated patch. Thanks!
Thanks for your patch. A small comment:
+ OID of the relation that the worker is synchronizing; null for the
+ main apply worker
Should we modify it to "OID of the relation that the worker was synchronizing ..."?
The rest of the patch LGTM.
Regards
Tang
On Fri, Nov 26, 2021 at 6:00 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Indeed. Attached an updated patch. Thanks!
I have made a number of changes in the attached patch which includes
(a) the patch was trying to register multiple array entries for the
same subscription which doesn't seem to be required, see changes in
pgstat_vacuum_stat, (b) multiple changes in the test like reduced the
wal_retrieve_retry_interval to 2s which has reduced the test time to
half, remove the check related to resetting of stats as there is no
guarantee that the message will be received by the collector and we
were not sending it again, changed the test case file name to
026_stats as we can add more subscription-related stats in this test
file itself (c) added/edited multiple comments, (d) updated
PGSTAT_FILE_FORMAT_ID.
Do let me know what you think of the attached?
--
With Regards,
Amit Kapila.
Attachments:
v27-0001-Add-a-view-to-show-the-stats-of-subscription-wor.patchapplication/octet-stream; name=v27-0001-Add-a-view-to-show-the-stats-of-subscription-wor.patchDownload
From ddb99bf6725583ecb4bab122444033eb9ac8d91b Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 16 Jul 2021 23:10:22 +0900
Subject: [PATCH v27] Add a view to show the stats of subscription workers.
This commit adds a new system view pg_stat_subscription_workers, that
shows information about any errors which occur during the application of
logical replication changes as well as during performing initial table
synchronization. The subscription statistics entries are removed when the
corresponding subscription is removed.
It also adds an SQL function pg_stat_reset_subscription_worker() to reset
single subscription errors.
The contents of this view can be used by an upcoming patch that skips the
particular transaction that conflicts with the existing data on the
subscriber.
This view can be extended in the future to track other xact related
statistics for subscription workers.
Author: Masahiko Sawada
Reviewed-by: Greg Nancarrow, Hou Zhijie, Tang Haiying, Vignesh C, Dilip Kumar, Takamichi Osumi, Amit Kapila
Discussion: https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xJfUVihNZDA@mail.gmail.com
---
doc/src/sgml/monitoring.sgml | 157 ++++++++++++
src/backend/catalog/system_functions.sql | 4 +
src/backend/catalog/system_views.sql | 23 ++
src/backend/commands/subscriptioncmds.c | 16 +-
src/backend/postmaster/pgstat.c | 409 +++++++++++++++++++++++++++++--
src/backend/replication/logical/worker.c | 54 +++-
src/backend/utils/adt/pgstatfuncs.c | 128 +++++++++-
src/include/catalog/pg_proc.dat | 18 ++
src/include/pgstat.h | 109 +++++++-
src/test/regress/expected/rules.out | 18 ++
src/test/subscription/t/026_stats.pl | 154 ++++++++++++
src/tools/pgindent/typedefs.list | 4 +
12 files changed, 1068 insertions(+), 26 deletions(-)
create mode 100644 src/test/subscription/t/026_stats.pl
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index af69148..62f2a33 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -627,6 +627,15 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
</entry>
</row>
+ <row>
+ <entry><structname>pg_stat_subscription_workers</structname><indexterm><primary>pg_stat_subscription_workers</primary></indexterm></entry>
+ <entry>One row per subscription worker, showing statistics about errors
+ that occurred on that subscription worker.
+ See <link linkend="monitoring-pg-stat-subscription-workers">
+ <structname>pg_stat_subscription_workers</structname></link> for details.
+ </entry>
+ </row>
+
</tbody>
</tgroup>
</table>
@@ -3054,6 +3063,128 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
</sect2>
+ <sect2 id="monitoring-pg-stat-subscription-workers">
+ <title><structname>pg_stat_subscription_workers</structname></title>
+
+ <indexterm>
+ <primary>pg_stat_subscription_workers</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_stat_subscription_workers</structname> view will contain
+ one row per subscription worker on which errors have occurred, for workers
+ applying logical replication changes and workers handling the initial data
+ copy of the subscribed tables. The statistics entry is removed when the
+ corresponding subscription is dropped.
+ </para>
+
+ <table id="pg-stat-subscription-workers" xreflabel="pg_stat_subscription_workers">
+ <title><structname>pg_stat_subscription_workers</structname> View</title>
+ <tgroup cols="1">
+ <thead>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ Column Type
+ </para>
+ <para>
+ Description
+ </para></entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the subscription
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the subscription
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subrelid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker is synchronizing; null for the
+ main apply worker
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_error_relid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker was processing when the
+ error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_error_command</structfield> <type>text</type>
+ </para>
+ <para>
+ Name of command being applied when the error occurred. This field
+ is null if the error was reported during the initial data copy.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_error_xid</structfield> <type>xid</type>
+ </para>
+ <para>
+ Transaction ID of the publisher node being applied when the error
+ occurred. This field is null if the error was reported
+ during the initial data copy.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_error_count</structfield> <type>uint8</type>
+ </para>
+ <para>
+ Number of consecutive times the error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_error_message</structfield> <type>text</type>
+ </para>
+ <para>
+ The error message
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_error_time</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Last time at which this error occurred
+ </para></entry>
+ </row>
+
+ </tbody>
+ </tgroup>
+ </table>
+
+ </sect2>
+
<sect2 id="monitoring-pg-stat-ssl-view">
<title><structname>pg_stat_ssl</structname></title>
@@ -5176,6 +5307,32 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
can be granted EXECUTE to run the function.
</para></entry>
</row>
+
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_stat_reset_subscription_worker</primary>
+ </indexterm>
+ <function>pg_stat_reset_subscription_worker</function> ( <parameter>subid</parameter> <type>oid</type> <optional>, <parameter>relid</parameter> <type>oid</type> </optional> )
+ <returnvalue>void</returnvalue>
+ </para>
+ <para>
+ Resets the statistics of subscription workers running on the
+ subscription with <parameter>subid</parameter> shown in the
+ <structname>pg_stat_subscription_workers</structname> view. If the
+ argument <parameter>relid</parameter> is not <literal>NULL</literal>,
+ resets statistics of the subscription worker handling the initial data
+ copy of the relation with <parameter>relid</parameter>. Otherwise,
+ resets the subscription worker statistics of the main apply worker.
+ If the argument <parameter>relid</parameter> is omitted, resets the
+ statistics of all subscription workers running on the subscription
+ with <parameter>subid</parameter>.
+ </para>
+ <para>
+ This function is restricted to superusers by default, but other users
+ can be granted EXECUTE to run the function.
+ </para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index f678902..3a4fa90 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -639,6 +639,10 @@ REVOKE EXECUTE ON FUNCTION pg_stat_reset_single_function_counters(oid) FROM publ
REVOKE EXECUTE ON FUNCTION pg_stat_reset_replication_slot(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_stat_reset_subscription_worker(oid) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_stat_reset_subscription_worker(oid, oid) FROM public;
+
REVOKE EXECUTE ON FUNCTION lo_import(text) FROM public;
REVOKE EXECUTE ON FUNCTION lo_import(text, oid) FROM public;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index eb56095..61b515c 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1261,3 +1261,26 @@ REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
substream, subtwophasestate, subslotname, subsynccommit, subpublications)
ON pg_subscription TO public;
+
+CREATE VIEW pg_stat_subscription_workers AS
+ SELECT
+ w.subid,
+ s.subname,
+ w.subrelid,
+ w.last_error_relid,
+ w.last_error_command,
+ w.last_error_xid,
+ w.last_error_count,
+ w.last_error_message,
+ w.last_error_time
+ FROM (SELECT
+ oid as subid,
+ NULL as relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT
+ srsubid as subid,
+ srrelid as relid
+ FROM pg_subscription_rel) sr,
+ LATERAL pg_stat_get_subscription_worker(sr.subid, sr.relid) w
+ JOIN pg_subscription s ON (w.subid = s.oid);
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index c47ba26..9427e86 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -32,6 +32,7 @@
#include "executor/executor.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "pgstat.h"
#include "replication/logicallauncher.h"
#include "replication/origin.h"
#include "replication/slot.h"
@@ -1204,7 +1205,8 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
* Since dropping a replication slot is not transactional, the replication
* slot stays dropped even if the transaction rolls back. So we cannot
* run DROP SUBSCRIPTION inside a transaction block if dropping the
- * replication slot.
+ * replication slot. Also, in this case, we report a message for dropping
+ * the subscription to the stats collector.
*
* XXX The command name should really be something like "DROP SUBSCRIPTION
* of a subscription that is associated with a replication slot", but we
@@ -1377,6 +1379,18 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
}
PG_END_TRY();
+ /*
+ * Send a message for dropping this subscription to the stats collector.
+ * We can safely report dropping the subscription statistics here if the
+ * subscription is associated with a replication slot since we cannot run
+ * DROP SUBSCRIPTION inside a transaction block. Subscription statistics
+ * will be removed later by (auto)vacuum either if it's not associated
+ * with a replication slot or if the message for dropping the subscription
+ * gets lost.
+ */
+ if (slotname)
+ pgstat_report_subscription_drop(subid);
+
table_close(rel, NoLock);
}
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 8c166e5..7264d2c 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -41,6 +41,7 @@
#include "catalog/catalog.h"
#include "catalog/pg_database.h"
#include "catalog/pg_proc.h"
+#include "catalog/pg_subscription.h"
#include "common/ip.h"
#include "executor/instrument.h"
#include "libpq/libpq.h"
@@ -105,6 +106,7 @@
#define PGSTAT_DB_HASH_SIZE 16
#define PGSTAT_TAB_HASH_SIZE 512
#define PGSTAT_FUNCTION_HASH_SIZE 512
+#define PGSTAT_SUBWORKER_HASH_SIZE 32
#define PGSTAT_REPLSLOT_HASH_SIZE 32
@@ -320,10 +322,14 @@ NON_EXEC_STATIC void PgstatCollectorMain(int argc, char *argv[]) pg_attribute_no
static PgStat_StatDBEntry *pgstat_get_db_entry(Oid databaseid, bool create);
static PgStat_StatTabEntry *pgstat_get_tab_entry(PgStat_StatDBEntry *dbentry,
Oid tableoid, bool create);
+static PgStat_StatSubWorkerEntry *pgstat_get_subworker_entry(PgStat_StatDBEntry *dbentry,
+ Oid subid, Oid subrelid,
+ bool create);
static void pgstat_write_statsfiles(bool permanent, bool allDbs);
static void pgstat_write_db_statsfile(PgStat_StatDBEntry *dbentry, bool permanent);
static HTAB *pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep);
-static void pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash, bool permanent);
+static void pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash,
+ HTAB *subworkerhash, bool permanent);
static void backend_read_statsfile(void);
static bool pgstat_write_statsfile_needed(void);
@@ -335,6 +341,7 @@ static void pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotstats, Timestamp
static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg, TimestampTz now);
static void pgstat_send_funcstats(void);
static void pgstat_send_slru(void);
+static void pgstat_send_subscription_purge(PgStat_MsgSubscriptionPurge *msg);
static HTAB *pgstat_collect_oids(Oid catalogid, AttrNumber anum_oid);
static bool pgstat_should_report_connstat(void);
static void pgstat_report_disconnect(Oid dboid);
@@ -373,6 +380,8 @@ static void pgstat_recv_connect(PgStat_MsgConnect *msg, int len);
static void pgstat_recv_disconnect(PgStat_MsgDisconnect *msg, int len);
static void pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len);
static void pgstat_recv_tempfile(PgStat_MsgTempFile *msg, int len);
+static void pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len);
+static void pgstat_recv_subworker_error(PgStat_MsgSubWorkerError *msg, int len);
/* ------------------------------------------------------------
* Public functions called from postmaster follow
@@ -1302,6 +1311,74 @@ pgstat_vacuum_stat(void)
hash_destroy(htab);
}
+
+ /*
+ * Repeat for subscription workers. Similarly, we needn't bother in the
+ * common case where no subscription workers' stats are being collected.
+ */
+ if (dbentry->subworkers != NULL &&
+ hash_get_num_entries(dbentry->subworkers) > 0)
+ {
+ PgStat_StatSubWorkerEntry *subwentry;
+ PgStat_MsgSubscriptionPurge spmsg;
+
+ /*
+ * Read pg_subscription and make a list of OIDs of all existing
+ * subscriptions
+ */
+ htab = pgstat_collect_oids(SubscriptionRelationId, Anum_pg_subscription_oid);
+
+ spmsg.m_databaseid = MyDatabaseId;
+ spmsg.m_nentries = 0;
+
+ hash_seq_init(&hstat, dbentry->subworkers);
+ while ((subwentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ bool exists = false;
+ Oid subid = subwentry->key.subid;
+
+ CHECK_FOR_INTERRUPTS();
+
+ if (hash_search(htab, (void *) &subid, HASH_FIND, NULL) != NULL)
+ continue;
+
+ /*
+ * It is possible that we have multiple entries for the
+ * subscription corresponding to apply worker and tablesync
+ * workers. In such cases, we don't need to add the same subid
+ * again.
+ */
+ for (int i = 0; i < spmsg.m_nentries; i++)
+ {
+ if (spmsg.m_subids[i] == subid)
+ {
+ exists = true;
+ break;
+ }
+ }
+
+ if (exists)
+ continue;
+
+ /* This subscription is dead, add the subid to the message */
+ spmsg.m_subids[spmsg.m_nentries++] = subid;
+
+ /*
+ * If the message is full, send it out and reinitialize to empty
+ */
+ if (spmsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONPURGE)
+ {
+ pgstat_send_subscription_purge(&spmsg);
+ spmsg.m_nentries = 0;
+ }
+ }
+
+ /* Send the rest of dead subscriptions */
+ if (spmsg.m_nentries > 0)
+ pgstat_send_subscription_purge(&spmsg);
+
+ hash_destroy(htab);
+ }
}
@@ -1474,7 +1551,8 @@ pgstat_reset_shared_counters(const char *target)
* ----------
*/
void
-pgstat_reset_single_counter(Oid objoid, PgStat_Single_Reset_Type type)
+pgstat_reset_single_counter(Oid objoid, Oid subobjoid,
+ PgStat_Single_Reset_Type type)
{
PgStat_MsgResetsinglecounter msg;
@@ -1485,6 +1563,7 @@ pgstat_reset_single_counter(Oid objoid, PgStat_Single_Reset_Type type)
msg.m_databaseid = MyDatabaseId;
msg.m_resettype = type;
msg.m_objectid = objoid;
+ msg.m_subobjectid = subobjoid;
pgstat_send(&msg, sizeof(msg));
}
@@ -1870,6 +1949,51 @@ pgstat_report_replslot_drop(const char *slotname)
}
/* ----------
+ * pgstat_report_subworker_error() -
+ *
+ * Tell the collector about the subscription worker error.
+ * ----------
+ */
+void
+pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command, TransactionId xid,
+ const char *errmsg)
+{
+ PgStat_MsgSubWorkerError msg;
+ int len;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBWORKERERROR);
+ msg.m_databaseid = MyDatabaseId;
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_relid = relid;
+ msg.m_command = command;
+ msg.m_xid = xid;
+ msg.m_timestamp = GetCurrentTimestamp();
+ strlcpy(msg.m_message, errmsg, PGSTAT_SUBWORKERERROR_MSGLEN);
+
+ len = offsetof(PgStat_MsgSubWorkerError, m_message) + strlen(msg.m_message) + 1;
+ pgstat_send(&msg, len);
+}
+
+/* ----------
+ * pgstat_report_subscription_drop() -
+ *
+ * Tell the collector about dropping the subscription.
+ * ----------
+ */
+void
+pgstat_report_subscription_drop(Oid subid)
+{
+ PgStat_MsgSubscriptionPurge msg;
+
+ msg.m_databaseid = MyDatabaseId;
+ msg.m_subids[0] = subid;
+ msg.m_nentries = 1;
+ pgstat_send_subscription_purge(&msg);
+}
+
+/* ----------
* pgstat_ping() -
*
* Send some junk data to the collector to increase traffic.
@@ -2874,6 +2998,35 @@ pgstat_fetch_stat_funcentry(Oid func_id)
return funcentry;
}
+/*
+ * ---------
+ * pgstat_fetch_stat_subworker_entry() -
+ *
+ * Support function for the SQL-callable pgstat* functions. Returns
+ * the collected statistics for subscription worker or NULL.
+ * ---------
+ */
+PgStat_StatSubWorkerEntry *
+pgstat_fetch_stat_subworker_entry(Oid subid, Oid subrelid)
+{
+ PgStat_StatDBEntry *dbentry;
+ PgStat_StatSubWorkerEntry *wentry = NULL;
+
+ /* Load the stats file if needed */
+ backend_read_statsfile();
+
+ /*
+ * Lookup our database, then find the requested subscription worker stats.
+ */
+ dbentry = pgstat_fetch_stat_dbentry(MyDatabaseId);
+ if (dbentry != NULL && dbentry->subworkers != NULL)
+ {
+ wentry = pgstat_get_subworker_entry(dbentry, subid, subrelid,
+ false);
+ }
+
+ return wentry;
+}
/*
* ---------
@@ -3312,6 +3465,23 @@ pgstat_send_slru(void)
}
}
+/* --------
+ * pgstat_send_subscription_purge() -
+ *
+ * Send a subscription purge message to the collector
+ * --------
+ */
+static void
+pgstat_send_subscription_purge(PgStat_MsgSubscriptionPurge *msg)
+{
+ int len;
+
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + msg->m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&msg->m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(msg, len);
+}
/* ----------
* PgstatCollectorMain() -
@@ -3568,6 +3738,14 @@ PgstatCollectorMain(int argc, char *argv[])
pgstat_recv_disconnect(&msg.msg_disconnect, len);
break;
+ case PGSTAT_MTYPE_SUBSCRIPTIONPURGE:
+ pgstat_recv_subscription_purge(&msg.msg_subscriptionpurge, len);
+ break;
+
+ case PGSTAT_MTYPE_SUBWORKERERROR:
+ pgstat_recv_subworker_error(&msg.msg_subworkererror, len);
+ break;
+
default:
break;
}
@@ -3613,7 +3791,8 @@ PgstatCollectorMain(int argc, char *argv[])
/*
* Subroutine to clear stats in a database entry
*
- * Tables and functions hashes are initialized to empty.
+ * Tables, functions, and subscription workers hashes are initialized
+ * to empty.
*/
static void
reset_dbentry_counters(PgStat_StatDBEntry *dbentry)
@@ -3666,6 +3845,13 @@ reset_dbentry_counters(PgStat_StatDBEntry *dbentry)
PGSTAT_FUNCTION_HASH_SIZE,
&hash_ctl,
HASH_ELEM | HASH_BLOBS);
+
+ hash_ctl.keysize = sizeof(PgStat_StatSubWorkerKey);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubWorkerEntry);
+ dbentry->subworkers = hash_create("Per-database subscription worker",
+ PGSTAT_SUBWORKER_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
}
/*
@@ -3690,7 +3876,7 @@ pgstat_get_db_entry(Oid databaseid, bool create)
/*
* If not found, initialize the new one. This creates empty hash tables
- * for tables and functions, too.
+ * for tables, functions, and subscription workers, too.
*/
if (!found)
reset_dbentry_counters(result);
@@ -3748,6 +3934,47 @@ pgstat_get_tab_entry(PgStat_StatDBEntry *dbentry, Oid tableoid, bool create)
return result;
}
+/* ----------
+ * pgstat_get_subworker_entry
+ *
+ * Return subscription worker entry with the given subscription OID and
+ * relation OID. If subrelid is InvalidOid, it returns an entry of the
+ * apply worker otherwise returns an entry of the table sync worker
+ * associated with subrelid. If no subscription worker entry exists,
+ * initialize it, if the create parameter is true. Else, return NULL.
+ * ----------
+ */
+static PgStat_StatSubWorkerEntry *
+pgstat_get_subworker_entry(PgStat_StatDBEntry *dbentry, Oid subid, Oid subrelid,
+ bool create)
+{
+ PgStat_StatSubWorkerEntry *subwentry;
+ PgStat_StatSubWorkerKey key;
+ bool found;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);
+
+ key.subid = subid;
+ key.subrelid = subrelid;
+ subwentry = (PgStat_StatSubWorkerEntry *) hash_search(dbentry->subworkers,
+ (void *) &key,
+ action, &found);
+
+ if (!create && !found)
+ return NULL;
+
+ /* If not found, initialize the new one */
+ if (!found)
+ {
+ subwentry->last_error_relid = InvalidOid;
+ subwentry->last_error_command = 0;
+ subwentry->last_error_xid = InvalidTransactionId;
+ subwentry->last_error_count = 0;
+ subwentry->last_error_time = 0;
+ subwentry->last_error_message[0] = '\0';
+ }
+
+ return subwentry;
+}
/* ----------
* pgstat_write_statsfiles() -
@@ -3832,8 +4059,8 @@ pgstat_write_statsfiles(bool permanent, bool allDbs)
while ((dbentry = (PgStat_StatDBEntry *) hash_seq_search(&hstat)) != NULL)
{
/*
- * Write out the table and function stats for this DB into the
- * appropriate per-DB stat file, if required.
+ * Write out the table, function, and subscription-worker stats for
+ * this DB into the appropriate per-DB stat file, if required.
*/
if (allDbs || pgstat_db_requested(dbentry->databaseid))
{
@@ -3947,8 +4174,10 @@ pgstat_write_db_statsfile(PgStat_StatDBEntry *dbentry, bool permanent)
{
HASH_SEQ_STATUS tstat;
HASH_SEQ_STATUS fstat;
+ HASH_SEQ_STATUS sstat;
PgStat_StatTabEntry *tabentry;
PgStat_StatFuncEntry *funcentry;
+ PgStat_StatSubWorkerEntry *subwentry;
FILE *fpout;
int32 format_id;
Oid dbid = dbentry->databaseid;
@@ -4004,6 +4233,17 @@ pgstat_write_db_statsfile(PgStat_StatDBEntry *dbentry, bool permanent)
}
/*
+ * Walk through the database's subscription worker stats table.
+ */
+ hash_seq_init(&sstat, dbentry->subworkers);
+ while ((subwentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&sstat)) != NULL)
+ {
+ fputc('S', fpout);
+ rc = fwrite(subwentry, sizeof(PgStat_StatSubWorkerEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+ }
+
+ /*
* No more output to be done. Close the temp file and replace the old
* pgstat.stat with it. The ferror() check replaces testing for error
* after each individual fputc or fwrite above.
@@ -4061,8 +4301,9 @@ pgstat_write_db_statsfile(PgStat_StatDBEntry *dbentry, bool permanent)
* files after reading; the in-memory status is now authoritative, and the
* files would be out of date in case somebody else reads them.
*
- * If a 'deep' read is requested, table/function stats are read, otherwise
- * the table/function hash tables remain empty.
+ * If a 'deep' read is requested, table/function/subscription-worker stats are
+ * read, otherwise the table/function/subscription-worker hash tables remain
+ * empty.
* ----------
*/
static HTAB *
@@ -4241,6 +4482,7 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
memcpy(dbentry, &dbbuf, sizeof(PgStat_StatDBEntry));
dbentry->tables = NULL;
dbentry->functions = NULL;
+ dbentry->subworkers = NULL;
/*
* In the collector, disregard the timestamp we read from the
@@ -4252,8 +4494,8 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
dbentry->stats_timestamp = 0;
/*
- * Don't create tables/functions hashtables for uninteresting
- * databases.
+ * Don't create tables/functions/subworkers hashtables for
+ * uninteresting databases.
*/
if (onlydb != InvalidOid)
{
@@ -4278,6 +4520,14 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
&hash_ctl,
HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ hash_ctl.keysize = sizeof(PgStat_StatSubWorkerKey);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubWorkerEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ dbentry->subworkers = hash_create("Per-database subscription worker",
+ PGSTAT_SUBWORKER_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+
/*
* If requested, read the data from the database-specific
* file. Otherwise we just leave the hashtables empty.
@@ -4286,6 +4536,7 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
pgstat_read_db_statsfile(dbentry->databaseid,
dbentry->tables,
dbentry->functions,
+ dbentry->subworkers,
permanent);
break;
@@ -4363,19 +4614,21 @@ done:
* As in pgstat_read_statsfiles, if the permanent file is requested, it is
* removed after reading.
*
- * Note: this code has the ability to skip storing per-table or per-function
- * data, if NULL is passed for the corresponding hashtable. That's not used
- * at the moment though.
+ * Note: this code has the ability to skip storing per-table, per-function, or
+ * per-subscription-worker data, if NULL is passed for the corresponding hashtable.
+ * That's not used at the moment though.
* ----------
*/
static void
pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash,
- bool permanent)
+ HTAB *subworkerhash, bool permanent)
{
PgStat_StatTabEntry *tabentry;
PgStat_StatTabEntry tabbuf;
PgStat_StatFuncEntry funcbuf;
PgStat_StatFuncEntry *funcentry;
+ PgStat_StatSubWorkerEntry subwbuf;
+ PgStat_StatSubWorkerEntry *subwentry;
FILE *fpin;
int32 format_id;
bool found;
@@ -4490,6 +4743,41 @@ pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash,
break;
/*
+ * 'S' A PgStat_StatSubWorkerEntry struct describing
+ * subscription worker statistics.
+ */
+ case 'S':
+ if (fread(&subwbuf, 1, sizeof(PgStat_StatSubWorkerEntry),
+ fpin) != sizeof(PgStat_StatSubWorkerEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /*
+ * Skip if subscription worker data not wanted.
+ */
+ if (subworkerhash == NULL)
+ break;
+
+ subwentry = (PgStat_StatSubWorkerEntry *) hash_search(subworkerhash,
+ (void *) &subwbuf.key,
+ HASH_ENTER, &found);
+
+ if (found)
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ memcpy(subwentry, &subwbuf, sizeof(subwbuf));
+ break;
+
+ /*
* 'E' The EOF marker of a complete stats file.
*/
case 'E':
@@ -5162,6 +5450,8 @@ pgstat_recv_dropdb(PgStat_MsgDropdb *msg, int len)
hash_destroy(dbentry->tables);
if (dbentry->functions != NULL)
hash_destroy(dbentry->functions);
+ if (dbentry->subworkers != NULL)
+ hash_destroy(dbentry->subworkers);
if (hash_search(pgStatDBHash,
(void *) &dbid,
@@ -5199,13 +5489,16 @@ pgstat_recv_resetcounter(PgStat_MsgResetcounter *msg, int len)
hash_destroy(dbentry->tables);
if (dbentry->functions != NULL)
hash_destroy(dbentry->functions);
+ if (dbentry->subworkers != NULL)
+ hash_destroy(dbentry->subworkers);
dbentry->tables = NULL;
dbentry->functions = NULL;
+ dbentry->subworkers = NULL;
/*
* Reset database-level stats, too. This creates empty hash tables for
- * tables and functions.
+ * tables, functions, and subscription workers.
*/
reset_dbentry_counters(dbentry);
}
@@ -5274,6 +5567,14 @@ pgstat_recv_resetsinglecounter(PgStat_MsgResetsinglecounter *msg, int len)
else if (msg->m_resettype == RESET_FUNCTION)
(void) hash_search(dbentry->functions, (void *) &(msg->m_objectid),
HASH_REMOVE, NULL);
+ else if (msg->m_resettype == RESET_SUBWORKER)
+ {
+ PgStat_StatSubWorkerKey key;
+
+ key.subid = msg->m_objectid;
+ key.subrelid = msg->m_subobjectid;
+ (void) hash_search(dbentry->subworkers, (void *) &key, HASH_REMOVE, NULL);
+ }
}
/* ----------
@@ -5817,6 +6118,84 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
}
/* ----------
+ * pgstat_recv_subscription_purge() -
+ *
+ * Process a SUBSCRIPTIONPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len)
+{
+ HASH_SEQ_STATUS hstat;
+ PgStat_StatDBEntry *dbentry;
+ PgStat_StatSubWorkerEntry *subwentry;
+
+ dbentry = pgstat_get_db_entry(msg->m_databaseid, false);
+
+ /* No need to purge if we don't even know the database */
+ if (!dbentry || !dbentry->subworkers)
+ return;
+
+ /* Remove all subscription worker statistics for the given subscriptions */
+ hash_seq_init(&hstat, dbentry->subworkers);
+ while ((subwentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ if (subwentry->key.subid == msg->m_subids[i])
+ {
+ (void) hash_search(dbentry->subworkers, (void *) &(subwentry->key),
+ HASH_REMOVE, NULL);
+ break;
+ }
+ }
+ }
+}
+
+/* ----------
+ * pgstat_recv_subworker_error() -
+ *
+ * Process a SUBWORKERERROR message.
+ * ----------
+ */
+static void
+pgstat_recv_subworker_error(PgStat_MsgSubWorkerError *msg, int len)
+{
+ PgStat_StatDBEntry *dbentry;
+ PgStat_StatSubWorkerEntry *subwentry;
+
+ dbentry = pgstat_get_db_entry(msg->m_databaseid, true);
+
+ /* Get the subscription worker stats */
+ subwentry = pgstat_get_subworker_entry(dbentry, msg->m_subid,
+ msg->m_subrelid, true);
+ Assert(subwentry);
+
+ if (subwentry->last_error_relid == msg->m_relid &&
+ subwentry->last_error_command == msg->m_command &&
+ subwentry->last_error_xid == msg->m_xid &&
+ strcmp(subwentry->last_error_message, msg->m_message) == 0)
+ {
+ /*
+ * The same error occurred again in succession, just update its
+ * timestamp and count.
+ */
+ subwentry->last_error_count++;
+ subwentry->last_error_time = msg->m_timestamp;
+ return;
+ }
+
+ /* Otherwise, update the error information */
+ subwentry->last_error_relid = msg->m_relid;
+ subwentry->last_error_command = msg->m_command;
+ subwentry->last_error_xid = msg->m_xid;
+ subwentry->last_error_count = 1;
+ subwentry->last_error_time = msg->m_timestamp;
+ strlcpy(subwentry->last_error_message, msg->m_message,
+ PGSTAT_SUBWORKERERROR_MSGLEN);
+}
+
+/* ----------
* pgstat_write_statsfile_needed() -
*
* Do we need to write out any stats files?
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index ae1b391..2e79302 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3332,6 +3332,7 @@ void
ApplyWorkerMain(Datum main_arg)
{
int worker_slot = DatumGetInt32(main_arg);
+ MemoryContext cctx = CurrentMemoryContext;
MemoryContext oldctx;
char originname[NAMEDATALEN];
XLogRecPtr origin_startpos;
@@ -3432,8 +3433,30 @@ ApplyWorkerMain(Datum main_arg)
{
char *syncslotname;
- /* This is table synchronization worker, call initial sync. */
- syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ PG_TRY();
+ {
+ /* This is table synchronization worker, call initial sync. */
+ syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ }
+ PG_CATCH();
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ /*
+ * Report the table sync error. There is no corresponding message
+ * type for table synchronization.
+ */
+ pgstat_report_subworker_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ MyLogicalRepWorker->relid,
+ 0, /* message type */
+ InvalidTransactionId,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
/* allocate slot name in long-lived context */
myslotname = MemoryContextStrdup(ApplyContext, syncslotname);
@@ -3551,7 +3574,32 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- LogicalRepApplyLoop(origin_startpos);
+ PG_TRY();
+ {
+ LogicalRepApplyLoop(origin_startpos);
+ }
+ PG_CATCH();
+ {
+ /* report the apply error */
+ if (apply_error_callback_arg.command != 0)
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ pgstat_report_subworker_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ apply_error_callback_arg.rel != NULL
+ ? apply_error_callback_arg.rel->localreloid
+ : InvalidOid,
+ apply_error_callback_arg.command,
+ apply_error_callback_arg.remote_xid,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ }
+
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
proc_exit(0);
}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index e64857e..f529c15 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -2172,7 +2172,7 @@ pg_stat_reset_single_table_counters(PG_FUNCTION_ARGS)
{
Oid taboid = PG_GETARG_OID(0);
- pgstat_reset_single_counter(taboid, RESET_TABLE);
+ pgstat_reset_single_counter(taboid, InvalidOid, RESET_TABLE);
PG_RETURN_VOID();
}
@@ -2182,11 +2182,38 @@ pg_stat_reset_single_function_counters(PG_FUNCTION_ARGS)
{
Oid funcoid = PG_GETARG_OID(0);
- pgstat_reset_single_counter(funcoid, RESET_FUNCTION);
+ pgstat_reset_single_counter(funcoid, InvalidOid, RESET_FUNCTION);
PG_RETURN_VOID();
}
+Datum
+pg_stat_reset_subscription_worker_subrel(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid = PG_ARGISNULL(1) ? InvalidOid : PG_GETARG_OID(1);
+
+ pgstat_reset_single_counter(subid, relid, RESET_SUBWORKER);
+
+ PG_RETURN_VOID();
+}
+
+/* Reset all subscription worker stats associated with the given subscription */
+Datum
+pg_stat_reset_subscription_worker_sub(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+
+ /*
+ * Use subscription drop message to remove statistics of all subscription
+ * workers.
+ */
+ pgstat_report_subscription_drop(subid);
+
+ PG_RETURN_VOID();
+}
+
+
/* Reset SLRU counters (a specific one or all of them). */
Datum
pg_stat_reset_slru(PG_FUNCTION_ARGS)
@@ -2380,3 +2407,100 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
/* Returns the record as Datum */
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
+
+/*
+ * Get the subscription worker statistics for the given subscription
+ * (and relation).
+ */
+Datum
+pg_stat_get_subscription_worker(PG_FUNCTION_ARGS)
+{
+#define PG_STAT_GET_SUBSCRIPTION_WORKER_COLS 8
+ Oid subid = PG_GETARG_OID(0);
+ Oid subrelid;
+ TupleDesc tupdesc;
+ Datum values[PG_STAT_GET_SUBSCRIPTION_WORKER_COLS];
+ bool nulls[PG_STAT_GET_SUBSCRIPTION_WORKER_COLS];
+ PgStat_StatSubWorkerEntry *wentry;
+ int i;
+
+ if (PG_ARGISNULL(1))
+ subrelid = InvalidOid;
+ else
+ subrelid = PG_GETARG_OID(1);
+
+ /* Get subscription worker stats */
+ wentry = pgstat_fetch_stat_subworker_entry(subid, subrelid);
+
+ /* Return NULL if there is no worker statistics */
+ if (wentry == NULL)
+ PG_RETURN_NULL();
+
+ /* Initialise attributes information in the tuple descriptor */
+ tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_SUBSCRIPTION_WORKER_COLS);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "subid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subrelid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 3, "last_error_relid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 4, "last_error_command",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 5, "last_error_xid",
+ XIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 6, "last_error_count",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 7, "last_error_message",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 8, "last_error_time",
+ TIMESTAMPTZOID, -1, 0);
+ BlessTupleDesc(tupdesc);
+
+ /* Initialise values and NULL flags arrays */
+ MemSet(values, 0, sizeof(values));
+ MemSet(nulls, 0, sizeof(nulls));
+
+ i = 0;
+ /* subid */
+ values[i++] = ObjectIdGetDatum(subid);
+
+ /* subrelid */
+ if (OidIsValid(subrelid))
+ values[i++] = ObjectIdGetDatum(subrelid);
+ else
+ nulls[i++] = true;
+
+ /* last_error_relid */
+ if (OidIsValid(wentry->last_error_relid))
+ values[i++] = ObjectIdGetDatum(wentry->last_error_relid);
+ else
+ nulls[i++] = true;
+
+ /* last_error_command */
+ if (wentry->last_error_command != 0)
+ values[i++] =
+ CStringGetTextDatum(logicalrep_message_type(wentry->last_error_command));
+ else
+ nulls[i++] = true;
+
+ /* last_error_xid */
+ if (TransactionIdIsValid(wentry->last_error_xid))
+ values[i++] = TransactionIdGetDatum(wentry->last_error_xid);
+ else
+ nulls[i++] = true;
+
+ /* last_error_count */
+ values[i++] = Int64GetDatum(wentry->last_error_count);
+
+ /* last_error_message */
+ values[i++] = CStringGetTextDatum(wentry->last_error_message);
+
+ /* last_error_time */
+ if (wentry->last_error_time != 0)
+ values[i++] = TimestampTzGetDatum(wentry->last_error_time);
+ else
+ nulls[i++] = true;
+
+ /* Returns the record as Datum */
+ PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
+}
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index e934361..79d787c 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5389,6 +5389,14 @@
proargmodes => '{i,o,o,o,o,o,o,o,o,o,o}',
proargnames => '{slot_name,slot_name,spill_txns,spill_count,spill_bytes,stream_txns,stream_count,stream_bytes,total_txns,total_bytes,stats_reset}',
prosrc => 'pg_stat_get_replication_slot' },
+{ oid => '8523', descr => 'statistics: information about subscription worker',
+ proname => 'pg_stat_get_subscription_worker', prorows => '1', proisstrict => 'f',
+ proretset => 't', provolatile => 's', proparallel => 'r',
+ prorettype => 'record', proargtypes => 'oid oid',
+ proallargtypes => '{oid,oid,oid,oid,oid,text,xid,int8,text,timestamptz}',
+ proargmodes => '{i,i,o,o,o,o,o,o,o,o}',
+ proargnames => '{subid,subrelid,subid,subrelid,last_error_relid,last_error_command,last_error_xid,last_error_count,last_error_message,last_error_time}',
+ prosrc => 'pg_stat_get_subscription_worker' },
{ oid => '6118', descr => 'statistics: information about subscription',
proname => 'pg_stat_get_subscription', prorows => '10', proisstrict => 'f',
proretset => 't', provolatile => 's', proparallel => 'r',
@@ -5776,6 +5784,16 @@
proname => 'pg_stat_reset_replication_slot', proisstrict => 'f',
provolatile => 'v', prorettype => 'void', proargtypes => 'text',
prosrc => 'pg_stat_reset_replication_slot' },
+{ oid => '8524',
+ descr => 'statistics: reset collected statistics for a single subscription worker',
+ proname => 'pg_stat_reset_subscription_worker', proisstrict => 'f',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid oid',
+ prosrc => 'pg_stat_reset_subscription_worker_subrel' },
+{ oid => '8525',
+ descr => 'statistics: reset all collected statistics for a single subscription',
+ proname => 'pg_stat_reset_subscription_worker',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid',
+ prosrc => 'pg_stat_reset_subscription_worker_sub' },
{ oid => '3163', descr => 'current trigger depth',
proname => 'pg_trigger_depth', provolatile => 's', proparallel => 'r',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index bcd3588..5b51b58 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -14,6 +14,7 @@
#include "datatype/timestamp.h"
#include "portability/instr_time.h"
#include "postmaster/pgarch.h" /* for MAX_XFN_CHARS */
+#include "replication/logicalproto.h"
#include "utils/backend_progress.h" /* for backward compatibility */
#include "utils/backend_status.h" /* for backward compatibility */
#include "utils/hsearch.h"
@@ -83,6 +84,8 @@ typedef enum StatMsgType
PGSTAT_MTYPE_REPLSLOT,
PGSTAT_MTYPE_CONNECT,
PGSTAT_MTYPE_DISCONNECT,
+ PGSTAT_MTYPE_SUBSCRIPTIONPURGE,
+ PGSTAT_MTYPE_SUBWORKERERROR,
} StatMsgType;
/* ----------
@@ -145,7 +148,8 @@ typedef enum PgStat_Shared_Reset_Target
typedef enum PgStat_Single_Reset_Type
{
RESET_TABLE,
- RESET_FUNCTION
+ RESET_FUNCTION,
+ RESET_SUBWORKER
} PgStat_Single_Reset_Type;
/* ------------------------------------------------------------
@@ -364,6 +368,7 @@ typedef struct PgStat_MsgResetsinglecounter
Oid m_databaseid;
PgStat_Single_Reset_Type m_resettype;
Oid m_objectid;
+ Oid m_subobjectid;
} PgStat_MsgResetsinglecounter;
/* ----------
@@ -536,6 +541,54 @@ typedef struct PgStat_MsgReplSlot
PgStat_Counter m_total_bytes;
} PgStat_MsgReplSlot;
+/* ----------
+ * PgStat_MsgSubscriptionPurge Sent by the backend and autovacuum to tell the
+ * collector about the dead subscriptions.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(Oid) - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionPurge
+{
+ PgStat_MsgHdr m_hdr;
+ Oid m_databaseid;
+ int m_nentries;
+ Oid m_subids[PGSTAT_NUM_SUBSCRIPTIONPURGE];
+} PgStat_MsgSubscriptionPurge;
+
+/* ----------
+ * PgStat_MsgSubWorkerError Sent by the apply worker or the table sync
+ * worker to report the error occurred while
+ * processing changes.
+ * ----------
+ */
+#define PGSTAT_SUBWORKERERROR_MSGLEN 256
+typedef struct PgStat_MsgSubWorkerError
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * m_subid and m_subrelid are used to determine the subscription and the
+ * reporter of the error. m_subrelid is InvalidOid if reported by an apply
+ * worker otherwise reported by a table sync worker.
+ */
+ Oid m_databaseid;
+ Oid m_subid;
+ Oid m_subrelid;
+
+ /*
+ * Oid of the table that the reporter was actually processing. m_relid can
+ * be InvalidOid if an error occurred during worker applying a
+ * non-data-modification message such as RELATION.
+ */
+ Oid m_relid;
+
+ LogicalRepMsgType m_command;
+ TransactionId m_xid;
+ TimestampTz m_timestamp;
+ char m_message[PGSTAT_SUBWORKERERROR_MSGLEN];
+} PgStat_MsgSubWorkerError;
/* ----------
* PgStat_MsgRecoveryConflict Sent by the backend upon recovery conflict
@@ -714,6 +767,8 @@ typedef union PgStat_Msg
PgStat_MsgReplSlot msg_replslot;
PgStat_MsgConnect msg_connect;
PgStat_MsgDisconnect msg_disconnect;
+ PgStat_MsgSubscriptionPurge msg_subscriptionpurge;
+ PgStat_MsgSubWorkerError msg_subworkererror;
} PgStat_Msg;
@@ -725,7 +780,7 @@ typedef union PgStat_Msg
* ------------------------------------------------------------
*/
-#define PGSTAT_FILE_FORMAT_ID 0x01A5BCA4
+#define PGSTAT_FILE_FORMAT_ID 0x01A5BCA5
/* ----------
* PgStat_StatDBEntry The collector's data per database
@@ -768,11 +823,16 @@ typedef struct PgStat_StatDBEntry
TimestampTz stats_timestamp; /* time of db stats file update */
/*
- * tables and functions must be last in the struct, because we don't write
- * the pointers out to the stats file.
+ * tables, functions, and subscription workers must be last in the struct,
+ * because we don't write the pointers out to the stats file.
+ *
+ * subworkers is the hash table of PgStat_StatSubWorkerEntry which stores
+ * statistics of logical replication workers: apply worker and table sync
+ * worker.
*/
HTAB *tables;
HTAB *functions;
+ HTAB *subworkers;
} PgStat_StatDBEntry;
@@ -929,6 +989,38 @@ typedef struct PgStat_StatReplSlotEntry
TimestampTz stat_reset_timestamp;
} PgStat_StatReplSlotEntry;
+/* The lookup key for subscription worker hash table */
+typedef struct PgStat_StatSubWorkerKey
+{
+ Oid subid;
+
+ /*
+ * Oid of the table for which tablesync worker will copy the initial data.
+ * An InvalidOid will be assigned for apply workers.
+ */
+ Oid subrelid;
+} PgStat_StatSubWorkerKey;
+
+/*
+ * Logical replication apply worker and table sync worker statistics kept in the
+ * stats collector.
+ */
+typedef struct PgStat_StatSubWorkerEntry
+{
+ PgStat_StatSubWorkerKey key; /* hash key (must be first) */
+
+ /*
+ * Subscription worker error statistics representing an error that
+ * occurred during application of changes or the initial table
+ * synchronization.
+ */
+ Oid last_error_relid;
+ LogicalRepMsgType last_error_command;
+ TransactionId last_error_xid;
+ PgStat_Counter last_error_count;
+ TimestampTz last_error_time;
+ char last_error_message[PGSTAT_SUBWORKERERROR_MSGLEN];
+} PgStat_StatSubWorkerEntry;
/*
* Working state needed to accumulate per-function-call timing statistics.
@@ -1019,7 +1111,8 @@ extern void pgstat_drop_database(Oid databaseid);
extern void pgstat_clear_snapshot(void);
extern void pgstat_reset_counters(void);
extern void pgstat_reset_shared_counters(const char *);
-extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type type);
+extern void pgstat_reset_single_counter(Oid objectid, Oid subobjectid,
+ PgStat_Single_Reset_Type type);
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
@@ -1038,6 +1131,10 @@ extern void pgstat_report_checksum_failure(void);
extern void pgstat_report_replslot(const PgStat_StatReplSlotEntry *repSlotStat);
extern void pgstat_report_replslot_create(const char *slotname);
extern void pgstat_report_replslot_drop(const char *slotname);
+extern void pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command,
+ TransactionId xid, const char *errmsg);
+extern void pgstat_report_subscription_drop(Oid subid);
extern void pgstat_initialize(void);
@@ -1129,6 +1226,8 @@ extern void pgstat_send_wal(bool force);
extern PgStat_StatDBEntry *pgstat_fetch_stat_dbentry(Oid dbid);
extern PgStat_StatTabEntry *pgstat_fetch_stat_tabentry(Oid relid);
extern PgStat_StatFuncEntry *pgstat_fetch_stat_funcentry(Oid funcid);
+extern PgStat_StatSubWorkerEntry *pgstat_fetch_stat_subworker_entry(Oid subid,
+ Oid subrelid);
extern PgStat_ArchiverStats *pgstat_fetch_stat_archiver(void);
extern PgStat_BgWriterStats *pgstat_fetch_stat_bgwriter(void);
extern PgStat_CheckpointerStats *pgstat_fetch_stat_checkpointer(void);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2fa00a3..b58b062 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2094,6 +2094,24 @@ pg_stat_subscription| SELECT su.oid AS subid,
st.latest_end_time
FROM (pg_subscription su
LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid, pid, received_lsn, last_msg_send_time, last_msg_receipt_time, latest_end_lsn, latest_end_time) ON ((st.subid = su.oid)));
+pg_stat_subscription_workers| SELECT w.subid,
+ s.subname,
+ w.subrelid,
+ w.last_error_relid,
+ w.last_error_command,
+ w.last_error_xid,
+ w.last_error_count,
+ w.last_error_message,
+ w.last_error_time
+ FROM ( SELECT pg_subscription.oid AS subid,
+ NULL::oid AS relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT pg_subscription_rel.srsubid AS subid,
+ pg_subscription_rel.srrelid AS relid
+ FROM pg_subscription_rel) sr,
+ (LATERAL pg_stat_get_subscription_worker(sr.subid, sr.relid) w(subid, subrelid, last_error_relid, last_error_command, last_error_xid, last_error_count, last_error_message, last_error_time)
+ JOIN pg_subscription s ON ((w.subid = s.oid)));
pg_stat_sys_indexes| SELECT pg_stat_all_indexes.relid,
pg_stat_all_indexes.indexrelid,
pg_stat_all_indexes.schemaname,
diff --git a/src/test/subscription/t/026_stats.pl b/src/test/subscription/t/026_stats.pl
new file mode 100644
index 0000000..e64e0a7
--- /dev/null
+++ b/src/test/subscription/t/026_stats.pl
@@ -0,0 +1,154 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Tests for subscription error stats.
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 5;
+
+# Test if the error reported on pg_stat_subscription_workers view is expected.
+sub test_subscription_error
+{
+ my ($node, $relname, $xid, $expected_error, $msg) = @_;
+
+ my $check_sql = qq[
+SELECT count(1) > 0 FROM pg_stat_subscription_workers
+WHERE last_error_relid = '$relname'::regclass];
+ $check_sql .= " AND last_error_xid = '$xid'::xid;" if $xid ne '';
+
+ # Wait for the error statistics to be updated.
+ $node->poll_query_until(
+ 'postgres', $check_sql,
+) or die "Timed out while waiting for statistics to be updated";
+
+ my $result = $node->safe_psql(
+ 'postgres',
+ qq[
+SELECT subname, last_error_command, last_error_relid::regclass, last_error_count > 0
+FROM pg_stat_subscription_workers
+WHERE last_error_relid = '$relname'::regclass;
+]);
+ is($result, $expected_error, $msg);
+}
+
+# Create publisher node.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+
+# The subscriber will enter an infinite error loop, so we don't want
+# to overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+wal_retrieve_retry_interval = 2s
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we
+# create the same tables but with primary keys. Also, insert some data that
+# will conflict with the data replicated from publisher later.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+CREATE TABLE test_tab1 (a int);
+CREATE TABLE test_tab2 (a int);
+INSERT INTO test_tab1 VALUES (1);
+INSERT INTO test_tab2 VALUES (1);
+COMMIT;
+]);
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+CREATE TABLE test_tab1 (a int primary key);
+CREATE TABLE test_tab2 (a int primary key);
+INSERT INTO test_tab2 VALUES (1);
+COMMIT;
+]);
+
+# Setup publications.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql(
+ 'postgres',
+ "CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;");
+
+# There shouldn't be any subscription errors before starting logical replication.
+my $result = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT count(1) FROM pg_stat_subscription_workers");
+is($result, qq(0), 'check no subscription error');
+
+# Create subscription. The table sync for test_tab2 on tap_sub will enter into
+# infinite error loop due to violating the unique constraint.
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr' PUBLICATION tap_pub WITH (streaming = off);");
+
+$node_publisher->wait_for_catchup('tap_sub');
+
+# Wait for initial table sync for test_tab1 to finish.
+$node_subscriber->poll_query_until(
+ 'postgres',
+ qq[
+SELECT count(1) = 1 FROM pg_subscription_rel
+WHERE srrelid = 'test_tab1'::regclass AND srsubstate in ('r', 's')
+]) or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check the initial data.
+$result = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT count(a) FROM test_tab1");
+is($result, q(1), 'check initial data are copied to subscriber');
+
+# Insert more data to test_tab1, raising an error on the subscriber due to
+# violation of the unique constraint on test_tab1.
+my $xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab1 VALUES (1);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_subscription_error($node_subscriber, 'test_tab1', $xid,
+ qq(tap_sub|INSERT|test_tab1|t),
+ 'check the error reported by the apply worker');
+
+# Check the table sync worker's error in the view.
+test_subscription_error($node_subscriber, 'test_tab2', '',
+ qq(tap_sub||test_tab2|t),
+ 'check the error reported by the table sync worker');
+
+# Test for resetting subscription worker statistics.
+# Truncate test_tab1 and test_tab2 so that applying changes and table sync can
+# continue, respectively.
+$node_subscriber->safe_psql(
+ 'postgres',
+ "TRUNCATE test_tab1, test_tab2;");
+
+# Wait for the data to be replicated.
+$node_subscriber->poll_query_until(
+ 'postgres',
+ "SELECT count(1) > 0 FROM test_tab1");
+$node_subscriber->poll_query_until(
+ 'postgres',
+ "SELECT count(1) > 0 FROM test_tab2");
+
+# There shouldn't be any errors in the view after dropping the subscription.
+$node_subscriber->safe_psql(
+ 'postgres',
+ "DROP SUBSCRIPTION tap_sub;");
+$result = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT count(1) FROM pg_stat_subscription_workers");
+is($result, q(0), 'no error after dropping subscription');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index da6ac8e..f41ef0d 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1943,6 +1943,8 @@ PgStat_MsgResetsharedcounter
PgStat_MsgResetsinglecounter
PgStat_MsgResetslrucounter
PgStat_MsgSLRU
+PgStat_MsgSubscriptionPurge
+PgStat_MsgSubWorkerError
PgStat_MsgTabpurge
PgStat_MsgTabstat
PgStat_MsgTempFile
@@ -1954,6 +1956,8 @@ PgStat_Single_Reset_Type
PgStat_StatDBEntry
PgStat_StatFuncEntry
PgStat_StatReplSlotEntry
+PgStat_StatSubWorkerEntry
+PgStat_StatSubWorkerKey
PgStat_StatTabEntry
PgStat_SubXactStatus
PgStat_TableCounts
--
1.8.3.1
On Fri, Nov 26, 2021 at 7:45 AM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
On Friday, November 26, 2021 9:30 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Indeed. Attached an updated patch. Thanks!
Thanks for your patch. A small comment:
+ OID of the relation that the worker is synchronizing; null for the + main apply workerShould we modify it to "OID of the relation that the worker was synchronizing ..."?
I don't think this change is required, see the description of the
similar column in pg_stat_subscription.
--
With Regards,
Amit Kapila.
On Sat, Nov 27, 2021 at 7:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Nov 26, 2021 at 6:00 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Indeed. Attached an updated patch. Thanks!
Thank you for updating the patch!
I have made a number of changes in the attached patch which includes
(a) the patch was trying to register multiple array entries for the
same subscription which doesn't seem to be required, see changes in
pgstat_vacuum_stat, (b) multiple changes in the test like reduced the
wal_retrieve_retry_interval to 2s which has reduced the test time to
half, remove the check related to resetting of stats as there is no
guarantee that the message will be received by the collector and we
were not sending it again, changed the test case file name to
026_stats as we can add more subscription-related stats in this test
file itself
Since we have pg_stat_subscription view, how about 026_worker_stats.pl?
The rests look good to me.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Mon, Nov 29, 2021 at 7:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Sat, Nov 27, 2021 at 7:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Thank you for updating the patch!
I have made a number of changes in the attached patch which includes
(a) the patch was trying to register multiple array entries for the
same subscription which doesn't seem to be required, see changes in
pgstat_vacuum_stat, (b) multiple changes in the test like reduced the
wal_retrieve_retry_interval to 2s which has reduced the test time to
half, remove the check related to resetting of stats as there is no
guarantee that the message will be received by the collector and we
were not sending it again, changed the test case file name to
026_stats as we can add more subscription-related stats in this test
file itselfSince we have pg_stat_subscription view, how about 026_worker_stats.pl?
Sounds better. Updated patch attached.
The rests look good to me.
Okay, I'll push this patch tomorrow unless there are more comments.
--
With Regards,
Amit Kapila.
Attachments:
v28-0001-Add-a-view-to-show-the-stats-of-subscription-wor.patchapplication/octet-stream; name=v28-0001-Add-a-view-to-show-the-stats-of-subscription-wor.patchDownload
From 6ff17aeae67a7bf26d46f2a27544ab94cacd08d4 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 16 Jul 2021 23:10:22 +0900
Subject: [PATCH v28] Add a view to show the stats of subscription workers.
This commit adds a new system view pg_stat_subscription_workers, that
shows information about any errors which occur during the application of
logical replication changes as well as during performing initial table
synchronization. The subscription statistics entries are removed when the
corresponding subscription is removed.
It also adds an SQL function pg_stat_reset_subscription_worker() to reset
single subscription errors.
The contents of this view can be used by an upcoming patch that skips the
particular transaction that conflicts with the existing data on the
subscriber.
This view can be extended in the future to track other xact related
statistics for subscription workers.
Author: Masahiko Sawada
Reviewed-by: Greg Nancarrow, Hou Zhijie, Tang Haiying, Vignesh C, Dilip Kumar, Takamichi Osumi, Amit Kapila
Discussion: https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xJfUVihNZDA@mail.gmail.com
---
doc/src/sgml/monitoring.sgml | 157 +++++++++++
src/backend/catalog/system_functions.sql | 4 +
src/backend/catalog/system_views.sql | 23 ++
src/backend/commands/subscriptioncmds.c | 16 +-
src/backend/postmaster/pgstat.c | 409 +++++++++++++++++++++++++++-
src/backend/replication/logical/worker.c | 54 +++-
src/backend/utils/adt/pgstatfuncs.c | 128 ++++++++-
src/include/catalog/pg_proc.dat | 18 ++
src/include/pgstat.h | 109 +++++++-
src/test/regress/expected/rules.out | 18 ++
src/test/subscription/t/026_worker_stats.pl | 154 +++++++++++
src/tools/pgindent/typedefs.list | 4 +
12 files changed, 1068 insertions(+), 26 deletions(-)
create mode 100644 src/test/subscription/t/026_worker_stats.pl
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index af69148..62f2a33 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -627,6 +627,15 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
</entry>
</row>
+ <row>
+ <entry><structname>pg_stat_subscription_workers</structname><indexterm><primary>pg_stat_subscription_workers</primary></indexterm></entry>
+ <entry>One row per subscription worker, showing statistics about errors
+ that occurred on that subscription worker.
+ See <link linkend="monitoring-pg-stat-subscription-workers">
+ <structname>pg_stat_subscription_workers</structname></link> for details.
+ </entry>
+ </row>
+
</tbody>
</tgroup>
</table>
@@ -3054,6 +3063,128 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
</sect2>
+ <sect2 id="monitoring-pg-stat-subscription-workers">
+ <title><structname>pg_stat_subscription_workers</structname></title>
+
+ <indexterm>
+ <primary>pg_stat_subscription_workers</primary>
+ </indexterm>
+
+ <para>
+ The <structname>pg_stat_subscription_workers</structname> view will contain
+ one row per subscription worker on which errors have occurred, for workers
+ applying logical replication changes and workers handling the initial data
+ copy of the subscribed tables. The statistics entry is removed when the
+ corresponding subscription is dropped.
+ </para>
+
+ <table id="pg-stat-subscription-workers" xreflabel="pg_stat_subscription_workers">
+ <title><structname>pg_stat_subscription_workers</structname> View</title>
+ <tgroup cols="1">
+ <thead>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ Column Type
+ </para>
+ <para>
+ Description
+ </para></entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the subscription
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subname</structfield> <type>name</type>
+ </para>
+ <para>
+ Name of the subscription
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subrelid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker is synchronizing; null for the
+ main apply worker
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_error_relid</structfield> <type>oid</type>
+ </para>
+ <para>
+ OID of the relation that the worker was processing when the
+ error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_error_command</structfield> <type>text</type>
+ </para>
+ <para>
+ Name of command being applied when the error occurred. This field
+ is null if the error was reported during the initial data copy.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_error_xid</structfield> <type>xid</type>
+ </para>
+ <para>
+ Transaction ID of the publisher node being applied when the error
+ occurred. This field is null if the error was reported
+ during the initial data copy.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_error_count</structfield> <type>uint8</type>
+ </para>
+ <para>
+ Number of consecutive times the error occurred
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_error_message</structfield> <type>text</type>
+ </para>
+ <para>
+ The error message
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>last_error_time</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Last time at which this error occurred
+ </para></entry>
+ </row>
+
+ </tbody>
+ </tgroup>
+ </table>
+
+ </sect2>
+
<sect2 id="monitoring-pg-stat-ssl-view">
<title><structname>pg_stat_ssl</structname></title>
@@ -5176,6 +5307,32 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
can be granted EXECUTE to run the function.
</para></entry>
</row>
+
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_stat_reset_subscription_worker</primary>
+ </indexterm>
+ <function>pg_stat_reset_subscription_worker</function> ( <parameter>subid</parameter> <type>oid</type> <optional>, <parameter>relid</parameter> <type>oid</type> </optional> )
+ <returnvalue>void</returnvalue>
+ </para>
+ <para>
+ Resets the statistics of subscription workers running on the
+ subscription with <parameter>subid</parameter> shown in the
+ <structname>pg_stat_subscription_workers</structname> view. If the
+ argument <parameter>relid</parameter> is not <literal>NULL</literal>,
+ resets statistics of the subscription worker handling the initial data
+ copy of the relation with <parameter>relid</parameter>. Otherwise,
+ resets the subscription worker statistics of the main apply worker.
+ If the argument <parameter>relid</parameter> is omitted, resets the
+ statistics of all subscription workers running on the subscription
+ with <parameter>subid</parameter>.
+ </para>
+ <para>
+ This function is restricted to superusers by default, but other users
+ can be granted EXECUTE to run the function.
+ </para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index f678902..3a4fa90 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -639,6 +639,10 @@ REVOKE EXECUTE ON FUNCTION pg_stat_reset_single_function_counters(oid) FROM publ
REVOKE EXECUTE ON FUNCTION pg_stat_reset_replication_slot(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_stat_reset_subscription_worker(oid) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_stat_reset_subscription_worker(oid, oid) FROM public;
+
REVOKE EXECUTE ON FUNCTION lo_import(text) FROM public;
REVOKE EXECUTE ON FUNCTION lo_import(text, oid) FROM public;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index eb56095..61b515c 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1261,3 +1261,26 @@ REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
substream, subtwophasestate, subslotname, subsynccommit, subpublications)
ON pg_subscription TO public;
+
+CREATE VIEW pg_stat_subscription_workers AS
+ SELECT
+ w.subid,
+ s.subname,
+ w.subrelid,
+ w.last_error_relid,
+ w.last_error_command,
+ w.last_error_xid,
+ w.last_error_count,
+ w.last_error_message,
+ w.last_error_time
+ FROM (SELECT
+ oid as subid,
+ NULL as relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT
+ srsubid as subid,
+ srrelid as relid
+ FROM pg_subscription_rel) sr,
+ LATERAL pg_stat_get_subscription_worker(sr.subid, sr.relid) w
+ JOIN pg_subscription s ON (w.subid = s.oid);
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index c47ba26..9427e86 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -32,6 +32,7 @@
#include "executor/executor.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "pgstat.h"
#include "replication/logicallauncher.h"
#include "replication/origin.h"
#include "replication/slot.h"
@@ -1204,7 +1205,8 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
* Since dropping a replication slot is not transactional, the replication
* slot stays dropped even if the transaction rolls back. So we cannot
* run DROP SUBSCRIPTION inside a transaction block if dropping the
- * replication slot.
+ * replication slot. Also, in this case, we report a message for dropping
+ * the subscription to the stats collector.
*
* XXX The command name should really be something like "DROP SUBSCRIPTION
* of a subscription that is associated with a replication slot", but we
@@ -1377,6 +1379,18 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
}
PG_END_TRY();
+ /*
+ * Send a message for dropping this subscription to the stats collector.
+ * We can safely report dropping the subscription statistics here if the
+ * subscription is associated with a replication slot since we cannot run
+ * DROP SUBSCRIPTION inside a transaction block. Subscription statistics
+ * will be removed later by (auto)vacuum either if it's not associated
+ * with a replication slot or if the message for dropping the subscription
+ * gets lost.
+ */
+ if (slotname)
+ pgstat_report_subscription_drop(subid);
+
table_close(rel, NoLock);
}
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 8c166e5..7264d2c 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -41,6 +41,7 @@
#include "catalog/catalog.h"
#include "catalog/pg_database.h"
#include "catalog/pg_proc.h"
+#include "catalog/pg_subscription.h"
#include "common/ip.h"
#include "executor/instrument.h"
#include "libpq/libpq.h"
@@ -105,6 +106,7 @@
#define PGSTAT_DB_HASH_SIZE 16
#define PGSTAT_TAB_HASH_SIZE 512
#define PGSTAT_FUNCTION_HASH_SIZE 512
+#define PGSTAT_SUBWORKER_HASH_SIZE 32
#define PGSTAT_REPLSLOT_HASH_SIZE 32
@@ -320,10 +322,14 @@ NON_EXEC_STATIC void PgstatCollectorMain(int argc, char *argv[]) pg_attribute_no
static PgStat_StatDBEntry *pgstat_get_db_entry(Oid databaseid, bool create);
static PgStat_StatTabEntry *pgstat_get_tab_entry(PgStat_StatDBEntry *dbentry,
Oid tableoid, bool create);
+static PgStat_StatSubWorkerEntry *pgstat_get_subworker_entry(PgStat_StatDBEntry *dbentry,
+ Oid subid, Oid subrelid,
+ bool create);
static void pgstat_write_statsfiles(bool permanent, bool allDbs);
static void pgstat_write_db_statsfile(PgStat_StatDBEntry *dbentry, bool permanent);
static HTAB *pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep);
-static void pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash, bool permanent);
+static void pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash,
+ HTAB *subworkerhash, bool permanent);
static void backend_read_statsfile(void);
static bool pgstat_write_statsfile_needed(void);
@@ -335,6 +341,7 @@ static void pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotstats, Timestamp
static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg, TimestampTz now);
static void pgstat_send_funcstats(void);
static void pgstat_send_slru(void);
+static void pgstat_send_subscription_purge(PgStat_MsgSubscriptionPurge *msg);
static HTAB *pgstat_collect_oids(Oid catalogid, AttrNumber anum_oid);
static bool pgstat_should_report_connstat(void);
static void pgstat_report_disconnect(Oid dboid);
@@ -373,6 +380,8 @@ static void pgstat_recv_connect(PgStat_MsgConnect *msg, int len);
static void pgstat_recv_disconnect(PgStat_MsgDisconnect *msg, int len);
static void pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len);
static void pgstat_recv_tempfile(PgStat_MsgTempFile *msg, int len);
+static void pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len);
+static void pgstat_recv_subworker_error(PgStat_MsgSubWorkerError *msg, int len);
/* ------------------------------------------------------------
* Public functions called from postmaster follow
@@ -1302,6 +1311,74 @@ pgstat_vacuum_stat(void)
hash_destroy(htab);
}
+
+ /*
+ * Repeat for subscription workers. Similarly, we needn't bother in the
+ * common case where no subscription workers' stats are being collected.
+ */
+ if (dbentry->subworkers != NULL &&
+ hash_get_num_entries(dbentry->subworkers) > 0)
+ {
+ PgStat_StatSubWorkerEntry *subwentry;
+ PgStat_MsgSubscriptionPurge spmsg;
+
+ /*
+ * Read pg_subscription and make a list of OIDs of all existing
+ * subscriptions
+ */
+ htab = pgstat_collect_oids(SubscriptionRelationId, Anum_pg_subscription_oid);
+
+ spmsg.m_databaseid = MyDatabaseId;
+ spmsg.m_nentries = 0;
+
+ hash_seq_init(&hstat, dbentry->subworkers);
+ while ((subwentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ bool exists = false;
+ Oid subid = subwentry->key.subid;
+
+ CHECK_FOR_INTERRUPTS();
+
+ if (hash_search(htab, (void *) &subid, HASH_FIND, NULL) != NULL)
+ continue;
+
+ /*
+ * It is possible that we have multiple entries for the
+ * subscription corresponding to apply worker and tablesync
+ * workers. In such cases, we don't need to add the same subid
+ * again.
+ */
+ for (int i = 0; i < spmsg.m_nentries; i++)
+ {
+ if (spmsg.m_subids[i] == subid)
+ {
+ exists = true;
+ break;
+ }
+ }
+
+ if (exists)
+ continue;
+
+ /* This subscription is dead, add the subid to the message */
+ spmsg.m_subids[spmsg.m_nentries++] = subid;
+
+ /*
+ * If the message is full, send it out and reinitialize to empty
+ */
+ if (spmsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONPURGE)
+ {
+ pgstat_send_subscription_purge(&spmsg);
+ spmsg.m_nentries = 0;
+ }
+ }
+
+ /* Send the rest of dead subscriptions */
+ if (spmsg.m_nentries > 0)
+ pgstat_send_subscription_purge(&spmsg);
+
+ hash_destroy(htab);
+ }
}
@@ -1474,7 +1551,8 @@ pgstat_reset_shared_counters(const char *target)
* ----------
*/
void
-pgstat_reset_single_counter(Oid objoid, PgStat_Single_Reset_Type type)
+pgstat_reset_single_counter(Oid objoid, Oid subobjoid,
+ PgStat_Single_Reset_Type type)
{
PgStat_MsgResetsinglecounter msg;
@@ -1485,6 +1563,7 @@ pgstat_reset_single_counter(Oid objoid, PgStat_Single_Reset_Type type)
msg.m_databaseid = MyDatabaseId;
msg.m_resettype = type;
msg.m_objectid = objoid;
+ msg.m_subobjectid = subobjoid;
pgstat_send(&msg, sizeof(msg));
}
@@ -1870,6 +1949,51 @@ pgstat_report_replslot_drop(const char *slotname)
}
/* ----------
+ * pgstat_report_subworker_error() -
+ *
+ * Tell the collector about the subscription worker error.
+ * ----------
+ */
+void
+pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command, TransactionId xid,
+ const char *errmsg)
+{
+ PgStat_MsgSubWorkerError msg;
+ int len;
+
+ pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SUBWORKERERROR);
+ msg.m_databaseid = MyDatabaseId;
+ msg.m_subid = subid;
+ msg.m_subrelid = subrelid;
+ msg.m_relid = relid;
+ msg.m_command = command;
+ msg.m_xid = xid;
+ msg.m_timestamp = GetCurrentTimestamp();
+ strlcpy(msg.m_message, errmsg, PGSTAT_SUBWORKERERROR_MSGLEN);
+
+ len = offsetof(PgStat_MsgSubWorkerError, m_message) + strlen(msg.m_message) + 1;
+ pgstat_send(&msg, len);
+}
+
+/* ----------
+ * pgstat_report_subscription_drop() -
+ *
+ * Tell the collector about dropping the subscription.
+ * ----------
+ */
+void
+pgstat_report_subscription_drop(Oid subid)
+{
+ PgStat_MsgSubscriptionPurge msg;
+
+ msg.m_databaseid = MyDatabaseId;
+ msg.m_subids[0] = subid;
+ msg.m_nentries = 1;
+ pgstat_send_subscription_purge(&msg);
+}
+
+/* ----------
* pgstat_ping() -
*
* Send some junk data to the collector to increase traffic.
@@ -2874,6 +2998,35 @@ pgstat_fetch_stat_funcentry(Oid func_id)
return funcentry;
}
+/*
+ * ---------
+ * pgstat_fetch_stat_subworker_entry() -
+ *
+ * Support function for the SQL-callable pgstat* functions. Returns
+ * the collected statistics for subscription worker or NULL.
+ * ---------
+ */
+PgStat_StatSubWorkerEntry *
+pgstat_fetch_stat_subworker_entry(Oid subid, Oid subrelid)
+{
+ PgStat_StatDBEntry *dbentry;
+ PgStat_StatSubWorkerEntry *wentry = NULL;
+
+ /* Load the stats file if needed */
+ backend_read_statsfile();
+
+ /*
+ * Lookup our database, then find the requested subscription worker stats.
+ */
+ dbentry = pgstat_fetch_stat_dbentry(MyDatabaseId);
+ if (dbentry != NULL && dbentry->subworkers != NULL)
+ {
+ wentry = pgstat_get_subworker_entry(dbentry, subid, subrelid,
+ false);
+ }
+
+ return wentry;
+}
/*
* ---------
@@ -3312,6 +3465,23 @@ pgstat_send_slru(void)
}
}
+/* --------
+ * pgstat_send_subscription_purge() -
+ *
+ * Send a subscription purge message to the collector
+ * --------
+ */
+static void
+pgstat_send_subscription_purge(PgStat_MsgSubscriptionPurge *msg)
+{
+ int len;
+
+ len = offsetof(PgStat_MsgSubscriptionPurge, m_subids[0])
+ + msg->m_nentries * sizeof(Oid);
+
+ pgstat_setheader(&msg->m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ pgstat_send(msg, len);
+}
/* ----------
* PgstatCollectorMain() -
@@ -3568,6 +3738,14 @@ PgstatCollectorMain(int argc, char *argv[])
pgstat_recv_disconnect(&msg.msg_disconnect, len);
break;
+ case PGSTAT_MTYPE_SUBSCRIPTIONPURGE:
+ pgstat_recv_subscription_purge(&msg.msg_subscriptionpurge, len);
+ break;
+
+ case PGSTAT_MTYPE_SUBWORKERERROR:
+ pgstat_recv_subworker_error(&msg.msg_subworkererror, len);
+ break;
+
default:
break;
}
@@ -3613,7 +3791,8 @@ PgstatCollectorMain(int argc, char *argv[])
/*
* Subroutine to clear stats in a database entry
*
- * Tables and functions hashes are initialized to empty.
+ * Tables, functions, and subscription workers hashes are initialized
+ * to empty.
*/
static void
reset_dbentry_counters(PgStat_StatDBEntry *dbentry)
@@ -3666,6 +3845,13 @@ reset_dbentry_counters(PgStat_StatDBEntry *dbentry)
PGSTAT_FUNCTION_HASH_SIZE,
&hash_ctl,
HASH_ELEM | HASH_BLOBS);
+
+ hash_ctl.keysize = sizeof(PgStat_StatSubWorkerKey);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubWorkerEntry);
+ dbentry->subworkers = hash_create("Per-database subscription worker",
+ PGSTAT_SUBWORKER_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS);
}
/*
@@ -3690,7 +3876,7 @@ pgstat_get_db_entry(Oid databaseid, bool create)
/*
* If not found, initialize the new one. This creates empty hash tables
- * for tables and functions, too.
+ * for tables, functions, and subscription workers, too.
*/
if (!found)
reset_dbentry_counters(result);
@@ -3748,6 +3934,47 @@ pgstat_get_tab_entry(PgStat_StatDBEntry *dbentry, Oid tableoid, bool create)
return result;
}
+/* ----------
+ * pgstat_get_subworker_entry
+ *
+ * Return subscription worker entry with the given subscription OID and
+ * relation OID. If subrelid is InvalidOid, it returns an entry of the
+ * apply worker otherwise returns an entry of the table sync worker
+ * associated with subrelid. If no subscription worker entry exists,
+ * initialize it, if the create parameter is true. Else, return NULL.
+ * ----------
+ */
+static PgStat_StatSubWorkerEntry *
+pgstat_get_subworker_entry(PgStat_StatDBEntry *dbentry, Oid subid, Oid subrelid,
+ bool create)
+{
+ PgStat_StatSubWorkerEntry *subwentry;
+ PgStat_StatSubWorkerKey key;
+ bool found;
+ HASHACTION action = (create ? HASH_ENTER : HASH_FIND);
+
+ key.subid = subid;
+ key.subrelid = subrelid;
+ subwentry = (PgStat_StatSubWorkerEntry *) hash_search(dbentry->subworkers,
+ (void *) &key,
+ action, &found);
+
+ if (!create && !found)
+ return NULL;
+
+ /* If not found, initialize the new one */
+ if (!found)
+ {
+ subwentry->last_error_relid = InvalidOid;
+ subwentry->last_error_command = 0;
+ subwentry->last_error_xid = InvalidTransactionId;
+ subwentry->last_error_count = 0;
+ subwentry->last_error_time = 0;
+ subwentry->last_error_message[0] = '\0';
+ }
+
+ return subwentry;
+}
/* ----------
* pgstat_write_statsfiles() -
@@ -3832,8 +4059,8 @@ pgstat_write_statsfiles(bool permanent, bool allDbs)
while ((dbentry = (PgStat_StatDBEntry *) hash_seq_search(&hstat)) != NULL)
{
/*
- * Write out the table and function stats for this DB into the
- * appropriate per-DB stat file, if required.
+ * Write out the table, function, and subscription-worker stats for
+ * this DB into the appropriate per-DB stat file, if required.
*/
if (allDbs || pgstat_db_requested(dbentry->databaseid))
{
@@ -3947,8 +4174,10 @@ pgstat_write_db_statsfile(PgStat_StatDBEntry *dbentry, bool permanent)
{
HASH_SEQ_STATUS tstat;
HASH_SEQ_STATUS fstat;
+ HASH_SEQ_STATUS sstat;
PgStat_StatTabEntry *tabentry;
PgStat_StatFuncEntry *funcentry;
+ PgStat_StatSubWorkerEntry *subwentry;
FILE *fpout;
int32 format_id;
Oid dbid = dbentry->databaseid;
@@ -4004,6 +4233,17 @@ pgstat_write_db_statsfile(PgStat_StatDBEntry *dbentry, bool permanent)
}
/*
+ * Walk through the database's subscription worker stats table.
+ */
+ hash_seq_init(&sstat, dbentry->subworkers);
+ while ((subwentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&sstat)) != NULL)
+ {
+ fputc('S', fpout);
+ rc = fwrite(subwentry, sizeof(PgStat_StatSubWorkerEntry), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+ }
+
+ /*
* No more output to be done. Close the temp file and replace the old
* pgstat.stat with it. The ferror() check replaces testing for error
* after each individual fputc or fwrite above.
@@ -4061,8 +4301,9 @@ pgstat_write_db_statsfile(PgStat_StatDBEntry *dbentry, bool permanent)
* files after reading; the in-memory status is now authoritative, and the
* files would be out of date in case somebody else reads them.
*
- * If a 'deep' read is requested, table/function stats are read, otherwise
- * the table/function hash tables remain empty.
+ * If a 'deep' read is requested, table/function/subscription-worker stats are
+ * read, otherwise the table/function/subscription-worker hash tables remain
+ * empty.
* ----------
*/
static HTAB *
@@ -4241,6 +4482,7 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
memcpy(dbentry, &dbbuf, sizeof(PgStat_StatDBEntry));
dbentry->tables = NULL;
dbentry->functions = NULL;
+ dbentry->subworkers = NULL;
/*
* In the collector, disregard the timestamp we read from the
@@ -4252,8 +4494,8 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
dbentry->stats_timestamp = 0;
/*
- * Don't create tables/functions hashtables for uninteresting
- * databases.
+ * Don't create tables/functions/subworkers hashtables for
+ * uninteresting databases.
*/
if (onlydb != InvalidOid)
{
@@ -4278,6 +4520,14 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
&hash_ctl,
HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ hash_ctl.keysize = sizeof(PgStat_StatSubWorkerKey);
+ hash_ctl.entrysize = sizeof(PgStat_StatSubWorkerEntry);
+ hash_ctl.hcxt = pgStatLocalContext;
+ dbentry->subworkers = hash_create("Per-database subscription worker",
+ PGSTAT_SUBWORKER_HASH_SIZE,
+ &hash_ctl,
+ HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+
/*
* If requested, read the data from the database-specific
* file. Otherwise we just leave the hashtables empty.
@@ -4286,6 +4536,7 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
pgstat_read_db_statsfile(dbentry->databaseid,
dbentry->tables,
dbentry->functions,
+ dbentry->subworkers,
permanent);
break;
@@ -4363,19 +4614,21 @@ done:
* As in pgstat_read_statsfiles, if the permanent file is requested, it is
* removed after reading.
*
- * Note: this code has the ability to skip storing per-table or per-function
- * data, if NULL is passed for the corresponding hashtable. That's not used
- * at the moment though.
+ * Note: this code has the ability to skip storing per-table, per-function, or
+ * per-subscription-worker data, if NULL is passed for the corresponding hashtable.
+ * That's not used at the moment though.
* ----------
*/
static void
pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash,
- bool permanent)
+ HTAB *subworkerhash, bool permanent)
{
PgStat_StatTabEntry *tabentry;
PgStat_StatTabEntry tabbuf;
PgStat_StatFuncEntry funcbuf;
PgStat_StatFuncEntry *funcentry;
+ PgStat_StatSubWorkerEntry subwbuf;
+ PgStat_StatSubWorkerEntry *subwentry;
FILE *fpin;
int32 format_id;
bool found;
@@ -4490,6 +4743,41 @@ pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash,
break;
/*
+ * 'S' A PgStat_StatSubWorkerEntry struct describing
+ * subscription worker statistics.
+ */
+ case 'S':
+ if (fread(&subwbuf, 1, sizeof(PgStat_StatSubWorkerEntry),
+ fpin) != sizeof(PgStat_StatSubWorkerEntry))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ /*
+ * Skip if subscription worker data not wanted.
+ */
+ if (subworkerhash == NULL)
+ break;
+
+ subwentry = (PgStat_StatSubWorkerEntry *) hash_search(subworkerhash,
+ (void *) &subwbuf.key,
+ HASH_ENTER, &found);
+
+ if (found)
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"",
+ statfile)));
+ goto done;
+ }
+
+ memcpy(subwentry, &subwbuf, sizeof(subwbuf));
+ break;
+
+ /*
* 'E' The EOF marker of a complete stats file.
*/
case 'E':
@@ -5162,6 +5450,8 @@ pgstat_recv_dropdb(PgStat_MsgDropdb *msg, int len)
hash_destroy(dbentry->tables);
if (dbentry->functions != NULL)
hash_destroy(dbentry->functions);
+ if (dbentry->subworkers != NULL)
+ hash_destroy(dbentry->subworkers);
if (hash_search(pgStatDBHash,
(void *) &dbid,
@@ -5199,13 +5489,16 @@ pgstat_recv_resetcounter(PgStat_MsgResetcounter *msg, int len)
hash_destroy(dbentry->tables);
if (dbentry->functions != NULL)
hash_destroy(dbentry->functions);
+ if (dbentry->subworkers != NULL)
+ hash_destroy(dbentry->subworkers);
dbentry->tables = NULL;
dbentry->functions = NULL;
+ dbentry->subworkers = NULL;
/*
* Reset database-level stats, too. This creates empty hash tables for
- * tables and functions.
+ * tables, functions, and subscription workers.
*/
reset_dbentry_counters(dbentry);
}
@@ -5274,6 +5567,14 @@ pgstat_recv_resetsinglecounter(PgStat_MsgResetsinglecounter *msg, int len)
else if (msg->m_resettype == RESET_FUNCTION)
(void) hash_search(dbentry->functions, (void *) &(msg->m_objectid),
HASH_REMOVE, NULL);
+ else if (msg->m_resettype == RESET_SUBWORKER)
+ {
+ PgStat_StatSubWorkerKey key;
+
+ key.subid = msg->m_objectid;
+ key.subrelid = msg->m_subobjectid;
+ (void) hash_search(dbentry->subworkers, (void *) &key, HASH_REMOVE, NULL);
+ }
}
/* ----------
@@ -5817,6 +6118,84 @@ pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len)
}
/* ----------
+ * pgstat_recv_subscription_purge() -
+ *
+ * Process a SUBSCRIPTIONPURGE message.
+ * ----------
+ */
+static void
+pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len)
+{
+ HASH_SEQ_STATUS hstat;
+ PgStat_StatDBEntry *dbentry;
+ PgStat_StatSubWorkerEntry *subwentry;
+
+ dbentry = pgstat_get_db_entry(msg->m_databaseid, false);
+
+ /* No need to purge if we don't even know the database */
+ if (!dbentry || !dbentry->subworkers)
+ return;
+
+ /* Remove all subscription worker statistics for the given subscriptions */
+ hash_seq_init(&hstat, dbentry->subworkers);
+ while ((subwentry = (PgStat_StatSubWorkerEntry *) hash_seq_search(&hstat)) != NULL)
+ {
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ if (subwentry->key.subid == msg->m_subids[i])
+ {
+ (void) hash_search(dbentry->subworkers, (void *) &(subwentry->key),
+ HASH_REMOVE, NULL);
+ break;
+ }
+ }
+ }
+}
+
+/* ----------
+ * pgstat_recv_subworker_error() -
+ *
+ * Process a SUBWORKERERROR message.
+ * ----------
+ */
+static void
+pgstat_recv_subworker_error(PgStat_MsgSubWorkerError *msg, int len)
+{
+ PgStat_StatDBEntry *dbentry;
+ PgStat_StatSubWorkerEntry *subwentry;
+
+ dbentry = pgstat_get_db_entry(msg->m_databaseid, true);
+
+ /* Get the subscription worker stats */
+ subwentry = pgstat_get_subworker_entry(dbentry, msg->m_subid,
+ msg->m_subrelid, true);
+ Assert(subwentry);
+
+ if (subwentry->last_error_relid == msg->m_relid &&
+ subwentry->last_error_command == msg->m_command &&
+ subwentry->last_error_xid == msg->m_xid &&
+ strcmp(subwentry->last_error_message, msg->m_message) == 0)
+ {
+ /*
+ * The same error occurred again in succession, just update its
+ * timestamp and count.
+ */
+ subwentry->last_error_count++;
+ subwentry->last_error_time = msg->m_timestamp;
+ return;
+ }
+
+ /* Otherwise, update the error information */
+ subwentry->last_error_relid = msg->m_relid;
+ subwentry->last_error_command = msg->m_command;
+ subwentry->last_error_xid = msg->m_xid;
+ subwentry->last_error_count = 1;
+ subwentry->last_error_time = msg->m_timestamp;
+ strlcpy(subwentry->last_error_message, msg->m_message,
+ PGSTAT_SUBWORKERERROR_MSGLEN);
+}
+
+/* ----------
* pgstat_write_statsfile_needed() -
*
* Do we need to write out any stats files?
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index ae1b391..2e79302 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3332,6 +3332,7 @@ void
ApplyWorkerMain(Datum main_arg)
{
int worker_slot = DatumGetInt32(main_arg);
+ MemoryContext cctx = CurrentMemoryContext;
MemoryContext oldctx;
char originname[NAMEDATALEN];
XLogRecPtr origin_startpos;
@@ -3432,8 +3433,30 @@ ApplyWorkerMain(Datum main_arg)
{
char *syncslotname;
- /* This is table synchronization worker, call initial sync. */
- syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ PG_TRY();
+ {
+ /* This is table synchronization worker, call initial sync. */
+ syncslotname = LogicalRepSyncTableStart(&origin_startpos);
+ }
+ PG_CATCH();
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ /*
+ * Report the table sync error. There is no corresponding message
+ * type for table synchronization.
+ */
+ pgstat_report_subworker_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ MyLogicalRepWorker->relid,
+ 0, /* message type */
+ InvalidTransactionId,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
/* allocate slot name in long-lived context */
myslotname = MemoryContextStrdup(ApplyContext, syncslotname);
@@ -3551,7 +3574,32 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- LogicalRepApplyLoop(origin_startpos);
+ PG_TRY();
+ {
+ LogicalRepApplyLoop(origin_startpos);
+ }
+ PG_CATCH();
+ {
+ /* report the apply error */
+ if (apply_error_callback_arg.command != 0)
+ {
+ MemoryContext ecxt = MemoryContextSwitchTo(cctx);
+ ErrorData *errdata = CopyErrorData();
+
+ pgstat_report_subworker_error(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ apply_error_callback_arg.rel != NULL
+ ? apply_error_callback_arg.rel->localreloid
+ : InvalidOid,
+ apply_error_callback_arg.command,
+ apply_error_callback_arg.remote_xid,
+ errdata->message);
+ MemoryContextSwitchTo(ecxt);
+ }
+
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
proc_exit(0);
}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index e64857e..f529c15 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -2172,7 +2172,7 @@ pg_stat_reset_single_table_counters(PG_FUNCTION_ARGS)
{
Oid taboid = PG_GETARG_OID(0);
- pgstat_reset_single_counter(taboid, RESET_TABLE);
+ pgstat_reset_single_counter(taboid, InvalidOid, RESET_TABLE);
PG_RETURN_VOID();
}
@@ -2182,11 +2182,38 @@ pg_stat_reset_single_function_counters(PG_FUNCTION_ARGS)
{
Oid funcoid = PG_GETARG_OID(0);
- pgstat_reset_single_counter(funcoid, RESET_FUNCTION);
+ pgstat_reset_single_counter(funcoid, InvalidOid, RESET_FUNCTION);
PG_RETURN_VOID();
}
+Datum
+pg_stat_reset_subscription_worker_subrel(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+ Oid relid = PG_ARGISNULL(1) ? InvalidOid : PG_GETARG_OID(1);
+
+ pgstat_reset_single_counter(subid, relid, RESET_SUBWORKER);
+
+ PG_RETURN_VOID();
+}
+
+/* Reset all subscription worker stats associated with the given subscription */
+Datum
+pg_stat_reset_subscription_worker_sub(PG_FUNCTION_ARGS)
+{
+ Oid subid = PG_GETARG_OID(0);
+
+ /*
+ * Use subscription drop message to remove statistics of all subscription
+ * workers.
+ */
+ pgstat_report_subscription_drop(subid);
+
+ PG_RETURN_VOID();
+}
+
+
/* Reset SLRU counters (a specific one or all of them). */
Datum
pg_stat_reset_slru(PG_FUNCTION_ARGS)
@@ -2380,3 +2407,100 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
/* Returns the record as Datum */
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
+
+/*
+ * Get the subscription worker statistics for the given subscription
+ * (and relation).
+ */
+Datum
+pg_stat_get_subscription_worker(PG_FUNCTION_ARGS)
+{
+#define PG_STAT_GET_SUBSCRIPTION_WORKER_COLS 8
+ Oid subid = PG_GETARG_OID(0);
+ Oid subrelid;
+ TupleDesc tupdesc;
+ Datum values[PG_STAT_GET_SUBSCRIPTION_WORKER_COLS];
+ bool nulls[PG_STAT_GET_SUBSCRIPTION_WORKER_COLS];
+ PgStat_StatSubWorkerEntry *wentry;
+ int i;
+
+ if (PG_ARGISNULL(1))
+ subrelid = InvalidOid;
+ else
+ subrelid = PG_GETARG_OID(1);
+
+ /* Get subscription worker stats */
+ wentry = pgstat_fetch_stat_subworker_entry(subid, subrelid);
+
+ /* Return NULL if there is no worker statistics */
+ if (wentry == NULL)
+ PG_RETURN_NULL();
+
+ /* Initialise attributes information in the tuple descriptor */
+ tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_SUBSCRIPTION_WORKER_COLS);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "subid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subrelid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 3, "last_error_relid",
+ OIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 4, "last_error_command",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 5, "last_error_xid",
+ XIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 6, "last_error_count",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 7, "last_error_message",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 8, "last_error_time",
+ TIMESTAMPTZOID, -1, 0);
+ BlessTupleDesc(tupdesc);
+
+ /* Initialise values and NULL flags arrays */
+ MemSet(values, 0, sizeof(values));
+ MemSet(nulls, 0, sizeof(nulls));
+
+ i = 0;
+ /* subid */
+ values[i++] = ObjectIdGetDatum(subid);
+
+ /* subrelid */
+ if (OidIsValid(subrelid))
+ values[i++] = ObjectIdGetDatum(subrelid);
+ else
+ nulls[i++] = true;
+
+ /* last_error_relid */
+ if (OidIsValid(wentry->last_error_relid))
+ values[i++] = ObjectIdGetDatum(wentry->last_error_relid);
+ else
+ nulls[i++] = true;
+
+ /* last_error_command */
+ if (wentry->last_error_command != 0)
+ values[i++] =
+ CStringGetTextDatum(logicalrep_message_type(wentry->last_error_command));
+ else
+ nulls[i++] = true;
+
+ /* last_error_xid */
+ if (TransactionIdIsValid(wentry->last_error_xid))
+ values[i++] = TransactionIdGetDatum(wentry->last_error_xid);
+ else
+ nulls[i++] = true;
+
+ /* last_error_count */
+ values[i++] = Int64GetDatum(wentry->last_error_count);
+
+ /* last_error_message */
+ values[i++] = CStringGetTextDatum(wentry->last_error_message);
+
+ /* last_error_time */
+ if (wentry->last_error_time != 0)
+ values[i++] = TimestampTzGetDatum(wentry->last_error_time);
+ else
+ nulls[i++] = true;
+
+ /* Returns the record as Datum */
+ PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
+}
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index e934361..79d787c 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5389,6 +5389,14 @@
proargmodes => '{i,o,o,o,o,o,o,o,o,o,o}',
proargnames => '{slot_name,slot_name,spill_txns,spill_count,spill_bytes,stream_txns,stream_count,stream_bytes,total_txns,total_bytes,stats_reset}',
prosrc => 'pg_stat_get_replication_slot' },
+{ oid => '8523', descr => 'statistics: information about subscription worker',
+ proname => 'pg_stat_get_subscription_worker', prorows => '1', proisstrict => 'f',
+ proretset => 't', provolatile => 's', proparallel => 'r',
+ prorettype => 'record', proargtypes => 'oid oid',
+ proallargtypes => '{oid,oid,oid,oid,oid,text,xid,int8,text,timestamptz}',
+ proargmodes => '{i,i,o,o,o,o,o,o,o,o}',
+ proargnames => '{subid,subrelid,subid,subrelid,last_error_relid,last_error_command,last_error_xid,last_error_count,last_error_message,last_error_time}',
+ prosrc => 'pg_stat_get_subscription_worker' },
{ oid => '6118', descr => 'statistics: information about subscription',
proname => 'pg_stat_get_subscription', prorows => '10', proisstrict => 'f',
proretset => 't', provolatile => 's', proparallel => 'r',
@@ -5776,6 +5784,16 @@
proname => 'pg_stat_reset_replication_slot', proisstrict => 'f',
provolatile => 'v', prorettype => 'void', proargtypes => 'text',
prosrc => 'pg_stat_reset_replication_slot' },
+{ oid => '8524',
+ descr => 'statistics: reset collected statistics for a single subscription worker',
+ proname => 'pg_stat_reset_subscription_worker', proisstrict => 'f',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid oid',
+ prosrc => 'pg_stat_reset_subscription_worker_subrel' },
+{ oid => '8525',
+ descr => 'statistics: reset all collected statistics for a single subscription',
+ proname => 'pg_stat_reset_subscription_worker',
+ provolatile => 'v', prorettype => 'void', proargtypes => 'oid',
+ prosrc => 'pg_stat_reset_subscription_worker_sub' },
{ oid => '3163', descr => 'current trigger depth',
proname => 'pg_trigger_depth', provolatile => 's', proparallel => 'r',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index bcd3588..5b51b58 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -14,6 +14,7 @@
#include "datatype/timestamp.h"
#include "portability/instr_time.h"
#include "postmaster/pgarch.h" /* for MAX_XFN_CHARS */
+#include "replication/logicalproto.h"
#include "utils/backend_progress.h" /* for backward compatibility */
#include "utils/backend_status.h" /* for backward compatibility */
#include "utils/hsearch.h"
@@ -83,6 +84,8 @@ typedef enum StatMsgType
PGSTAT_MTYPE_REPLSLOT,
PGSTAT_MTYPE_CONNECT,
PGSTAT_MTYPE_DISCONNECT,
+ PGSTAT_MTYPE_SUBSCRIPTIONPURGE,
+ PGSTAT_MTYPE_SUBWORKERERROR,
} StatMsgType;
/* ----------
@@ -145,7 +148,8 @@ typedef enum PgStat_Shared_Reset_Target
typedef enum PgStat_Single_Reset_Type
{
RESET_TABLE,
- RESET_FUNCTION
+ RESET_FUNCTION,
+ RESET_SUBWORKER
} PgStat_Single_Reset_Type;
/* ------------------------------------------------------------
@@ -364,6 +368,7 @@ typedef struct PgStat_MsgResetsinglecounter
Oid m_databaseid;
PgStat_Single_Reset_Type m_resettype;
Oid m_objectid;
+ Oid m_subobjectid;
} PgStat_MsgResetsinglecounter;
/* ----------
@@ -536,6 +541,54 @@ typedef struct PgStat_MsgReplSlot
PgStat_Counter m_total_bytes;
} PgStat_MsgReplSlot;
+/* ----------
+ * PgStat_MsgSubscriptionPurge Sent by the backend and autovacuum to tell the
+ * collector about the dead subscriptions.
+ * ----------
+ */
+#define PGSTAT_NUM_SUBSCRIPTIONPURGE \
+ ((PGSTAT_MSG_PAYLOAD - sizeof(Oid) - sizeof(int)) / sizeof(Oid))
+
+typedef struct PgStat_MsgSubscriptionPurge
+{
+ PgStat_MsgHdr m_hdr;
+ Oid m_databaseid;
+ int m_nentries;
+ Oid m_subids[PGSTAT_NUM_SUBSCRIPTIONPURGE];
+} PgStat_MsgSubscriptionPurge;
+
+/* ----------
+ * PgStat_MsgSubWorkerError Sent by the apply worker or the table sync
+ * worker to report the error occurred while
+ * processing changes.
+ * ----------
+ */
+#define PGSTAT_SUBWORKERERROR_MSGLEN 256
+typedef struct PgStat_MsgSubWorkerError
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * m_subid and m_subrelid are used to determine the subscription and the
+ * reporter of the error. m_subrelid is InvalidOid if reported by an apply
+ * worker otherwise reported by a table sync worker.
+ */
+ Oid m_databaseid;
+ Oid m_subid;
+ Oid m_subrelid;
+
+ /*
+ * Oid of the table that the reporter was actually processing. m_relid can
+ * be InvalidOid if an error occurred during worker applying a
+ * non-data-modification message such as RELATION.
+ */
+ Oid m_relid;
+
+ LogicalRepMsgType m_command;
+ TransactionId m_xid;
+ TimestampTz m_timestamp;
+ char m_message[PGSTAT_SUBWORKERERROR_MSGLEN];
+} PgStat_MsgSubWorkerError;
/* ----------
* PgStat_MsgRecoveryConflict Sent by the backend upon recovery conflict
@@ -714,6 +767,8 @@ typedef union PgStat_Msg
PgStat_MsgReplSlot msg_replslot;
PgStat_MsgConnect msg_connect;
PgStat_MsgDisconnect msg_disconnect;
+ PgStat_MsgSubscriptionPurge msg_subscriptionpurge;
+ PgStat_MsgSubWorkerError msg_subworkererror;
} PgStat_Msg;
@@ -725,7 +780,7 @@ typedef union PgStat_Msg
* ------------------------------------------------------------
*/
-#define PGSTAT_FILE_FORMAT_ID 0x01A5BCA4
+#define PGSTAT_FILE_FORMAT_ID 0x01A5BCA5
/* ----------
* PgStat_StatDBEntry The collector's data per database
@@ -768,11 +823,16 @@ typedef struct PgStat_StatDBEntry
TimestampTz stats_timestamp; /* time of db stats file update */
/*
- * tables and functions must be last in the struct, because we don't write
- * the pointers out to the stats file.
+ * tables, functions, and subscription workers must be last in the struct,
+ * because we don't write the pointers out to the stats file.
+ *
+ * subworkers is the hash table of PgStat_StatSubWorkerEntry which stores
+ * statistics of logical replication workers: apply worker and table sync
+ * worker.
*/
HTAB *tables;
HTAB *functions;
+ HTAB *subworkers;
} PgStat_StatDBEntry;
@@ -929,6 +989,38 @@ typedef struct PgStat_StatReplSlotEntry
TimestampTz stat_reset_timestamp;
} PgStat_StatReplSlotEntry;
+/* The lookup key for subscription worker hash table */
+typedef struct PgStat_StatSubWorkerKey
+{
+ Oid subid;
+
+ /*
+ * Oid of the table for which tablesync worker will copy the initial data.
+ * An InvalidOid will be assigned for apply workers.
+ */
+ Oid subrelid;
+} PgStat_StatSubWorkerKey;
+
+/*
+ * Logical replication apply worker and table sync worker statistics kept in the
+ * stats collector.
+ */
+typedef struct PgStat_StatSubWorkerEntry
+{
+ PgStat_StatSubWorkerKey key; /* hash key (must be first) */
+
+ /*
+ * Subscription worker error statistics representing an error that
+ * occurred during application of changes or the initial table
+ * synchronization.
+ */
+ Oid last_error_relid;
+ LogicalRepMsgType last_error_command;
+ TransactionId last_error_xid;
+ PgStat_Counter last_error_count;
+ TimestampTz last_error_time;
+ char last_error_message[PGSTAT_SUBWORKERERROR_MSGLEN];
+} PgStat_StatSubWorkerEntry;
/*
* Working state needed to accumulate per-function-call timing statistics.
@@ -1019,7 +1111,8 @@ extern void pgstat_drop_database(Oid databaseid);
extern void pgstat_clear_snapshot(void);
extern void pgstat_reset_counters(void);
extern void pgstat_reset_shared_counters(const char *);
-extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type type);
+extern void pgstat_reset_single_counter(Oid objectid, Oid subobjectid,
+ PgStat_Single_Reset_Type type);
extern void pgstat_reset_slru_counter(const char *);
extern void pgstat_reset_replslot_counter(const char *name);
@@ -1038,6 +1131,10 @@ extern void pgstat_report_checksum_failure(void);
extern void pgstat_report_replslot(const PgStat_StatReplSlotEntry *repSlotStat);
extern void pgstat_report_replslot_create(const char *slotname);
extern void pgstat_report_replslot_drop(const char *slotname);
+extern void pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
+ LogicalRepMsgType command,
+ TransactionId xid, const char *errmsg);
+extern void pgstat_report_subscription_drop(Oid subid);
extern void pgstat_initialize(void);
@@ -1129,6 +1226,8 @@ extern void pgstat_send_wal(bool force);
extern PgStat_StatDBEntry *pgstat_fetch_stat_dbentry(Oid dbid);
extern PgStat_StatTabEntry *pgstat_fetch_stat_tabentry(Oid relid);
extern PgStat_StatFuncEntry *pgstat_fetch_stat_funcentry(Oid funcid);
+extern PgStat_StatSubWorkerEntry *pgstat_fetch_stat_subworker_entry(Oid subid,
+ Oid subrelid);
extern PgStat_ArchiverStats *pgstat_fetch_stat_archiver(void);
extern PgStat_BgWriterStats *pgstat_fetch_stat_bgwriter(void);
extern PgStat_CheckpointerStats *pgstat_fetch_stat_checkpointer(void);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2fa00a3..b58b062 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2094,6 +2094,24 @@ pg_stat_subscription| SELECT su.oid AS subid,
st.latest_end_time
FROM (pg_subscription su
LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid, pid, received_lsn, last_msg_send_time, last_msg_receipt_time, latest_end_lsn, latest_end_time) ON ((st.subid = su.oid)));
+pg_stat_subscription_workers| SELECT w.subid,
+ s.subname,
+ w.subrelid,
+ w.last_error_relid,
+ w.last_error_command,
+ w.last_error_xid,
+ w.last_error_count,
+ w.last_error_message,
+ w.last_error_time
+ FROM ( SELECT pg_subscription.oid AS subid,
+ NULL::oid AS relid
+ FROM pg_subscription
+ UNION ALL
+ SELECT pg_subscription_rel.srsubid AS subid,
+ pg_subscription_rel.srrelid AS relid
+ FROM pg_subscription_rel) sr,
+ (LATERAL pg_stat_get_subscription_worker(sr.subid, sr.relid) w(subid, subrelid, last_error_relid, last_error_command, last_error_xid, last_error_count, last_error_message, last_error_time)
+ JOIN pg_subscription s ON ((w.subid = s.oid)));
pg_stat_sys_indexes| SELECT pg_stat_all_indexes.relid,
pg_stat_all_indexes.indexrelid,
pg_stat_all_indexes.schemaname,
diff --git a/src/test/subscription/t/026_worker_stats.pl b/src/test/subscription/t/026_worker_stats.pl
new file mode 100644
index 0000000..e64e0a7
--- /dev/null
+++ b/src/test/subscription/t/026_worker_stats.pl
@@ -0,0 +1,154 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Tests for subscription error stats.
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 5;
+
+# Test if the error reported on pg_stat_subscription_workers view is expected.
+sub test_subscription_error
+{
+ my ($node, $relname, $xid, $expected_error, $msg) = @_;
+
+ my $check_sql = qq[
+SELECT count(1) > 0 FROM pg_stat_subscription_workers
+WHERE last_error_relid = '$relname'::regclass];
+ $check_sql .= " AND last_error_xid = '$xid'::xid;" if $xid ne '';
+
+ # Wait for the error statistics to be updated.
+ $node->poll_query_until(
+ 'postgres', $check_sql,
+) or die "Timed out while waiting for statistics to be updated";
+
+ my $result = $node->safe_psql(
+ 'postgres',
+ qq[
+SELECT subname, last_error_command, last_error_relid::regclass, last_error_count > 0
+FROM pg_stat_subscription_workers
+WHERE last_error_relid = '$relname'::regclass;
+]);
+ is($result, $expected_error, $msg);
+}
+
+# Create publisher node.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+
+# The subscriber will enter an infinite error loop, so we don't want
+# to overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+wal_retrieve_retry_interval = 2s
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we
+# create the same tables but with primary keys. Also, insert some data that
+# will conflict with the data replicated from publisher later.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+CREATE TABLE test_tab1 (a int);
+CREATE TABLE test_tab2 (a int);
+INSERT INTO test_tab1 VALUES (1);
+INSERT INTO test_tab2 VALUES (1);
+COMMIT;
+]);
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+CREATE TABLE test_tab1 (a int primary key);
+CREATE TABLE test_tab2 (a int primary key);
+INSERT INTO test_tab2 VALUES (1);
+COMMIT;
+]);
+
+# Setup publications.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql(
+ 'postgres',
+ "CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;");
+
+# There shouldn't be any subscription errors before starting logical replication.
+my $result = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT count(1) FROM pg_stat_subscription_workers");
+is($result, qq(0), 'check no subscription error');
+
+# Create subscription. The table sync for test_tab2 on tap_sub will enter into
+# infinite error loop due to violating the unique constraint.
+$node_subscriber->safe_psql(
+ 'postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr' PUBLICATION tap_pub WITH (streaming = off);");
+
+$node_publisher->wait_for_catchup('tap_sub');
+
+# Wait for initial table sync for test_tab1 to finish.
+$node_subscriber->poll_query_until(
+ 'postgres',
+ qq[
+SELECT count(1) = 1 FROM pg_subscription_rel
+WHERE srrelid = 'test_tab1'::regclass AND srsubstate in ('r', 's')
+]) or die "Timed out while waiting for subscriber to synchronize data";
+
+# Check the initial data.
+$result = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT count(a) FROM test_tab1");
+is($result, q(1), 'check initial data are copied to subscriber');
+
+# Insert more data to test_tab1, raising an error on the subscriber due to
+# violation of the unique constraint on test_tab1.
+my $xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab1 VALUES (1);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_subscription_error($node_subscriber, 'test_tab1', $xid,
+ qq(tap_sub|INSERT|test_tab1|t),
+ 'check the error reported by the apply worker');
+
+# Check the table sync worker's error in the view.
+test_subscription_error($node_subscriber, 'test_tab2', '',
+ qq(tap_sub||test_tab2|t),
+ 'check the error reported by the table sync worker');
+
+# Test for resetting subscription worker statistics.
+# Truncate test_tab1 and test_tab2 so that applying changes and table sync can
+# continue, respectively.
+$node_subscriber->safe_psql(
+ 'postgres',
+ "TRUNCATE test_tab1, test_tab2;");
+
+# Wait for the data to be replicated.
+$node_subscriber->poll_query_until(
+ 'postgres',
+ "SELECT count(1) > 0 FROM test_tab1");
+$node_subscriber->poll_query_until(
+ 'postgres',
+ "SELECT count(1) > 0 FROM test_tab2");
+
+# There shouldn't be any errors in the view after dropping the subscription.
+$node_subscriber->safe_psql(
+ 'postgres',
+ "DROP SUBSCRIPTION tap_sub;");
+$result = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT count(1) FROM pg_stat_subscription_workers");
+is($result, q(0), 'no error after dropping subscription');
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index da6ac8e..f41ef0d 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1943,6 +1943,8 @@ PgStat_MsgResetsharedcounter
PgStat_MsgResetsinglecounter
PgStat_MsgResetslrucounter
PgStat_MsgSLRU
+PgStat_MsgSubscriptionPurge
+PgStat_MsgSubWorkerError
PgStat_MsgTabpurge
PgStat_MsgTabstat
PgStat_MsgTempFile
@@ -1954,6 +1956,8 @@ PgStat_Single_Reset_Type
PgStat_StatDBEntry
PgStat_StatFuncEntry
PgStat_StatReplSlotEntry
+PgStat_StatSubWorkerEntry
+PgStat_StatSubWorkerKey
PgStat_StatTabEntry
PgStat_SubXactStatus
PgStat_TableCounts
--
1.8.3.1
On Mon, Nov 29, 2021 at 9:13 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Nov 29, 2021 at 7:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Sat, Nov 27, 2021 at 7:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Thank you for updating the patch!
I have made a number of changes in the attached patch which includes
(a) the patch was trying to register multiple array entries for the
same subscription which doesn't seem to be required, see changes in
pgstat_vacuum_stat, (b) multiple changes in the test like reduced the
wal_retrieve_retry_interval to 2s which has reduced the test time to
half, remove the check related to resetting of stats as there is no
guarantee that the message will be received by the collector and we
were not sending it again, changed the test case file name to
026_stats as we can add more subscription-related stats in this test
file itselfSince we have pg_stat_subscription view, how about 026_worker_stats.pl?
Sounds better. Updated patch attached.
Thanks for the updated patch, the v28 patch looks good to me.
Regards,
Vignesh
On Mon, Nov 29, 2021 at 11:38 AM vignesh C <vignesh21@gmail.com> wrote:
I have pushed this patch and there is a buildfarm failure for it. See:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sidewinder&dt=2021-11-30%2005%3A05%3A25
Sawada-San has shared his initial analysis on pgsql-committers [1]/messages/by-id/CAD21AoChP5wOT2AYziF+-j7vvThF2NyAs7wr+yy+8hsnu=8Rgg@mail.gmail.com and
I am responding here as the fix requires some more discussion.
Looking at the result the test actually got, we had two error entries
for test_tab1 instead of one:# Failed test 'check the error reported by the apply worker'
# at t/026_worker_stats.pl line 33.
# got: 'tap_sub|INSERT|test_tab1|t
# tap_sub||test_tab1|t'
# expected: 'tap_sub|INSERT|test_tab1|t'The possible scenarios are:
The table sync worker for test_tab1 failed due to an error unrelated
to apply changes:2021-11-30 06:24:02.137 CET [18990:2] ERROR: replication origin with
OID 2 is already active for PID 23706At this time, the view had one error entry for the table sync worker.
After retrying table sync, it succeeded:2021-11-30 06:24:04.202 CET [28117:2] LOG: logical replication table
synchronization worker for subscription "tap_sub", table "test_tab1"
has finishedThen after inserting a row on the publisher, the apply worker inserted
the row but failed due to violating a unique key violation, which is
expected:2021-11-30 06:24:04.307 CET [4806:2] ERROR: duplicate key value
violates unique constraint "test_tab1_pkey"
2021-11-30 06:24:04.307 CET [4806:3] DETAIL: Key (a)=(1) already exists.
2021-11-30 06:24:04.307 CET [4806:4] CONTEXT: processing remote data
during "INSERT" for replication target relation "public.test_tab1" in
transaction 721 at 2021-11-30 06:24:04.305096+01As a result, we had two error entries for test_tab1: the table sync
worker error and the apply worker error. I didn't expect that the
table sync worker for test_tab1 failed due to "replication origin with
OID 2 is already active for PID 23706” error.Looking at test_subscription_error() in 026_worker_stats.pl, we have
two checks; in the first check, we wait for the view to show the error
entry for the given relation name and xid. This check was passed since
we had the second error (i.g., apply worker error). In the second
check, we get error entries from pg_stat_subscription_workers by
specifying only the relation name. Therefore, we ended up getting two
entries and failed the tests.To fix this issue, I think that in the second check, we can get the
error from pg_stat_subscription_workers by specifying the relation
name *and* xid like the first check does. I've attached the patch.
What do you think?
I think this will fix the reported failure but there is another race
condition in the test. Isn't it possible that for table test_tab2, we
get an error "replication origin with OID ..." or some other error
before copy, in that case also, we will proceed from the second call
of test_subscription_error() which is not what we expect in the test?
Shouldn't we someway check that the error message also starts with
"duplicate key value violates ..."?
[1]: /messages/by-id/CAD21AoChP5wOT2AYziF+-j7vvThF2NyAs7wr+yy+8hsnu=8Rgg@mail.gmail.com
--
With Regards,
Amit Kapila.
On Tue, Nov 30, 2021 at 6:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Nov 29, 2021 at 11:38 AM vignesh C <vignesh21@gmail.com> wrote:
I have pushed this patch and there is a buildfarm failure for it. See:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sidewinder&dt=2021-11-30%2005%3A05%3A25Sawada-San has shared his initial analysis on pgsql-committers [1] and
I am responding here as the fix requires some more discussion.Looking at the result the test actually got, we had two error entries
for test_tab1 instead of one:# Failed test 'check the error reported by the apply worker'
# at t/026_worker_stats.pl line 33.
# got: 'tap_sub|INSERT|test_tab1|t
# tap_sub||test_tab1|t'
# expected: 'tap_sub|INSERT|test_tab1|t'The possible scenarios are:
The table sync worker for test_tab1 failed due to an error unrelated
to apply changes:2021-11-30 06:24:02.137 CET [18990:2] ERROR: replication origin with
OID 2 is already active for PID 23706At this time, the view had one error entry for the table sync worker.
After retrying table sync, it succeeded:2021-11-30 06:24:04.202 CET [28117:2] LOG: logical replication table
synchronization worker for subscription "tap_sub", table "test_tab1"
has finishedThen after inserting a row on the publisher, the apply worker inserted
the row but failed due to violating a unique key violation, which is
expected:2021-11-30 06:24:04.307 CET [4806:2] ERROR: duplicate key value
violates unique constraint "test_tab1_pkey"
2021-11-30 06:24:04.307 CET [4806:3] DETAIL: Key (a)=(1) already exists.
2021-11-30 06:24:04.307 CET [4806:4] CONTEXT: processing remote data
during "INSERT" for replication target relation "public.test_tab1" in
transaction 721 at 2021-11-30 06:24:04.305096+01As a result, we had two error entries for test_tab1: the table sync
worker error and the apply worker error. I didn't expect that the
table sync worker for test_tab1 failed due to "replication origin with
OID 2 is already active for PID 23706” error.Looking at test_subscription_error() in 026_worker_stats.pl, we have
two checks; in the first check, we wait for the view to show the error
entry for the given relation name and xid. This check was passed since
we had the second error (i.g., apply worker error). In the second
check, we get error entries from pg_stat_subscription_workers by
specifying only the relation name. Therefore, we ended up getting two
entries and failed the tests.To fix this issue, I think that in the second check, we can get the
error from pg_stat_subscription_workers by specifying the relation
name *and* xid like the first check does. I've attached the patch.
What do you think?I think this will fix the reported failure but there is another race
condition in the test. Isn't it possible that for table test_tab2, we
get an error "replication origin with OID ..." or some other error
before copy, in that case also, we will proceed from the second call
of test_subscription_error() which is not what we expect in the test?
Right.
Shouldn't we someway check that the error message also starts with
"duplicate key value violates ..."?
Yeah, I think it's a good idea to make the checks more specific. That
is, probably we can specify the prefix of the error message and
subrelid in addition to the current conditions: relid and xid. That
way, we can check what error was reported by which workers (tablesync
or apply) for which relations. And both check queries in
test_subscription_error() can have the same WHERE clause.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, Nov 30, 2021 at 8:41 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Nov 30, 2021 at 6:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Nov 29, 2021 at 11:38 AM vignesh C <vignesh21@gmail.com> wrote:
I have pushed this patch and there is a buildfarm failure for it. See:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sidewinder&dt=2021-11-30%2005%3A05%3A25Sawada-San has shared his initial analysis on pgsql-committers [1] and
I am responding here as the fix requires some more discussion.Looking at the result the test actually got, we had two error entries
for test_tab1 instead of one:# Failed test 'check the error reported by the apply worker'
# at t/026_worker_stats.pl line 33.
# got: 'tap_sub|INSERT|test_tab1|t
# tap_sub||test_tab1|t'
# expected: 'tap_sub|INSERT|test_tab1|t'The possible scenarios are:
The table sync worker for test_tab1 failed due to an error unrelated
to apply changes:2021-11-30 06:24:02.137 CET [18990:2] ERROR: replication origin with
OID 2 is already active for PID 23706At this time, the view had one error entry for the table sync worker.
After retrying table sync, it succeeded:2021-11-30 06:24:04.202 CET [28117:2] LOG: logical replication table
synchronization worker for subscription "tap_sub", table "test_tab1"
has finishedThen after inserting a row on the publisher, the apply worker inserted
the row but failed due to violating a unique key violation, which is
expected:2021-11-30 06:24:04.307 CET [4806:2] ERROR: duplicate key value
violates unique constraint "test_tab1_pkey"
2021-11-30 06:24:04.307 CET [4806:3] DETAIL: Key (a)=(1) already exists.
2021-11-30 06:24:04.307 CET [4806:4] CONTEXT: processing remote data
during "INSERT" for replication target relation "public.test_tab1" in
transaction 721 at 2021-11-30 06:24:04.305096+01As a result, we had two error entries for test_tab1: the table sync
worker error and the apply worker error. I didn't expect that the
table sync worker for test_tab1 failed due to "replication origin with
OID 2 is already active for PID 23706” error.Looking at test_subscription_error() in 026_worker_stats.pl, we have
two checks; in the first check, we wait for the view to show the error
entry for the given relation name and xid. This check was passed since
we had the second error (i.g., apply worker error). In the second
check, we get error entries from pg_stat_subscription_workers by
specifying only the relation name. Therefore, we ended up getting two
entries and failed the tests.To fix this issue, I think that in the second check, we can get the
error from pg_stat_subscription_workers by specifying the relation
name *and* xid like the first check does. I've attached the patch.
What do you think?I think this will fix the reported failure but there is another race
condition in the test. Isn't it possible that for table test_tab2, we
get an error "replication origin with OID ..." or some other error
before copy, in that case also, we will proceed from the second call
of test_subscription_error() which is not what we expect in the test?Right.
Shouldn't we someway check that the error message also starts with
"duplicate key value violates ..."?Yeah, I think it's a good idea to make the checks more specific. That
is, probably we can specify the prefix of the error message and
subrelid in addition to the current conditions: relid and xid. That
way, we can check what error was reported by which workers (tablesync
or apply) for which relations. And both check queries in
test_subscription_error() can have the same WHERE clause.
I've attached a patch that fixes this issue. Please review it.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
0001-Fix-regression-test-failure-caused-by-commit-8d74fc9.patchapplication/octet-stream; name=0001-Fix-regression-test-failure-caused-by-commit-8d74fc9.patchDownload
From 30bc31540100974dd1dd63251622cd99fa7d1992 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 30 Nov 2021 20:56:57 +0900
Subject: [PATCH] Fix regression test failure caused by commit 8d74fc96db
The tests were not considering that an error unrelated to apply
changes, e.g. "replication origin with OID 2 is already active ...",
could occur on the table sync worker before starting to apply changes.
This commit makes the queries used to check error entries check also
a prefix of the error message as well as the source of logical
replication worker (tablesync or apply) so we can check the specific
error.
Per buildfarm member sidewinder.
---
src/test/subscription/t/026_worker_stats.pl | 34 ++++++++++++---------
1 file changed, 20 insertions(+), 14 deletions(-)
diff --git a/src/test/subscription/t/026_worker_stats.pl b/src/test/subscription/t/026_worker_stats.pl
index e64e0a74b8..121d789c24 100644
--- a/src/test/subscription/t/026_worker_stats.pl
+++ b/src/test/subscription/t/026_worker_stats.pl
@@ -11,26 +11,30 @@ use Test::More tests => 5;
# Test if the error reported on pg_stat_subscription_workers view is expected.
sub test_subscription_error
{
- my ($node, $relname, $xid, $expected_error, $msg) = @_;
+ my ($node, $relname, $xid, $by_apply_worker, $errmsg_prefix, $expected, $msg) = @_;
- my $check_sql = qq[
-SELECT count(1) > 0 FROM pg_stat_subscription_workers
-WHERE last_error_relid = '$relname'::regclass];
- $check_sql .= " AND last_error_xid = '$xid'::xid;" if $xid ne '';
+ # Construct the part of query used below.
+ my $part_sql = qq[
+FROM pg_stat_subscription_workers
+WHERE last_error_relid = '$relname'::regclass
+ AND starts_with(last_error_message, '$errmsg_prefix')];
+ $part_sql .= $by_apply_worker
+ ? qq[ AND subrelid IS NULL]
+ : qq[ AND subrelid = '$relname'::regclass];
+ $part_sql .= qq[ AND last_error_xid = '$xid'::xid] if $xid ne '';
# Wait for the error statistics to be updated.
+ my $check_sql = qq[SELECT count(1) > 0 ] . $part_sql;
$node->poll_query_until(
'postgres', $check_sql,
) or die "Timed out while waiting for statistics to be updated";
- my $result = $node->safe_psql(
- 'postgres',
+ $check_sql =
qq[
-SELECT subname, last_error_command, last_error_relid::regclass, last_error_count > 0
-FROM pg_stat_subscription_workers
-WHERE last_error_relid = '$relname'::regclass;
-]);
- is($result, $expected_error, $msg);
+SELECT subname, last_error_command, last_error_relid::regclass,
+last_error_count > 0 ] . $part_sql;
+ my $result = $node->safe_psql('postgres', $check_sql);
+ is($result, $expected, $msg);
}
# Create publisher node.
@@ -117,12 +121,14 @@ INSERT INTO test_tab1 VALUES (1);
SELECT pg_current_xact_id()::xid;
COMMIT;
]);
-test_subscription_error($node_subscriber, 'test_tab1', $xid,
+test_subscription_error($node_subscriber, 'test_tab1', $xid, 1,
+ qq(duplicate key value violates unique constraint),
qq(tap_sub|INSERT|test_tab1|t),
'check the error reported by the apply worker');
# Check the table sync worker's error in the view.
-test_subscription_error($node_subscriber, 'test_tab2', '',
+test_subscription_error($node_subscriber, 'test_tab2', '', 0,
+ qq(duplicate key value violates unique constraint),
qq(tap_sub||test_tab2|t),
'check the error reported by the table sync worker');
--
2.24.3 (Apple Git-128)
On Tue, Nov 30, 2021 at 7:09 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Nov 30, 2021 at 8:41 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Nov 30, 2021 at 6:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Nov 29, 2021 at 11:38 AM vignesh C <vignesh21@gmail.com> wrote:
I have pushed this patch and there is a buildfarm failure for it. See:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sidewinder&dt=2021-11-30%2005%3A05%3A25Sawada-San has shared his initial analysis on pgsql-committers [1] and
I am responding here as the fix requires some more discussion.Looking at the result the test actually got, we had two error entries
for test_tab1 instead of one:# Failed test 'check the error reported by the apply worker'
# at t/026_worker_stats.pl line 33.
# got: 'tap_sub|INSERT|test_tab1|t
# tap_sub||test_tab1|t'
# expected: 'tap_sub|INSERT|test_tab1|t'The possible scenarios are:
The table sync worker for test_tab1 failed due to an error unrelated
to apply changes:2021-11-30 06:24:02.137 CET [18990:2] ERROR: replication origin with
OID 2 is already active for PID 23706At this time, the view had one error entry for the table sync worker.
After retrying table sync, it succeeded:2021-11-30 06:24:04.202 CET [28117:2] LOG: logical replication table
synchronization worker for subscription "tap_sub", table "test_tab1"
has finishedThen after inserting a row on the publisher, the apply worker inserted
the row but failed due to violating a unique key violation, which is
expected:2021-11-30 06:24:04.307 CET [4806:2] ERROR: duplicate key value
violates unique constraint "test_tab1_pkey"
2021-11-30 06:24:04.307 CET [4806:3] DETAIL: Key (a)=(1) already exists.
2021-11-30 06:24:04.307 CET [4806:4] CONTEXT: processing remote data
during "INSERT" for replication target relation "public.test_tab1" in
transaction 721 at 2021-11-30 06:24:04.305096+01As a result, we had two error entries for test_tab1: the table sync
worker error and the apply worker error. I didn't expect that the
table sync worker for test_tab1 failed due to "replication origin with
OID 2 is already active for PID 23706” error.Looking at test_subscription_error() in 026_worker_stats.pl, we have
two checks; in the first check, we wait for the view to show the error
entry for the given relation name and xid. This check was passed since
we had the second error (i.g., apply worker error). In the second
check, we get error entries from pg_stat_subscription_workers by
specifying only the relation name. Therefore, we ended up getting two
entries and failed the tests.To fix this issue, I think that in the second check, we can get the
error from pg_stat_subscription_workers by specifying the relation
name *and* xid like the first check does. I've attached the patch.
What do you think?I think this will fix the reported failure but there is another race
condition in the test. Isn't it possible that for table test_tab2, we
get an error "replication origin with OID ..." or some other error
before copy, in that case also, we will proceed from the second call
of test_subscription_error() which is not what we expect in the test?Right.
Shouldn't we someway check that the error message also starts with
"duplicate key value violates ..."?Yeah, I think it's a good idea to make the checks more specific. That
is, probably we can specify the prefix of the error message and
subrelid in addition to the current conditions: relid and xid. That
way, we can check what error was reported by which workers (tablesync
or apply) for which relations. And both check queries in
test_subscription_error() can have the same WHERE clause.I've attached a patch that fixes this issue. Please review it.
Thanks for the updated patch, the patch applies neatly and make
check-world passes. Also I ran the failing test in a loop and found it
to be passing always.
Regards,
Vignesh
On Tues, Nov 30, 2021 9:39 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Nov 30, 2021 at 8:41 PM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:On Tue, Nov 30, 2021 at 6:28 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:
On Mon, Nov 29, 2021 at 11:38 AM vignesh C <vignesh21@gmail.com>
wrote:
I have pushed this patch and there is a buildfarm failure for it. See:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sidewinder&d
t=2021-11-30%2005%3A05%3A25Sawada-San has shared his initial analysis on pgsql-committers [1]
and I am responding here as the fix requires some more discussion.Looking at the result the test actually got, we had two error
entries for test_tab1 instead of one:# Failed test 'check the error reported by the apply worker'
# at t/026_worker_stats.pl line 33.
# got: 'tap_sub|INSERT|test_tab1|t
# tap_sub||test_tab1|t'
# expected: 'tap_sub|INSERT|test_tab1|t'The possible scenarios are:
The table sync worker for test_tab1 failed due to an error
unrelated to apply changes:2021-11-30 06:24:02.137 CET [18990:2] ERROR: replication origin
with OID 2 is already active for PID 23706At this time, the view had one error entry for the table sync worker.
After retrying table sync, it succeeded:2021-11-30 06:24:04.202 CET [28117:2] LOG: logical replication
table synchronization worker for subscription "tap_sub", table"test_tab1"
has finished
Then after inserting a row on the publisher, the apply worker
inserted the row but failed due to violating a unique key
violation, which is
expected:2021-11-30 06:24:04.307 CET [4806:2] ERROR: duplicate key value
violates unique constraint "test_tab1_pkey"
2021-11-30 06:24:04.307 CET [4806:3] DETAIL: Key (a)=(1) already exists.
2021-11-30 06:24:04.307 CET [4806:4] CONTEXT: processing remote
data during "INSERT" for replication target relation
"public.test_tab1" in transaction 721 at 2021-11-30
06:24:04.305096+01As a result, we had two error entries for test_tab1: the table
sync worker error and the apply worker error. I didn't expect that
the table sync worker for test_tab1 failed due to "replication
origin with OID 2 is already active for PID 23706” error.Looking at test_subscription_error() in 026_worker_stats.pl, we
have two checks; in the first check, we wait for the view to show
the error entry for the given relation name and xid. This check
was passed since we had the second error (i.g., apply worker
error). In the second check, we get error entries from
pg_stat_subscription_workers by specifying only the relation name.
Therefore, we ended up getting two entries and failed the tests.To fix this issue, I think that in the second check, we can get
the error from pg_stat_subscription_workers by specifying the
relation name *and* xid like the first check does. I've attached the patch.
What do you think?I think this will fix the reported failure but there is another race
condition in the test. Isn't it possible that for table test_tab2,
we get an error "replication origin with OID ..." or some other
error before copy, in that case also, we will proceed from the
second call of test_subscription_error() which is not what we expect in thetest?
Right.
Shouldn't we someway check that the error message also starts with
"duplicate key value violates ..."?Yeah, I think it's a good idea to make the checks more specific. That
is, probably we can specify the prefix of the error message and
subrelid in addition to the current conditions: relid and xid. That
way, we can check what error was reported by which workers (tablesync
or apply) for which relations. And both check queries in
test_subscription_error() can have the same WHERE clause.I've attached a patch that fixes this issue. Please review it.
I have a question about the testcase (I could be wrong here).
Is it possible that the race condition happen between apply worker(test_tab1)
and table sync worker(test_tab2) ? If so, it seems the error("replication
origin with OID") could happen randomly until we resolve the conflict.
Based on this, for the following code:
-----
# Wait for the error statistics to be updated.
my $check_sql = qq[SELECT count(1) > 0 ] . $part_sql;
$node->poll_query_until(
'postgres', $check_sql,
) or die "Timed out while waiting for statistics to be updated";
* [1] *
$check_sql =
qq[
SELECT subname, last_error_command, last_error_relid::regclass,
last_error_count > 0 ] . $part_sql;
my $result = $node->safe_psql('postgres', $check_sql);
is($result, $expected, $msg);
-----
Is it possible that the error("replication origin with OID") happen again at the
place [1]. In this case, the error message we have checked could be replaced by
another error("replication origin ...") and then the test fail ?
Best regards,
Hou zj
On Wed, Dec 1, 2021 at 8:24 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
On Tues, Nov 30, 2021 9:39 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Shouldn't we someway check that the error message also starts with
"duplicate key value violates ..."?Yeah, I think it's a good idea to make the checks more specific. That
is, probably we can specify the prefix of the error message and
subrelid in addition to the current conditions: relid and xid. That
way, we can check what error was reported by which workers (tablesync
or apply) for which relations. And both check queries in
test_subscription_error() can have the same WHERE clause.I've attached a patch that fixes this issue. Please review it.
I have a question about the testcase (I could be wrong here).
Is it possible that the race condition happen between apply worker(test_tab1)
and table sync worker(test_tab2) ? If so, it seems the error("replication
origin with OID") could happen randomly until we resolve the conflict.
Based on this, for the following code:
-----
# Wait for the error statistics to be updated.
my $check_sql = qq[SELECT count(1) > 0 ] . $part_sql;
$node->poll_query_until(
'postgres', $check_sql,
) or die "Timed out while waiting for statistics to be updated";* [1] *
$check_sql =
qq[
SELECT subname, last_error_command, last_error_relid::regclass,
last_error_count > 0 ] . $part_sql;
my $result = $node->safe_psql('postgres', $check_sql);
is($result, $expected, $msg);
-----Is it possible that the error("replication origin with OID") happen again at the
place [1]. In this case, the error message we have checked could be replaced by
another error("replication origin ...") and then the test fail ?
Once we get the "duplicate key violation ..." error before * [1] * via
apply_worker then we shouldn't get replication origin-specific error
because the origin set up is done before starting to apply changes.
Also, even if that or some other happens after * [1] * because of
errmsg_prefix check it should still succeed. Does that make sense?
--
With Regards,
Amit Kapila.
On Wed, Dec 1, 2021 11:22 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Dec 1, 2021 at 8:24 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:On Tues, Nov 30, 2021 9:39 PM Masahiko Sawada
<sawada.mshk@gmail.com> wrote:
Shouldn't we someway check that the error message also starts with
"duplicate key value violates ..."?Yeah, I think it's a good idea to make the checks more specific. That
is, probably we can specify the prefix of the error message and
subrelid in addition to the current conditions: relid and xid. That
way, we can check what error was reported by which workers (tablesync
or apply) for which relations. And both check queries in
test_subscription_error() can have the same WHERE clause.I've attached a patch that fixes this issue. Please review it.
I have a question about the testcase (I could be wrong here).
Is it possible that the race condition happen between apply
worker(test_tab1)
and table sync worker(test_tab2) ? If so, it seems the error("replication
origin with OID") could happen randomly until we resolve the conflict.
Based on this, for the following code:
-----
# Wait for the error statistics to be updated.
my $check_sql = qq[SELECT count(1) > 0 ] . $part_sql;
$node->poll_query_until(
'postgres', $check_sql,
) or die "Timed out while waiting for statistics to be updated";* [1] *
$check_sql =
qq[
SELECT subname, last_error_command, last_error_relid::regclass,
last_error_count > 0 ] . $part_sql;
my $result = $node->safe_psql('postgres', $check_sql);
is($result, $expected, $msg);
-----Is it possible that the error("replication origin with OID") happen again at the
place [1]. In this case, the error message we have checked could be replacedby
another error("replication origin ...") and then the test fail ?
Once we get the "duplicate key violation ..." error before * [1] * via
apply_worker then we shouldn't get replication origin-specific error
because the origin set up is done before starting to apply changes.
Also, even if that or some other happens after * [1] * because of
errmsg_prefix check it should still succeed. Does that make sense?
Oh, I missed the point that the origin set up is done once we get the expected error.
Thanks for the explanation, and I think the patch looks good.
Best regards,
Hou zj
On Wed, Dec 1, 2021 at 12:22 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Dec 1, 2021 at 8:24 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:On Tues, Nov 30, 2021 9:39 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Shouldn't we someway check that the error message also starts with
"duplicate key value violates ..."?Yeah, I think it's a good idea to make the checks more specific. That
is, probably we can specify the prefix of the error message and
subrelid in addition to the current conditions: relid and xid. That
way, we can check what error was reported by which workers (tablesync
or apply) for which relations. And both check queries in
test_subscription_error() can have the same WHERE clause.I've attached a patch that fixes this issue. Please review it.
I have a question about the testcase (I could be wrong here).
Is it possible that the race condition happen between apply worker(test_tab1)
and table sync worker(test_tab2) ? If so, it seems the error("replication
origin with OID") could happen randomly until we resolve the conflict.
Based on this, for the following code:
-----
# Wait for the error statistics to be updated.
my $check_sql = qq[SELECT count(1) > 0 ] . $part_sql;
$node->poll_query_until(
'postgres', $check_sql,
) or die "Timed out while waiting for statistics to be updated";* [1] *
$check_sql =
qq[
SELECT subname, last_error_command, last_error_relid::regclass,
last_error_count > 0 ] . $part_sql;
my $result = $node->safe_psql('postgres', $check_sql);
is($result, $expected, $msg);
-----Is it possible that the error("replication origin with OID") happen again at the
place [1]. In this case, the error message we have checked could be replaced by
another error("replication origin ...") and then the test fail ?Once we get the "duplicate key violation ..." error before * [1] * via
apply_worker then we shouldn't get replication origin-specific error
because the origin set up is done before starting to apply changes.
Right.
Also, even if that or some other happens after * [1] * because of
errmsg_prefix check it should still succeed.
In this case, the old error ("duplicate key violation ...") is
overwritten by a new error (e.g., connection error. not sure how
possible it is) and the test fails because the query returns no
entries, no? If so, the result from the second check_sql is unstable
and it's probably better to check the result only once. That is, the
first check_sql includes the command and we exit from the function
once we confirm the error entry is expectedly updated.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wed, Dec 1, 2021 at 9:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Dec 1, 2021 at 12:22 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Dec 1, 2021 at 8:24 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:I have a question about the testcase (I could be wrong here).
Is it possible that the race condition happen between apply worker(test_tab1)
and table sync worker(test_tab2) ? If so, it seems the error("replication
origin with OID") could happen randomly until we resolve the conflict.
Based on this, for the following code:
-----
# Wait for the error statistics to be updated.
my $check_sql = qq[SELECT count(1) > 0 ] . $part_sql;
$node->poll_query_until(
'postgres', $check_sql,
) or die "Timed out while waiting for statistics to be updated";* [1] *
$check_sql =
qq[
SELECT subname, last_error_command, last_error_relid::regclass,
last_error_count > 0 ] . $part_sql;
my $result = $node->safe_psql('postgres', $check_sql);
is($result, $expected, $msg);
-----Is it possible that the error("replication origin with OID") happen again at the
place [1]. In this case, the error message we have checked could be replaced by
another error("replication origin ...") and then the test fail ?Once we get the "duplicate key violation ..." error before * [1] * via
apply_worker then we shouldn't get replication origin-specific error
because the origin set up is done before starting to apply changes.Right.
Also, even if that or some other happens after * [1] * because of
errmsg_prefix check it should still succeed.In this case, the old error ("duplicate key violation ...") is
overwritten by a new error (e.g., connection error. not sure how
possible it is)
Yeah, or probably some memory allocation failure. I think the
probability of such failures is very low but OTOH why take chance.
and the test fails because the query returns no
entries, no?
Right.
If so, the result from the second check_sql is unstable
and it's probably better to check the result only once. That is, the
first check_sql includes the command and we exit from the function
once we confirm the error entry is expectedly updated.
Yeah, I think that should be fine.
With Regards,
Amit Kapila.
On Wed, Dec 1, 2021 at 1:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Dec 1, 2021 at 9:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Dec 1, 2021 at 12:22 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Dec 1, 2021 at 8:24 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:I have a question about the testcase (I could be wrong here).
Is it possible that the race condition happen between apply worker(test_tab1)
and table sync worker(test_tab2) ? If so, it seems the error("replication
origin with OID") could happen randomly until we resolve the conflict.
Based on this, for the following code:
-----
# Wait for the error statistics to be updated.
my $check_sql = qq[SELECT count(1) > 0 ] . $part_sql;
$node->poll_query_until(
'postgres', $check_sql,
) or die "Timed out while waiting for statistics to be updated";* [1] *
$check_sql =
qq[
SELECT subname, last_error_command, last_error_relid::regclass,
last_error_count > 0 ] . $part_sql;
my $result = $node->safe_psql('postgres', $check_sql);
is($result, $expected, $msg);
-----Is it possible that the error("replication origin with OID") happen again at the
place [1]. In this case, the error message we have checked could be replaced by
another error("replication origin ...") and then the test fail ?Once we get the "duplicate key violation ..." error before * [1] * via
apply_worker then we shouldn't get replication origin-specific error
because the origin set up is done before starting to apply changes.Right.
Also, even if that or some other happens after * [1] * because of
errmsg_prefix check it should still succeed.In this case, the old error ("duplicate key violation ...") is
overwritten by a new error (e.g., connection error. not sure how
possible it is)Yeah, or probably some memory allocation failure. I think the
probability of such failures is very low but OTOH why take chance.and the test fails because the query returns no
entries, no?Right.
If so, the result from the second check_sql is unstable
and it's probably better to check the result only once. That is, the
first check_sql includes the command and we exit from the function
once we confirm the error entry is expectedly updated.Yeah, I think that should be fine.
Okay, I've attached an updated patch. Please review it.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v2-0001-Fix-regression-test-failure-caused-by-commit-8d74.patchapplication/octet-stream; name=v2-0001-Fix-regression-test-failure-caused-by-commit-8d74.patchDownload
From 9867e1f6da9507303a0e7ca3b493d68719a91a33 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 30 Nov 2021 20:56:57 +0900
Subject: [PATCH v2] Fix regression test failure caused by commit 8d74fc96db
The tests were not considering that an error unrelated to apply
changes, e.g. "replication origin with OID 2 is already active ...",
could occur on the table sync worker before starting to apply changes.
This commit makes the queries used to check error entries check also
a prefix of the error message as well as the source of logical
replication worker (tablesync or apply) so we can check the specific
error.
Per buildfarm member sidewinder.
---
src/test/subscription/t/026_worker_stats.pl | 57 ++++++++++++---------
1 file changed, 33 insertions(+), 24 deletions(-)
diff --git a/src/test/subscription/t/026_worker_stats.pl b/src/test/subscription/t/026_worker_stats.pl
index e64e0a74b8..3510976a91 100644
--- a/src/test/subscription/t/026_worker_stats.pl
+++ b/src/test/subscription/t/026_worker_stats.pl
@@ -6,31 +6,38 @@ use strict;
use warnings;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
-use Test::More tests => 5;
+use Test::More tests => 3;
# Test if the error reported on pg_stat_subscription_workers view is expected.
sub test_subscription_error
{
- my ($node, $relname, $xid, $expected_error, $msg) = @_;
+ my ($node, $relname, $command, $xid, $by_apply_worker, $errmsg_prefix, $msg)
+ = @_;
my $check_sql = qq[
-SELECT count(1) > 0 FROM pg_stat_subscription_workers
-WHERE last_error_relid = '$relname'::regclass];
- $check_sql .= " AND last_error_xid = '$xid'::xid;" if $xid ne '';
-
- # Wait for the error statistics to be updated.
- $node->poll_query_until(
- 'postgres', $check_sql,
-) or die "Timed out while waiting for statistics to be updated";
-
- my $result = $node->safe_psql(
- 'postgres',
- qq[
-SELECT subname, last_error_command, last_error_relid::regclass, last_error_count > 0
+SELECT count(1) > 0
FROM pg_stat_subscription_workers
-WHERE last_error_relid = '$relname'::regclass;
-]);
- is($result, $expected_error, $msg);
+WHERE last_error_relid = '$relname'::regclass
+ AND starts_with(last_error_message, '$errmsg_prefix')];
+
+ # subrelid
+ $check_sql .= $by_apply_worker
+ ? qq[ AND subrelid IS NULL]
+ : qq[ AND subrelid = '$relname'::regclass];
+
+ # last_error_command
+ $check_sql .= $command eq ''
+ ? qq[ AND last_error_command IS NULL]
+ : qq[ AND last_error_command = '$command'];
+
+ # last_error_xid
+ $check_sql .= $xid eq ''
+ ? qq[ AND last_error_xid IS NULL]
+ : qq[ AND last_error_xid = '$xid'::xid];
+
+ # Wait for the particular error statistics to be reported.
+ $node->poll_query_until('postgres', $check_sql,
+) or die "Timed out while waiting for " . $msg;
}
# Create publisher node.
@@ -117,14 +124,16 @@ INSERT INTO test_tab1 VALUES (1);
SELECT pg_current_xact_id()::xid;
COMMIT;
]);
-test_subscription_error($node_subscriber, 'test_tab1', $xid,
- qq(tap_sub|INSERT|test_tab1|t),
- 'check the error reported by the apply worker');
+test_subscription_error($node_subscriber, 'test_tab1', 'INSERT', $xid,
+ 1, # check apply worker error
+ qq(duplicate key value violates unique constraint),
+ 'error reported by the apply worker');
# Check the table sync worker's error in the view.
-test_subscription_error($node_subscriber, 'test_tab2', '',
- qq(tap_sub||test_tab2|t),
- 'check the error reported by the table sync worker');
+test_subscription_error($node_subscriber, 'test_tab2', '', '',
+ 0, # check tablesync worker error
+ qq(duplicate key value violates unique constraint),
+ 'the error reported by the table sync worker');
# Test for resetting subscription worker statistics.
# Truncate test_tab1 and test_tab2 so that applying changes and table sync can
--
2.24.3 (Apple Git-128)
On Wednesday, December 1, 2021 1:23 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Dec 1, 2021 at 1:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Dec 1, 2021 at 9:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
If so, the result from the second check_sql is unstable and it's
probably better to check the result only once. That is, the first
check_sql includes the command and we exit from the function once we
confirm the error entry is expectedly updated.Yeah, I think that should be fine.
Okay, I've attached an updated patch. Please review it.
I agreed that checking the result only once makes the test more stable.
The patch looks good to me.
Best regards,
Hou zj
On Wed, Dec 1, 2021 at 11:57 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
On Wednesday, December 1, 2021 1:23 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Okay, I've attached an updated patch. Please review it.
I agreed that checking the result only once makes the test more stable.
The patch looks good to me.
Pushed.
Now, coming back to the skip_xid patch. To summarize the discussion in
that regard so far, we have discussed various alternatives for the
syntax like:
a. ALTER SUBSCRIPTION ... [SET|RESET] SKIP TRANSACTION xxx;
b. Alter Subscription <sub_name> SET ( subscription_parameter [=value]
[, ... ] );
c. Alter Subscription <sub_name> On Error ( subscription_parameter
[=value] [, ... ] );
d. Alter Subscription <sub_name> SKIP ( subscription_parameter
[=value] [, ... ] );
where subscription_parameter can be one of:
xid = <xid_val>
lsn = <lsn_val>
...
We didn't prefer (a) as it can lead to more keywords as we add more
options; (b) as we want these new skip options to behave and be set
differently than existing subscription properties because of the
difference in their behavior; (c) as that sounds more like an action
to be performed on a future condition (error/conflict) whereas here we
already knew that an error has happened;
As per discussion till now, option (d) seems preferable. In this, we
need to see how and what to allow as options. The simplest way for the
first version is to just allow one xid to be specified at a time which
would mean that specifying multiple xids should error out. We can also
additionally allow specifying operations like 'insert', 'update',
etc., and then relation list (list of oids). What that would mean is
that for a transaction we can allow which particular operations and
relations we want to skip.
I am not sure what exactly we can provide to users to allow skipping
initial table sync as we can't specify XID there. One option that
comes to mind is to allow specifying a combination of copy_data and
relid to skip table sync for a particular relation. We might think of
not doing anything for table sync workers but not sure if that is a
good option.
Thoughts?
--
With Regards,
Amit Kapila.
On 02.12.21 07:48, Amit Kapila wrote:
a. ALTER SUBSCRIPTION ... [SET|RESET] SKIP TRANSACTION xxx;
b. Alter Subscription <sub_name> SET ( subscription_parameter [=value]
[, ... ] );
c. Alter Subscription <sub_name> On Error ( subscription_parameter
[=value] [, ... ] );
d. Alter Subscription <sub_name> SKIP ( subscription_parameter
[=value] [, ... ] );
where subscription_parameter can be one of:
xid = <xid_val>
lsn = <lsn_val>
...
As per discussion till now, option (d) seems preferable.
I agree.
In this, we
need to see how and what to allow as options. The simplest way for the
first version is to just allow one xid to be specified at a time which
would mean that specifying multiple xids should error out. We can also
additionally allow specifying operations like 'insert', 'update',
etc., and then relation list (list of oids). What that would mean is
that for a transaction we can allow which particular operations and
relations we want to skip.
I don't know how difficult it would be, but allowing multiple xids might
be desirable. But this syntax gives you flexibility, so we can also
start with a simple implementation.
I am not sure what exactly we can provide to users to allow skipping
initial table sync as we can't specify XID there. One option that
comes to mind is to allow specifying a combination of copy_data and
relid to skip table sync for a particular relation. We might think of
not doing anything for table sync workers but not sure if that is a
good option.
I don't think this feature should affect tablesync. The semantics are
not clear, and it's not really needed. If the tablesync doesn't work,
you can try the setup again from scratch.
On Thu, Dec 2, 2021 at 8:38 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
On 02.12.21 07:48, Amit Kapila wrote:
a. ALTER SUBSCRIPTION ... [SET|RESET] SKIP TRANSACTION xxx;
b. Alter Subscription <sub_name> SET ( subscription_parameter [=value]
[, ... ] );
c. Alter Subscription <sub_name> On Error ( subscription_parameter
[=value] [, ... ] );
d. Alter Subscription <sub_name> SKIP ( subscription_parameter
[=value] [, ... ] );
where subscription_parameter can be one of:
xid = <xid_val>
lsn = <lsn_val>
...As per discussion till now, option (d) seems preferable.
I agree.
In this, we
need to see how and what to allow as options. The simplest way for the
first version is to just allow one xid to be specified at a time which
would mean that specifying multiple xids should error out. We can also
additionally allow specifying operations like 'insert', 'update',
etc., and then relation list (list of oids). What that would mean is
that for a transaction we can allow which particular operations and
relations we want to skip.I don't know how difficult it would be, but allowing multiple xids might
be desirable.
Are there many cases where there could be multiple xid failures that
the user can skip? Apply worker always keeps looping at the same error
failure so the user wouldn't know of the second xid failure (if any)
till the first failure is resolved. I could think of one such case
where it is possible during the initial synchronization phase where
apply worker went ahead then tablesync worker by skipping to apply the
changes on the corresponding table. After that, it is possible, that
the table sync worker failed during the catch-up phase and apply
worker fails during the processing of some other rel.
But this syntax gives you flexibility, so we can also
start with a simple implementation.
Yeah, I also think so. BTW, what do you think of providing extra
flexibility of giving other options like 'operation', 'rel' along with
xid? I think such options could be useful for large transactions that
operate on multiple tables as it is quite possible that only a
particular operation from the entire transaction is the cause of
failure. Now, on one side, we can argue that skipping the entire
transaction is better from the consistency point of view but I think
it is already possible that we just skip a particular update/delete
(if the corresponding tuple doesn't exist on the subscriber). For the
sake of simplicity, we can just allow providing xid at this stage and
then extend it later as required but I am not very sure of that point.
I am not sure what exactly we can provide to users to allow skipping
initial table sync as we can't specify XID there. One option that
comes to mind is to allow specifying a combination of copy_data and
relid to skip table sync for a particular relation. We might think of
not doing anything for table sync workers but not sure if that is a
good option.I don't think this feature should affect tablesync. The semantics are
not clear, and it's not really needed. If the tablesync doesn't work,
you can try the setup again from scratch.
Okay, that makes sense. But note it is possible that tablesync workers
might also need to skip some xids during the catchup phase to complete
the sync.
--
With Regards,
Amit Kapila.
On Fri, Dec 3, 2021 at 11:53 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Dec 2, 2021 at 8:38 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:On 02.12.21 07:48, Amit Kapila wrote:
a. ALTER SUBSCRIPTION ... [SET|RESET] SKIP TRANSACTION xxx;
b. Alter Subscription <sub_name> SET ( subscription_parameter [=value]
[, ... ] );
c. Alter Subscription <sub_name> On Error ( subscription_parameter
[=value] [, ... ] );
d. Alter Subscription <sub_name> SKIP ( subscription_parameter
[=value] [, ... ] );
where subscription_parameter can be one of:
xid = <xid_val>
lsn = <lsn_val>
...As per discussion till now, option (d) seems preferable.
I agree.
+1
In this, we
need to see how and what to allow as options. The simplest way for the
first version is to just allow one xid to be specified at a time which
would mean that specifying multiple xids should error out. We can also
additionally allow specifying operations like 'insert', 'update',
etc., and then relation list (list of oids). What that would mean is
that for a transaction we can allow which particular operations and
relations we want to skip.I don't know how difficult it would be, but allowing multiple xids might
be desirable.Are there many cases where there could be multiple xid failures that
the user can skip? Apply worker always keeps looping at the same error
failure so the user wouldn't know of the second xid failure (if any)
till the first failure is resolved. I could think of one such case
where it is possible during the initial synchronization phase where
apply worker went ahead then tablesync worker by skipping to apply the
changes on the corresponding table. After that, it is possible, that
the table sync worker failed during the catch-up phase and apply
worker fails during the processing of some other rel.But this syntax gives you flexibility, so we can also
start with a simple implementation.Yeah, I also think so. BTW, what do you think of providing extra
flexibility of giving other options like 'operation', 'rel' along with
xid? I think such options could be useful for large transactions that
operate on multiple tables as it is quite possible that only a
particular operation from the entire transaction is the cause of
failure. Now, on one side, we can argue that skipping the entire
transaction is better from the consistency point of view but I think
it is already possible that we just skip a particular update/delete
(if the corresponding tuple doesn't exist on the subscriber). For the
sake of simplicity, we can just allow providing xid at this stage and
then extend it later as required but I am not very sure of that point.
+1
Skipping a whole transaction by specifying xid would be a good start.
Ideally, we'd like to automatically skip only operations within the
transaction that fail but it seems not easy to achieve. If we allow
specifying operations and/or relations, probably multiple operations
or relations need to be specified in some cases. Otherwise, the
subscriber cannot continue logical replication if the transaction has
multiple operations on different relations that fail. But similar to
the idea of specifying multiple xids, we need to note the fact that
user wouldn't know of the second operation failure unless the apply
worker applies the change. So I'm not sure there are many use cases in
practice where users can specify multiple operations and relations in
order to skip applies that fail.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Fri, Dec 3, 2021 at 12:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Dec 3, 2021 at 11:53 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
But this syntax gives you flexibility, so we can also
start with a simple implementation.Yeah, I also think so. BTW, what do you think of providing extra
flexibility of giving other options like 'operation', 'rel' along with
xid? I think such options could be useful for large transactions that
operate on multiple tables as it is quite possible that only a
particular operation from the entire transaction is the cause of
failure. Now, on one side, we can argue that skipping the entire
transaction is better from the consistency point of view but I think
it is already possible that we just skip a particular update/delete
(if the corresponding tuple doesn't exist on the subscriber). For the
sake of simplicity, we can just allow providing xid at this stage and
then extend it later as required but I am not very sure of that point.+1
Skipping a whole transaction by specifying xid would be a good start.
Okay, that sounds reasonable, so let's do that for now.
--
With Regards,
Amit Kapila.
On Mon, Dec 6, 2021 at 2:17 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Dec 3, 2021 at 12:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Dec 3, 2021 at 11:53 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
But this syntax gives you flexibility, so we can also
start with a simple implementation.Yeah, I also think so. BTW, what do you think of providing extra
flexibility of giving other options like 'operation', 'rel' along with
xid? I think such options could be useful for large transactions that
operate on multiple tables as it is quite possible that only a
particular operation from the entire transaction is the cause of
failure. Now, on one side, we can argue that skipping the entire
transaction is better from the consistency point of view but I think
it is already possible that we just skip a particular update/delete
(if the corresponding tuple doesn't exist on the subscriber). For the
sake of simplicity, we can just allow providing xid at this stage and
then extend it later as required but I am not very sure of that point.+1
Skipping a whole transaction by specifying xid would be a good start.
Okay, that sounds reasonable, so let's do that for now.
I'll submit the patch tomorrow.
While updating the patch, I realized that skipping a transaction that
is prepared on the publisher will be tricky a bit;
First of all, since skip-xid is in pg_subscription catalog, we need to
do a catalog update in a transaction and commit it to disable it. I
think we need to set origin-lsn and timestamp of the transaction being
skipped to the transaction that does the catalog update. That is,
during skipping the (not prepared) transaction, we skip all
data-modification changes coming from the publisher, do a catalog
update, and commit the transaction. If we do the catalog update in the
next transaction after skipping the whole transaction, skip_xid could
be left in case of a server crash between them. Also, we cannot set
origin-lsn and timestamp to an empty transaction.
In prepared transaction cases, I think that when handling a prepare
message, we need to commit the transaction to update the catalog,
instead of preparing it. And at the commit prepared and rollback
prepared time, we skip it since there is not the prepared transaction
on the subscriber. Currently, handling rollback prepared already
behaves so; it first checks whether we have prepared the transaction
or not and skip it if haven’t. So I think we need to do that also for
commit prepared case. With that, this requires protocol changes so
that the subscriber can get prepare-lsn and prepare-time when handling
commit prepared.
So I’m writing a separate patch to add prepare-lsn and timestamp to
commit_prepared message, which will be a building block for skipping
prepared transactions. Actually, I think it’s beneficial even today;
we can skip preparing the transaction if it’s an empty transaction.
Although the comment it’s not a common case, I think that it could
happen quite often in some cases:
* XXX, We can optimize such that at commit prepared time, we first check
* whether we have prepared the transaction or not but that doesn't seem
* worthwhile because such cases shouldn't be common.
*/
For example, if the publisher has multiple subscriptions and there are
many prepared transactions that modify the particular table subscribed
by one publisher, many empty transactions are replicated to other
subscribers.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On 03.12.21 03:53, Amit Kapila wrote:
I don't know how difficult it would be, but allowing multiple xids might
be desirable.Are there many cases where there could be multiple xid failures that
the user can skip? Apply worker always keeps looping at the same error
failure so the user wouldn't know of the second xid failure (if any)
till the first failure is resolved.
Yeah, nevermind, doesn't make sense.
Yeah, I also think so. BTW, what do you think of providing extra
flexibility of giving other options like 'operation', 'rel' along with
xid? I think such options could be useful for large transactions that
operate on multiple tables as it is quite possible that only a
particular operation from the entire transaction is the cause of
failure. Now, on one side, we can argue that skipping the entire
transaction is better from the consistency point of view but I think
it is already possible that we just skip a particular update/delete
(if the corresponding tuple doesn't exist on the subscriber). For the
sake of simplicity, we can just allow providing xid at this stage and
then extend it later as required but I am not very sure of that point.
Skipping transactions partially sounds dangerous, especially when
exposed as an option to users. Needs more careful thought.
On Tue, Dec 7, 2021 at 5:06 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Dec 6, 2021 at 2:17 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I'll submit the patch tomorrow.
While updating the patch, I realized that skipping a transaction that
is prepared on the publisher will be tricky a bit;First of all, since skip-xid is in pg_subscription catalog, we need to
do a catalog update in a transaction and commit it to disable it. I
think we need to set origin-lsn and timestamp of the transaction being
skipped to the transaction that does the catalog update. That is,
during skipping the (not prepared) transaction, we skip all
data-modification changes coming from the publisher, do a catalog
update, and commit the transaction. If we do the catalog update in the
next transaction after skipping the whole transaction, skip_xid could
be left in case of a server crash between them.
But if we haven't updated origin_lsn/timestamp before the crash, won't
it request the same transaction again from the publisher? If so, it
will be again able to skip it because skip_xid is still not updated.
Also, we cannot set
origin-lsn and timestamp to an empty transaction.
But won't we update the catalog for skip_xid in that case?
Do we see any advantage of updating the skip_xid in the same
transaction vs. doing it in a separate transaction? If not then
probably we can choose either of those ways and add some comments to
indicate the possibility of doing it another way.
In prepared transaction cases, I think that when handling a prepare
message, we need to commit the transaction to update the catalog,
instead of preparing it. And at the commit prepared and rollback
prepared time, we skip it since there is not the prepared transaction
on the subscriber.
Can't we think of just allowing prepare in this case and updating the
skip_xid only at commit time? I see that in this case, we would be
doing prepare for a transaction that has no changes but as such cases
won't be common, isn't that acceptable?
Currently, handling rollback prepared already
behaves so; it first checks whether we have prepared the transaction
or not and skip it if haven’t. So I think we need to do that also for
commit prepared case. With that, this requires protocol changes so
that the subscriber can get prepare-lsn and prepare-time when handling
commit prepared.So I’m writing a separate patch to add prepare-lsn and timestamp to
commit_prepared message, which will be a building block for skipping
prepared transactions. Actually, I think it’s beneficial even today;
we can skip preparing the transaction if it’s an empty transaction.
Although the comment it’s not a common case, I think that it could
happen quite often in some cases:* XXX, We can optimize such that at commit prepared time, we first check
* whether we have prepared the transaction or not but that doesn't seem
* worthwhile because such cases shouldn't be common.
*/For example, if the publisher has multiple subscriptions and there are
many prepared transactions that modify the particular table subscribed
by one publisher, many empty transactions are replicated to other
subscribers.
I think this is not clear to me. Why would one have multiple
subscriptions for the same publication? I thought it is possible when
say some publisher doesn't publish any data of prepared transaction
say because the corresponding action is not published or something
like that. I don't deny that someday we want to optimize this case but
it might be better if we don't need to do it along with this patch.
--
With Regards,
Amit Kapila.
On Wed, Dec 8, 2021 at 2:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Dec 7, 2021 at 5:06 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Dec 6, 2021 at 2:17 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I'll submit the patch tomorrow.
While updating the patch, I realized that skipping a transaction that
is prepared on the publisher will be tricky a bit;First of all, since skip-xid is in pg_subscription catalog, we need to
do a catalog update in a transaction and commit it to disable it. I
think we need to set origin-lsn and timestamp of the transaction being
skipped to the transaction that does the catalog update. That is,
during skipping the (not prepared) transaction, we skip all
data-modification changes coming from the publisher, do a catalog
update, and commit the transaction. If we do the catalog update in the
next transaction after skipping the whole transaction, skip_xid could
be left in case of a server crash between them.But if we haven't updated origin_lsn/timestamp before the crash, won't
it request the same transaction again from the publisher? If so, it
will be again able to skip it because skip_xid is still not updated.
Yes. I mean that if we update origin_lsn and origin_timestamp when
committing the skipped transaction and then update the catalog in the
next transaction it doesn't work in case of a crash. But it's not
possible in the first place since the first transaction is empty and
we cannot set origin_lsn and origin_timestamp to it.
Also, we cannot set
origin-lsn and timestamp to an empty transaction.But won't we update the catalog for skip_xid in that case?
Yes. Probably my explanation was not clear. Even if we skip all
changes of the transaction, the transaction doesn't become empty since
we update the catalog.
Do we see any advantage of updating the skip_xid in the same
transaction vs. doing it in a separate transaction? If not then
probably we can choose either of those ways and add some comments to
indicate the possibility of doing it another way.
I think that since the skipped transaction is always empty there is
always one transaction. What we need to consider is when we update
origin_lsn and origin_timestamp. In non-prepared transaction cases,
the only option is when updating the catalog.
In prepared transaction cases, I think that when handling a prepare
message, we need to commit the transaction to update the catalog,
instead of preparing it. And at the commit prepared and rollback
prepared time, we skip it since there is not the prepared transaction
on the subscriber.Can't we think of just allowing prepare in this case and updating the
skip_xid only at commit time? I see that in this case, we would be
doing prepare for a transaction that has no changes but as such cases
won't be common, isn't that acceptable?
In this case, we will end up committing both the prepared (empty)
transaction and the transaction that updates the catalog, right? If
so, since these are separate transactions it can be a problem in case
of a crash between these two commits.
Currently, handling rollback prepared already
behaves so; it first checks whether we have prepared the transaction
or not and skip it if haven’t. So I think we need to do that also for
commit prepared case. With that, this requires protocol changes so
that the subscriber can get prepare-lsn and prepare-time when handling
commit prepared.So I’m writing a separate patch to add prepare-lsn and timestamp to
commit_prepared message, which will be a building block for skipping
prepared transactions. Actually, I think it’s beneficial even today;
we can skip preparing the transaction if it’s an empty transaction.
Although the comment it’s not a common case, I think that it could
happen quite often in some cases:* XXX, We can optimize such that at commit prepared time, we first check
* whether we have prepared the transaction or not but that doesn't seem
* worthwhile because such cases shouldn't be common.
*/For example, if the publisher has multiple subscriptions and there are
many prepared transactions that modify the particular table subscribed
by one publisher, many empty transactions are replicated to other
subscribers.I think this is not clear to me. Why would one have multiple
subscriptions for the same publication? I thought it is possible when
say some publisher doesn't publish any data of prepared transaction
say because the corresponding action is not published or something
like that. I don't deny that someday we want to optimize this case but
it might be better if we don't need to do it along with this patch.
I imagined that the publisher has two publications (say pub-A and
pub-B) that publishes a diferent set of relations in the database and
there are two subscribers that are subscribing to either one
publication (e.g, subscriber-A subscribes to pub-A and subscriber-B
subscribes to pub-B). If many prepared transactions happen on the
publisher and these transactions modify only relations published by
pub-A, both subscriber-A and subscriber-B would prepare the same
number of transactions but all of them in subscriber-B is empty.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wed, Dec 8, 2021 at 11:48 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Dec 8, 2021 at 2:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Dec 7, 2021 at 5:06 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Dec 6, 2021 at 2:17 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I'll submit the patch tomorrow.
While updating the patch, I realized that skipping a transaction that
is prepared on the publisher will be tricky a bit;First of all, since skip-xid is in pg_subscription catalog, we need to
do a catalog update in a transaction and commit it to disable it. I
think we need to set origin-lsn and timestamp of the transaction being
skipped to the transaction that does the catalog update. That is,
during skipping the (not prepared) transaction, we skip all
data-modification changes coming from the publisher, do a catalog
update, and commit the transaction. If we do the catalog update in the
next transaction after skipping the whole transaction, skip_xid could
be left in case of a server crash between them.But if we haven't updated origin_lsn/timestamp before the crash, won't
it request the same transaction again from the publisher? If so, it
will be again able to skip it because skip_xid is still not updated.Yes. I mean that if we update origin_lsn and origin_timestamp when
committing the skipped transaction and then update the catalog in the
next transaction it doesn't work in case of a crash. But it's not
possible in the first place since the first transaction is empty and
we cannot set origin_lsn and origin_timestamp to it.Also, we cannot set
origin-lsn and timestamp to an empty transaction.But won't we update the catalog for skip_xid in that case?
Yes. Probably my explanation was not clear. Even if we skip all
changes of the transaction, the transaction doesn't become empty since
we update the catalog.Do we see any advantage of updating the skip_xid in the same
transaction vs. doing it in a separate transaction? If not then
probably we can choose either of those ways and add some comments to
indicate the possibility of doing it another way.I think that since the skipped transaction is always empty there is
always one transaction. What we need to consider is when we update
origin_lsn and origin_timestamp. In non-prepared transaction cases,
the only option is when updating the catalog.
Your last sentence is not completely clear to me but it seems you
agree that we can use one transaction instead of two to skip the
changes, perform a catalog update, and update origin_lsn/timestamp.
In prepared transaction cases, I think that when handling a prepare
message, we need to commit the transaction to update the catalog,
instead of preparing it. And at the commit prepared and rollback
prepared time, we skip it since there is not the prepared transaction
on the subscriber.Can't we think of just allowing prepare in this case and updating the
skip_xid only at commit time? I see that in this case, we would be
doing prepare for a transaction that has no changes but as such cases
won't be common, isn't that acceptable?In this case, we will end up committing both the prepared (empty)
transaction and the transaction that updates the catalog, right?
Can't we do this catalog update before committing the prepared
transaction? If so, both in prepared and non-prepared cases, our
implementation could be the same and we have a reason to accomplish
the catalog update in the same transaction for which we skipped the
changes.
If
so, since these are separate transactions it can be a problem in case
of a crash between these two commits.Currently, handling rollback prepared already
behaves so; it first checks whether we have prepared the transaction
or not and skip it if haven’t. So I think we need to do that also for
commit prepared case. With that, this requires protocol changes so
that the subscriber can get prepare-lsn and prepare-time when handling
commit prepared.So I’m writing a separate patch to add prepare-lsn and timestamp to
commit_prepared message, which will be a building block for skipping
prepared transactions. Actually, I think it’s beneficial even today;
we can skip preparing the transaction if it’s an empty transaction.
Although the comment it’s not a common case, I think that it could
happen quite often in some cases:* XXX, We can optimize such that at commit prepared time, we first check
* whether we have prepared the transaction or not but that doesn't seem
* worthwhile because such cases shouldn't be common.
*/For example, if the publisher has multiple subscriptions and there are
many prepared transactions that modify the particular table subscribed
by one publisher, many empty transactions are replicated to other
subscribers.I think this is not clear to me. Why would one have multiple
subscriptions for the same publication? I thought it is possible when
say some publisher doesn't publish any data of prepared transaction
say because the corresponding action is not published or something
like that. I don't deny that someday we want to optimize this case but
it might be better if we don't need to do it along with this patch.I imagined that the publisher has two publications (say pub-A and
pub-B) that publishes a diferent set of relations in the database and
there are two subscribers that are subscribing to either one
publication (e.g, subscriber-A subscribes to pub-A and subscriber-B
subscribes to pub-B). If many prepared transactions happen on the
publisher and these transactions modify only relations published by
pub-A, both subscriber-A and subscriber-B would prepare the same
number of transactions but all of them in subscriber-B is empty.
Okay, I understand those cases but note always checking if the
prepared xact exists during commit prepared has a cost and that is why
we avoided it at the first place. There is a separate effort in
progress [1]https://commitfest.postgresql.org/36/3093/ where we want to avoid sending empty transactions at the
first place. So, it is better to avoid this cost via that effort
rather than adding additional cost at commit of each prepared
transaction. OTOH, if there are other strong reasons to do it then we
can probably consider it.
[1]: https://commitfest.postgresql.org/36/3093/
--
With Regards,
Amit Kapila.
On Wed, Dec 8, 2021 at 3:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Dec 8, 2021 at 11:48 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Dec 8, 2021 at 2:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Dec 7, 2021 at 5:06 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Dec 6, 2021 at 2:17 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I'll submit the patch tomorrow.
While updating the patch, I realized that skipping a transaction that
is prepared on the publisher will be tricky a bit;First of all, since skip-xid is in pg_subscription catalog, we need to
do a catalog update in a transaction and commit it to disable it. I
think we need to set origin-lsn and timestamp of the transaction being
skipped to the transaction that does the catalog update. That is,
during skipping the (not prepared) transaction, we skip all
data-modification changes coming from the publisher, do a catalog
update, and commit the transaction. If we do the catalog update in the
next transaction after skipping the whole transaction, skip_xid could
be left in case of a server crash between them.But if we haven't updated origin_lsn/timestamp before the crash, won't
it request the same transaction again from the publisher? If so, it
will be again able to skip it because skip_xid is still not updated.Yes. I mean that if we update origin_lsn and origin_timestamp when
committing the skipped transaction and then update the catalog in the
next transaction it doesn't work in case of a crash. But it's not
possible in the first place since the first transaction is empty and
we cannot set origin_lsn and origin_timestamp to it.Also, we cannot set
origin-lsn and timestamp to an empty transaction.But won't we update the catalog for skip_xid in that case?
Yes. Probably my explanation was not clear. Even if we skip all
changes of the transaction, the transaction doesn't become empty since
we update the catalog.Do we see any advantage of updating the skip_xid in the same
transaction vs. doing it in a separate transaction? If not then
probably we can choose either of those ways and add some comments to
indicate the possibility of doing it another way.I think that since the skipped transaction is always empty there is
always one transaction. What we need to consider is when we update
origin_lsn and origin_timestamp. In non-prepared transaction cases,
the only option is when updating the catalog.Your last sentence is not completely clear to me but it seems you
agree that we can use one transaction instead of two to skip the
changes, perform a catalog update, and update origin_lsn/timestamp.
Yes.
In prepared transaction cases, I think that when handling a prepare
message, we need to commit the transaction to update the catalog,
instead of preparing it. And at the commit prepared and rollback
prepared time, we skip it since there is not the prepared transaction
on the subscriber.Can't we think of just allowing prepare in this case and updating the
skip_xid only at commit time? I see that in this case, we would be
doing prepare for a transaction that has no changes but as such cases
won't be common, isn't that acceptable?In this case, we will end up committing both the prepared (empty)
transaction and the transaction that updates the catalog, right?Can't we do this catalog update before committing the prepared
transaction? If so, both in prepared and non-prepared cases, our
implementation could be the same and we have a reason to accomplish
the catalog update in the same transaction for which we skipped the
changes.
But in case of a crash between these two transactions, given that
skip_xid is already cleared how do we know the prepared transaction
that was supposed to be skipped?
If
so, since these are separate transactions it can be a problem in case
of a crash between these two commits.Currently, handling rollback prepared already
behaves so; it first checks whether we have prepared the transaction
or not and skip it if haven’t. So I think we need to do that also for
commit prepared case. With that, this requires protocol changes so
that the subscriber can get prepare-lsn and prepare-time when handling
commit prepared.So I’m writing a separate patch to add prepare-lsn and timestamp to
commit_prepared message, which will be a building block for skipping
prepared transactions. Actually, I think it’s beneficial even today;
we can skip preparing the transaction if it’s an empty transaction.
Although the comment it’s not a common case, I think that it could
happen quite often in some cases:* XXX, We can optimize such that at commit prepared time, we first check
* whether we have prepared the transaction or not but that doesn't seem
* worthwhile because such cases shouldn't be common.
*/For example, if the publisher has multiple subscriptions and there are
many prepared transactions that modify the particular table subscribed
by one publisher, many empty transactions are replicated to other
subscribers.I think this is not clear to me. Why would one have multiple
subscriptions for the same publication? I thought it is possible when
say some publisher doesn't publish any data of prepared transaction
say because the corresponding action is not published or something
like that. I don't deny that someday we want to optimize this case but
it might be better if we don't need to do it along with this patch.I imagined that the publisher has two publications (say pub-A and
pub-B) that publishes a diferent set of relations in the database and
there are two subscribers that are subscribing to either one
publication (e.g, subscriber-A subscribes to pub-A and subscriber-B
subscribes to pub-B). If many prepared transactions happen on the
publisher and these transactions modify only relations published by
pub-A, both subscriber-A and subscriber-B would prepare the same
number of transactions but all of them in subscriber-B is empty.Okay, I understand those cases but note always checking if the
prepared xact exists during commit prepared has a cost and that is why
we avoided it at the first place. There is a separate effort in
progress [1] where we want to avoid sending empty transactions at the
first place. So, it is better to avoid this cost via that effort
rather than adding additional cost at commit of each prepared
transaction. OTOH, if there are other strong reasons to do it then we
can probably consider it.
Thank you for the information. Agreed.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wed, Dec 8, 2021 at 12:36 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Dec 8, 2021 at 3:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Dec 8, 2021 at 11:48 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Can't we think of just allowing prepare in this case and updating the
skip_xid only at commit time? I see that in this case, we would be
doing prepare for a transaction that has no changes but as such cases
won't be common, isn't that acceptable?In this case, we will end up committing both the prepared (empty)
transaction and the transaction that updates the catalog, right?Can't we do this catalog update before committing the prepared
transaction? If so, both in prepared and non-prepared cases, our
implementation could be the same and we have a reason to accomplish
the catalog update in the same transaction for which we skipped the
changes.But in case of a crash between these two transactions, given that
skip_xid is already cleared how do we know the prepared transaction
that was supposed to be skipped?
I was thinking of doing it as one transaction at the time of
commit_prepare. Say, in function apply_handle_commit_prepared(), if we
check whether the skip_xid is the same as prepare_data.xid then update
the catalog and set origin_lsn/timestamp in the same transaction. Why
do we need two transactions for it?
--
With Regards,
Amit Kapila.
On Wed, Dec 8, 2021 at 5:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Dec 8, 2021 at 12:36 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Dec 8, 2021 at 3:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Dec 8, 2021 at 11:48 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Can't we think of just allowing prepare in this case and updating the
skip_xid only at commit time? I see that in this case, we would be
doing prepare for a transaction that has no changes but as such cases
won't be common, isn't that acceptable?In this case, we will end up committing both the prepared (empty)
transaction and the transaction that updates the catalog, right?Can't we do this catalog update before committing the prepared
transaction? If so, both in prepared and non-prepared cases, our
implementation could be the same and we have a reason to accomplish
the catalog update in the same transaction for which we skipped the
changes.But in case of a crash between these two transactions, given that
skip_xid is already cleared how do we know the prepared transaction
that was supposed to be skipped?I was thinking of doing it as one transaction at the time of
commit_prepare. Say, in function apply_handle_commit_prepared(), if we
check whether the skip_xid is the same as prepare_data.xid then update
the catalog and set origin_lsn/timestamp in the same transaction. Why
do we need two transactions for it?
I meant the two transactions are the prepared transaction and the
transaction that updates the catalog. If I understand your idea
correctly, in apply_handle_commit_prepared(), we update the catalog
and set origin_lsn/timestamp. These are done in the same transaction.
Then, we commit the prepared transaction, right? If the server crashes
between them, skip_xid is already cleared and logical replication
starts from the LSN after COMMIT PREPARED. But the prepared
transaction still exists on the subscriber.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wed, Dec 8, 2021 at 4:05 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Okay, I understand those cases but note always checking if the
prepared xact exists during commit prepared has a cost and that is why
we avoided it at the first place.
BTW what costs were we concerned about? Looking at LookupGXact(), we
look for the 2PC state data on shmem while acquiring TwoPhaseStateLock
in shared mode. And we check origin_lsn and origin_timestamp of 2PC by
reading WAL or 2PC state file only if gid matched. On the other hand,
committing the prepared transaction does WAL logging, waits for
synchronous replication, and calls post-commit callbacks, and removes
2PC state file etc. And it requires acquiring TwoPhaseStateLock in
exclusive mode to remove 2PC state entry. So it looks like always
checking if the prepared transaction exists and skipping it if not is
cheaper than always committing prepared transactions.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wed, Dec 8, 2021 at 4:36 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Dec 8, 2021 at 5:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Dec 8, 2021 at 12:36 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Dec 8, 2021 at 3:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Dec 8, 2021 at 11:48 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Can't we think of just allowing prepare in this case and updating the
skip_xid only at commit time? I see that in this case, we would be
doing prepare for a transaction that has no changes but as such cases
won't be common, isn't that acceptable?In this case, we will end up committing both the prepared (empty)
transaction and the transaction that updates the catalog, right?Can't we do this catalog update before committing the prepared
transaction? If so, both in prepared and non-prepared cases, our
implementation could be the same and we have a reason to accomplish
the catalog update in the same transaction for which we skipped the
changes.But in case of a crash between these two transactions, given that
skip_xid is already cleared how do we know the prepared transaction
that was supposed to be skipped?I was thinking of doing it as one transaction at the time of
commit_prepare. Say, in function apply_handle_commit_prepared(), if we
check whether the skip_xid is the same as prepare_data.xid then update
the catalog and set origin_lsn/timestamp in the same transaction. Why
do we need two transactions for it?I meant the two transactions are the prepared transaction and the
transaction that updates the catalog. If I understand your idea
correctly, in apply_handle_commit_prepared(), we update the catalog
and set origin_lsn/timestamp. These are done in the same transaction.
Then, we commit the prepared transaction, right?
I am thinking that we can start a transaction, update the catalog,
commit that transaction. Then start a new one to update
origin_lsn/timestamp, finishprepared, and commit it. Now, if it
crashes after the first transaction, only commit prepared will be
resent again and this time we don't need to update the catalog as that
entry would be already cleared.
--
With Regards,
Amit Kapila.
On Thu, Dec 9, 2021 at 11:47 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Dec 8, 2021 at 4:36 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Dec 8, 2021 at 5:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Dec 8, 2021 at 12:36 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Dec 8, 2021 at 3:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Dec 8, 2021 at 11:48 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Can't we think of just allowing prepare in this case and updating the
skip_xid only at commit time? I see that in this case, we would be
doing prepare for a transaction that has no changes but as such cases
won't be common, isn't that acceptable?In this case, we will end up committing both the prepared (empty)
transaction and the transaction that updates the catalog, right?Can't we do this catalog update before committing the prepared
transaction? If so, both in prepared and non-prepared cases, our
implementation could be the same and we have a reason to accomplish
the catalog update in the same transaction for which we skipped the
changes.But in case of a crash between these two transactions, given that
skip_xid is already cleared how do we know the prepared transaction
that was supposed to be skipped?I was thinking of doing it as one transaction at the time of
commit_prepare. Say, in function apply_handle_commit_prepared(), if we
check whether the skip_xid is the same as prepare_data.xid then update
the catalog and set origin_lsn/timestamp in the same transaction. Why
do we need two transactions for it?I meant the two transactions are the prepared transaction and the
transaction that updates the catalog. If I understand your idea
correctly, in apply_handle_commit_prepared(), we update the catalog
and set origin_lsn/timestamp. These are done in the same transaction.
Then, we commit the prepared transaction, right?I am thinking that we can start a transaction, update the catalog,
commit that transaction. Then start a new one to update
origin_lsn/timestamp, finishprepared, and commit it. Now, if it
crashes after the first transaction, only commit prepared will be
resent again and this time we don't need to update the catalog as that
entry would be already cleared.
Sounds good. In the crash case, it should be fine since we will just
commit an empty transaction. The same is true for the case where
skip_xid has been changed after skipping and preparing the transaction
and before handling commit_prepared.
Regarding the case where the user specifies XID of the transaction
after it is prepared on the subscriber (i.g., the transaction is not
empty), we won’t skip committing the prepared transaction. But I think
that we don't need to support skipping already-prepared transaction
since such transaction doesn't conflict with anything regardless of
having changed or not.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thu, Dec 9, 2021 at 2:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Dec 9, 2021 at 11:47 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
I am thinking that we can start a transaction, update the catalog,
commit that transaction. Then start a new one to update
origin_lsn/timestamp, finishprepared, and commit it. Now, if it
crashes after the first transaction, only commit prepared will be
resent again and this time we don't need to update the catalog as that
entry would be already cleared.Sounds good. In the crash case, it should be fine since we will just
commit an empty transaction. The same is true for the case where
skip_xid has been changed after skipping and preparing the transaction
and before handling commit_prepared.Regarding the case where the user specifies XID of the transaction
after it is prepared on the subscriber (i.g., the transaction is not
empty), we won’t skip committing the prepared transaction. But I think
that we don't need to support skipping already-prepared transaction
since such transaction doesn't conflict with anything regardless of
having changed or not.
Yeah, this makes sense to me.
--
With Regards,
Amit Kapila.
On Thu, Dec 9, 2021 at 6:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Dec 9, 2021 at 2:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Dec 9, 2021 at 11:47 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
I am thinking that we can start a transaction, update the catalog,
commit that transaction. Then start a new one to update
origin_lsn/timestamp, finishprepared, and commit it. Now, if it
crashes after the first transaction, only commit prepared will be
resent again and this time we don't need to update the catalog as that
entry would be already cleared.Sounds good. In the crash case, it should be fine since we will just
commit an empty transaction. The same is true for the case where
skip_xid has been changed after skipping and preparing the transaction
and before handling commit_prepared.Regarding the case where the user specifies XID of the transaction
after it is prepared on the subscriber (i.g., the transaction is not
empty), we won’t skip committing the prepared transaction. But I think
that we don't need to support skipping already-prepared transaction
since such transaction doesn't conflict with anything regardless of
having changed or not.Yeah, this makes sense to me.
I've attached an updated patch. The new syntax is like "ALTER
SUBSCRIPTION testsub SKIP (xid = '123')".
I’ve been thinking we can do something safeguard for the case where
the user specified the wrong xid. For example, can we somewhat use the
stats in pg_stat_subscription_workers? An idea is that logical
replication worker fetches the xid from the stats when reading the
subscription and skips the transaction if the xid matches to
subskipxid. That is, the worker checks the error reported by the
worker previously working on the same subscription. The error could
not be a conflict error (e.g., connection error etc.) or might have
been cleared by the reset function, But given the worker is in an
error loop, the worker can eventually get xid in question. We can
prevent an unrelated transaction from being skipped unexpectedly. It
seems not a stable solution though. Or it might be enough to warn
users when they specified an XID that doesn’t match to last_error_xid.
Anyway, I think it’s better to have more discussion on this. Any
ideas?
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transactio.patchapplication/octet-stream; name=0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transactio.patchDownload
From d53b8db12749d020a19ea67884c925d5e5510e6f Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 10 Dec 2021 14:41:30 +0900
Subject: [PATCH] Add ALTER SUBSCRIPTION ... SKIP to skip the transaction on
subscriber nodes
If incoming change violates any constraint, logical replication stops
until it's resolved. This commit introduces another way to skip the
transaction in question, other than manually updating the subscriber's
database or using pg_replication_origin_advance().
The user can specify XID by ALTER SUBSCRIPTION ... SKIP (xid = XXX),
updating pg_subscription.subskipxid field, telling the apply worker to
skip the transaction. The apply worker skips all data modification
changes within the specified transaction.
After skipping the transaciton the apply worker clears
pg_subscription.subskipxid.
---
doc/src/sgml/logical-replication.sgml | 54 +++++-
doc/src/sgml/ref/alter_subscription.sgml | 41 ++++
src/backend/catalog/pg_subscription.c | 10 +
src/backend/commands/subscriptioncmds.c | 54 ++++++
src/backend/parser/gram.y | 9 +
src/backend/replication/logical/worker.c | 213 ++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 4 +
src/bin/psql/tab-complete.c | 8 +-
src/include/catalog/pg_subscription.h | 4 +
src/include/nodes/parsenodes.h | 3 +-
src/test/regress/expected/subscription.out | 26 +++
src/test/regress/sql/subscription.sql | 14 ++
src/test/subscription/t/027_skip_xact.pl | 204 ++++++++++++++++++++
13 files changed, 634 insertions(+), 10 deletions(-)
create mode 100644 src/test/subscription/t/027_skip_xact.pl
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 45b2e1e28f..3237f68b04 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -333,20 +333,66 @@
<para>
A conflict will produce an error and will stop the replication; it must be
resolved manually by the user. Details about the conflict can be found in
- the subscriber's server log.
+ <xref linkend="monitoring-pg-stat-subscription-workers"/> as well as the
+ subscriber's server log.
</para>
<para>
The resolution can be done either by changing data on the subscriber so
- that it does not conflict with the incoming change or by skipping the
- transaction that conflicts with the existing data. The transaction can be
- skipped by calling the <link linkend="pg-replication-origin-advance">
+ that it does not conflict with the incoming changes or by skipping the whole
+ transaction. This option specifies the ID of the transaction whose
+ application is to be skipped by the logical replication worker. The logical
+ replication worker skips all data modification transaction conflicts with
+ the existing data. When a conflict produce an error, it is shown in
+ <structname>pg_stat_subscription_workers</structname> view as follows:
+ </para>
+
+ <programlisting>
+postgres=# SELECT * FROM pg_stat_subscription_workers;
+-[ RECORD 1 ]------+-----------------------------------------------------------
+subid | 16391
+subname | test_sub
+subrelid |
+last_error_relid | 16385
+last_error_command | INSERT
+last_error_xid | 716
+last_error_count | 50
+last_error_message | duplicate key value violates unique constraint "test_pkey"
+last_error_time | 2021-09-29 15:52:45.165754+00
+</programlisting>
+
+ <para>
+ and it is also shown in subscriber's server log as follows:
+ </para>
+
+<screen>
+ERROR: duplicate key value violates unique constraint "test_pkey"
+DETAIL: Key (id)=(1) already exists.
+CONTEXT: processing remote data during "INSERT" for replication target relation "public.test" in transaction 716 with commit timestamp 2021-09-29 15:52:45.165754+00
+</screen>
+
+ <para>
+ The transaction ID that contains the change violating the constraint can be
+ found from those outputs (transaction ID 716 in the above case). The transaction
+ can be skipped by using <command>ALTER SUBSCRIPTION ... SKIP</command> on the
+ subscription. Alternatively, the transaction can also be skipped by calling the
+ <link linkend="pg-replication-origin-advance">
<function>pg_replication_origin_advance()</function></link> function with
a <parameter>node_name</parameter> corresponding to the subscription name,
and a position. The current position of origins can be seen in the
<link linkend="view-pg-replication-origin-status">
<structname>pg_replication_origin_status</structname></link> system view.
</para>
+
+ <para>
+ In this case, you need to consider changing the data on the subscriber so that it
+ doesn't conflict with incoming changes, or dropping the conflicting constraint or
+ unique index, or writing a trigger on the subscriber to suppress or redirect
+ conflicting incoming changes, or as a last resort, by skipping the whole transaction.
+ They skip the whole transaction, including changes that may not violate any
+ constraint. They may easily make the subscriber inconsistent, especially if
+ a user specifies the wrong transaction ID or the position of origin.
+ </para>
</sect1>
<sect1 id="logical-replication-restrictions">
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 0b027cc346..79a05e08ab 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -29,6 +29,7 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> REFRESH PUB
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> ENABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> DISABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable class="parameter">skip_option</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <replaceable>new_name</replaceable>
</synopsis>
@@ -207,6 +208,46 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><literal>SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> [, ... ] )</literal></term>
+ <listitem>
+ <para>
+ Skip applying changes of the particular transaction. If incoming data
+ violates any constraints the logical replication will stop until it is
+ resolved. The resolution can be done either by changing data on the
+ subscriber so that it doesn't conflict with incoming change or by skipping
+ the whole transaction. The logical replication worker skips all data
+ modification changes within the specified transaction. Therefore, since
+ it skips the whole transaction including the changes that may not violate
+ the constraint, it should only be used as a last resort. This option has
+ no effect for the transaction that is already prepared with enabling
+ <literal>two_phase</literal> on susbscriber. After the logical replication
+ successfully skips the transaction, the transaction ID (stored in
+ <structname>pg_subscription</structname>.<structfield>subskipxid</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts.
+ </para>
+
+ <para>
+ <replaceable>skip_option</replaceable> specifies options for this operation.
+ The supported option is:
+
+ <variablelist>
+ <varlistentry>
+ <term><literal>xid</literal> (<type>xid</type>)</term>
+ <listitem>
+ <para>
+ Specifies the ID of the transaction whose application is to be skipped
+ by the logical replication worker. Setting -1 means to reset the
+ transaction ID.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><replaceable class="parameter">new_owner</replaceable></term>
<listitem>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index 25021e25a4..cb22cd7463 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -104,6 +104,16 @@ GetSubscription(Oid subid, bool missing_ok)
Assert(!isnull);
sub->publications = textarray_to_stringlist(DatumGetArrayTypeP(datum));
+ /* Get skip XID */
+ datum = SysCacheGetAttr(SUBSCRIPTIONOID,
+ tup,
+ Anum_pg_subscription_subskipxid,
+ &isnull);
+ if (!isnull)
+ sub->skipxid = DatumGetTransactionId(datum);
+ else
+ sub->skipxid = InvalidTransactionId;
+
ReleaseSysCache(tup);
return sub;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 2b658080fe..6e64d8ccac 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -61,6 +61,7 @@
#define SUBOPT_BINARY 0x00000080
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
+#define SUBOPT_XID 0x00000400
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -82,6 +83,8 @@ typedef struct SubOpts
bool binary;
bool streaming;
bool twophase;
+ TransactionId xid; /* InvalidTransactionId for resetting purpose, otherwise
+ * normal transaction id */
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -249,6 +252,31 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_TWOPHASE_COMMIT;
opts->twophase = defGetBoolean(defel);
}
+ else if (strcmp(defel->defname, "xid") == 0)
+ {
+ char *xid_str = defGetString(defel);
+ TransactionId xid;
+
+ if (strcmp(xid_str, "-1") == 0)
+ {
+ /* Setting -1 to xid means to reset it */
+ xid = InvalidTransactionId;
+ }
+ else
+ {
+ /* Parse the argument as TransactionId */
+ xid = DatumGetTransactionId(DirectFunctionCall1(xidin,
+ CStringGetDatum(xid_str)));
+
+ if (!TransactionIdIsNormal(xid))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid transaction id: %s", xid_str)));
+ }
+
+ opts->specified_opts |= SUBOPT_XID;
+ opts->xid = xid;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -464,6 +492,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
CharGetDatum(opts.twophase ?
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -1083,6 +1112,31 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
break;
}
+ case ALTER_SUBSCRIPTION_SKIP:
+ {
+ parse_subscription_options(pstate, stmt->options, SUBOPT_XID, &opts);
+
+ if (IsSet(opts.specified_opts, SUBOPT_XID))
+ {
+ if (!superuser())
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("must be superuser to set %s", "skip_xid")));
+
+ if (TransactionIdIsValid(opts.xid))
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(opts.xid);
+ else
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
+
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ update_tuple = true;
+ }
+
+ break;
+ }
+
default:
elog(ERROR, "unrecognized ALTER SUBSCRIPTION kind %d",
stmt->kind);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 3d4dd43e47..ba039ff9a6 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9956,6 +9956,15 @@ AlterSubscriptionStmt:
(Node *)makeInteger(false), @1));
$$ = (Node *)n;
}
+ | ALTER SUBSCRIPTION name SKIP definition
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_SKIP;
+ n->subname = $3;
+ n->options = $5;
+ $$ = (Node *)n;
+ }
;
/*****************************************************************************
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 2e79302a48..6c2ff569b9 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -136,6 +136,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -255,6 +256,19 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * True if we are skipping all data modification changes (INSERT, UPDATE, etc.) of
+ * the specified transaction at MySubscription->skipxid. Once we start skipping
+ * changes, we don't stop it until the we skip all changes of the transaction even
+ * if the subscription invalidated and MySubscription->skipxid gets changed or reset.
+ * Also, we don't skip receiving the changes in streaming cases, since we decide
+ * whether or not to skip applying the changes when starting to apply changes. At
+ * end of the transaction, we disable it and reset the skip XID. The timing of
+ * resetting the skip XID varies depending on commit or commit/rollback prepared
+ * cases. Please refer to the comments in these functions for details.
+ */
+static bool skipping_changes = false;
+
/* BufFile handle of the current streaming file */
static BufFile *stream_fd = NULL;
@@ -330,6 +344,11 @@ static void TwoPhaseTransactionGid(Oid subid, TransactionId xid, char *gid, int
/* Common streaming function to apply all the spooled messages */
static void apply_spooled_messages(TransactionId xid, XLogRecPtr lsn);
+/* Functions for skipping changes */
+static void maybe_start_skipping_changes(TransactionId xid);
+static void stop_skipping_changes(bool commit);
+static void clear_subscription_skip_xid(void);
+
/* Functions for apply error callback */
static void apply_error_callback(void *arg);
static inline void set_apply_error_context_xact(TransactionId xid, TimestampTz ts);
@@ -789,6 +808,11 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
+ /*
+ * Enable skipping all changes of this transaction if specified.
+ */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -841,6 +865,11 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
+ /*
+ * Enable skipping all changes of this transaction if specified
+ */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -897,11 +926,32 @@ apply_handle_prepare(StringInfo s)
LSN_FORMAT_ARGS(prepare_data.prepare_lsn),
LSN_FORMAT_ARGS(remote_final_lsn))));
+ /*
+ * If we are skipping all changes of this transaction, we stop it but
+ * unlike commit, we do not clear subskipxid of pg_subscription catalog
+ * here and will do that at commit prepared or rollback prepared time.
+ * If we update the catalog and prepare the transaction, in a case where
+ * the server crashes between them, subskipxid is cleared but this
+ * transaction will be resent. Even if we do that in reverse order,
+ * subskipxid will not be cleared but this transaction won’t be resent.
+ *
+ * Also, one might think that we can skip preparing the skipped transaction.
+ * But if we do that, PREPARE WAL record won’t be sent to its physical
+ * standbys, resulting in that users won’t be able to find the prepared
+ * transaction entry after a fail-over.
+ *
+ * subskipxid might be changed or cleared by the user before we receive
+ * COMMIT PREPARED or ROLLBACK PREPARED. But that’s okay because this
+ * prepared transaction is empty.
+ */
+ if (skipping_changes)
+ stop_skipping_changes(false);
+
/*
* Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction. It is done this way because at
- * commit prepared time, we won't know whether we have skipped preparing a
- * transaction because of no change.
+ * change has happened in this transaction or all changes are skipped. It is
+ * done this way because at commit prepared time, we won't know whether we
+ * have skipped preparing a transaction because of no change.
*
* XXX, We can optimize such that at commit prepared time, we first check
* whether we have prepared the transaction or not but that doesn't seem
@@ -938,6 +988,24 @@ apply_handle_commit_prepared(StringInfo s)
logicalrep_read_commit_prepared(s, &prepare_data);
set_apply_error_context_xact(prepare_data.xid, prepare_data.commit_time);
+ if (MySubscription->skipxid == prepare_data.xid)
+ {
+ /*
+ * Clear the subskipxid of pg_subscription catalog. This catalog
+ * update must be committed before finishing prepared transaction.
+ * Because otherwise, in a case where the server crashes between
+ * finishing prepared transaction and the catalog update, COMMIT
+ * PREPARED won’t be resent but skipsubxid is left.
+ *
+ * Also, we must not update the replication origin LSN and timestamp
+ * while committing the catalog update so that COMMIT PREPARED will
+ * be resent in case of a crash immediately after the catalog update
+ * commit.
+ */
+ clear_subscription_skip_xid();
+ CommitTransactionCommand();
+ }
+
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
gid, sizeof(gid));
@@ -979,6 +1047,17 @@ apply_handle_rollback_prepared(StringInfo s)
logicalrep_read_rollback_prepared(s, &rollback_data);
set_apply_error_context_xact(rollback_data.xid, rollback_data.rollback_time);
+ if (MySubscription->skipxid == rollback_data.xid)
+ {
+ /*
+ * Same as COMMIT PREPARED, we must clear subskipxid of pg_subscription
+ * before rolling back the prepared transaction. Please see the comments
+ * in apply_handle_commit_prepared() for details.
+ */
+ clear_subscription_skip_xid();
+ CommitTransactionCommand();
+ }
+
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, rollback_data.xid,
gid, sizeof(gid));
@@ -1046,9 +1125,19 @@ apply_handle_stream_prepare(StringInfo s)
elog(DEBUG1, "received prepare for streamed transaction %u", prepare_data.xid);
+ /* Enable skipping all changes of this transaction if specified. */
+ maybe_start_skipping_changes(prepare_data.xid);
+
/* Replay all the spooled operations. */
apply_spooled_messages(prepare_data.xid, prepare_data.prepare_lsn);
+ /*
+ * Same as PREPARE, we stop skipping changes but don't clear subskipxid
+ * here. See the comments in apply_handle_prepare() for details.
+ */
+ if (skipping_changes)
+ stop_skipping_changes(false);
+
/* Mark the transaction as prepared. */
apply_handle_prepare_internal(&prepare_data);
@@ -1207,6 +1296,16 @@ apply_handle_stream_abort(StringInfo s)
logicalrep_read_stream_abort(s, &xid, &subxid);
+ /*
+ * We don't expect that the user set the XID of the transaction that is
+ * rolled back but if the skip XID is set, clear it.
+ */
+ if (MySubscription->skipxid == xid || MySubscription->skipxid == subxid)
+ {
+ clear_subscription_skip_xid();
+ CommitTransactionCommand();
+ }
+
/*
* If the two XIDs are the same, it's in fact abort of toplevel xact, so
* just delete the files with serialized info.
@@ -1428,8 +1527,15 @@ apply_handle_stream_commit(StringInfo s)
elog(DEBUG1, "received commit for streamed transaction %u", xid);
+ /* Enable skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(xid);
+
apply_spooled_messages(xid, commit_data.commit_lsn);
+ /*
+ * Commit streamed transaction. If we're skipping this transaction,
+ * we stop it in apply_handle_commit_internal().
+ */
apply_handle_commit_internal(&commit_data);
/* unlink the files with serialized changes and subxact info */
@@ -1449,8 +1555,17 @@ apply_handle_stream_commit(StringInfo s)
static void
apply_handle_commit_internal(LogicalRepCommitData *commit_data)
{
- if (IsTransactionState())
+ if (IsTransactionState() || skipping_changes)
{
+ /*
+ * If we are skipping all changes of this transaction, we stop it
+ * and clear subskipxid of pg_subscription. The catalog update is
+ * committed at CommitTransactionCommand() below while updating
+ * the replication origin LSN and timestamp.
+ */
+ if (skipping_changes)
+ stop_skipping_changes(true);
+
/*
* Update origin state so we can restart streaming from correct
* position in case of crash.
@@ -2319,6 +2434,17 @@ apply_dispatch(StringInfo s)
LogicalRepMsgType action = pq_getmsgbyte(s);
LogicalRepMsgType saved_command;
+ /*
+ * Skip all data-modification changes if we're skipping changes of this
+ * transaction.
+ */
+ if (skipping_changes &&
+ (action == LOGICAL_REP_MSG_INSERT ||
+ action == LOGICAL_REP_MSG_UPDATE ||
+ action == LOGICAL_REP_MSG_DELETE ||
+ action == LOGICAL_REP_MSG_TRUNCATE))
+ return;
+
/*
* Set the current command being applied. Since this function can be
* called recusively when applying spooled changes, save the current
@@ -3613,6 +3739,85 @@ IsLogicalWorker(void)
return MyLogicalRepWorker != NULL;
}
+/*
+ * Start skipping changes of the transaction if the given XID matches the
+ * transaction ID specified by skip_xid option.
+ */
+static void
+maybe_start_skipping_changes(TransactionId xid)
+{
+ Assert(!skipping_changes);
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ if (MySubscription->skipxid != xid)
+ return;
+
+ /* Start skipping all changes of this transaction */
+ skipping_changes = true;
+
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction %u",
+ xid));
+}
+
+/*
+ * Stop skipping changes by resetting skipping_xid. If clear_subskipxid is true,
+ * we also set NULL to subskipxid of pg_subscription catalog.
+ */
+static void
+stop_skipping_changes(bool clear_subskipxid)
+{
+ Assert(skipping_changes);
+
+ /* Stop skipping changes */
+ skipping_changes = false;
+
+ if (clear_subskipxid)
+ clear_subscription_skip_xid();
+
+ ereport(LOG, (errmsg("done skipping logical replication transaction")));
+}
+
+/* Update subskipxid of pg_subscription to NULL */
+static void
+clear_subscription_skip_xid(void)
+{
+ Relation rel;
+ HeapTuple tup;
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ if (!IsTransactionState())
+ StartTransactionCommand();
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ ObjectIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ /* Set subskipxid to null */
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ /* Update the system catalog to reset the skip XID */
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+
+ heap_freetuple(tup);
+ table_close(rel, RowExclusiveLock);
+}
+
/* Error callback to give more context info about the change being applied */
static void
apply_error_callback(void *arg)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 10a86f9810..c025d64b83 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4513,6 +4513,10 @@ getSubscriptions(Archive *fout)
ntups = PQntuples(res);
+ /*
+ * Get subscription fields. We don't fetch subskipxid as we don't
+ * include it in the dump.
+ */
i_tableoid = PQfnumber(res, "tableoid");
i_oid = PQfnumber(res, "oid");
i_subname = PQfnumber(res, "subname");
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 2f412ca3db..ed14d8e6c5 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1666,7 +1666,7 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> */
else if (Matches("ALTER", "SUBSCRIPTION", MatchAny))
COMPLETE_WITH("CONNECTION", "ENABLE", "DISABLE", "OWNER TO",
- "RENAME TO", "REFRESH PUBLICATION", "SET",
+ "RENAME TO", "REFRESH PUBLICATION", "SET", "SKIP",
"ADD PUBLICATION", "DROP PUBLICATION");
/* ALTER SUBSCRIPTION <name> REFRESH PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) &&
@@ -1682,6 +1682,12 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
COMPLETE_WITH("binary", "slot_name", "streaming", "synchronous_commit");
+ /* ALTER SUBSCRIPTION <name> SKIP */
+ else if (Matches("ALTER", "SUBSCRIPTION", MatchAny, "SKIP"))
+ COMPLETE_WITH("(");
+ /* ALTER SUBSCRIPTION <name> SKIP ( */
+ else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
+ COMPLETE_WITH("xid");
/* ALTER SUBSCRIPTION <name> SET PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "PUBLICATION"))
{
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 21061493ea..beaa6e646d 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -67,6 +67,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
char subtwophasestate; /* Stream two-phase transactions */
+ TransactionId subskipxid BKI_FORCE_NULL; /* All changes associated with
+ * this XID are skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -103,6 +106,7 @@ typedef struct Subscription
* binary format */
bool stream; /* Allow streaming in-progress transactions. */
char twophasestate; /* Allow streaming two-phase transactions */
+ TransactionId skipxid; /* All changes of the XID are skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 4c5a8a39bf..ab778865d8 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3709,7 +3709,8 @@ typedef enum AlterSubscriptionType
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
ALTER_SUBSCRIPTION_DROP_PUBLICATION,
ALTER_SUBSCRIPTION_REFRESH,
- ALTER_SUBSCRIPTION_ENABLED
+ ALTER_SUBSCRIPTION_ENABLED,
+ ALTER_SUBSCRIPTION_SKIP
} AlterSubscriptionType;
typedef struct AlterSubscriptionStmt
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 80aae83562..f5c757f4fd 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -93,6 +93,32 @@ ALTER SUBSCRIPTION regress_doesnotexist CONNECTION 'dbname=regress_doesnotexist2
ERROR: subscription "regress_doesnotexist" does not exist
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
ERROR: unrecognized subscription parameter: "create_slot"
+-- ok - valid xid
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 3);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 4294967295);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+ subname | subskipxid
+-----------------+------------
+ regress_testsub | 4294967295
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = -1);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+ subname | subskipxid
+-----------------+------------
+ regress_testsub |
+(1 row)
+
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1.1);
+ERROR: invalid transaction id: 1.1
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = -2);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 0);
+ERROR: invalid transaction id: 0
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1);
+ERROR: invalid transaction id: 1
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 2);
+ERROR: invalid transaction id: 2
\dRs+
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index bd0f4af1e4..00df54e84a 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -72,6 +72,20 @@ ALTER SUBSCRIPTION regress_testsub SET (slot_name = '');
ALTER SUBSCRIPTION regress_doesnotexist CONNECTION 'dbname=regress_doesnotexist2';
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
+-- ok - valid xid
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 3);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 4294967295);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = -1);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1.1);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = -2);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 0);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 2);
+
\dRs+
BEGIN;
diff --git a/src/test/subscription/t/027_skip_xact.pl b/src/test/subscription/t/027_skip_xact.pl
new file mode 100644
index 0000000000..37c2fb3dba
--- /dev/null
+++ b/src/test/subscription/t/027_skip_xact.pl
@@ -0,0 +1,204 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Tests for skipping logical replication transactions.
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 6;
+
+# Test skipping the transaction. This function must be called after the caller
+# inserting data that conflict with the subscriber. After waiting for the
+# subscription worker stats are updated, we skip the transaction in question
+# by ALTER SUBSCRIPTION ... SKIP. Then, check if logical replication can continue
+# working by inserting $nonconflict_data on the publisher.
+sub test_skip_xact
+{
+ my ($node_publisher, $node_subscriber, $subname, $relname, $nonconflict_data,
+ $expected, $xid, $msg) = @_;
+
+ # Wait for worker error
+ $node_subscriber->poll_query_until(
+ 'postgres',
+ qq[
+SELECT count(1) > 0
+FROM pg_stat_subscription_workers
+WHERE last_error_relid = '$relname'::regclass
+ AND subrelid IS NULL
+ AND last_error_command = 'INSERT'
+ AND last_error_xid = '$xid'
+ AND starts_with(last_error_message, 'duplicate key value violates unique constraint');
+]) or die "Timed out while waiting for worker error";
+
+ # Set skip xid
+ $node_subscriber->safe_psql(
+ 'postgres',
+ "ALTER SUBSCRIPTION $subname SKIP (xid = '$xid')");
+
+ # Restart the subscriber node to restart logical replication with no interval.
+ $node_subscriber->restart;
+
+ # Wait for the failed transaction to be skipped.
+ $node_subscriber->poll_query_until(
+ 'postgres',
+ "SELECT subskipxid IS NULL FROM pg_subscription WHERE subname = '$subname'");
+
+ # Insert non-conflict data
+ $node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO $relname VALUES $nonconflict_data");
+
+ $node_publisher->wait_for_catchup($subname);
+
+ # Check replicated data
+ my $res = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT count(*) FROM $relname");
+ is($res, $expected, $msg);
+}
+
+# Create publisher node.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ qq[
+logical_decoding_work_mem = 64kB
+max_prepared_transactions = 10
+]);
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+]);
+
+# The subscriber will enter an infinite error loop, so we don't want
+# to overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+wal_retrieve_retry_interval = 2s
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we
+# create the same tables but with primary keys. Also, insert some data that
+# will conflict with the data replicated from publisher later.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+CREATE TABLE test_tab (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+COMMIT;
+]);
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+CREATE TABLE test_tab (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab VALUES (1);
+INSERT INTO test_tab_streaming VALUES (1, md5(1::text));
+COMMIT;
+]);
+
+# Setup publications.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Create subscriptions.
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr' PUBLICATION tap_pub WITH (two_phase = on);
+CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr' PUBLICATION tap_pub_streaming WITH (two_phase = on, streaming = on);
+]);
+
+$node_publisher->wait_for_catchup('tap_sub');
+$node_publisher->wait_for_catchup('tap_sub_streaming');
+
+# Insert data to test_tab1, raising an error on the subscriber due to violation
+# of the unique constraint on test_tab. Then skip the transaction.
+my $xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(2)", "2", $xid,
+ "test skipping transaction");
+
+# Test for PREPARE. Insert the same data to test_tab1 and PREPARE the transaction,
+# raising an error. Then skip the transaction.
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+COMMIT PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(3)", "3", $xid,
+ "test skipping prepare and commit prepared ");
+
+# Test for PREPARE and ROLLBACK PREPARED.
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+ROLLBACK PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(4)", "4", $xid,
+ "test skipping prepare and rollback prepared");
+
+# Test for STREAM COMMIT. Insert enough rows to test_tab_streaming to exceed the 64kB
+# limit, also raising an error on the subscriber during applying spooled changes for the
+# same reason. Then skip the transaction.
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub_streaming", "test_tab_streaming",
+ "(2, md5(2::text))", "2", $xid,
+ "test skipping stream-commit");
+
+# Test for STREAM PREPARE and COMMIT PREPARED.
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+COMMIT PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub_streaming", "test_tab_streaming",
+ "(3, md5(3::text))", "3", $xid,
+ "test skipping stream-prepare and commit prepared");
+
+my $res = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT count(*) FROM pg_prepared_xacts");
+is($res, "0", "check all prepared transactions are resolved on the subscriber");
--
2.24.3 (Apple Git-128)
On Fri, Dec 10, 2021 at 11:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Dec 9, 2021 at 6:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Dec 9, 2021 at 2:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Dec 9, 2021 at 11:47 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
I am thinking that we can start a transaction, update the catalog,
commit that transaction. Then start a new one to update
origin_lsn/timestamp, finishprepared, and commit it. Now, if it
crashes after the first transaction, only commit prepared will be
resent again and this time we don't need to update the catalog as that
entry would be already cleared.Sounds good. In the crash case, it should be fine since we will just
commit an empty transaction. The same is true for the case where
skip_xid has been changed after skipping and preparing the transaction
and before handling commit_prepared.Regarding the case where the user specifies XID of the transaction
after it is prepared on the subscriber (i.g., the transaction is not
empty), we won’t skip committing the prepared transaction. But I think
that we don't need to support skipping already-prepared transaction
since such transaction doesn't conflict with anything regardless of
having changed or not.Yeah, this makes sense to me.
I've attached an updated patch. The new syntax is like "ALTER
SUBSCRIPTION testsub SKIP (xid = '123')".I’ve been thinking we can do something safeguard for the case where
the user specified the wrong xid. For example, can we somewhat use the
stats in pg_stat_subscription_workers? An idea is that logical
replication worker fetches the xid from the stats when reading the
subscription and skips the transaction if the xid matches to
subskipxid. That is, the worker checks the error reported by the
worker previously working on the same subscription. The error could
not be a conflict error (e.g., connection error etc.) or might have
been cleared by the reset function, But given the worker is in an
error loop, the worker can eventually get xid in question. We can
prevent an unrelated transaction from being skipped unexpectedly. It
seems not a stable solution though. Or it might be enough to warn
users when they specified an XID that doesn’t match to last_error_xid.
I think the idea is good but because it is not predictable as pointed
by you so we might want to just issue a LOG/WARNING. If not already
mentioned, then please do mention in docs the possibility of skipping
non-errored transactions.
Few comments/questions:
=====================
1.
+ Specifies the ID of the transaction whose application is to
be skipped
+ by the logical replication worker. Setting -1 means to reset the
+ transaction ID.
Can we change it to something like: "Specifies the ID of the
transaction whose changes are to be skipped by the logical replication
worker. ...."
2.
@@ -104,6 +104,16 @@ GetSubscription(Oid subid, bool missing_ok)
Assert(!isnull);
sub->publications = textarray_to_stringlist(DatumGetArrayTypeP(datum));
+ /* Get skip XID */
+ datum = SysCacheGetAttr(SUBSCRIPTIONOID,
+ tup,
+ Anum_pg_subscription_subskipxid,
+ &isnull);
+ if (!isnull)
+ sub->skipxid = DatumGetTransactionId(datum);
+ else
+ sub->skipxid = InvalidTransactionId;
Can't we assign it as we do for other fixed columns like subdbid,
subowner, etc.?
3.
+ * Also, we don't skip receiving the changes in streaming cases,
since we decide
+ * whether or not to skip applying the changes when starting to apply changes.
But why so? Can't we even skip streaming (and writing to file all such
messages)? If we can do this then we can avoid even collecting all
messages in a file.
4.
+ * Also, one might think that we can skip preparing the skipped transaction.
+ * But if we do that, PREPARE WAL record won’t be sent to its physical
+ * standbys, resulting in that users won’t be able to find the prepared
+ * transaction entry after a fail-over.
+ *
..
+ */
+ if (skipping_changes)
+ stop_skipping_changes(false);
Why do we need such a Prepare's entry either at current subscriber or
on its physical standby? I think it is to allow Commit-prepared. If
so, how about if we skip even commit prepared as well? Even on
physical standby, we would be having the value of skip_xid which can
help us to skip there as well after failover.
--
With Regards,
Amit Kapila.
On Sat, Dec 11, 2021 at 3:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Dec 10, 2021 at 11:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Dec 9, 2021 at 6:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Dec 9, 2021 at 2:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Dec 9, 2021 at 11:47 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
I am thinking that we can start a transaction, update the catalog,
commit that transaction. Then start a new one to update
origin_lsn/timestamp, finishprepared, and commit it. Now, if it
crashes after the first transaction, only commit prepared will be
resent again and this time we don't need to update the catalog as that
entry would be already cleared.Sounds good. In the crash case, it should be fine since we will just
commit an empty transaction. The same is true for the case where
skip_xid has been changed after skipping and preparing the transaction
and before handling commit_prepared.Regarding the case where the user specifies XID of the transaction
after it is prepared on the subscriber (i.g., the transaction is not
empty), we won’t skip committing the prepared transaction. But I think
that we don't need to support skipping already-prepared transaction
since such transaction doesn't conflict with anything regardless of
having changed or not.Yeah, this makes sense to me.
I've attached an updated patch. The new syntax is like "ALTER
SUBSCRIPTION testsub SKIP (xid = '123')".I’ve been thinking we can do something safeguard for the case where
the user specified the wrong xid. For example, can we somewhat use the
stats in pg_stat_subscription_workers? An idea is that logical
replication worker fetches the xid from the stats when reading the
subscription and skips the transaction if the xid matches to
subskipxid. That is, the worker checks the error reported by the
worker previously working on the same subscription. The error could
not be a conflict error (e.g., connection error etc.) or might have
been cleared by the reset function, But given the worker is in an
error loop, the worker can eventually get xid in question. We can
prevent an unrelated transaction from being skipped unexpectedly. It
seems not a stable solution though. Or it might be enough to warn
users when they specified an XID that doesn’t match to last_error_xid.I think the idea is good but because it is not predictable as pointed
by you so we might want to just issue a LOG/WARNING. If not already
mentioned, then please do mention in docs the possibility of skipping
non-errored transactions.Few comments/questions: ===================== 1. + Specifies the ID of the transaction whose application is to be skipped + by the logical replication worker. Setting -1 means to reset the + transaction ID.Can we change it to something like: "Specifies the ID of the
transaction whose changes are to be skipped by the logical replication
worker. ...."
Agreed.
2.
@@ -104,6 +104,16 @@ GetSubscription(Oid subid, bool missing_ok)
Assert(!isnull);
sub->publications = textarray_to_stringlist(DatumGetArrayTypeP(datum));+ /* Get skip XID */ + datum = SysCacheGetAttr(SUBSCRIPTIONOID, + tup, + Anum_pg_subscription_subskipxid, + &isnull); + if (!isnull) + sub->skipxid = DatumGetTransactionId(datum); + else + sub->skipxid = InvalidTransactionId;Can't we assign it as we do for other fixed columns like subdbid,
subowner, etc.?
Yeah, I think we can use InvalidTransactionId as the initial value
instead of setting NULL. Then, we can change this code.
3. + * Also, we don't skip receiving the changes in streaming cases, since we decide + * whether or not to skip applying the changes when starting to apply changes.But why so? Can't we even skip streaming (and writing to file all such
messages)? If we can do this then we can avoid even collecting all
messages in a file.
IIUC in streaming cases, a transaction can be sent to the subscriber
while splitting into multiple chunks of changes. In the meanwhile,
skip_xid can be changed. If the user changed or cleared skip_xid after
the subscriber skips some streamed changes, the subscriber won't able
to have complete changes of the transaction.
4. + * Also, one might think that we can skip preparing the skipped transaction. + * But if we do that, PREPARE WAL record won’t be sent to its physical + * standbys, resulting in that users won’t be able to find the prepared + * transaction entry after a fail-over. + * .. + */ + if (skipping_changes) + stop_skipping_changes(false);Why do we need such a Prepare's entry either at current subscriber or
on its physical standby? I think it is to allow Commit-prepared. If
so, how about if we skip even commit prepared as well? Even on
physical standby, we would be having the value of skip_xid which can
help us to skip there as well after failover.
It's true that skip_xid would be set also on physical standby. When it
comes to preparing the skipped transaction on the current subscriber,
if we want to skip commit-prepared I think we need protocol changes in
order for subscribers to know prepare_lsn and preppare_timestampso
that it can lookup the prepared transaction when doing
commit-prepared. I proposed this idea before. This change would be
benefical as of now since the publisher sends even empty transactions.
But considering the proposed patch[1] that makes the puslisher not
send empty transaction, this protocol change would be an optimization
only for this feature.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Fri, Dec 10, 2021 at 4:44 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated patch. The new syntax is like "ALTER
SUBSCRIPTION testsub SKIP (xid = '123')".
I have some review comments:
(1) Patch comment - some suggested wording improvements
BEFORE:
If incoming change violates any constraint, logical replication stops
AFTER:
If an incoming change violates any constraint, logical replication stops
BEFORE:
The user can specify XID by ALTER SUBSCRIPTION ... SKIP (xid = XXX),
updating pg_subscription.subskipxid field, telling the apply worker to
skip the transaction.
AFTER:
The user can specify the XID of the transaction to skip using
ALTER SUBSCRIPTION ... SKIP (xid = XXX), updating the pg_subscription.subskipxid
field, telling the apply worker to skip the transaction.
src/sgml/logical-replication.sgml
(2) Some suggested wording improvements
(i) Missing "the"
BEFORE:
+ the existing data. When a conflict produce an error, it is shown in
AFTER:
+ the existing data. When a conflict produce an error, it is shown in the
(ii) Suggest starting a new sentence
BEFORE:
+ and it is also shown in subscriber's server log as follows:
AFTER:
+ The error is also shown in the subscriber's server log as follows:
(iii) Context message should say "at ..." instead of "with commit
timestamp ...", to match the actual output from the current code
BEFORE:
+CONTEXT: processing remote data during "INSERT" for replication
target relation "public.test" in transaction 716 with commit timestamp
2021-09-29 15:52:45.165754+00
AFTER:
+CONTEXT: processing remote data during "INSERT" for replication
target relation "public.test" in transaction 716 at 2021-09-29
15:52:45.165754+00
(iv) The following paragraph seems out of place, with the information
presented in the wrong order:
+ <para>
+ In this case, you need to consider changing the data on the
subscriber so that it
+ doesn't conflict with incoming changes, or dropping the
conflicting constraint or
+ unique index, or writing a trigger on the subscriber to suppress or redirect
+ conflicting incoming changes, or as a last resort, by skipping the
whole transaction.
+ They skip the whole transaction, including changes that may not violate any
+ constraint. They may easily make the subscriber inconsistent, especially if
+ a user specifies the wrong transaction ID or the position of origin.
+ </para>
How about rearranging it as follows:
+ <para>
+ These methods skip the whole transaction, including changes that
may not violate
+ any constraint. They may easily make the subscriber inconsistent,
especially if
+ a user specifies the wrong transaction ID or the position of
origin, and should
+ be used as a last resort.
+ Alternatively, you might consider changing the data on the
subscriber so that it
+ doesn't conflict with incoming changes, or dropping the
conflicting constraint or
+ unique index, or writing a trigger on the subscriber to suppress or redirect
+ conflicting incoming changes.
+ </para>
doc/src/sgml/ref/alter_subscription.sgml
(3)
(i) Doc needs clarification
BEFORE:
+ the whole transaction. The logical replication worker skips all data
AFTER:
+ the whole transaction. For the latter case, the logical
replication worker skips all data
(ii) "Setting -1 means to reset the transaction ID"
Shouldn't it be explained what resetting actually does and when it can
be, or is needed to be, done? Isn't it automatically reset?
I notice that negative values (other than -1) seem to be regarded as
valid - is that right?
Also, what happens if this option is set multiple times? Does it just
override and use the latest setting? (other option handling errors out
with errorConflictingDefElem()).
e.g. alter subscription sub skip (xid = 721, xid = 722);
src/backend/replication/logical/worker.c
(4) Shouldn't the "done skipping logical replication transaction"
message also include the skipped XID value at the end?
src/test/subscription/t/027_skip_xact.pl
(5) Some suggested wording improvements
(i)
BEFORE:
+# Test skipping the transaction. This function must be called after the caller
+# inserting data that conflict with the subscriber. After waiting for the
+# subscription worker stats are updated, we skip the transaction in question
+# by ALTER SUBSCRIPTION ... SKIP. Then, check if logical replication
can continue
+# working by inserting $nonconflict_data on the publisher.
AFTER:
+# Test skipping the transaction. This function must be called after the caller
+# inserts data that conflicts with the subscriber. After waiting for the
+# subscription worker stats to be updated, we skip the transaction in question
+# by ALTER SUBSCRIPTION ... SKIP. Then, check if logical replication
can continue
+# working by inserting $nonconflict_data on the publisher.
(ii)
BEFORE:
+# will conflict with the data replicated from publisher later.
AFTER:
+# will conflict with the data replicated later from the publisher.
Regards,
Greg Nancarrow
Fujitsu Australia
On Mon, Dec 13, 2021 at 8:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Sat, Dec 11, 2021 at 3:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
3. + * Also, we don't skip receiving the changes in streaming cases, since we decide + * whether or not to skip applying the changes when starting to apply changes.But why so? Can't we even skip streaming (and writing to file all such
messages)? If we can do this then we can avoid even collecting all
messages in a file.IIUC in streaming cases, a transaction can be sent to the subscriber
while splitting into multiple chunks of changes. In the meanwhile,
skip_xid can be changed. If the user changed or cleared skip_xid after
the subscriber skips some streamed changes, the subscriber won't able
to have complete changes of the transaction.
Yeah, I think if we want we can handle this by writing into the stream
xid file whether the changes need to be skipped and then the
consecutive streams can check that in the file or may be in some way
don't allow skip_xid to be changed in worker if it is already skipping
some xact. If we don't want to do anything for this then it is better
to at least reflect this reasoning in the comments.
4. + * Also, one might think that we can skip preparing the skipped transaction. + * But if we do that, PREPARE WAL record won’t be sent to its physical + * standbys, resulting in that users won’t be able to find the prepared + * transaction entry after a fail-over. + * .. + */ + if (skipping_changes) + stop_skipping_changes(false);Why do we need such a Prepare's entry either at current subscriber or
on its physical standby? I think it is to allow Commit-prepared. If
so, how about if we skip even commit prepared as well? Even on
physical standby, we would be having the value of skip_xid which can
help us to skip there as well after failover.It's true that skip_xid would be set also on physical standby. When it
comes to preparing the skipped transaction on the current subscriber,
if we want to skip commit-prepared I think we need protocol changes in
order for subscribers to know prepare_lsn and preppare_timestampso
that it can lookup the prepared transaction when doing
commit-prepared. I proposed this idea before. This change would be
benefical as of now since the publisher sends even empty transactions.
But considering the proposed patch[1] that makes the puslisher not
send empty transaction, this protocol change would be an optimization
only for this feature.
I was thinking to compare the xid received as part of the
commit_prepared message with the value of skip_xid to skip the
commit_prepared but I guess the user would change it between prepare
and commit prepare and then we won't be able to detect it, right? I
think we can handle this and the streaming case if we disallow users
to change the value of skip_xid when we are already skipping changes
or don't let the new skip_xid to reflect in the apply worker if we are
already skipping some other transaction. What do you think?
--
With Regards,
Amit Kapila.
On Mon, Dec 13, 2021 at 1:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Dec 13, 2021 at 8:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Sat, Dec 11, 2021 at 3:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
3. + * Also, we don't skip receiving the changes in streaming cases, since we decide + * whether or not to skip applying the changes when starting to apply changes.But why so? Can't we even skip streaming (and writing to file all such
messages)? If we can do this then we can avoid even collecting all
messages in a file.IIUC in streaming cases, a transaction can be sent to the subscriber
while splitting into multiple chunks of changes. In the meanwhile,
skip_xid can be changed. If the user changed or cleared skip_xid after
the subscriber skips some streamed changes, the subscriber won't able
to have complete changes of the transaction.Yeah, I think if we want we can handle this by writing into the stream
xid file whether the changes need to be skipped and then the
consecutive streams can check that in the file or may be in some way
don't allow skip_xid to be changed in worker if it is already skipping
some xact. If we don't want to do anything for this then it is better
to at least reflect this reasoning in the comments.
Yes. Given that we still need to apply messages other than
data-modification messages, we need to skip writing only these changes
to the stream file.
4. + * Also, one might think that we can skip preparing the skipped transaction. + * But if we do that, PREPARE WAL record won’t be sent to its physical + * standbys, resulting in that users won’t be able to find the prepared + * transaction entry after a fail-over. + * .. + */ + if (skipping_changes) + stop_skipping_changes(false);Why do we need such a Prepare's entry either at current subscriber or
on its physical standby? I think it is to allow Commit-prepared. If
so, how about if we skip even commit prepared as well? Even on
physical standby, we would be having the value of skip_xid which can
help us to skip there as well after failover.It's true that skip_xid would be set also on physical standby. When it
comes to preparing the skipped transaction on the current subscriber,
if we want to skip commit-prepared I think we need protocol changes in
order for subscribers to know prepare_lsn and preppare_timestampso
that it can lookup the prepared transaction when doing
commit-prepared. I proposed this idea before. This change would be
benefical as of now since the publisher sends even empty transactions.
But considering the proposed patch[1] that makes the puslisher not
send empty transaction, this protocol change would be an optimization
only for this feature.I was thinking to compare the xid received as part of the
commit_prepared message with the value of skip_xid to skip the
commit_prepared but I guess the user would change it between prepare
and commit prepare and then we won't be able to detect it, right? I
think we can handle this and the streaming case if we disallow users
to change the value of skip_xid when we are already skipping changes
or don't let the new skip_xid to reflect in the apply worker if we are
already skipping some other transaction. What do you think?
In streaming cases, we don’t know when stream-commit or stream-abort
comes and another conflict could occur on the subscription in the
meanwhile. But given that (we expect) this feature is used after the
apply worker enters into an error loop, this is unlikely to happen in
practice unless the user sets the wrong XID. Similarly, in 2PC cases,
we don’t know when commit-prepared or rollback-prepared comes and
another conflict could occur in the meanwhile. But this could occur in
practice even if the user specified the correct XID. Therefore, if we
disallow to change skip_xid until the subscriber receives
commit-prepared or rollback-prepared, we cannot skip the second
transaction that conflicts with data on the subscriber.
From the application perspective, which behavior is preferable between
skipping preparing a transaction and preparing an empty transaction,
in the first place? From the resource consumption etc., skipping
preparing transactions seems better. On the other hand, if we skipped
preparing the transaction, the application would not be able to find
the prepared transaction after a fail-over to the subscriber.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Mon, Dec 13, 2021 at 6:55 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Dec 13, 2021 at 1:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Dec 13, 2021 at 8:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
4. + * Also, one might think that we can skip preparing the skipped transaction. + * But if we do that, PREPARE WAL record won’t be sent to its physical + * standbys, resulting in that users won’t be able to find the prepared + * transaction entry after a fail-over. + * .. + */ + if (skipping_changes) + stop_skipping_changes(false);Why do we need such a Prepare's entry either at current subscriber or
on its physical standby? I think it is to allow Commit-prepared. If
so, how about if we skip even commit prepared as well? Even on
physical standby, we would be having the value of skip_xid which can
help us to skip there as well after failover.It's true that skip_xid would be set also on physical standby. When it
comes to preparing the skipped transaction on the current subscriber,
if we want to skip commit-prepared I think we need protocol changes in
order for subscribers to know prepare_lsn and preppare_timestampso
that it can lookup the prepared transaction when doing
commit-prepared. I proposed this idea before. This change would be
benefical as of now since the publisher sends even empty transactions.
But considering the proposed patch[1] that makes the puslisher not
send empty transaction, this protocol change would be an optimization
only for this feature.I was thinking to compare the xid received as part of the
commit_prepared message with the value of skip_xid to skip the
commit_prepared but I guess the user would change it between prepare
and commit prepare and then we won't be able to detect it, right? I
think we can handle this and the streaming case if we disallow users
to change the value of skip_xid when we are already skipping changes
or don't let the new skip_xid to reflect in the apply worker if we are
already skipping some other transaction. What do you think?In streaming cases, we don’t know when stream-commit or stream-abort
comes and another conflict could occur on the subscription in the
meanwhile. But given that (we expect) this feature is used after the
apply worker enters into an error loop, this is unlikely to happen in
practice unless the user sets the wrong XID. Similarly, in 2PC cases,
we don’t know when commit-prepared or rollback-prepared comes and
another conflict could occur in the meanwhile. But this could occur in
practice even if the user specified the correct XID. Therefore, if we
disallow to change skip_xid until the subscriber receives
commit-prepared or rollback-prepared, we cannot skip the second
transaction that conflicts with data on the subscriber.
I agree with this theory. Can we reflect this in comments so that in
the future we know why we didn't pursue this direction?
From the application perspective, which behavior is preferable between
skipping preparing a transaction and preparing an empty transaction,
in the first place? From the resource consumption etc., skipping
preparing transactions seems better. On the other hand, if we skipped
preparing the transaction, the application would not be able to find
the prepared transaction after a fail-over to the subscriber.
I am not sure how much it matters that such prepares are not present
because we wanted to some way skip the corresponding commit prepared
as well. I think your previous point is a good enough reason as to why
we should allow such prepares.
--
With Regards,
Amit Kapila.
On Fri, Dec 10, 2021 at 11:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Dec 9, 2021 at 6:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Dec 9, 2021 at 2:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Dec 9, 2021 at 11:47 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
I am thinking that we can start a transaction, update the catalog,
commit that transaction. Then start a new one to update
origin_lsn/timestamp, finishprepared, and commit it. Now, if it
crashes after the first transaction, only commit prepared will be
resent again and this time we don't need to update the catalog as that
entry would be already cleared.Sounds good. In the crash case, it should be fine since we will just
commit an empty transaction. The same is true for the case where
skip_xid has been changed after skipping and preparing the transaction
and before handling commit_prepared.Regarding the case where the user specifies XID of the transaction
after it is prepared on the subscriber (i.g., the transaction is not
empty), we won’t skip committing the prepared transaction. But I think
that we don't need to support skipping already-prepared transaction
since such transaction doesn't conflict with anything regardless of
having changed or not.Yeah, this makes sense to me.
I've attached an updated patch. The new syntax is like "ALTER
SUBSCRIPTION testsub SKIP (xid = '123')".I’ve been thinking we can do something safeguard for the case where
the user specified the wrong xid. For example, can we somewhat use the
stats in pg_stat_subscription_workers? An idea is that logical
replication worker fetches the xid from the stats when reading the
subscription and skips the transaction if the xid matches to
subskipxid. That is, the worker checks the error reported by the
worker previously working on the same subscription. The error could
not be a conflict error (e.g., connection error etc.) or might have
been cleared by the reset function, But given the worker is in an
error loop, the worker can eventually get xid in question. We can
prevent an unrelated transaction from being skipped unexpectedly. It
seems not a stable solution though. Or it might be enough to warn
users when they specified an XID that doesn’t match to last_error_xid.
Anyway, I think it’s better to have more discussion on this. Any
ideas?
While the worker is skipping one of the skip transactions specified by
the user and immediately if the user specifies another skip
transaction while the skipping of the transaction is in progress this
new value will be reset by the worker while clearing the skip xid. I
felt once the worker has identified the skip xid and is about to skip
the xid, the worker can acquire a lock to prevent concurrency issues:
+static void
+clear_subscription_skip_xid(void)
+{
+ Relation rel;
+ HeapTuple tup;
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ if (!IsTransactionState())
+ StartTransactionCommand();
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+
ObjectIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist",
MySubscription->name);
+
+ /* Set subskipxid to null */
+ nulls[Anum_pg_subscription_subskipxid - 1] = true;
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ /* Update the system catalog to reset the skip XID */
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+
+ heap_freetuple(tup);
+ table_close(rel, RowExclusiveLock);
+}
Regards,
Vignesh
On Tue, Dec 14, 2021 at 3:23 PM vignesh C <vignesh21@gmail.com> wrote:
While the worker is skipping one of the skip transactions specified by
the user and immediately if the user specifies another skip
transaction while the skipping of the transaction is in progress this
new value will be reset by the worker while clearing the skip xid. I
felt once the worker has identified the skip xid and is about to skip
the xid, the worker can acquire a lock to prevent concurrency issues:
That's a good point.
If only the last_error_xid could be skipped, then this wouldn't be an
issue, right?
If a different xid to skip is specified while the worker is currently
skipping a transaction, should that even be allowed?
Regards,
Greg Nancarrow
Fujitsu Australia
On Fri, Dec 3, 2021 at 12:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Skipping a whole transaction by specifying xid would be a good start.
Ideally, we'd like to automatically skip only operations within the
transaction that fail but it seems not easy to achieve. If we allow
specifying operations and/or relations, probably multiple operations
or relations need to be specified in some cases. Otherwise, the
subscriber cannot continue logical replication if the transaction has
multiple operations on different relations that fail. But similar to
the idea of specifying multiple xids, we need to note the fact that
user wouldn't know of the second operation failure unless the apply
worker applies the change. So I'm not sure there are many use cases in
practice where users can specify multiple operations and relations in
order to skip applies that fail.
I think there would be use cases for specifying the relations or
operation, e.g. if the user finds an issue in inserting in a
particular relation then maybe based on some manual investigation he
founds that the table has some constraint due to that it is failing on
the subscriber side but on the publisher side that constraint is not
there so maybe the user is okay to skip the changes for this table and
not for other tables, or there might be a few more tables which are
designed based on the same principle and can have similar error so
isn't it good to provide an option to give the list of all such
tables.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Tue, Dec 14, 2021 at 8:20 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Dec 13, 2021 at 6:55 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
In streaming cases, we don’t know when stream-commit or stream-abort
comes and another conflict could occur on the subscription in the
meanwhile. But given that (we expect) this feature is used after the
apply worker enters into an error loop, this is unlikely to happen in
practice unless the user sets the wrong XID. Similarly, in 2PC cases,
we don’t know when commit-prepared or rollback-prepared comes and
another conflict could occur in the meanwhile. But this could occur in
practice even if the user specified the correct XID. Therefore, if we
disallow to change skip_xid until the subscriber receives
commit-prepared or rollback-prepared, we cannot skip the second
transaction that conflicts with data on the subscriber.I agree with this theory. Can we reflect this in comments so that in
the future we know why we didn't pursue this direction?
I might be missing something here, but for streaming, transaction
users can decide whether they wants to skip or not only once we start
applying no? I mean only once we start applying the changes we can
get some errors and by that time we must be having all the changes for
the transaction. So I do not understand the point we are trying to
discuss here?
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Tue, Dec 14, 2021 at 1:07 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Dec 14, 2021 at 8:20 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Dec 13, 2021 at 6:55 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
In streaming cases, we don’t know when stream-commit or stream-abort
comes and another conflict could occur on the subscription in the
meanwhile. But given that (we expect) this feature is used after the
apply worker enters into an error loop, this is unlikely to happen in
practice unless the user sets the wrong XID. Similarly, in 2PC cases,
we don’t know when commit-prepared or rollback-prepared comes and
another conflict could occur in the meanwhile. But this could occur in
practice even if the user specified the correct XID. Therefore, if we
disallow to change skip_xid until the subscriber receives
commit-prepared or rollback-prepared, we cannot skip the second
transaction that conflicts with data on the subscriber.I agree with this theory. Can we reflect this in comments so that in
the future we know why we didn't pursue this direction?I might be missing something here, but for streaming, transaction
users can decide whether they wants to skip or not only once we start
applying no? I mean only once we start applying the changes we can
get some errors and by that time we must be having all the changes for
the transaction.
That is right and as per my understanding, the patch is trying to
accomplish the same.
So I do not understand the point we are trying to
discuss here?
The point is that whether we can skip the changes while streaming
itself like when we get the changes and write to a stream file. Now,
it is possible that streams from multiple transactions can be
interleaved and users can change the skip_xid in between. It is not
that we can't handle this but that would require a more complex design
and it doesn't seem worth it because we can anyway skip the changes
while applying as you mentioned in the previous paragraph.
--
With Regards,
Amit Kapila.
On Fri, Dec 10, 2021 at 11:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Dec 9, 2021 at 6:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Dec 9, 2021 at 2:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Dec 9, 2021 at 11:47 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
I am thinking that we can start a transaction, update the catalog,
commit that transaction. Then start a new one to update
origin_lsn/timestamp, finishprepared, and commit it. Now, if it
crashes after the first transaction, only commit prepared will be
resent again and this time we don't need to update the catalog as that
entry would be already cleared.Sounds good. In the crash case, it should be fine since we will just
commit an empty transaction. The same is true for the case where
skip_xid has been changed after skipping and preparing the transaction
and before handling commit_prepared.Regarding the case where the user specifies XID of the transaction
after it is prepared on the subscriber (i.g., the transaction is not
empty), we won’t skip committing the prepared transaction. But I think
that we don't need to support skipping already-prepared transaction
since such transaction doesn't conflict with anything regardless of
having changed or not.Yeah, this makes sense to me.
I've attached an updated patch. The new syntax is like "ALTER
SUBSCRIPTION testsub SKIP (xid = '123')".I’ve been thinking we can do something safeguard for the case where
the user specified the wrong xid. For example, can we somewhat use the
stats in pg_stat_subscription_workers? An idea is that logical
replication worker fetches the xid from the stats when reading the
subscription and skips the transaction if the xid matches to
subskipxid. That is, the worker checks the error reported by the
worker previously working on the same subscription. The error could
not be a conflict error (e.g., connection error etc.) or might have
been cleared by the reset function, But given the worker is in an
error loop, the worker can eventually get xid in question. We can
prevent an unrelated transaction from being skipped unexpectedly. It
seems not a stable solution though. Or it might be enough to warn
users when they specified an XID that doesn’t match to last_error_xid.
Anyway, I think it’s better to have more discussion on this. Any
ideas?
Few comments:
1) Should we check if conflicting option is specified like others above:
+ else if (strcmp(defel->defname, "xid") == 0)
+ {
+ char *xid_str = defGetString(defel);
+ TransactionId xid;
+
+ if (strcmp(xid_str, "-1") == 0)
+ {
+ /* Setting -1 to xid means to reset it */
+ xid = InvalidTransactionId;
+ }
+ else
+ {
2) Currently only superusers can set skip xid, we can add this in the
documentation:
+ if (!superuser())
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("must be superuser to set %s", "skip_xid")));
3) There is an extra tab before "The resolution can be done ...", it
can be removed.
+ Skip applying changes of the particular transaction. If incoming data
+ violates any constraints the logical replication will stop until it is
+ resolved. The resolution can be done either by changing data on the
+ subscriber so that it doesn't conflict with incoming change or
by skipping
+ the whole transaction. The logical replication worker skips all data
4) xid with -2 is currently allowed, may be it is ok. If it is fine we
can remove it from the fail section.
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1.1);
+ERROR: invalid transaction id: 1.1
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = -2);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 0);
+ERROR: invalid transaction id: 0
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1);
Regards,
Vignesh
On Tue, Dec 14, 2021 at 2:36 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I agree with this theory. Can we reflect this in comments so that in
the future we know why we didn't pursue this direction?I might be missing something here, but for streaming, transaction
users can decide whether they wants to skip or not only once we start
applying no? I mean only once we start applying the changes we can
get some errors and by that time we must be having all the changes for
the transaction.That is right and as per my understanding, the patch is trying to
accomplish the same.So I do not understand the point we are trying to
discuss here?The point is that whether we can skip the changes while streaming
itself like when we get the changes and write to a stream file. Now,
it is possible that streams from multiple transactions can be
interleaved and users can change the skip_xid in between. It is not
that we can't handle this but that would require a more complex design
and it doesn't seem worth it because we can anyway skip the changes
while applying as you mentioned in the previous paragraph.
Actually, I was trying to understand the use case for skipping while
streaming. Actually, during streaming we are not doing any database
operation that means this will not generate any error. So IIUC, there
is no use case for skipping while streaming itself? Is there any use
case which I am not aware of?
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Tue, Dec 14, 2021 at 3:41 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Dec 14, 2021 at 2:36 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I agree with this theory. Can we reflect this in comments so that in
the future we know why we didn't pursue this direction?I might be missing something here, but for streaming, transaction
users can decide whether they wants to skip or not only once we start
applying no? I mean only once we start applying the changes we can
get some errors and by that time we must be having all the changes for
the transaction.That is right and as per my understanding, the patch is trying to
accomplish the same.So I do not understand the point we are trying to
discuss here?The point is that whether we can skip the changes while streaming
itself like when we get the changes and write to a stream file. Now,
it is possible that streams from multiple transactions can be
interleaved and users can change the skip_xid in between. It is not
that we can't handle this but that would require a more complex design
and it doesn't seem worth it because we can anyway skip the changes
while applying as you mentioned in the previous paragraph.Actually, I was trying to understand the use case for skipping while
streaming. Actually, during streaming we are not doing any database
operation that means this will not generate any error.
Say, there is an error the first time when we start to apply changes
for such a transaction. So, such a transaction will be streamed again.
Say, the user has set the skip_xid before we stream a second time, so
this time, we can skip it either during the stream phase or apply
phase. I think the patch is skipping it during apply phase.
Sawada-San, please confirm if my understanding is correct?
--
With Regards,
Amit Kapila.
On Tue, Dec 14, 2021 at 8:24 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Dec 14, 2021 at 3:41 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Dec 14, 2021 at 2:36 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I agree with this theory. Can we reflect this in comments so that in
the future we know why we didn't pursue this direction?I might be missing something here, but for streaming, transaction
users can decide whether they wants to skip or not only once we start
applying no? I mean only once we start applying the changes we can
get some errors and by that time we must be having all the changes for
the transaction.That is right and as per my understanding, the patch is trying to
accomplish the same.So I do not understand the point we are trying to
discuss here?The point is that whether we can skip the changes while streaming
itself like when we get the changes and write to a stream file. Now,
it is possible that streams from multiple transactions can be
interleaved and users can change the skip_xid in between. It is not
that we can't handle this but that would require a more complex design
and it doesn't seem worth it because we can anyway skip the changes
while applying as you mentioned in the previous paragraph.Actually, I was trying to understand the use case for skipping while
streaming. Actually, during streaming we are not doing any database
operation that means this will not generate any error.Say, there is an error the first time when we start to apply changes
for such a transaction. So, such a transaction will be streamed again.
Say, the user has set the skip_xid before we stream a second time, so
this time, we can skip it either during the stream phase or apply
phase. I think the patch is skipping it during apply phase.
Sawada-San, please confirm if my understanding is correct?
My understanding is the same. The patch doesn't skip the streaming
phase but starts skipping when starting to apply changes. That is, we
receive streamed changes and write them to the stream file anyway
regardless of skip_xid. When receiving the stream-commit message, we
check whether or not we skip this transaction, and if so we apply all
messages in the stream file other than all data modification messages.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, Dec 14, 2021 at 2:35 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Tue, Dec 14, 2021 at 3:23 PM vignesh C <vignesh21@gmail.com> wrote:
While the worker is skipping one of the skip transactions specified by
the user and immediately if the user specifies another skip
transaction while the skipping of the transaction is in progress this
new value will be reset by the worker while clearing the skip xid. I
felt once the worker has identified the skip xid and is about to skip
the xid, the worker can acquire a lock to prevent concurrency issues:That's a good point.
If only the last_error_xid could be skipped, then this wouldn't be an
issue, right?
If a different xid to skip is specified while the worker is currently
skipping a transaction, should that even be allowed?
We don't expect such usage but yes, it could happen and seems not
good. I thought we can acquire Share lock on pg_subscription during
the skip but not sure it's a good idea. It would be better if we can
find a way to allow users to specify only XID that has failed.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, Dec 14, 2021 at 4:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Actually, I was trying to understand the use case for skipping while
streaming. Actually, during streaming we are not doing any database
operation that means this will not generate any error.Say, there is an error the first time when we start to apply changes
for such a transaction. So, such a transaction will be streamed again.
Say, the user has set the skip_xid before we stream a second time, so
this time, we can skip it either during the stream phase or apply
phase. I think the patch is skipping it during apply phase.
Sawada-San, please confirm if my understanding is correct?
Got it, thanks for clarifying.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Wed, Dec 15, 2021 at 8:19 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Dec 14, 2021 at 2:35 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Tue, Dec 14, 2021 at 3:23 PM vignesh C <vignesh21@gmail.com> wrote:
While the worker is skipping one of the skip transactions specified by
the user and immediately if the user specifies another skip
transaction while the skipping of the transaction is in progress this
new value will be reset by the worker while clearing the skip xid. I
felt once the worker has identified the skip xid and is about to skip
the xid, the worker can acquire a lock to prevent concurrency issues:That's a good point.
If only the last_error_xid could be skipped, then this wouldn't be an
issue, right?
If a different xid to skip is specified while the worker is currently
skipping a transaction, should that even be allowed?We don't expect such usage but yes, it could happen and seems not
good. I thought we can acquire Share lock on pg_subscription during
the skip but not sure it's a good idea. It would be better if we can
find a way to allow users to specify only XID that has failed.
Yeah, but as we don't have a definite way to allow specifying only
failed XID, I think it is better to use share lock on that particular
subscription. We are already using it for add/update rel state (see,
AddSubscriptionRelState, UpdateSubscriptionRelState), so this will be
another place to use a similar technique.
--
With Regards,
Amit Kapila.
On Tue, Dec 14, 2021 at 11:40 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Fri, Dec 3, 2021 at 12:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Skipping a whole transaction by specifying xid would be a good start.
Ideally, we'd like to automatically skip only operations within the
transaction that fail but it seems not easy to achieve. If we allow
specifying operations and/or relations, probably multiple operations
or relations need to be specified in some cases. Otherwise, the
subscriber cannot continue logical replication if the transaction has
multiple operations on different relations that fail. But similar to
the idea of specifying multiple xids, we need to note the fact that
user wouldn't know of the second operation failure unless the apply
worker applies the change. So I'm not sure there are many use cases in
practice where users can specify multiple operations and relations in
order to skip applies that fail.I think there would be use cases for specifying the relations or
operation, e.g. if the user finds an issue in inserting in a
particular relation then maybe based on some manual investigation he
founds that the table has some constraint due to that it is failing on
the subscriber side but on the publisher side that constraint is not
there so maybe the user is okay to skip the changes for this table and
not for other tables, or there might be a few more tables which are
designed based on the same principle and can have similar error so
isn't it good to provide an option to give the list of all such
tables.
That's right and I agree there could be some use case for it and even
specifying the operation but I think we can always extend the existing
feature for it if the need arises. Note that the user can anyway only
specify a single relation or an operation because there is a way to
know only one error and till that is resolved, the apply process won't
proceed. We have discussed providing these additional options in this
thread but thought of doing it later once we have the base feature and
based on the feedback from users.
--
With Regards,
Amit Kapila.
On Wed, Dec 15, 2021 at 9:46 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Dec 14, 2021 at 11:40 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
That's right and I agree there could be some use case for it and even
specifying the operation but I think we can always extend the existing
feature for it if the need arises. Note that the user can anyway only
specify a single relation or an operation because there is a way to
know only one error and till that is resolved, the apply process won't
proceed. We have discussed providing these additional options in this
thread but thought of doing it later once we have the base feature and
based on the feedback from users.
Yeah, I only wanted to make the point that this could be useful, it
seems we are on the same page. I agree we can extend it in the future
as well.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Wed, Dec 15, 2021 at 1:49 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
We don't expect such usage but yes, it could happen and seems not
good. I thought we can acquire Share lock on pg_subscription during
the skip but not sure it's a good idea. It would be better if we can
find a way to allow users to specify only XID that has failed.
Yes, I agree that would be better.
If you didn't do that, I think you'd need to queue the XIDs to be
skipped (rather than locking).
Regards,
Greg Nancarrow
Fujitsu Australia
On Wed, Dec 15, 2021 at 1:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Dec 15, 2021 at 8:19 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Dec 14, 2021 at 2:35 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Tue, Dec 14, 2021 at 3:23 PM vignesh C <vignesh21@gmail.com> wrote:
While the worker is skipping one of the skip transactions specified by
the user and immediately if the user specifies another skip
transaction while the skipping of the transaction is in progress this
new value will be reset by the worker while clearing the skip xid. I
felt once the worker has identified the skip xid and is about to skip
the xid, the worker can acquire a lock to prevent concurrency issues:That's a good point.
If only the last_error_xid could be skipped, then this wouldn't be an
issue, right?
If a different xid to skip is specified while the worker is currently
skipping a transaction, should that even be allowed?We don't expect such usage but yes, it could happen and seems not
good. I thought we can acquire Share lock on pg_subscription during
the skip but not sure it's a good idea. It would be better if we can
find a way to allow users to specify only XID that has failed.Yeah, but as we don't have a definite way to allow specifying only
failed XID, I think it is better to use share lock on that particular
subscription. We are already using it for add/update rel state (see,
AddSubscriptionRelState, UpdateSubscriptionRelState), so this will be
another place to use a similar technique.
Yes, but it seems to mean that we disallow users to change skip_xid
while the apply worker is skipping changes so we will end up having
the same problem we discussed so far;
In the current patch, we don't clear skip_xid at prepare time but do
that at commit-prepare time. But we cannot keep holding the lock until
commit-prepared comes because we don’t know when commit-prepared
comes. It’s possible that another conflict occurs before the
commit-prepared comes. We also cannot only clear skip_xid at prepare
time because it doesn’t solve the concurrency problem at
commit-prepared time. So if my understanding is correct, we need to
both clear skip_xid and unlock the lock at prepare time, and commit
the prepared (empty) transaction at commit-prepared time (I assume
that we prepare even empty transactions).
Suppose that at prepare time, we clear skip_xid (and release the lock)
and then prepare the transaction, if the server crashes right after
clearing skip_xid, skip_xid is already cleared but the transaction
will be sent again. The user has to specify skip_xid again. So let’s
change the order; we prepare the transaction and then clear skip_xid.
But if the server crashes between them, the transaction won’t be sent
again, but skip_xid is left. The user has to clear it. The left
skip_xid can automatically be cleared at commit-prepared time if XID
in the commit-prepared message matches skip_xid, but this actually
doesn’t solve the concurrency problem. If the user changed skip_xid
before commit-prepared, we would end up clearing the value. So we
might want to hold the lock until we clear skip_xid but we want to
avoid that as I explained first. It seems like we entered a loop.
It sounds better among these ideas that we clear skip_xid and then
prepare the transaction. Or we might want to revisit the idea of
storing skip_xid on shmem (e.g., ReplicationState) instead of the
catalog.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wed, Dec 15, 2021 at 8:19 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Dec 15, 2021 at 1:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Dec 15, 2021 at 8:19 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Dec 14, 2021 at 2:35 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Tue, Dec 14, 2021 at 3:23 PM vignesh C <vignesh21@gmail.com> wrote:
While the worker is skipping one of the skip transactions specified by
the user and immediately if the user specifies another skip
transaction while the skipping of the transaction is in progress this
new value will be reset by the worker while clearing the skip xid. I
felt once the worker has identified the skip xid and is about to skip
the xid, the worker can acquire a lock to prevent concurrency issues:That's a good point.
If only the last_error_xid could be skipped, then this wouldn't be an
issue, right?
If a different xid to skip is specified while the worker is currently
skipping a transaction, should that even be allowed?We don't expect such usage but yes, it could happen and seems not
good. I thought we can acquire Share lock on pg_subscription during
the skip but not sure it's a good idea. It would be better if we can
find a way to allow users to specify only XID that has failed.Yeah, but as we don't have a definite way to allow specifying only
failed XID, I think it is better to use share lock on that particular
subscription. We are already using it for add/update rel state (see,
AddSubscriptionRelState, UpdateSubscriptionRelState), so this will be
another place to use a similar technique.Yes, but it seems to mean that we disallow users to change skip_xid
while the apply worker is skipping changes so we will end up having
the same problem we discussed so far;
I thought we just want to lock before clearing the skip_xid something
like take the lock, check if the skip_xid in the catalog is the same
as we have skipped, if it is the same then clear it, otherwise, leave
it as it is. How will that disallow users to change skip_xid when we
are skipping changes?
--
With Regards,
Amit Kapila.
On Thu, Dec 16, 2021 at 11:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Dec 15, 2021 at 8:19 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Dec 15, 2021 at 1:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Dec 15, 2021 at 8:19 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Dec 14, 2021 at 2:35 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Tue, Dec 14, 2021 at 3:23 PM vignesh C <vignesh21@gmail.com> wrote:
While the worker is skipping one of the skip transactions specified by
the user and immediately if the user specifies another skip
transaction while the skipping of the transaction is in progress this
new value will be reset by the worker while clearing the skip xid. I
felt once the worker has identified the skip xid and is about to skip
the xid, the worker can acquire a lock to prevent concurrency issues:That's a good point.
If only the last_error_xid could be skipped, then this wouldn't be an
issue, right?
If a different xid to skip is specified while the worker is currently
skipping a transaction, should that even be allowed?We don't expect such usage but yes, it could happen and seems not
good. I thought we can acquire Share lock on pg_subscription during
the skip but not sure it's a good idea. It would be better if we can
find a way to allow users to specify only XID that has failed.Yeah, but as we don't have a definite way to allow specifying only
failed XID, I think it is better to use share lock on that particular
subscription. We are already using it for add/update rel state (see,
AddSubscriptionRelState, UpdateSubscriptionRelState), so this will be
another place to use a similar technique.Yes, but it seems to mean that we disallow users to change skip_xid
while the apply worker is skipping changes so we will end up having
the same problem we discussed so far;I thought we just want to lock before clearing the skip_xid something
like take the lock, check if the skip_xid in the catalog is the same
as we have skipped, if it is the same then clear it, otherwise, leave
it as it is. How will that disallow users to change skip_xid when we
are skipping changes?
Oh I thought we wanted to keep holding the lock while skipping changes
(changing skip_xid requires acquiring the lock).
So if skip_xid is already changed, the apply worker would do
replorigin_advance() with WAL logging, instead of committing the
catalog change?
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thu, Dec 16, 2021 at 10:37 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Dec 16, 2021 at 11:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
I thought we just want to lock before clearing the skip_xid something
like take the lock, check if the skip_xid in the catalog is the same
as we have skipped, if it is the same then clear it, otherwise, leave
it as it is. How will that disallow users to change skip_xid when we
are skipping changes?Oh I thought we wanted to keep holding the lock while skipping changes
(changing skip_xid requires acquiring the lock).So if skip_xid is already changed, the apply worker would do
replorigin_advance() with WAL logging, instead of committing the
catalog change?
Right. BTW, how are you planning to advance the origin? Normally, a
commit transaction would do it but when we are skipping all changes,
the commit might not do it as there won't be any transaction id
assigned.
--
With Regards,
Amit Kapila.
On Thu, Dec 16, 2021 at 2:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Dec 16, 2021 at 10:37 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Dec 16, 2021 at 11:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
I thought we just want to lock before clearing the skip_xid something
like take the lock, check if the skip_xid in the catalog is the same
as we have skipped, if it is the same then clear it, otherwise, leave
it as it is. How will that disallow users to change skip_xid when we
are skipping changes?Oh I thought we wanted to keep holding the lock while skipping changes
(changing skip_xid requires acquiring the lock).So if skip_xid is already changed, the apply worker would do
replorigin_advance() with WAL logging, instead of committing the
catalog change?Right. BTW, how are you planning to advance the origin? Normally, a
commit transaction would do it but when we are skipping all changes,
the commit might not do it as there won't be any transaction id
assigned.
I've not tested it yet but replorigin_advance() with wal_log = true
seems to work for this case.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On 13.12.21 04:12, Greg Nancarrow wrote:
(ii) "Setting -1 means to reset the transaction ID"
Shouldn't it be explained what resetting actually does and when it can
be, or is needed to be, done? Isn't it automatically reset?
I notice that negative values (other than -1) seem to be regarded as
valid - is that right?
Also, what happens if this option is set multiple times? Does it just
override and use the latest setting? (other option handling errors out
with errorConflictingDefElem()).
e.g. alter subscription sub skip (xid = 721, xid = 722);
Let's not use magic numbers and instead use a syntax that is more
explicit, like SKIP (xid = NONE) or RESET SKIP or something like that.
On Fri, Dec 17, 2021 at 3:23 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
On 13.12.21 04:12, Greg Nancarrow wrote:
(ii) "Setting -1 means to reset the transaction ID"
Shouldn't it be explained what resetting actually does and when it can
be, or is needed to be, done? Isn't it automatically reset?
I notice that negative values (other than -1) seem to be regarded as
valid - is that right?
Also, what happens if this option is set multiple times? Does it just
override and use the latest setting? (other option handling errors out
with errorConflictingDefElem()).
e.g. alter subscription sub skip (xid = 721, xid = 722);Let's not use magic numbers and instead use a syntax that is more
explicit, like SKIP (xid = NONE) or RESET SKIP or something like that.
+1 for using SKIP (xid = NONE) because otherwise first we need to
introduce RESET syntax for this command.
--
With Regards,
Amit Kapila.
On Fri, Dec 17, 2021 at 7:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Dec 17, 2021 at 3:23 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:On 13.12.21 04:12, Greg Nancarrow wrote:
(ii) "Setting -1 means to reset the transaction ID"
Shouldn't it be explained what resetting actually does and when it can
be, or is needed to be, done? Isn't it automatically reset?
I notice that negative values (other than -1) seem to be regarded as
valid - is that right?
Also, what happens if this option is set multiple times? Does it just
override and use the latest setting? (other option handling errors out
with errorConflictingDefElem()).
e.g. alter subscription sub skip (xid = 721, xid = 722);Let's not use magic numbers and instead use a syntax that is more
explicit, like SKIP (xid = NONE) or RESET SKIP or something like that.+1 for using SKIP (xid = NONE) because otherwise first we need to
introduce RESET syntax for this command.
Agreed. Thank you for the comment!
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thu, Dec 16, 2021 at 2:42 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Dec 16, 2021 at 2:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Dec 16, 2021 at 10:37 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Dec 16, 2021 at 11:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
I thought we just want to lock before clearing the skip_xid something
like take the lock, check if the skip_xid in the catalog is the same
as we have skipped, if it is the same then clear it, otherwise, leave
it as it is. How will that disallow users to change skip_xid when we
are skipping changes?Oh I thought we wanted to keep holding the lock while skipping changes
(changing skip_xid requires acquiring the lock).So if skip_xid is already changed, the apply worker would do
replorigin_advance() with WAL logging, instead of committing the
catalog change?Right. BTW, how are you planning to advance the origin? Normally, a
commit transaction would do it but when we are skipping all changes,
the commit might not do it as there won't be any transaction id
assigned.I've not tested it yet but replorigin_advance() with wal_log = true
seems to work for this case.
I've tested it and realized that we cannot use replorigin_advance()
for this purpose without changes. That is, the current
replorigin_advance() doesn't allow to advance the origin by the owner:
/* Make sure it's not used by somebody else */
if (replication_state->acquired_by != 0)
{
ereport(ERROR,
(errcode(ERRCODE_OBJECT_IN_USE),
errmsg("replication origin with OID %d is already
active for PID %d",
replication_state->roident,
replication_state->acquired_by)));
}
So we need to change it so that the origin owner can advance its
origin, which makes sense to me.
Also, when we have to update the origin instead of committing the
catalog change while updating the origin, we cannot record the origin
timestamp. This behavior makes sense to me because we skipped the
transaction. But ISTM it’s not good if we emit the origin timestamp
only when directly updating the origin. So probably we need to always
omit origin timestamp.
Apart from that, I'm vaguely concerned that the logic seems to be
getting complex. Probably it comes from the fact that we store
skip_xid in the catalog and update the catalog to clear/set the
skip_xid. It might be worth revisiting the idea of storing skip_xid on
shmem (e.g., ReplicationState)? That way, we can always advance the
origin by replorigin_advance() and don’t need to worry about a complex
case like the server crashes during preparing the transaction. I’ve
not considered the downside yet enough, though.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Mon, Dec 27, 2021 at 9:54 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Dec 16, 2021 at 2:42 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Dec 16, 2021 at 2:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Dec 16, 2021 at 10:37 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Dec 16, 2021 at 11:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
I thought we just want to lock before clearing the skip_xid something
like take the lock, check if the skip_xid in the catalog is the same
as we have skipped, if it is the same then clear it, otherwise, leave
it as it is. How will that disallow users to change skip_xid when we
are skipping changes?Oh I thought we wanted to keep holding the lock while skipping changes
(changing skip_xid requires acquiring the lock).So if skip_xid is already changed, the apply worker would do
replorigin_advance() with WAL logging, instead of committing the
catalog change?Right. BTW, how are you planning to advance the origin? Normally, a
commit transaction would do it but when we are skipping all changes,
the commit might not do it as there won't be any transaction id
assigned.I've not tested it yet but replorigin_advance() with wal_log = true
seems to work for this case.I've tested it and realized that we cannot use replorigin_advance()
for this purpose without changes. That is, the current
replorigin_advance() doesn't allow to advance the origin by the owner:/* Make sure it's not used by somebody else */
if (replication_state->acquired_by != 0)
{
ereport(ERROR,
(errcode(ERRCODE_OBJECT_IN_USE),
errmsg("replication origin with OID %d is already
active for PID %d",
replication_state->roident,
replication_state->acquired_by)));
}So we need to change it so that the origin owner can advance its
origin, which makes sense to me.Also, when we have to update the origin instead of committing the
catalog change while updating the origin, we cannot record the origin
timestamp.
Is it because we currently update the origin timestamp with commit record?
This behavior makes sense to me because we skipped the
transaction. But ISTM it’s not good if we emit the origin timestamp
only when directly updating the origin. So probably we need to always
omit origin timestamp.
Do you mean to say that you want to omit it even when we are
committing the changes?
Apart from that, I'm vaguely concerned that the logic seems to be
getting complex. Probably it comes from the fact that we store
skip_xid in the catalog and update the catalog to clear/set the
skip_xid. It might be worth revisiting the idea of storing skip_xid on
shmem (e.g., ReplicationState)?
IIRC, the problem with that idea was that we won't remember skip_xid
information after server restart and the user won't even know that it
has to set it again.
--
With Regards,
Amit Kapila.
On Wed, Jan 5, 2022 at 9:01 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Dec 27, 2021 at 9:54 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Do you mean to say that you want to omit it even when we are
committing the changes?Apart from that, I'm vaguely concerned that the logic seems to be
getting complex. Probably it comes from the fact that we store
skip_xid in the catalog and update the catalog to clear/set the
skip_xid. It might be worth revisiting the idea of storing skip_xid on
shmem (e.g., ReplicationState)?IIRC, the problem with that idea was that we won't remember skip_xid
information after server restart and the user won't even know that it
has to set it again.
I agree, that if we don't keep it in the catalog then after restart if
the transaction replayed again then the user has to set the skip xid
again and that would be pretty inconvenient because the user might
have to analyze the failure again and repeat the same process he did
before restart. But OTOH the combination of restart and the skip xid
might not be very frequent so this might not be a very bad option.
Basically, I am in favor of storing it in a catalog as that solution
looks cleaner at least from the user pov but if we think there are a
lot of complexities from the implementation pov then we might analyze
the approach of storing in shmem as well.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Wed, Jan 5, 2022 at 9:48 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Wed, Jan 5, 2022 at 9:01 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Dec 27, 2021 at 9:54 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Do you mean to say that you want to omit it even when we are
committing the changes?Apart from that, I'm vaguely concerned that the logic seems to be
getting complex. Probably it comes from the fact that we store
skip_xid in the catalog and update the catalog to clear/set the
skip_xid. It might be worth revisiting the idea of storing skip_xid on
shmem (e.g., ReplicationState)?IIRC, the problem with that idea was that we won't remember skip_xid
information after server restart and the user won't even know that it
has to set it again.I agree, that if we don't keep it in the catalog then after restart if
the transaction replayed again then the user has to set the skip xid
again and that would be pretty inconvenient because the user might
have to analyze the failure again and repeat the same process he did
before restart. But OTOH the combination of restart and the skip xid
might not be very frequent so this might not be a very bad option.
Basically, I am in favor of storing it in a catalog as that solution
looks cleaner at least from the user pov but if we think there are a
lot of complexities from the implementation pov then we might analyze
the approach of storing in shmem as well.
Fair point, but I think it is better to see the patch or the problems
that can't be solved if we pursue storing it in catalog. Even, if we
decide to store it in shmem, we need to invent some way to inform the
user that we have not honored the previous setting of skip_xid and it
needs to be reset again.
--
With Regards,
Amit Kapila.
On Wed, Jan 5, 2022 at 12:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Dec 27, 2021 at 9:54 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Dec 16, 2021 at 2:42 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Dec 16, 2021 at 2:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Dec 16, 2021 at 10:37 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Dec 16, 2021 at 11:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
I thought we just want to lock before clearing the skip_xid something
like take the lock, check if the skip_xid in the catalog is the same
as we have skipped, if it is the same then clear it, otherwise, leave
it as it is. How will that disallow users to change skip_xid when we
are skipping changes?Oh I thought we wanted to keep holding the lock while skipping changes
(changing skip_xid requires acquiring the lock).So if skip_xid is already changed, the apply worker would do
replorigin_advance() with WAL logging, instead of committing the
catalog change?Right. BTW, how are you planning to advance the origin? Normally, a
commit transaction would do it but when we are skipping all changes,
the commit might not do it as there won't be any transaction id
assigned.I've not tested it yet but replorigin_advance() with wal_log = true
seems to work for this case.I've tested it and realized that we cannot use replorigin_advance()
for this purpose without changes. That is, the current
replorigin_advance() doesn't allow to advance the origin by the owner:/* Make sure it's not used by somebody else */
if (replication_state->acquired_by != 0)
{
ereport(ERROR,
(errcode(ERRCODE_OBJECT_IN_USE),
errmsg("replication origin with OID %d is already
active for PID %d",
replication_state->roident,
replication_state->acquired_by)));
}So we need to change it so that the origin owner can advance its
origin, which makes sense to me.Also, when we have to update the origin instead of committing the
catalog change while updating the origin, we cannot record the origin
timestamp.Is it because we currently update the origin timestamp with commit record?
Yes.
This behavior makes sense to me because we skipped the
transaction. But ISTM it’s not good if we emit the origin timestamp
only when directly updating the origin. So probably we need to always
omit origin timestamp.Do you mean to say that you want to omit it even when we are
committing the changes?
Yes, it would be better to record only origin lsn in terms of consistency.
Apart from that, I'm vaguely concerned that the logic seems to be
getting complex. Probably it comes from the fact that we store
skip_xid in the catalog and update the catalog to clear/set the
skip_xid. It might be worth revisiting the idea of storing skip_xid on
shmem (e.g., ReplicationState)?IIRC, the problem with that idea was that we won't remember skip_xid
information after server restart and the user won't even know that it
has to set it again.
Right, I agree that it’s not convenient when the server restarts or
crashes, but these problems could not be critical in the situation
where users have to use this feature; the subscriber already entered
an error loop so they can know xid again and it’s an uncommon case
that they need to restart during skipping changes.
Anyway, I'll submit an updated patch soon so we can discuss complexity
vs. convenience.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Fri, Jan 7, 2022 at 6:35 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Jan 5, 2022 at 12:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Dec 27, 2021 at 9:54 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Dec 16, 2021 at 2:42 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Dec 16, 2021 at 2:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Dec 16, 2021 at 10:37 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Dec 16, 2021 at 11:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
I thought we just want to lock before clearing the skip_xid something
like take the lock, check if the skip_xid in the catalog is the same
as we have skipped, if it is the same then clear it, otherwise, leave
it as it is. How will that disallow users to change skip_xid when we
are skipping changes?Oh I thought we wanted to keep holding the lock while skipping changes
(changing skip_xid requires acquiring the lock).So if skip_xid is already changed, the apply worker would do
replorigin_advance() with WAL logging, instead of committing the
catalog change?Right. BTW, how are you planning to advance the origin? Normally, a
commit transaction would do it but when we are skipping all changes,
the commit might not do it as there won't be any transaction id
assigned.I've not tested it yet but replorigin_advance() with wal_log = true
seems to work for this case.I've tested it and realized that we cannot use replorigin_advance()
for this purpose without changes. That is, the current
replorigin_advance() doesn't allow to advance the origin by the owner:/* Make sure it's not used by somebody else */
if (replication_state->acquired_by != 0)
{
ereport(ERROR,
(errcode(ERRCODE_OBJECT_IN_USE),
errmsg("replication origin with OID %d is already
active for PID %d",
replication_state->roident,
replication_state->acquired_by)));
}So we need to change it so that the origin owner can advance its
origin, which makes sense to me.Also, when we have to update the origin instead of committing the
catalog change while updating the origin, we cannot record the origin
timestamp.Is it because we currently update the origin timestamp with commit record?
Yes.
This behavior makes sense to me because we skipped the
transaction. But ISTM it’s not good if we emit the origin timestamp
only when directly updating the origin. So probably we need to always
omit origin timestamp.Do you mean to say that you want to omit it even when we are
committing the changes?Yes, it would be better to record only origin lsn in terms of consistency.
I am not so sure about this point because then what purpose origin
timestamp will serve in the code.
Apart from that, I'm vaguely concerned that the logic seems to be
getting complex. Probably it comes from the fact that we store
skip_xid in the catalog and update the catalog to clear/set the
skip_xid. It might be worth revisiting the idea of storing skip_xid on
shmem (e.g., ReplicationState)?IIRC, the problem with that idea was that we won't remember skip_xid
information after server restart and the user won't even know that it
has to set it again.Right, I agree that it’s not convenient when the server restarts or
crashes, but these problems could not be critical in the situation
where users have to use this feature; the subscriber already entered
an error loop so they can know xid again and it’s an uncommon case
that they need to restart during skipping changes.Anyway, I'll submit an updated patch soon so we can discuss complexity
vs. convenience.
Okay, that sounds reasonable.
--
With Regards,
Amit Kapila.
On Fri, Jan 7, 2022 at 10:04 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Jan 5, 2022 at 12:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Dec 27, 2021 at 9:54 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Dec 16, 2021 at 2:42 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Dec 16, 2021 at 2:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Dec 16, 2021 at 10:37 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Dec 16, 2021 at 11:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
I thought we just want to lock before clearing the skip_xid something
like take the lock, check if the skip_xid in the catalog is the same
as we have skipped, if it is the same then clear it, otherwise, leave
it as it is. How will that disallow users to change skip_xid when we
are skipping changes?Oh I thought we wanted to keep holding the lock while skipping changes
(changing skip_xid requires acquiring the lock).So if skip_xid is already changed, the apply worker would do
replorigin_advance() with WAL logging, instead of committing the
catalog change?Right. BTW, how are you planning to advance the origin? Normally, a
commit transaction would do it but when we are skipping all changes,
the commit might not do it as there won't be any transaction id
assigned.I've not tested it yet but replorigin_advance() with wal_log = true
seems to work for this case.I've tested it and realized that we cannot use replorigin_advance()
for this purpose without changes. That is, the current
replorigin_advance() doesn't allow to advance the origin by the owner:/* Make sure it's not used by somebody else */
if (replication_state->acquired_by != 0)
{
ereport(ERROR,
(errcode(ERRCODE_OBJECT_IN_USE),
errmsg("replication origin with OID %d is already
active for PID %d",
replication_state->roident,
replication_state->acquired_by)));
}So we need to change it so that the origin owner can advance its
origin, which makes sense to me.Also, when we have to update the origin instead of committing the
catalog change while updating the origin, we cannot record the origin
timestamp.Is it because we currently update the origin timestamp with commit record?
Yes.
This behavior makes sense to me because we skipped the
transaction. But ISTM it’s not good if we emit the origin timestamp
only when directly updating the origin. So probably we need to always
omit origin timestamp.Do you mean to say that you want to omit it even when we are
committing the changes?Yes, it would be better to record only origin lsn in terms of consistency.
Apart from that, I'm vaguely concerned that the logic seems to be
getting complex. Probably it comes from the fact that we store
skip_xid in the catalog and update the catalog to clear/set the
skip_xid. It might be worth revisiting the idea of storing skip_xid on
shmem (e.g., ReplicationState)?IIRC, the problem with that idea was that we won't remember skip_xid
information after server restart and the user won't even know that it
has to set it again.Right, I agree that it’s not convenient when the server restarts or
crashes, but these problems could not be critical in the situation
where users have to use this feature; the subscriber already entered
an error loop so they can know xid again and it’s an uncommon case
that they need to restart during skipping changes.Anyway, I'll submit an updated patch soon so we can discuss complexity
vs. convenience.
Attached an updated patch. Please review it.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v2-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transac.patchapplication/octet-stream; name=v2-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transac.patchDownload
From 8a81e77238265a25df0b15f9db8159ef4f9f88a9 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 10 Dec 2021 14:41:30 +0900
Subject: [PATCH v2] Add ALTER SUBSCRIPTION ... SKIP to skip the transaction on
subscriber nodes
If incoming change violates any constraint, logical replication stops
until it's resolved. This commit introduces another way to skip the
transaction in question, other than manually updating the subscriber's
database or using pg_replication_origin_advance().
The user can specify XID by ALTER SUBSCRIPTION ... SKIP (xid = XXX),
updating pg_subscription.subskipxid field, telling the apply worker to
skip the transaction. The apply worker skips all data modification
changes within the specified transaction.
After skipping the transaciton the apply worker clears
pg_subscription.subskipxid.
---
doc/src/sgml/catalogs.sgml | 10 +
doc/src/sgml/logical-replication.sgml | 54 ++++-
doc/src/sgml/ref/alter_subscription.sgml | 41 ++++
src/backend/catalog/pg_subscription.c | 1 +
src/backend/commands/subscriptioncmds.c | 52 +++++
src/backend/parser/gram.y | 9 +
src/backend/replication/logical/origin.c | 3 +-
src/backend/replication/logical/worker.c | 259 ++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 4 +
src/bin/psql/tab-complete.c | 8 +-
src/include/catalog/pg_subscription.h | 4 +
src/include/nodes/parsenodes.h | 3 +-
src/test/regress/expected/subscription.out | 25 ++
src/test/regress/sql/subscription.sql | 13 ++
src/test/subscription/t/027_skip_xact.pl | 204 ++++++++++++++++
15 files changed, 679 insertions(+), 11 deletions(-)
create mode 100644 src/test/subscription/t/027_skip_xact.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 03e2537b07..c3c8b0b428 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7723,6 +7723,16 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subskipxid</structfield> <type>xid</type>
+ </para>
+ <para>
+ ID of the transaction whose changes are to be skipped, if a valid
+ transaction ID; otherwise 0.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subconninfo</structfield> <type>text</type>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 45b2e1e28f..3237f68b04 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -333,20 +333,66 @@
<para>
A conflict will produce an error and will stop the replication; it must be
resolved manually by the user. Details about the conflict can be found in
- the subscriber's server log.
+ <xref linkend="monitoring-pg-stat-subscription-workers"/> as well as the
+ subscriber's server log.
</para>
<para>
The resolution can be done either by changing data on the subscriber so
- that it does not conflict with the incoming change or by skipping the
- transaction that conflicts with the existing data. The transaction can be
- skipped by calling the <link linkend="pg-replication-origin-advance">
+ that it does not conflict with the incoming changes or by skipping the whole
+ transaction. This option specifies the ID of the transaction whose
+ application is to be skipped by the logical replication worker. The logical
+ replication worker skips all data modification transaction conflicts with
+ the existing data. When a conflict produce an error, it is shown in
+ <structname>pg_stat_subscription_workers</structname> view as follows:
+ </para>
+
+ <programlisting>
+postgres=# SELECT * FROM pg_stat_subscription_workers;
+-[ RECORD 1 ]------+-----------------------------------------------------------
+subid | 16391
+subname | test_sub
+subrelid |
+last_error_relid | 16385
+last_error_command | INSERT
+last_error_xid | 716
+last_error_count | 50
+last_error_message | duplicate key value violates unique constraint "test_pkey"
+last_error_time | 2021-09-29 15:52:45.165754+00
+</programlisting>
+
+ <para>
+ and it is also shown in subscriber's server log as follows:
+ </para>
+
+<screen>
+ERROR: duplicate key value violates unique constraint "test_pkey"
+DETAIL: Key (id)=(1) already exists.
+CONTEXT: processing remote data during "INSERT" for replication target relation "public.test" in transaction 716 with commit timestamp 2021-09-29 15:52:45.165754+00
+</screen>
+
+ <para>
+ The transaction ID that contains the change violating the constraint can be
+ found from those outputs (transaction ID 716 in the above case). The transaction
+ can be skipped by using <command>ALTER SUBSCRIPTION ... SKIP</command> on the
+ subscription. Alternatively, the transaction can also be skipped by calling the
+ <link linkend="pg-replication-origin-advance">
<function>pg_replication_origin_advance()</function></link> function with
a <parameter>node_name</parameter> corresponding to the subscription name,
and a position. The current position of origins can be seen in the
<link linkend="view-pg-replication-origin-status">
<structname>pg_replication_origin_status</structname></link> system view.
</para>
+
+ <para>
+ In this case, you need to consider changing the data on the subscriber so that it
+ doesn't conflict with incoming changes, or dropping the conflicting constraint or
+ unique index, or writing a trigger on the subscriber to suppress or redirect
+ conflicting incoming changes, or as a last resort, by skipping the whole transaction.
+ They skip the whole transaction, including changes that may not violate any
+ constraint. They may easily make the subscriber inconsistent, especially if
+ a user specifies the wrong transaction ID or the position of origin.
+ </para>
</sect1>
<sect1 id="logical-replication-restrictions">
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 0b027cc346..08d934f7ca 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -29,6 +29,7 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> REFRESH PUB
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> ENABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> DISABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable class="parameter">skip_option</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <replaceable>new_name</replaceable>
</synopsis>
@@ -207,6 +208,46 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><literal>SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> [, ... ] )</literal></term>
+ <listitem>
+ <para>
+ Skip applying changes of the particular transaction. If incoming data
+ violates any constraints the logical replication will stop until it is
+ resolved. The resolution can be done either by changing data on the
+ subscriber so that it doesn't conflict with incoming change or by skipping
+ the whole transaction. The logical replication worker skips all data
+ modification changes within the specified transaction. Therefore, since
+ it skips the whole transaction including the changes that may not violate
+ the constraint, it should only be used as a last resort. This option has
+ no effect for the transaction that is already prepared with enabling
+ <literal>two_phase</literal> on susbscriber. After the logical replication
+ successfully skips the transaction, the transaction ID (stored in
+ <structname>pg_subscription</structname>.<structfield>subskipxid</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts.
+ </para>
+
+ <para>
+ <replaceable>skip_option</replaceable> specifies options for this operation.
+ The supported option is:
+
+ <variablelist>
+ <varlistentry>
+ <term><literal>xid</literal> (<type>xid</type>)</term>
+ <listitem>
+ <para>
+ Specifies the ID of the transaction whose changes are to be skipped
+ by the logical replication worker. Setting <literal>NONE</literal> means
+ to reset the transaction ID.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><replaceable class="parameter">new_owner</replaceable></term>
<listitem>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index 25021e25a4..b5c56aab33 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -69,6 +69,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->binary = subform->subbinary;
sub->stream = subform->substream;
sub->twophasestate = subform->subtwophasestate;
+ sub->skipxid = subform->subskipxid;
/* Get conninfo */
datum = SysCacheGetAttr(SUBSCRIPTIONOID,
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 2b658080fe..ca7e908042 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -61,6 +61,7 @@
#define SUBOPT_BINARY 0x00000080
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
+#define SUBOPT_XID 0x00000400
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -82,6 +83,8 @@ typedef struct SubOpts
bool binary;
bool streaming;
bool twophase;
+ TransactionId xid; /* InvalidTransactionId for resetting purpose, otherwise
+ * normal transaction id */
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -249,6 +252,32 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_TWOPHASE_COMMIT;
opts->twophase = defGetBoolean(defel);
}
+ else if (strcmp(defel->defname, "xid") == 0)
+ {
+ char *xid_str = defGetString(defel);
+ TransactionId xid;
+
+ if (IsSet(opts->specified_opts, SUBOPT_XID))
+ errorConflictingDefElem(defel, pstate);
+
+ /* Setting xid = NONE is treated as resetting xid */
+ if (strcmp(xid_str, "none") == 0)
+ xid = InvalidTransactionId;
+ else
+ {
+ /* Parse the argument as TransactionId */
+ xid = DatumGetTransactionId(DirectFunctionCall1(xidin,
+ CStringGetDatum(xid_str)));
+
+ if (!TransactionIdIsNormal(xid))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid transaction id: %s", xid_str)));
+ }
+
+ opts->specified_opts |= SUBOPT_XID;
+ opts->xid = xid;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -464,6 +493,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
CharGetDatum(opts.twophase ?
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(InvalidTransactionId);
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -1083,6 +1114,27 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
break;
}
+ case ALTER_SUBSCRIPTION_SKIP:
+ {
+ parse_subscription_options(pstate, stmt->options, SUBOPT_XID, &opts);
+
+ if (IsSet(opts.specified_opts, SUBOPT_XID))
+ {
+ if (!superuser())
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("must be superuser to set %s", "skip_xid")));
+
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(opts.xid);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ update_tuple = true;
+ }
+
+ break;
+ }
+
default:
elog(ERROR, "unrecognized ALTER SUBSCRIPTION kind %d",
stmt->kind);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 6dddc07947..1ae2105ac8 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9954,6 +9954,15 @@ AlterSubscriptionStmt:
(Node *)makeInteger(false), @1));
$$ = (Node *)n;
}
+ | ALTER SUBSCRIPTION name SKIP definition
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_SKIP;
+ n->subname = $3;
+ n->options = $5;
+ $$ = (Node *)n;
+ }
;
/*****************************************************************************
diff --git a/src/backend/replication/logical/origin.c b/src/backend/replication/logical/origin.c
index 65dcd033fd..54e8b175cd 100644
--- a/src/backend/replication/logical/origin.c
+++ b/src/backend/replication/logical/origin.c
@@ -921,7 +921,8 @@ replorigin_advance(RepOriginId node,
LWLockAcquire(&replication_state->lock, LW_EXCLUSIVE);
/* Make sure it's not used by somebody else */
- if (replication_state->acquired_by != 0)
+ if (replication_state->acquired_by != 0 &&
+ replication_state->acquired_by != MyProcPid)
{
ereport(ERROR,
(errcode(ERRCODE_OBJECT_IN_USE),
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 2e79302a48..5eb22e3712 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -136,6 +136,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -255,6 +256,20 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * True if we are skipping all data modification changes (INSERT, UPDATE, etc.) of
+ * the specified transaction at MySubscription->skipxid. Once we start skipping
+ * changes, we don't stop it until the we skip all changes of the transaction even
+ * if the subscription invalidated and MySubscription->skipxid gets changed or reset.
+ * Also, we don't skip receiving the changes in streaming cases, since we decide
+ * whether or not to skip applying the changes when starting to apply changes. At
+ * end of the transaction, we disable it and reset the skip XID. The timing of
+ * resetting the skip XID varies depending on commit or commit/rollback prepared
+ * cases. Please refer to the comments in these functions for details.
+ */
+static TransactionId skipping_xid = InvalidTransactionId;
+#define is_skipping_changes() (TransactionIdIsValid(skipping_xid))
+
/* BufFile handle of the current streaming file */
static BufFile *stream_fd = NULL;
@@ -330,6 +345,13 @@ static void TwoPhaseTransactionGid(Oid subid, TransactionId xid, char *gid, int
/* Common streaming function to apply all the spooled messages */
static void apply_spooled_messages(TransactionId xid, XLogRecPtr lsn);
+/* Functions for skipping changes */
+static void maybe_start_skipping_changes(TransactionId xid);
+static void stop_skipping_changes(bool commit, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp);
+static void clear_subscription_skip_xid(TransactionId xid, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp);
+
/* Functions for apply error callback */
static void apply_error_callback(void *arg);
static inline void set_apply_error_context_xact(TransactionId xid, TimestampTz ts);
@@ -789,6 +811,11 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
+ /*
+ * Enable skipping all changes of this transaction if specified.
+ */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -841,6 +868,11 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
+ /*
+ * Enable skipping all changes of this transaction if specified
+ */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -854,6 +886,36 @@ apply_handle_prepare_internal(LogicalRepPreparedTxnData *prepare_data)
{
char gid[GIDSIZE];
+ /*
+ * If we are skipping all changes of this transaction, we stop it but
+ * unlike commit, we do not clear subskipxid of pg_subscription catalog
+ * here and will do that at commit prepared or rollback prepared time.
+ * If we update the catalog and prepare the transaction, in a case where
+ * the server crashes between them, subskipxid is cleared but this
+ * transaction will be resent. Even if we do that in reverse order,
+ * subskipxid will not be cleared but this transaction won’t be resent.
+ *
+ * subskipxid might be changed or cleared by the user before we receive
+ * COMMIT PREPARED or ROLLBACK PREPARED. But that’s okay because this
+ * prepared transaction is empty.
+ *
+ * One might think that we can skip preparing the skipped transaction and
+ * also skips COMMIT PREPARED or ROLLBACK PREPARED by comparing the XID
+ * received as part of the message to the value of subskipxid. But
+ * subskipxid could be changed by users between PREPARE and COMMIT PREPARED
+ * or ROLLBACK PREPARED. There was an idea to disallow users to change the
+ * value of subskipxid while skipping changes. But we don't know when
+ * COMMIT PREPARED or ROLLBACK PREPARED comes and another conflict could
+ * occur in the meanwhile. If such another conflict occurs, we cannot
+ * skip the transaction by using subskipxid. Also, there was another idea
+ * to check whether the transaction has been prepared or not by checking
+ * GID, origin LSN, and origin timestamp of the prepared transaction but
+ * that doesn't seem worthwhile because it requires protocol changes, and
+ * skipping transactions shouldn't be common.
+ */
+ if (is_skipping_changes())
+ stop_skipping_changes(false, InvalidXLogRecPtr, 0);
+
/*
* Compute unique GID for two_phase transactions. We don't use GID of
* prepared transaction sent by server as that can lead to deadlock when
@@ -899,9 +961,9 @@ apply_handle_prepare(StringInfo s)
/*
* Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction. It is done this way because at
- * commit prepared time, we won't know whether we have skipped preparing a
- * transaction because of no change.
+ * change has happened in this transaction or all changes are skipped. It is
+ * done this way because at commit prepared time, we won't know whether we
+ * have skipped preparing a transaction because of no change.
*
* XXX, We can optimize such that at commit prepared time, we first check
* whether we have prepared the transaction or not but that doesn't seem
@@ -938,6 +1000,26 @@ apply_handle_commit_prepared(StringInfo s)
logicalrep_read_commit_prepared(s, &prepare_data);
set_apply_error_context_xact(prepare_data.xid, prepare_data.commit_time);
+ /* Update the subscription cache if necessary */
+ maybe_reread_subscription();
+
+ if (MySubscription->skipxid == prepare_data.xid)
+ {
+ /*
+ * Clear the subskipxid of pg_subscription catalog. This catalog
+ * update must be committed before finishing prepared transaction.
+ * Because otherwise, in a case where the server crashes between
+ * finishing prepared transaction and the catalog update, COMMIT
+ * PREPARED won’t be resent but skipsubxid is left.
+ *
+ * Also, we must not update the replication origin LSN and timestamp
+ * while committing the catalog update so that COMMIT PREPARED will
+ * be resent in case of a crash immediately after the catalog update
+ * commit.
+ */
+ clear_subscription_skip_xid(prepare_data.xid, InvalidXLogRecPtr, 0);
+ }
+
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
gid, sizeof(gid));
@@ -979,6 +1061,19 @@ apply_handle_rollback_prepared(StringInfo s)
logicalrep_read_rollback_prepared(s, &rollback_data);
set_apply_error_context_xact(rollback_data.xid, rollback_data.rollback_time);
+ /* Update the subscription cache if necessary */
+ maybe_reread_subscription();
+
+ if (MySubscription->skipxid == rollback_data.xid)
+ {
+ /*
+ * Same as COMMIT PREPARED, we must clear subskipxid of pg_subscription
+ * before rolling back the prepared transaction. Please see the comments
+ * in apply_handle_commit_prepared() for details.
+ */
+ clear_subscription_skip_xid(rollback_data.xid, InvalidXLogRecPtr, 0);
+ }
+
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, rollback_data.xid,
gid, sizeof(gid));
@@ -1207,6 +1302,13 @@ apply_handle_stream_abort(StringInfo s)
logicalrep_read_stream_abort(s, &xid, &subxid);
+ /*
+ * We don't expect that the user set the XID of the transaction that is
+ * rolled back but if the skip XID is set, clear it.
+ */
+ if (MySubscription->skipxid == xid || MySubscription->skipxid == subxid)
+ clear_subscription_skip_xid(MySubscription->skipxid, InvalidXLogRecPtr, 0);
+
/*
* If the two XIDs are the same, it's in fact abort of toplevel xact, so
* just delete the files with serialized info.
@@ -1329,6 +1431,9 @@ apply_spooled_messages(TransactionId xid, XLogRecPtr lsn)
remote_final_lsn = lsn;
+ /* Enable skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(xid);
+
/*
* Make sure the handle apply_dispatch methods are aware we're in a remote
* transaction.
@@ -1430,6 +1535,10 @@ apply_handle_stream_commit(StringInfo s)
apply_spooled_messages(xid, commit_data.commit_lsn);
+ /*
+ * Commit streamed transaction. If we're skipping this transaction,
+ * we stop it in apply_handle_commit_internal().
+ */
apply_handle_commit_internal(&commit_data);
/* unlink the files with serialized changes and subxact info */
@@ -1449,7 +1558,20 @@ apply_handle_stream_commit(StringInfo s)
static void
apply_handle_commit_internal(LogicalRepCommitData *commit_data)
{
- if (IsTransactionState())
+ if (is_skipping_changes())
+ {
+ /*
+ * If we are skipping all changes of this transaction, we stop it
+ * and clear subskipxid of pg_subscription.
+ */
+ stop_skipping_changes(true, commit_data->end_lsn,
+ commit_data->committime);
+
+ pgstat_report_stat(false);
+
+ store_flush_position(commit_data->end_lsn);
+ }
+ else if (IsTransactionState())
{
/*
* Update origin state so we can restart streaming from correct
@@ -2319,6 +2441,17 @@ apply_dispatch(StringInfo s)
LogicalRepMsgType action = pq_getmsgbyte(s);
LogicalRepMsgType saved_command;
+ /*
+ * Skip all data-modification changes if we're skipping changes of this
+ * transaction.
+ */
+ if (is_skipping_changes() &&
+ (action == LOGICAL_REP_MSG_INSERT ||
+ action == LOGICAL_REP_MSG_UPDATE ||
+ action == LOGICAL_REP_MSG_DELETE ||
+ action == LOGICAL_REP_MSG_TRUNCATE))
+ return;
+
/*
* Set the current command being applied. Since this function can be
* called recusively when applying spooled changes, save the current
@@ -3613,6 +3746,124 @@ IsLogicalWorker(void)
return MyLogicalRepWorker != NULL;
}
+/*
+ * Start skipping changes of the transaction if the given XID matches the
+ * transaction ID specified by skip_xid option.
+ */
+static void
+maybe_start_skipping_changes(TransactionId xid)
+{
+ Assert(!is_skipping_changes());
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ /* Make sure subscription cache is up-to-date */
+ maybe_reread_subscription();
+
+ if (MySubscription->skipxid != xid)
+ return;
+
+ /* Start skipping all changes of this transaction */
+ skipping_xid = xid;
+
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction %u",
+ xid));
+}
+
+/*
+ * Stop skipping changes by resetting skipping_xid. If clear_subskipxid is true,
+ * we also clear subskipxid of pg_subscription by setting InvalidTransactionId.
+ */
+static void
+stop_skipping_changes(bool clear_subskipxid, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp)
+{
+ Assert(is_skipping_changes());
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction %u",
+ skipping_xid)));
+
+ if (clear_subskipxid)
+ clear_subscription_skip_xid(skipping_xid, origin_lsn, origin_timestamp);
+
+ /* Stop skipping changes */
+ skipping_xid = InvalidTransactionId;
+}
+
+/* Update subskipxid of pg_subscription to NULL */
+static void
+clear_subscription_skip_xid(TransactionId xid, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp)
+{
+ Relation rel;
+ Form_pg_subscription subform;
+ TransactionId subskipxid;
+ HeapTuple tup;
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ if (!IsTransactionState())
+ StartTransactionCommand();
+
+ LockSharedObject(SubscriptionRelationId, MySubscription->oid, 0,
+ AccessShareLock);
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ ObjectIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ /* Get subskipxid value */
+ subform = (Form_pg_subscription) GETSTRUCT(tup);
+ subskipxid = subform->subskipxid;
+
+ if (subskipxid == xid)
+ {
+ /*
+ * Update origin state so we can restart streaming from correct
+ * position in case of crash.
+ */
+ replorigin_session_origin_lsn = origin_lsn;
+ replorigin_session_origin_timestamp = origin_timestamp;
+
+ /* Invalidate subskipxid */
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(InvalidTransactionId);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ /* Update the system catalog to reset the skip XID */
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+ }
+ else if (!XLogRecPtrIsInvalid(origin_lsn))
+ {
+ /*
+ * User has already changed subskipxid before clearing the subskipxid, so
+ * don't change the catalog but just advance the replication origin.
+ */
+ replorigin_advance(replorigin_session_origin, origin_lsn,
+ GetXLogInsertRecPtr(),
+ false, /* go_backward */
+ true /* wal_log */);
+ }
+
+ heap_freetuple(tup);
+ table_close(rel, NoLock);
+ CommitTransactionCommand();
+}
+
/* Error callback to give more context info about the change being applied */
static void
apply_error_callback(void *arg)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 59cd02ebb1..fefa84610a 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4301,6 +4301,10 @@ getSubscriptions(Archive *fout)
ntups = PQntuples(res);
+ /*
+ * Get subscription fields. We don't fetch subskipxid as we don't
+ * include it in the dump.
+ */
i_tableoid = PQfnumber(res, "tableoid");
i_oid = PQfnumber(res, "oid");
i_subname = PQfnumber(res, "subname");
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index b81a04c93b..6ae2ac2497 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1675,7 +1675,7 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> */
else if (Matches("ALTER", "SUBSCRIPTION", MatchAny))
COMPLETE_WITH("CONNECTION", "ENABLE", "DISABLE", "OWNER TO",
- "RENAME TO", "REFRESH PUBLICATION", "SET",
+ "RENAME TO", "REFRESH PUBLICATION", "SET", "SKIP",
"ADD PUBLICATION", "DROP PUBLICATION");
/* ALTER SUBSCRIPTION <name> REFRESH PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) &&
@@ -1691,6 +1691,12 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
COMPLETE_WITH("binary", "slot_name", "streaming", "synchronous_commit");
+ /* ALTER SUBSCRIPTION <name> SKIP */
+ else if (Matches("ALTER", "SUBSCRIPTION", MatchAny, "SKIP"))
+ COMPLETE_WITH("(");
+ /* ALTER SUBSCRIPTION <name> SKIP ( */
+ else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
+ COMPLETE_WITH("xid");
/* ALTER SUBSCRIPTION <name> SET PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "PUBLICATION"))
{
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 21061493ea..d837a3e55b 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -67,6 +67,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
char subtwophasestate; /* Stream two-phase transactions */
+ TransactionId subskipxid; /* All changes associated with
+ * this XID are skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -103,6 +106,7 @@ typedef struct Subscription
* binary format */
bool stream; /* Allow streaming in-progress transactions. */
char twophasestate; /* Allow streaming two-phase transactions */
+ TransactionId skipxid; /* All changes of the XID are skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 593e301f7a..a679ef6d28 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3716,7 +3716,8 @@ typedef enum AlterSubscriptionType
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
ALTER_SUBSCRIPTION_DROP_PUBLICATION,
ALTER_SUBSCRIPTION_REFRESH,
- ALTER_SUBSCRIPTION_ENABLED
+ ALTER_SUBSCRIPTION_ENABLED,
+ ALTER_SUBSCRIPTION_SKIP
} AlterSubscriptionType;
typedef struct AlterSubscriptionStmt
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 80aae83562..e747057ba0 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -93,6 +93,31 @@ ALTER SUBSCRIPTION regress_doesnotexist CONNECTION 'dbname=regress_doesnotexist2
ERROR: subscription "regress_doesnotexist" does not exist
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
ERROR: unrecognized subscription parameter: "create_slot"
+-- ok - valid xid
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 3);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 4294967295);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+ subname | subskipxid
+-----------------+------------
+ regress_testsub | 4294967295
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = NONE);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+ subname | subskipxid
+-----------------+------------
+ regress_testsub | 0
+(1 row)
+
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1.1);
+ERROR: invalid transaction id: 1.1
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 0);
+ERROR: invalid transaction id: 0
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1);
+ERROR: invalid transaction id: 1
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 2);
+ERROR: invalid transaction id: 2
\dRs+
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index bd0f4af1e4..6d8392758c 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -72,6 +72,19 @@ ALTER SUBSCRIPTION regress_testsub SET (slot_name = '');
ALTER SUBSCRIPTION regress_doesnotexist CONNECTION 'dbname=regress_doesnotexist2';
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
+-- ok - valid xid
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 3);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 4294967295);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = NONE);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1.1);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 0);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 2);
+
\dRs+
BEGIN;
diff --git a/src/test/subscription/t/027_skip_xact.pl b/src/test/subscription/t/027_skip_xact.pl
new file mode 100644
index 0000000000..a63c9c345e
--- /dev/null
+++ b/src/test/subscription/t/027_skip_xact.pl
@@ -0,0 +1,204 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Tests for skipping logical replication transactions.
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 6;
+
+# Test skipping the transaction. This function must be called after the caller
+# inserting data that conflict with the subscriber. After waiting for the
+# subscription worker stats are updated, we skip the transaction in question
+# by ALTER SUBSCRIPTION ... SKIP. Then, check if logical replication can continue
+# working by inserting $nonconflict_data on the publisher.
+sub test_skip_xact
+{
+ my ($node_publisher, $node_subscriber, $subname, $relname, $nonconflict_data,
+ $expected, $xid, $msg) = @_;
+
+ # Wait for worker error
+ $node_subscriber->poll_query_until(
+ 'postgres',
+ qq[
+SELECT count(1) > 0
+FROM pg_stat_subscription_workers
+WHERE last_error_relid = '$relname'::regclass
+ AND subrelid IS NULL
+ AND last_error_command = 'INSERT'
+ AND last_error_xid = '$xid'
+ AND starts_with(last_error_message, 'duplicate key value violates unique constraint');
+]) or die "Timed out while waiting for worker error";
+
+ # Set skip xid
+ $node_subscriber->safe_psql(
+ 'postgres',
+ "ALTER SUBSCRIPTION $subname SKIP (xid = '$xid')");
+
+ # Restart the subscriber node to restart logical replication with no interval.
+ $node_subscriber->restart;
+
+ # Wait for the failed transaction to be skipped.
+ $node_subscriber->poll_query_until(
+ 'postgres',
+ "SELECT subskipxid = 0 FROM pg_subscription WHERE subname = '$subname'");
+
+ # Insert non-conflict data
+ $node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO $relname VALUES $nonconflict_data");
+
+ $node_publisher->wait_for_catchup($subname);
+
+ # Check replicated data
+ my $res = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT count(*) FROM $relname");
+ is($res, $expected, $msg);
+}
+
+# Create publisher node.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ qq[
+logical_decoding_work_mem = 64kB
+max_prepared_transactions = 10
+]);
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+]);
+
+# The subscriber will enter an infinite error loop, so we don't want
+# to overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+wal_retrieve_retry_interval = 2s
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we
+# create the same tables but with primary keys. Also, insert some data that
+# will conflict with the data replicated from publisher later.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+CREATE TABLE test_tab (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+COMMIT;
+]);
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+CREATE TABLE test_tab (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab VALUES (1);
+INSERT INTO test_tab_streaming VALUES (1, md5(1::text));
+COMMIT;
+]);
+
+# Setup publications.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Create subscriptions.
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr' PUBLICATION tap_pub WITH (two_phase = on);
+CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr' PUBLICATION tap_pub_streaming WITH (two_phase = on, streaming = on);
+]);
+
+$node_publisher->wait_for_catchup('tap_sub');
+$node_publisher->wait_for_catchup('tap_sub_streaming');
+
+# Insert data to test_tab1, raising an error on the subscriber due to violation
+# of the unique constraint on test_tab. Then skip the transaction.
+my $xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(2)", "2", $xid,
+ "test skipping transaction");
+
+# Test for PREPARE. Insert the same data to test_tab1 and PREPARE the transaction,
+# raising an error. Then skip the transaction.
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+COMMIT PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(3)", "3", $xid,
+ "test skipping prepare and commit prepared ");
+
+# Test for PREPARE and ROLLBACK PREPARED.
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+ROLLBACK PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(4)", "4", $xid,
+ "test skipping prepare and rollback prepared");
+
+# Test for STREAM COMMIT. Insert enough rows to test_tab_streaming to exceed the 64kB
+# limit, also raising an error on the subscriber during applying spooled changes for the
+# same reason. Then skip the transaction.
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub_streaming", "test_tab_streaming",
+ "(2, md5(2::text))", "2", $xid,
+ "test skipping stream-commit");
+
+# Test for STREAM PREPARE and COMMIT PREPARED.
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+COMMIT PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub_streaming", "test_tab_streaming",
+ "(3, md5(3::text))", "3", $xid,
+ "test skipping stream-prepare and commit prepared");
+
+my $res = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT count(*) FROM pg_prepared_xacts");
+is($res, "0", "check all prepared transactions are resolved on the subscriber");
--
2.24.3 (Apple Git-128)
On Fri, Jan 7, 2022 at 11:23 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Jan 7, 2022 at 10:04 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Jan 5, 2022 at 12:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Dec 27, 2021 at 9:54 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Dec 16, 2021 at 2:42 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Dec 16, 2021 at 2:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Dec 16, 2021 at 10:37 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Dec 16, 2021 at 11:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
I thought we just want to lock before clearing the skip_xid something
like take the lock, check if the skip_xid in the catalog is the same
as we have skipped, if it is the same then clear it, otherwise, leave
it as it is. How will that disallow users to change skip_xid when we
are skipping changes?Oh I thought we wanted to keep holding the lock while skipping changes
(changing skip_xid requires acquiring the lock).So if skip_xid is already changed, the apply worker would do
replorigin_advance() with WAL logging, instead of committing the
catalog change?Right. BTW, how are you planning to advance the origin? Normally, a
commit transaction would do it but when we are skipping all changes,
the commit might not do it as there won't be any transaction id
assigned.I've not tested it yet but replorigin_advance() with wal_log = true
seems to work for this case.I've tested it and realized that we cannot use replorigin_advance()
for this purpose without changes. That is, the current
replorigin_advance() doesn't allow to advance the origin by the owner:/* Make sure it's not used by somebody else */
if (replication_state->acquired_by != 0)
{
ereport(ERROR,
(errcode(ERRCODE_OBJECT_IN_USE),
errmsg("replication origin with OID %d is already
active for PID %d",
replication_state->roident,
replication_state->acquired_by)));
}So we need to change it so that the origin owner can advance its
origin, which makes sense to me.Also, when we have to update the origin instead of committing the
catalog change while updating the origin, we cannot record the origin
timestamp.Is it because we currently update the origin timestamp with commit record?
Yes.
This behavior makes sense to me because we skipped the
transaction. But ISTM it’s not good if we emit the origin timestamp
only when directly updating the origin. So probably we need to always
omit origin timestamp.Do you mean to say that you want to omit it even when we are
committing the changes?Yes, it would be better to record only origin lsn in terms of consistency.
Apart from that, I'm vaguely concerned that the logic seems to be
getting complex. Probably it comes from the fact that we store
skip_xid in the catalog and update the catalog to clear/set the
skip_xid. It might be worth revisiting the idea of storing skip_xid on
shmem (e.g., ReplicationState)?IIRC, the problem with that idea was that we won't remember skip_xid
information after server restart and the user won't even know that it
has to set it again.Right, I agree that it’s not convenient when the server restarts or
crashes, but these problems could not be critical in the situation
where users have to use this feature; the subscriber already entered
an error loop so they can know xid again and it’s an uncommon case
that they need to restart during skipping changes.Anyway, I'll submit an updated patch soon so we can discuss complexity
vs. convenience.Attached an updated patch. Please review it.
Thanks for the updated patch, few comments:
1) Should this be case insensitive to support NONE too:
+ /* Setting xid = NONE is treated as resetting xid */
+ if (strcmp(xid_str, "none") == 0)
+ xid = InvalidTransactionId;
2) Can we have an option to specify last_error_xid of
pg_stat_subscription_workers. Something like:
alter subscription sub1 skip ( XID = 'last_subscription_error');
When the user specified last_subscription_error, it should pick
last_error_xid from pg_stat_subscription_workers.
As this operation is a critical operation, if there is an option which
could automatically pick and set from pg_stat_subscription_workers, it
would be useful.
3) Currently the following syntax is being supported, I felt this
should throw an error:
postgres=# alter subscription sub1 set ( XID = 100);
ALTER SUBSCRIPTION
4) You might need to rebase the patch:
git am v2-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transac.patch
Applying: Add ALTER SUBSCRIPTION ... SKIP to skip the transaction on
subscriber nodes
error: patch failed: doc/src/sgml/logical-replication.sgml:333
error: doc/src/sgml/logical-replication.sgml: patch does not apply
Patch failed at 0001 Add ALTER SUBSCRIPTION ... SKIP to skip the
transaction on subscriber nodes
hint: Use 'git am --show-current-patch=diff' to see the failed patch
5) You might have to rename 027_skip_xact to 028_skip_xact as
027_nosuperuser.pl already exists
diff --git a/src/test/subscription/t/027_skip_xact.pl
b/src/test/subscription/t/027_skip_xact.pl
new file mode 100644
index 0000000000..a63c9c345e
--- /dev/null
+++ b/src/test/subscription/t/027_skip_xact.pl
Regards,
Vignesh
On Thu, Dec 16, 2021 at 11:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Dec 16, 2021 at 2:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
So if skip_xid is already changed, the apply worker would do
replorigin_advance() with WAL logging, instead of committing the
catalog change?Right. BTW, how are you planning to advance the origin? Normally, a
commit transaction would do it but when we are skipping all changes,
the commit might not do it as there won't be any transaction id
assigned.I've not tested it yet but replorigin_advance() with wal_log = true
seems to work for this case.
IIUC, the changes corresponding to above in the latest patch are as follows:
--- a/src/backend/replication/logical/origin.c
+++ b/src/backend/replication/logical/origin.c
@@ -921,7 +921,8 @@ replorigin_advance(RepOriginId node,
LWLockAcquire(&replication_state->lock, LW_EXCLUSIVE);
/* Make sure it's not used by somebody else */
- if (replication_state->acquired_by != 0)
+ if (replication_state->acquired_by != 0 &&
+ replication_state->acquired_by != MyProcPid)
{
...
clear_subscription_skip_xid()
{
..
+ else if (!XLogRecPtrIsInvalid(origin_lsn))
+ {
+ /*
+ * User has already changed subskipxid before clearing the subskipxid, so
+ * don't change the catalog but just advance the replication origin.
+ */
+ replorigin_advance(replorigin_session_origin, origin_lsn,
+ GetXLogInsertRecPtr(),
+ false, /* go_backward */
+ true /* wal_log */);
+ }
..
}
I was thinking what if we don't advance origin explicitly in this
case? Actually, that will be no different than the transactions where
the apply worker doesn't apply any change because the initial sync is
in progress (see should_apply_changes_for_rel()) or we have received
an empty transaction. In those cases also, the origin lsn won't be
advanced even though we acknowledge the advanced last_received
location because of keep_alive messages. Now, it is possible after the
restart we send the old start_lsn location because the replication
origin was not updated before restart but we handle that case in the
server by starting from the last confirmed location. See below code:
CreateDecodingContext()
{
..
else if (start_lsn < slot->data.confirmed_flush)
..
Few other comments on the latest patch:
=================================
1.
A conflict will produce an error and will stop the replication; it must be
resolved manually by the user. Details about the conflict can be found in
- the subscriber's server log.
+ <xref linkend="monitoring-pg-stat-subscription-workers"/> as well as the
+ subscriber's server log.
Can we slightly change the modified line to: "Details about the
conflict can be found in <xref
linkend="monitoring-pg-stat-subscription-workers"/> and the
subscriber's server log."? I think we can commit this change
separately as this is true even without this patch.
2.
The resolution can be done either by changing data on the subscriber so
- that it does not conflict with the incoming change or by skipping the
- transaction that conflicts with the existing data. The transaction can be
- skipped by calling the <link linkend="pg-replication-origin-advance">
+ that it does not conflict with the incoming changes or by skipping the whole
+ transaction. This option specifies the ID of the transaction whose
+ application is to be skipped by the logical replication worker. The logical
+ replication worker skips all data modification transaction conflicts with
+ the existing data. When a conflict produce an error, it is shown in
+ <structname>pg_stat_subscription_workers</structname> view as follows:
I don't think most of the additional text added in the above paragraph
is required. We can rephrase it as: "The resolution can be done either
by changing data on the subscriber so that it does not conflict with
the incoming change or by skipping the transaction that conflicts with
the existing data. When a conflict produces an error, it is shown in
<structname>pg_stat_subscription_workers</structname> view as
follows:". After that keep the text, you have.
3.
They skip the whole transaction, including changes that may not violate any
+ constraint. They may easily make the subscriber inconsistent, especially if
+ a user specifies the wrong transaction ID or the position of origin.
Can we slightly reword the above text as: "Skipping the whole
transaction includes skipping the changes that may not violate any
constraint. This can easily make the subscriber inconsistent,
especially if a user specifies the wrong transaction ID or the
position of origin."?
4.
The logical replication worker skips all data
+ modification changes within the specified transaction. Therefore, since
+ it skips the whole transaction including the changes that may not violate
+ the constraint, it should only be used as a last resort. This option has
+ no effect for the transaction that is already prepared with enabling
+ <literal>two_phase</literal> on susbscriber.
Let's slightly reword the above text as: "The logical replication
worker skips all data modification changes within the specified
transaction including the changes that may not violate the constraint,
so, it should only be used as a last resort. This option has no effect
on the transaction that is already prepared by enabling
<literal>two_phase</literal> on the subscriber."
5.
+ by the logical replication worker. Setting
<literal>NONE</literal> means
+ to reset the transaction ID.
Let's slightly reword the second part of the sentence as: "Setting
<literal>NONE</literal> resets the transaction ID."
6.
Once we start skipping
+ * changes, we don't stop it until the we skip all changes of the
transaction even
+ * if the subscription invalidated and MySubscription->skipxid gets
changed or reset.
/subscription invalidated/subscription is invalidated
What do you mean by subscription invalidated and how is it related to
this feature? I think we should mention something on these lines in
the docs as well.
7.
"Please refer to the comments in these functions for details.". We can
slightly modify this part of the comment as: "Please refer to the
comments in corresponding functions for details."
--
With Regards,
Amit Kapila.
On Mon, Jan 10, 2022 at 2:57 PM vignesh C <vignesh21@gmail.com> wrote:
2) Can we have an option to specify last_error_xid of
pg_stat_subscription_workers. Something like:
alter subscription sub1 skip ( XID = 'last_subscription_error');When the user specified last_subscription_error, it should pick
last_error_xid from pg_stat_subscription_workers.
As this operation is a critical operation, if there is an option which
could automatically pick and set from pg_stat_subscription_workers, it
would be useful.
I think having some automatic functionality around this would be good
but I am not so sure about this idea because it is possible that the
error has not reached the stats collector and the user might be
referring to server logs to set the skip xid. In such cases, even
though an error would have occurred but we won't be able to set the
required xid. Now, one can imagine that if we don't get the required
value from pg_stat_subscription_workers then we can return an error to
the user indicating that she can cross-verify the server logs and set
the appropriate xid value but IMO it could be confusing. I feel even
if we want some automatic functionality like you are proposing or
something else, it could be done as a separate patch but let's wait
and see what Sawada-San or others think about this?
--
With Regards,
Amit Kapila.
On Tue, Jan 11, 2022 at 7:52 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Jan 10, 2022 at 2:57 PM vignesh C <vignesh21@gmail.com> wrote:
2) Can we have an option to specify last_error_xid of
pg_stat_subscription_workers. Something like:
alter subscription sub1 skip ( XID = 'last_subscription_error');When the user specified last_subscription_error, it should pick
last_error_xid from pg_stat_subscription_workers.
As this operation is a critical operation, if there is an option which
could automatically pick and set from pg_stat_subscription_workers, it
would be useful.I think having some automatic functionality around this would be good
but I am not so sure about this idea because it is possible that the
error has not reached the stats collector and the user might be
referring to server logs to set the skip xid. In such cases, even
though an error would have occurred but we won't be able to set the
required xid. Now, one can imagine that if we don't get the required
value from pg_stat_subscription_workers then we can return an error to
the user indicating that she can cross-verify the server logs and set
the appropriate xid value but IMO it could be confusing. I feel even
if we want some automatic functionality like you are proposing or
something else, it could be done as a separate patch but let's wait
and see what Sawada-San or others think about this?
If we are ok with the suggested idea then it can be done as a separate
patch, I agree that it need not be part of the existing patch.
Regards,
Vignesh
On Mon, Jan 10, 2022 at 8:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Dec 16, 2021 at 11:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Dec 16, 2021 at 2:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
So if skip_xid is already changed, the apply worker would do
replorigin_advance() with WAL logging, instead of committing the
catalog change?Right. BTW, how are you planning to advance the origin? Normally, a
commit transaction would do it but when we are skipping all changes,
the commit might not do it as there won't be any transaction id
assigned.I've not tested it yet but replorigin_advance() with wal_log = true
seems to work for this case.IIUC, the changes corresponding to above in the latest patch are as follows:
--- a/src/backend/replication/logical/origin.c +++ b/src/backend/replication/logical/origin.c @@ -921,7 +921,8 @@ replorigin_advance(RepOriginId node, LWLockAcquire(&replication_state->lock, LW_EXCLUSIVE);/* Make sure it's not used by somebody else */ - if (replication_state->acquired_by != 0) + if (replication_state->acquired_by != 0 && + replication_state->acquired_by != MyProcPid) { ...clear_subscription_skip_xid() { .. + else if (!XLogRecPtrIsInvalid(origin_lsn)) + { + /* + * User has already changed subskipxid before clearing the subskipxid, so + * don't change the catalog but just advance the replication origin. + */ + replorigin_advance(replorigin_session_origin, origin_lsn, + GetXLogInsertRecPtr(), + false, /* go_backward */ + true /* wal_log */); + } .. }I was thinking what if we don't advance origin explicitly in this
case? Actually, that will be no different than the transactions where
the apply worker doesn't apply any change because the initial sync is
in progress (see should_apply_changes_for_rel()) or we have received
an empty transaction. In those cases also, the origin lsn won't be
advanced even though we acknowledge the advanced last_received
location because of keep_alive messages. Now, it is possible after the
restart we send the old start_lsn location because the replication
origin was not updated before restart but we handle that case in the
server by starting from the last confirmed location. See below code:CreateDecodingContext()
{
..
else if (start_lsn < slot->data.confirmed_flush)
..
Good point. Probably one minor thing that is different from the
transaction where the apply worker applied an empty transaction is a
case where the server restarts/crashes before sending an
acknowledgment of the flush location. That is, in the case of the
empty transaction, the publisher sends an empty transaction again. On
the other hand in the case of skipping the transaction, a non-empty
transaction will be sent again but skip_xid is already changed or
cleared, therefore the user will have to specify skip_xid again. If we
write replication origin WAL record to advance the origin lsn, it
reduces the possibility of that. But I think it’s a very minor case so
we won’t need to deal with that.
Anyway, according to your analysis, I think we don't necessarily need
to do replorigin_advance() in this case.
Few other comments on the latest patch: ================================= 1. A conflict will produce an error and will stop the replication; it must be resolved manually by the user. Details about the conflict can be found in - the subscriber's server log. + <xref linkend="monitoring-pg-stat-subscription-workers"/> as well as the + subscriber's server log.Can we slightly change the modified line to: "Details about the
conflict can be found in <xref
linkend="monitoring-pg-stat-subscription-workers"/> and the
subscriber's server log."?
Will fix it.
I think we can commit this change
separately as this is true even without this patch.
Right. It seems an oversight of 8d74fc96db. I've attached the patch.
2. The resolution can be done either by changing data on the subscriber so - that it does not conflict with the incoming change or by skipping the - transaction that conflicts with the existing data. The transaction can be - skipped by calling the <link linkend="pg-replication-origin-advance"> + that it does not conflict with the incoming changes or by skipping the whole + transaction. This option specifies the ID of the transaction whose + application is to be skipped by the logical replication worker. The logical + replication worker skips all data modification transaction conflicts with + the existing data. When a conflict produce an error, it is shown in + <structname>pg_stat_subscription_workers</structname> view as follows:I don't think most of the additional text added in the above paragraph
is required. We can rephrase it as: "The resolution can be done either
by changing data on the subscriber so that it does not conflict with
the incoming change or by skipping the transaction that conflicts with
the existing data. When a conflict produces an error, it is shown in
<structname>pg_stat_subscription_workers</structname> view as
follows:". After that keep the text, you have.
Agreed, will fix.
3. They skip the whole transaction, including changes that may not violate any + constraint. They may easily make the subscriber inconsistent, especially if + a user specifies the wrong transaction ID or the position of origin.Can we slightly reword the above text as: "Skipping the whole
transaction includes skipping the changes that may not violate any
constraint. This can easily make the subscriber inconsistent,
especially if a user specifies the wrong transaction ID or the
position of origin."?
Will fix.
4. The logical replication worker skips all data + modification changes within the specified transaction. Therefore, since + it skips the whole transaction including the changes that may not violate + the constraint, it should only be used as a last resort. This option has + no effect for the transaction that is already prepared with enabling + <literal>two_phase</literal> on susbscriber.Let's slightly reword the above text as: "The logical replication
worker skips all data modification changes within the specified
transaction including the changes that may not violate the constraint,
so, it should only be used as a last resort. This option has no effect
on the transaction that is already prepared by enabling
<literal>two_phase</literal> on the subscriber."
Will fix.
5. + by the logical replication worker. Setting <literal>NONE</literal> means + to reset the transaction ID.Let's slightly reword the second part of the sentence as: "Setting
<literal>NONE</literal> resets the transaction ID."
Will fix.
6. Once we start skipping + * changes, we don't stop it until the we skip all changes of the transaction even + * if the subscription invalidated and MySubscription->skipxid gets changed or reset./subscription invalidated/subscription is invalidated
Will fix.
What do you mean by subscription invalidated and how is it related to
this feature? I think we should mention something on these lines in
the docs as well.
I meant that MySubscription, a cache of pg_subscription entry, is
invalidated by the catalog change. IIUC while applying changes we
don't re-read pg_subscription (i.e., not calling
maybe_reread_subscription()). Similarly, while skipping changes, we
also don't do that. Therefore, even if skip_xid has been changed while
skipping changes, we don't stop skipping changes.
7.
"Please refer to the comments in these functions for details.". We can
slightly modify this part of the comment as: "Please refer to the
comments in corresponding functions for details."
Will fix.
I'll submit an updated patch soon.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
doc_update.patchapplication/octet-stream; name=doc_update.patchDownload
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index fb4472356d..96b4886e08 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -346,7 +346,9 @@
<para>
A conflict will produce an error and will stop the replication; it must be
resolved manually by the user. Details about the conflict can be found in
- the subscriber's server log.
+ <link linkend="monitoring-pg-stat-subscription-workers">
+ <structname>pg_stat_subscription_workers</structname></link> and the
+ subscriber's server log.
</para>
<para>
On Tue, Jan 11, 2022 at 11:22 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Jan 10, 2022 at 2:57 PM vignesh C <vignesh21@gmail.com> wrote:
2) Can we have an option to specify last_error_xid of
pg_stat_subscription_workers. Something like:
alter subscription sub1 skip ( XID = 'last_subscription_error');When the user specified last_subscription_error, it should pick
last_error_xid from pg_stat_subscription_workers.
As this operation is a critical operation, if there is an option which
could automatically pick and set from pg_stat_subscription_workers, it
would be useful.I think having some automatic functionality around this would be good
but I am not so sure about this idea because it is possible that the
error has not reached the stats collector and the user might be
referring to server logs to set the skip xid. In such cases, even
though an error would have occurred but we won't be able to set the
required xid. Now, one can imagine that if we don't get the required
value from pg_stat_subscription_workers then we can return an error to
the user indicating that she can cross-verify the server logs and set
the appropriate xid value but IMO it could be confusing. I feel even
if we want some automatic functionality like you are proposing or
something else, it could be done as a separate patch but let's wait
and see what Sawada-San or others think about this?
Agreed. The automatically setting XID would be a good idea but we can
do that in a separate patch so we can keep the first patch simple.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, Jan 11, 2022 at 8:52 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Jan 10, 2022 at 8:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I was thinking what if we don't advance origin explicitly in this
case? Actually, that will be no different than the transactions where
the apply worker doesn't apply any change because the initial sync is
in progress (see should_apply_changes_for_rel()) or we have received
an empty transaction. In those cases also, the origin lsn won't be
advanced even though we acknowledge the advanced last_received
location because of keep_alive messages. Now, it is possible after the
restart we send the old start_lsn location because the replication
origin was not updated before restart but we handle that case in the
server by starting from the last confirmed location. See below code:CreateDecodingContext()
{
..
else if (start_lsn < slot->data.confirmed_flush)
..Good point. Probably one minor thing that is different from the
transaction where the apply worker applied an empty transaction is a
case where the server restarts/crashes before sending an
acknowledgment of the flush location. That is, in the case of the
empty transaction, the publisher sends an empty transaction again. On
the other hand in the case of skipping the transaction, a non-empty
transaction will be sent again but skip_xid is already changed or
cleared, therefore the user will have to specify skip_xid again. If we
write replication origin WAL record to advance the origin lsn, it
reduces the possibility of that. But I think it’s a very minor case so
we won’t need to deal with that.
Yeah, in the worst case, it will lead to conflict again and the user
needs to set the xid again.
Anyway, according to your analysis, I think we don't necessarily need
to do replorigin_advance() in this case.
Right.
5. + by the logical replication worker. Setting <literal>NONE</literal> means + to reset the transaction ID.Let's slightly reword the second part of the sentence as: "Setting
<literal>NONE</literal> resets the transaction ID."Will fix.
6. Once we start skipping + * changes, we don't stop it until the we skip all changes of the transaction even + * if the subscription invalidated and MySubscription->skipxid gets changed or reset./subscription invalidated/subscription is invalidated
Will fix.
What do you mean by subscription invalidated and how is it related to
this feature? I think we should mention something on these lines in
the docs as well.I meant that MySubscription, a cache of pg_subscription entry, is
invalidated by the catalog change. IIUC while applying changes we
don't re-read pg_subscription (i.e., not calling
maybe_reread_subscription()). Similarly, while skipping changes, we
also don't do that. Therefore, even if skip_xid has been changed while
skipping changes, we don't stop skipping changes.
Okay, but I don't think we need to mention subscription is invalidated
as that could be confusing, the other part of the comment is quite
clear.
--
With Regards,
Amit Kapila.
On Tue, Jan 11, 2022 at 3:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jan 11, 2022 at 8:52 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Jan 10, 2022 at 8:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I was thinking what if we don't advance origin explicitly in this
case? Actually, that will be no different than the transactions where
the apply worker doesn't apply any change because the initial sync is
in progress (see should_apply_changes_for_rel()) or we have received
an empty transaction. In those cases also, the origin lsn won't be
advanced even though we acknowledge the advanced last_received
location because of keep_alive messages. Now, it is possible after the
restart we send the old start_lsn location because the replication
origin was not updated before restart but we handle that case in the
server by starting from the last confirmed location. See below code:CreateDecodingContext()
{
..
else if (start_lsn < slot->data.confirmed_flush)
..Good point. Probably one minor thing that is different from the
transaction where the apply worker applied an empty transaction is a
case where the server restarts/crashes before sending an
acknowledgment of the flush location. That is, in the case of the
empty transaction, the publisher sends an empty transaction again. On
the other hand in the case of skipping the transaction, a non-empty
transaction will be sent again but skip_xid is already changed or
cleared, therefore the user will have to specify skip_xid again. If we
write replication origin WAL record to advance the origin lsn, it
reduces the possibility of that. But I think it’s a very minor case so
we won’t need to deal with that.Yeah, in the worst case, it will lead to conflict again and the user
needs to set the xid again.
On second thought, the same is true for other cases, for example,
preparing the transaction and clearing skip_xid while handling a
prepare message. That is, currently we don't clear skip_xid while
handling a prepare message but do that while handling commit/rollback
prepared message, in order to avoid the worst case. If we do both
while handling a prepare message and the server crashes between them,
it ends up that skip_xid is cleared and the transaction will be
resent, which is identical to the worst-case above. Therefore, if we
accept this situation because of its low probability, probably we can
do the same things for other cases too, which makes the patch simple
especially for prepare and commit/rollback-prepared cases. What do you
think?
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, Jan 11, 2022 at 1:51 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Jan 11, 2022 at 3:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jan 11, 2022 at 8:52 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Jan 10, 2022 at 8:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I was thinking what if we don't advance origin explicitly in this
case? Actually, that will be no different than the transactions where
the apply worker doesn't apply any change because the initial sync is
in progress (see should_apply_changes_for_rel()) or we have received
an empty transaction. In those cases also, the origin lsn won't be
advanced even though we acknowledge the advanced last_received
location because of keep_alive messages. Now, it is possible after the
restart we send the old start_lsn location because the replication
origin was not updated before restart but we handle that case in the
server by starting from the last confirmed location. See below code:CreateDecodingContext()
{
..
else if (start_lsn < slot->data.confirmed_flush)
..Good point. Probably one minor thing that is different from the
transaction where the apply worker applied an empty transaction is a
case where the server restarts/crashes before sending an
acknowledgment of the flush location. That is, in the case of the
empty transaction, the publisher sends an empty transaction again. On
the other hand in the case of skipping the transaction, a non-empty
transaction will be sent again but skip_xid is already changed or
cleared, therefore the user will have to specify skip_xid again. If we
write replication origin WAL record to advance the origin lsn, it
reduces the possibility of that. But I think it’s a very minor case so
we won’t need to deal with that.Yeah, in the worst case, it will lead to conflict again and the user
needs to set the xid again.On second thought, the same is true for other cases, for example,
preparing the transaction and clearing skip_xid while handling a
prepare message. That is, currently we don't clear skip_xid while
handling a prepare message but do that while handling commit/rollback
prepared message, in order to avoid the worst case. If we do both
while handling a prepare message and the server crashes between them,
it ends up that skip_xid is cleared and the transaction will be
resent, which is identical to the worst-case above.
How are you thinking to update the skip xid before prepare? If we do
it in the same transaction then the changes in the catalog will be
part of the prepared xact but won't be committed. Now, say if we do it
after prepare, then the situation won't be the same because after
restart the same xact won't appear again.
--
With Regards,
Amit Kapila.
On Tue, Jan 11, 2022 at 8:52 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Jan 10, 2022 at 8:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Few other comments on the latest patch: ================================= 1. A conflict will produce an error and will stop the replication; it must be resolved manually by the user. Details about the conflict can be found in - the subscriber's server log. + <xref linkend="monitoring-pg-stat-subscription-workers"/> as well as the + subscriber's server log.Can we slightly change the modified line to: "Details about the
conflict can be found in <xref
linkend="monitoring-pg-stat-subscription-workers"/> and the
subscriber's server log."?Will fix it.
I think we can commit this change
separately as this is true even without this patch.Right. It seems an oversight of 8d74fc96db. I've attached the patch.
Pushed.
--
With Regards,
Amit Kapila.
On Tue, Jan 11, 2022 at 7:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jan 11, 2022 at 1:51 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Jan 11, 2022 at 3:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jan 11, 2022 at 8:52 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Jan 10, 2022 at 8:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I was thinking what if we don't advance origin explicitly in this
case? Actually, that will be no different than the transactions where
the apply worker doesn't apply any change because the initial sync is
in progress (see should_apply_changes_for_rel()) or we have received
an empty transaction. In those cases also, the origin lsn won't be
advanced even though we acknowledge the advanced last_received
location because of keep_alive messages. Now, it is possible after the
restart we send the old start_lsn location because the replication
origin was not updated before restart but we handle that case in the
server by starting from the last confirmed location. See below code:CreateDecodingContext()
{
..
else if (start_lsn < slot->data.confirmed_flush)
..Good point. Probably one minor thing that is different from the
transaction where the apply worker applied an empty transaction is a
case where the server restarts/crashes before sending an
acknowledgment of the flush location. That is, in the case of the
empty transaction, the publisher sends an empty transaction again. On
the other hand in the case of skipping the transaction, a non-empty
transaction will be sent again but skip_xid is already changed or
cleared, therefore the user will have to specify skip_xid again. If we
write replication origin WAL record to advance the origin lsn, it
reduces the possibility of that. But I think it’s a very minor case so
we won’t need to deal with that.Yeah, in the worst case, it will lead to conflict again and the user
needs to set the xid again.On second thought, the same is true for other cases, for example,
preparing the transaction and clearing skip_xid while handling a
prepare message. That is, currently we don't clear skip_xid while
handling a prepare message but do that while handling commit/rollback
prepared message, in order to avoid the worst case. If we do both
while handling a prepare message and the server crashes between them,
it ends up that skip_xid is cleared and the transaction will be
resent, which is identical to the worst-case above.How are you thinking to update the skip xid before prepare? If we do
it in the same transaction then the changes in the catalog will be
part of the prepared xact but won't be committed. Now, say if we do it
after prepare, then the situation won't be the same because after
restart the same xact won't appear again.
I was thinking to commit the catalog change first in a separate
transaction while not updating origin LSN and then prepare an empty
transaction while updating origin LSN. If the server crashes between
them, the skip_xid is cleared but the transaction will be resent.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, Jan 11, 2022 at 7:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jan 11, 2022 at 8:52 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Jan 10, 2022 at 8:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Few other comments on the latest patch: ================================= 1. A conflict will produce an error and will stop the replication; it must be resolved manually by the user. Details about the conflict can be found in - the subscriber's server log. + <xref linkend="monitoring-pg-stat-subscription-workers"/> as well as the + subscriber's server log.Can we slightly change the modified line to: "Details about the
conflict can be found in <xref
linkend="monitoring-pg-stat-subscription-workers"/> and the
subscriber's server log."?Will fix it.
I think we can commit this change
separately as this is true even without this patch.Right. It seems an oversight of 8d74fc96db. I've attached the patch.
Pushed.
Thanks!
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wed, Jan 12, 2022 at 5:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Jan 11, 2022 at 7:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jan 11, 2022 at 1:51 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On second thought, the same is true for other cases, for example,
preparing the transaction and clearing skip_xid while handling a
prepare message. That is, currently we don't clear skip_xid while
handling a prepare message but do that while handling commit/rollback
prepared message, in order to avoid the worst case. If we do both
while handling a prepare message and the server crashes between them,
it ends up that skip_xid is cleared and the transaction will be
resent, which is identical to the worst-case above.How are you thinking to update the skip xid before prepare? If we do
it in the same transaction then the changes in the catalog will be
part of the prepared xact but won't be committed. Now, say if we do it
after prepare, then the situation won't be the same because after
restart the same xact won't appear again.I was thinking to commit the catalog change first in a separate
transaction while not updating origin LSN and then prepare an empty
transaction while updating origin LSN.
But, won't it complicate the handling if in the future we try to
enhance this API such that it skips partial changes like skipping only
for particular relation(s) or particular operations as discussed
previously in this thread?
--
With Regards,
Amit Kapila.
On Wed, Jan 12, 2022 at 12:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Jan 12, 2022 at 5:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Jan 11, 2022 at 7:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jan 11, 2022 at 1:51 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On second thought, the same is true for other cases, for example,
preparing the transaction and clearing skip_xid while handling a
prepare message. That is, currently we don't clear skip_xid while
handling a prepare message but do that while handling commit/rollback
prepared message, in order to avoid the worst case. If we do both
while handling a prepare message and the server crashes between them,
it ends up that skip_xid is cleared and the transaction will be
resent, which is identical to the worst-case above.How are you thinking to update the skip xid before prepare? If we do
it in the same transaction then the changes in the catalog will be
part of the prepared xact but won't be committed. Now, say if we do it
after prepare, then the situation won't be the same because after
restart the same xact won't appear again.I was thinking to commit the catalog change first in a separate
transaction while not updating origin LSN and then prepare an empty
transaction while updating origin LSN.But, won't it complicate the handling if in the future we try to
enhance this API such that it skips partial changes like skipping only
for particular relation(s) or particular operations as discussed
previously in this thread?
Right. I was thinking that if we accept the situation that the user
has to set skip_xid again in case of the server crashes, we might be
able to accept also the situation that the user has to clear skip_xid
in a case of the server crashes. But it seems the former is less
problematic.
I've attached an updated patch that incorporated all comments I got so far.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v3-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transac.patchapplication/octet-stream; name=v3-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transac.patchDownload
From 001095812d1273b2f18fda60d7118196d5daa98b Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 10 Dec 2021 14:41:30 +0900
Subject: [PATCH v3] Add ALTER SUBSCRIPTION ... SKIP to skip the transaction on
subscriber nodes
If incoming change violates any constraint, logical replication stops
until it's resolved. This commit introduces another way to skip the
transaction in question, other than manually updating the subscriber's
database or using pg_replication_origin_advance().
The user can specify XID by ALTER SUBSCRIPTION ... SKIP (xid = XXX),
updating pg_subscription.subskipxid field, telling the apply worker to
skip the transaction. The apply worker skips all data modification
changes within the specified transaction.
After skipping the transaciton the apply worker clears
pg_subscription.subskipxid.
---
doc/src/sgml/catalogs.sgml | 10 +
doc/src/sgml/logical-replication.sgml | 49 +++-
doc/src/sgml/ref/alter_subscription.sgml | 41 ++++
src/backend/catalog/pg_subscription.c | 1 +
src/backend/commands/subscriptioncmds.c | 53 +++++
src/backend/parser/gram.y | 9 +
src/backend/replication/logical/worker.c | 256 ++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 4 +
src/bin/psql/tab-complete.c | 8 +-
src/include/catalog/pg_subscription.h | 4 +
src/include/nodes/parsenodes.h | 3 +-
src/test/regress/expected/subscription.out | 25 ++
src/test/regress/sql/subscription.sql | 13 ++
src/test/subscription/t/028_skip_xact.pl | 204 ++++++++++++++++
14 files changed, 671 insertions(+), 9 deletions(-)
create mode 100644 src/test/subscription/t/028_skip_xact.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 03e2537b07..c3c8b0b428 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7723,6 +7723,16 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subskipxid</structfield> <type>xid</type>
+ </para>
+ <para>
+ ID of the transaction whose changes are to be skipped, if a valid
+ transaction ID; otherwise 0.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subconninfo</structfield> <type>text</type>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 96b4886e08..74add8ac36 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -353,15 +353,58 @@
<para>
The resolution can be done either by changing data or permissions on the subscriber so
- that it does not conflict with the incoming change or by skipping the
- transaction that conflicts with the existing data. The transaction can be
- skipped by calling the <link linkend="pg-replication-origin-advance">
+ that it does not conflict with the incoming changes or by skipping the
+ the transaction that conflicts with the existing data. When a conflict
+ produces an error, it is shown in
+ <structname>pg_stat_subscription_workers</structname> view as follows:
+ </para>
+
+ <programlisting>
+postgres=# SELECT * FROM pg_stat_subscription_workers;
+-[ RECORD 1 ]------+-----------------------------------------------------------
+subid | 16391
+subname | test_sub
+subrelid |
+last_error_relid | 16385
+last_error_command | INSERT
+last_error_xid | 716
+last_error_count | 50
+last_error_message | duplicate key value violates unique constraint "test_pkey"
+last_error_time | 2021-09-29 15:52:45.165754+00
+</programlisting>
+
+ <para>
+ and it is also shown in subscriber's server log as follows:
+ </para>
+
+<screen>
+ERROR: duplicate key value violates unique constraint "test_pkey"
+DETAIL: Key (id)=(1) already exists.
+CONTEXT: processing remote data during "INSERT" for replication target relation "public.test" in transaction 716 with commit timestamp 2021-09-29 15:52:45.165754+00
+</screen>
+
+ <para>
+ The transaction ID that contains the change violating the constraint can be
+ found from those outputs (transaction ID 716 in the above case). The transaction
+ can be skipped by using <command>ALTER SUBSCRIPTION ... SKIP</command> on the
+ subscription. Alternatively, the transaction can also be skipped by calling the
+ <link linkend="pg-replication-origin-advance">
<function>pg_replication_origin_advance()</function></link> function with
a <parameter>node_name</parameter> corresponding to the subscription name,
and a position. The current position of origins can be seen in the
<link linkend="view-pg-replication-origin-status">
<structname>pg_replication_origin_status</structname></link> system view.
</para>
+
+ <para>
+ In this case, you need to consider changing the data on the subscriber so that it
+ doesn't conflict with incoming changes, or dropping the conflicting constraint or
+ unique index, or writing a trigger on the subscriber to suppress or redirect
+ conflicting incoming changes, or as a last resort, by skipping the whole transaction.
+ Skipping the whole transaction includes skipping the cahnge that may not violate
+ any constraint. This can easily make the subscriber inconsistent, especially if
+ a user specifies the wrong transaction ID or the position of origin.
+ </para>
</sect1>
<sect1 id="logical-replication-restrictions">
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 0b027cc346..a5c0ddd6fc 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -29,6 +29,7 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> REFRESH PUB
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> ENABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> DISABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable class="parameter">skip_option</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <replaceable>new_name</replaceable>
</synopsis>
@@ -207,6 +208,46 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><literal>SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> [, ... ] )</literal></term>
+ <listitem>
+ <para>
+ Skip applying changes of the particular transaction. If incoming data
+ violates any constraints the logical replication will stop until it is
+ resolved. The resolution can be done either by changing data on the
+ subscriber so that it doesn't conflict with incoming change or by skipping
+ the whole transaction. The logical replication worker skips all data
+ modification changes within the specified transaction including the changes
+ that may not violate the constraint, so, it should only be used as a last
+ resort. This option has no effect on the transaction that is already
+ prepared by enabling <literal>two_phase</literal> on susbscriber. After h
+ the logical replication successfully skips the transaction, the transaction
+ ID (stored in
+ <structname>pg_subscription</structname>.<structfield>subskipxid</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts.
+ </para>
+
+ <para>
+ <replaceable>skip_option</replaceable> specifies options for this operation.
+ The supported option is:
+
+ <variablelist>
+ <varlistentry>
+ <term><literal>xid</literal> (<type>xid</type>)</term>
+ <listitem>
+ <para>
+ Specifies the ID of the transaction whose changes are to be skipped
+ by the logical replication worker. Setting <literal>NONE</literal> resets
+ the transaction ID.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><replaceable class="parameter">new_owner</replaceable></term>
<listitem>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index ca65a8bd20..da199e9a3e 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -69,6 +69,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->binary = subform->subbinary;
sub->stream = subform->substream;
sub->twophasestate = subform->subtwophasestate;
+ sub->skipxid = subform->subskipxid;
/* Get conninfo */
datum = SysCacheGetAttr(SUBSCRIPTIONOID,
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index f5eba450ce..dd35faa4c9 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -61,6 +61,7 @@
#define SUBOPT_BINARY 0x00000080
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
+#define SUBOPT_XID 0x00000400
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -82,6 +83,8 @@ typedef struct SubOpts
bool binary;
bool streaming;
bool twophase;
+ TransactionId xid; /* InvalidTransactionId for resetting purpose, otherwise
+ * normal transaction id */
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -249,6 +252,33 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_TWOPHASE_COMMIT;
opts->twophase = defGetBoolean(defel);
}
+ else if (IsSet(supported_opts, SUBOPT_XID) &&
+ strcmp(defel->defname, "xid") == 0)
+ {
+ char *xid_str = defGetString(defel);
+ TransactionId xid;
+
+ if (IsSet(opts->specified_opts, SUBOPT_XID))
+ errorConflictingDefElem(defel, pstate);
+
+ /* Setting xid = NONE is treated as resetting xid */
+ if (strcmp(xid_str, "none") == 0)
+ xid = InvalidTransactionId;
+ else
+ {
+ /* Parse the argument as TransactionId */
+ xid = DatumGetTransactionId(DirectFunctionCall1(xidin,
+ CStringGetDatum(xid_str)));
+
+ if (!TransactionIdIsNormal(xid))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid transaction id: %s", xid_str)));
+ }
+
+ opts->specified_opts |= SUBOPT_XID;
+ opts->xid = xid;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -464,6 +494,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
CharGetDatum(opts.twophase ?
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(InvalidTransactionId);
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -1083,6 +1115,27 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
break;
}
+ case ALTER_SUBSCRIPTION_SKIP:
+ {
+ parse_subscription_options(pstate, stmt->options, SUBOPT_XID, &opts);
+
+ if (IsSet(opts.specified_opts, SUBOPT_XID))
+ {
+ if (!superuser())
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("must be superuser to set %s", "skip_xid")));
+
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(opts.xid);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ update_tuple = true;
+ }
+
+ break;
+ }
+
default:
elog(ERROR, "unrecognized ALTER SUBSCRIPTION kind %d",
stmt->kind);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 879018377b..d4b542d0bf 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9954,6 +9954,15 @@ AlterSubscriptionStmt:
(Node *)makeInteger(false), @1));
$$ = (Node *)n;
}
+ | ALTER SUBSCRIPTION name SKIP definition
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_SKIP;
+ n->subname = $3;
+ n->options = $5;
+ $$ = (Node *)n;
+ }
;
/*****************************************************************************
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index c9af775bc1..49e9c251f2 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -136,6 +136,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -257,6 +258,21 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * True if we are skipping all data modification changes (INSERT, UPDATE, etc.) of
+ * the specified transaction at MySubscription->skipxid. Once we start skipping
+ * changes, we don't stop it until the we skip all changes of the transaction even
+ * if pg_subscription is updated that and MySubscription->skipxid gets changed or
+ * reset during that. Also, we don't skip receiving the changes in streaming
+ * cases, since we decide whether or not to skip applying the changes when starting
+ * to apply changes. At end of the transaction, we disable it and reset the skip
+ * XID. The timing of resetting the skip XID varies depending on commit or
+ * commit/rollback prepared cases. Please refer to the comments in corresponding
+ * functions for details.
+ */
+static TransactionId skipping_xid = InvalidTransactionId;
+#define is_skipping_changes() (TransactionIdIsValid(skipping_xid))
+
/* BufFile handle of the current streaming file */
static BufFile *stream_fd = NULL;
@@ -332,6 +348,13 @@ static void TwoPhaseTransactionGid(Oid subid, TransactionId xid, char *gid, int
/* Common streaming function to apply all the spooled messages */
static void apply_spooled_messages(TransactionId xid, XLogRecPtr lsn);
+/* Functions for skipping changes */
+static void maybe_start_skipping_changes(TransactionId xid);
+static void stop_skipping_changes(bool commit, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp);
+static void clear_subscription_skip_xid(TransactionId xid, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp);
+
/* Functions for apply error callback */
static void apply_error_callback(void *arg);
static inline void set_apply_error_context_xact(TransactionId xid, TimestampTz ts);
@@ -791,6 +814,11 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
+ /*
+ * Enable skipping all changes of this transaction if specified.
+ */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -843,6 +871,11 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
+ /*
+ * Enable skipping all changes of this transaction if specified
+ */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -856,6 +889,36 @@ apply_handle_prepare_internal(LogicalRepPreparedTxnData *prepare_data)
{
char gid[GIDSIZE];
+ /*
+ * If we are skipping all changes of this transaction, we stop it but
+ * unlike commit, we do not clear subskipxid of pg_subscription catalog
+ * here and will do that at commit prepared or rollback prepared time.
+ * If we update the catalog and then prepare the transaction, the changes
+ * will be part of the prepared transaction. Even if we do that in reverse
+ * order, subskipxid will not be cleared but this transaction won’t be
+ * resent.
+ *
+ * subskipxid might be changed or cleared by the user before we receive
+ * COMMIT PREPARED or ROLLBACK PREPARED. But that’s okay because this
+ * prepared transaction is empty.
+ *
+ * One might think that we can skip preparing the skipped transaction and
+ * also skips COMMIT PREPARED or ROLLBACK PREPARED by comparing the XID
+ * received as part of the message to the value of subskipxid. But
+ * subskipxid could be changed by users between PREPARE and COMMIT PREPARED
+ * or ROLLBACK PREPARED. There was an idea to disallow users to change the
+ * value of subskipxid while skipping changes. But we don't know when
+ * COMMIT PREPARED or ROLLBACK PREPARED comes and another conflict could
+ * occur in the meanwhile. If such another conflict occurs, we cannot
+ * skip the transaction by using subskipxid. Also, there was another idea
+ * to check whether the transaction has been prepared or not by checking
+ * GID, origin LSN, and origin timestamp of the prepared transaction but
+ * that doesn't seem worthwhile because it requires protocol changes, and
+ * skipping transactions shouldn't be common.
+ */
+ if (is_skipping_changes())
+ stop_skipping_changes(false, InvalidXLogRecPtr, 0);
+
/*
* Compute unique GID for two_phase transactions. We don't use GID of
* prepared transaction sent by server as that can lead to deadlock when
@@ -901,9 +964,9 @@ apply_handle_prepare(StringInfo s)
/*
* Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction. It is done this way because at
- * commit prepared time, we won't know whether we have skipped preparing a
- * transaction because of no change.
+ * change has happened in this transaction or all changes are skipped. It is
+ * done this way because at commit prepared time, we won't know whether we
+ * have skipped preparing a transaction because of no change.
*
* XXX, We can optimize such that at commit prepared time, we first check
* whether we have prepared the transaction or not but that doesn't seem
@@ -940,6 +1003,26 @@ apply_handle_commit_prepared(StringInfo s)
logicalrep_read_commit_prepared(s, &prepare_data);
set_apply_error_context_xact(prepare_data.xid, prepare_data.commit_time);
+ /* Update the subscription cache if necessary */
+ maybe_reread_subscription();
+
+ if (MySubscription->skipxid == prepare_data.xid)
+ {
+ /*
+ * Clear the subskipxid of pg_subscription catalog. This catalog
+ * update must be committed before finishing prepared transaction.
+ * Because otherwise, in a case where the server crashes between
+ * finishing prepared transaction and the catalog update, COMMIT
+ * PREPARED won’t be resent but skipsubxid is left.
+ *
+ * Also, we must not update the replication origin LSN and timestamp
+ * while committing the catalog update so that COMMIT PREPARED will
+ * be resent in case of a crash immediately after the catalog update
+ * commit.
+ */
+ clear_subscription_skip_xid(prepare_data.xid, InvalidXLogRecPtr, 0);
+ }
+
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
gid, sizeof(gid));
@@ -981,6 +1064,19 @@ apply_handle_rollback_prepared(StringInfo s)
logicalrep_read_rollback_prepared(s, &rollback_data);
set_apply_error_context_xact(rollback_data.xid, rollback_data.rollback_time);
+ /* Update the subscription cache if necessary */
+ maybe_reread_subscription();
+
+ if (MySubscription->skipxid == rollback_data.xid)
+ {
+ /*
+ * Same as COMMIT PREPARED, we must clear subskipxid of pg_subscription
+ * before rolling back the prepared transaction. Please see the comments
+ * in apply_handle_commit_prepared() for details.
+ */
+ clear_subscription_skip_xid(rollback_data.xid, InvalidXLogRecPtr, 0);
+ }
+
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, rollback_data.xid,
gid, sizeof(gid));
@@ -1209,6 +1305,13 @@ apply_handle_stream_abort(StringInfo s)
logicalrep_read_stream_abort(s, &xid, &subxid);
+ /*
+ * We don't expect that the user set the XID of the transaction that is
+ * rolled back but if the skip XID is set, clear it.
+ */
+ if (MySubscription->skipxid == xid || MySubscription->skipxid == subxid)
+ clear_subscription_skip_xid(MySubscription->skipxid, InvalidXLogRecPtr, 0);
+
/*
* If the two XIDs are the same, it's in fact abort of toplevel xact, so
* just delete the files with serialized info.
@@ -1331,6 +1434,9 @@ apply_spooled_messages(TransactionId xid, XLogRecPtr lsn)
remote_final_lsn = lsn;
+ /* Enable skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(xid);
+
/*
* Make sure the handle apply_dispatch methods are aware we're in a remote
* transaction.
@@ -1432,6 +1538,10 @@ apply_handle_stream_commit(StringInfo s)
apply_spooled_messages(xid, commit_data.commit_lsn);
+ /*
+ * Commit streamed transaction. If we're skipping this transaction,
+ * we stop it in apply_handle_commit_internal().
+ */
apply_handle_commit_internal(&commit_data);
/* unlink the files with serialized changes and subxact info */
@@ -1451,7 +1561,20 @@ apply_handle_stream_commit(StringInfo s)
static void
apply_handle_commit_internal(LogicalRepCommitData *commit_data)
{
- if (IsTransactionState())
+ if (is_skipping_changes())
+ {
+ /*
+ * If we are skipping all changes of this transaction, we stop it
+ * and clear subskipxid of pg_subscription.
+ */
+ stop_skipping_changes(true, commit_data->end_lsn,
+ commit_data->committime);
+
+ pgstat_report_stat(false);
+
+ store_flush_position(commit_data->end_lsn);
+ }
+ else if (IsTransactionState())
{
/*
* Update origin state so we can restart streaming from correct
@@ -2366,6 +2489,17 @@ apply_dispatch(StringInfo s)
LogicalRepMsgType action = pq_getmsgbyte(s);
LogicalRepMsgType saved_command;
+ /*
+ * Skip all data-modification changes if we're skipping changes of this
+ * transaction.
+ */
+ if (is_skipping_changes() &&
+ (action == LOGICAL_REP_MSG_INSERT ||
+ action == LOGICAL_REP_MSG_UPDATE ||
+ action == LOGICAL_REP_MSG_DELETE ||
+ action == LOGICAL_REP_MSG_TRUNCATE))
+ return;
+
/*
* Set the current command being applied. Since this function can be
* called recusively when applying spooled changes, save the current
@@ -3661,6 +3795,120 @@ IsLogicalWorker(void)
return MyLogicalRepWorker != NULL;
}
+/*
+ * Start skipping changes of the transaction if the given XID matches the
+ * transaction ID specified by skip_xid option.
+ */
+static void
+maybe_start_skipping_changes(TransactionId xid)
+{
+ Assert(!is_skipping_changes());
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ /* Make sure subscription cache is up-to-date */
+ maybe_reread_subscription();
+
+ if (MySubscription->skipxid != xid)
+ return;
+
+ /* Start skipping all changes of this transaction */
+ skipping_xid = xid;
+
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction %u",
+ xid));
+}
+
+/*
+ * Stop skipping changes by resetting skipping_xid. If clear_subskipxid is true,
+ * we also clear subskipxid of pg_subscription by setting InvalidTransactionId.
+ */
+static void
+stop_skipping_changes(bool clear_subskipxid, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp)
+{
+ Assert(is_skipping_changes());
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction %u",
+ skipping_xid)));
+
+ if (clear_subskipxid)
+ clear_subscription_skip_xid(skipping_xid, origin_lsn, origin_timestamp);
+
+ /* Stop skipping changes */
+ skipping_xid = InvalidTransactionId;
+}
+
+/* clear subskipxid of pg_subscription catalog */
+static void
+clear_subscription_skip_xid(TransactionId xid, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp)
+{
+ Relation rel;
+ Form_pg_subscription subform;
+ HeapTuple tup;
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ if (!IsTransactionState())
+ StartTransactionCommand();
+
+ LockSharedObject(SubscriptionRelationId, MySubscription->oid, 0,
+ AccessShareLock);
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ ObjectIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ /* Get subskipxid value */
+ subform = (Form_pg_subscription) GETSTRUCT(tup);
+
+ /*
+ * Update the subskipxid of the tuple to InvalidTransactionId. If user has
+ * already changed subskipxid before clearing it we don't update the catalog
+ * and don't advance the replication origin state. So in the worst case,
+ * if the server crashes before sending an acknowledgment of the flush position
+ * the transaction will be sent again and the user needs to be set subskipxid
+ * again. We can reduce the possibility by logging a replication origin WAL
+ * record to advance the origin LSN instead but it doesn't seem to be worth
+ * since it's a very minor case.
+ */
+ if (subform->subskipxid == xid)
+ {
+ /*
+ * Update origin state so we can restart streaming from correct
+ * position in case of crash.
+ */
+ replorigin_session_origin_lsn = origin_lsn;
+ replorigin_session_origin_timestamp = origin_timestamp;
+
+ /* reset subskipxid */
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(InvalidTransactionId);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+ }
+
+ heap_freetuple(tup);
+ table_close(rel, NoLock);
+ CommitTransactionCommand();
+}
+
/* Error callback to give more context info about the change being applied */
static void
apply_error_callback(void *arg)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 92ab95724d..29aea5b56b 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4301,6 +4301,10 @@ getSubscriptions(Archive *fout)
ntups = PQntuples(res);
+ /*
+ * Get subscription fields. We don't fetch subskipxid as we don't
+ * include it in the dump.
+ */
i_tableoid = PQfnumber(res, "tableoid");
i_oid = PQfnumber(res, "oid");
i_subname = PQfnumber(res, "subname");
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 39be6f556a..db95b50a91 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1675,7 +1675,7 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> */
else if (Matches("ALTER", "SUBSCRIPTION", MatchAny))
COMPLETE_WITH("CONNECTION", "ENABLE", "DISABLE", "OWNER TO",
- "RENAME TO", "REFRESH PUBLICATION", "SET",
+ "RENAME TO", "REFRESH PUBLICATION", "SET", "SKIP",
"ADD PUBLICATION", "DROP PUBLICATION");
/* ALTER SUBSCRIPTION <name> REFRESH PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) &&
@@ -1691,6 +1691,12 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
COMPLETE_WITH("binary", "slot_name", "streaming", "synchronous_commit");
+ /* ALTER SUBSCRIPTION <name> SKIP */
+ else if (Matches("ALTER", "SUBSCRIPTION", MatchAny, "SKIP"))
+ COMPLETE_WITH("(");
+ /* ALTER SUBSCRIPTION <name> SKIP ( */
+ else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
+ COMPLETE_WITH("xid");
/* ALTER SUBSCRIPTION <name> SET PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "PUBLICATION"))
{
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 18c291289f..d4410da58f 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -67,6 +67,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
char subtwophasestate; /* Stream two-phase transactions */
+ TransactionId subskipxid; /* All changes associated with
+ * this XID are skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -103,6 +106,7 @@ typedef struct Subscription
* binary format */
bool stream; /* Allow streaming in-progress transactions. */
char twophasestate; /* Allow streaming two-phase transactions */
+ TransactionId skipxid; /* All changes of the XID are skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 413e7c85a1..ab3554f234 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3716,7 +3716,8 @@ typedef enum AlterSubscriptionType
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
ALTER_SUBSCRIPTION_DROP_PUBLICATION,
ALTER_SUBSCRIPTION_REFRESH,
- ALTER_SUBSCRIPTION_ENABLED
+ ALTER_SUBSCRIPTION_ENABLED,
+ ALTER_SUBSCRIPTION_SKIP
} AlterSubscriptionType;
typedef struct AlterSubscriptionStmt
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 80aae83562..e747057ba0 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -93,6 +93,31 @@ ALTER SUBSCRIPTION regress_doesnotexist CONNECTION 'dbname=regress_doesnotexist2
ERROR: subscription "regress_doesnotexist" does not exist
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
ERROR: unrecognized subscription parameter: "create_slot"
+-- ok - valid xid
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 3);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 4294967295);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+ subname | subskipxid
+-----------------+------------
+ regress_testsub | 4294967295
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = NONE);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+ subname | subskipxid
+-----------------+------------
+ regress_testsub | 0
+(1 row)
+
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1.1);
+ERROR: invalid transaction id: 1.1
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 0);
+ERROR: invalid transaction id: 0
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1);
+ERROR: invalid transaction id: 1
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 2);
+ERROR: invalid transaction id: 2
\dRs+
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index bd0f4af1e4..6d8392758c 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -72,6 +72,19 @@ ALTER SUBSCRIPTION regress_testsub SET (slot_name = '');
ALTER SUBSCRIPTION regress_doesnotexist CONNECTION 'dbname=regress_doesnotexist2';
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
+-- ok - valid xid
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 3);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 4294967295);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = NONE);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1.1);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 0);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 2);
+
\dRs+
BEGIN;
diff --git a/src/test/subscription/t/028_skip_xact.pl b/src/test/subscription/t/028_skip_xact.pl
new file mode 100644
index 0000000000..a63c9c345e
--- /dev/null
+++ b/src/test/subscription/t/028_skip_xact.pl
@@ -0,0 +1,204 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Tests for skipping logical replication transactions.
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 6;
+
+# Test skipping the transaction. This function must be called after the caller
+# inserting data that conflict with the subscriber. After waiting for the
+# subscription worker stats are updated, we skip the transaction in question
+# by ALTER SUBSCRIPTION ... SKIP. Then, check if logical replication can continue
+# working by inserting $nonconflict_data on the publisher.
+sub test_skip_xact
+{
+ my ($node_publisher, $node_subscriber, $subname, $relname, $nonconflict_data,
+ $expected, $xid, $msg) = @_;
+
+ # Wait for worker error
+ $node_subscriber->poll_query_until(
+ 'postgres',
+ qq[
+SELECT count(1) > 0
+FROM pg_stat_subscription_workers
+WHERE last_error_relid = '$relname'::regclass
+ AND subrelid IS NULL
+ AND last_error_command = 'INSERT'
+ AND last_error_xid = '$xid'
+ AND starts_with(last_error_message, 'duplicate key value violates unique constraint');
+]) or die "Timed out while waiting for worker error";
+
+ # Set skip xid
+ $node_subscriber->safe_psql(
+ 'postgres',
+ "ALTER SUBSCRIPTION $subname SKIP (xid = '$xid')");
+
+ # Restart the subscriber node to restart logical replication with no interval.
+ $node_subscriber->restart;
+
+ # Wait for the failed transaction to be skipped.
+ $node_subscriber->poll_query_until(
+ 'postgres',
+ "SELECT subskipxid = 0 FROM pg_subscription WHERE subname = '$subname'");
+
+ # Insert non-conflict data
+ $node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO $relname VALUES $nonconflict_data");
+
+ $node_publisher->wait_for_catchup($subname);
+
+ # Check replicated data
+ my $res = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT count(*) FROM $relname");
+ is($res, $expected, $msg);
+}
+
+# Create publisher node.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ qq[
+logical_decoding_work_mem = 64kB
+max_prepared_transactions = 10
+]);
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+]);
+
+# The subscriber will enter an infinite error loop, so we don't want
+# to overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+wal_retrieve_retry_interval = 2s
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we
+# create the same tables but with primary keys. Also, insert some data that
+# will conflict with the data replicated from publisher later.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+CREATE TABLE test_tab (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+COMMIT;
+]);
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+CREATE TABLE test_tab (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab VALUES (1);
+INSERT INTO test_tab_streaming VALUES (1, md5(1::text));
+COMMIT;
+]);
+
+# Setup publications.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Create subscriptions.
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr' PUBLICATION tap_pub WITH (two_phase = on);
+CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr' PUBLICATION tap_pub_streaming WITH (two_phase = on, streaming = on);
+]);
+
+$node_publisher->wait_for_catchup('tap_sub');
+$node_publisher->wait_for_catchup('tap_sub_streaming');
+
+# Insert data to test_tab1, raising an error on the subscriber due to violation
+# of the unique constraint on test_tab. Then skip the transaction.
+my $xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(2)", "2", $xid,
+ "test skipping transaction");
+
+# Test for PREPARE. Insert the same data to test_tab1 and PREPARE the transaction,
+# raising an error. Then skip the transaction.
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+COMMIT PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(3)", "3", $xid,
+ "test skipping prepare and commit prepared ");
+
+# Test for PREPARE and ROLLBACK PREPARED.
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+ROLLBACK PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(4)", "4", $xid,
+ "test skipping prepare and rollback prepared");
+
+# Test for STREAM COMMIT. Insert enough rows to test_tab_streaming to exceed the 64kB
+# limit, also raising an error on the subscriber during applying spooled changes for the
+# same reason. Then skip the transaction.
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub_streaming", "test_tab_streaming",
+ "(2, md5(2::text))", "2", $xid,
+ "test skipping stream-commit");
+
+# Test for STREAM PREPARE and COMMIT PREPARED.
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+COMMIT PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub_streaming", "test_tab_streaming",
+ "(3, md5(3::text))", "3", $xid,
+ "test skipping stream-prepare and commit prepared");
+
+my $res = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT count(*) FROM pg_prepared_xacts");
+is($res, "0", "check all prepared transactions are resolved on the subscriber");
--
2.24.3 (Apple Git-128)
On Mon, Jan 10, 2022 at 6:27 PM vignesh C <vignesh21@gmail.com> wrote:
On Fri, Jan 7, 2022 at 11:23 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Jan 7, 2022 at 10:04 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Jan 5, 2022 at 12:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Dec 27, 2021 at 9:54 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Dec 16, 2021 at 2:42 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Dec 16, 2021 at 2:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Dec 16, 2021 at 10:37 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Dec 16, 2021 at 11:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
I thought we just want to lock before clearing the skip_xid something
like take the lock, check if the skip_xid in the catalog is the same
as we have skipped, if it is the same then clear it, otherwise, leave
it as it is. How will that disallow users to change skip_xid when we
are skipping changes?Oh I thought we wanted to keep holding the lock while skipping changes
(changing skip_xid requires acquiring the lock).So if skip_xid is already changed, the apply worker would do
replorigin_advance() with WAL logging, instead of committing the
catalog change?Right. BTW, how are you planning to advance the origin? Normally, a
commit transaction would do it but when we are skipping all changes,
the commit might not do it as there won't be any transaction id
assigned.I've not tested it yet but replorigin_advance() with wal_log = true
seems to work for this case.I've tested it and realized that we cannot use replorigin_advance()
for this purpose without changes. That is, the current
replorigin_advance() doesn't allow to advance the origin by the owner:/* Make sure it's not used by somebody else */
if (replication_state->acquired_by != 0)
{
ereport(ERROR,
(errcode(ERRCODE_OBJECT_IN_USE),
errmsg("replication origin with OID %d is already
active for PID %d",
replication_state->roident,
replication_state->acquired_by)));
}So we need to change it so that the origin owner can advance its
origin, which makes sense to me.Also, when we have to update the origin instead of committing the
catalog change while updating the origin, we cannot record the origin
timestamp.Is it because we currently update the origin timestamp with commit record?
Yes.
This behavior makes sense to me because we skipped the
transaction. But ISTM it’s not good if we emit the origin timestamp
only when directly updating the origin. So probably we need to always
omit origin timestamp.Do you mean to say that you want to omit it even when we are
committing the changes?Yes, it would be better to record only origin lsn in terms of consistency.
Apart from that, I'm vaguely concerned that the logic seems to be
getting complex. Probably it comes from the fact that we store
skip_xid in the catalog and update the catalog to clear/set the
skip_xid. It might be worth revisiting the idea of storing skip_xid on
shmem (e.g., ReplicationState)?IIRC, the problem with that idea was that we won't remember skip_xid
information after server restart and the user won't even know that it
has to set it again.Right, I agree that it’s not convenient when the server restarts or
crashes, but these problems could not be critical in the situation
where users have to use this feature; the subscriber already entered
an error loop so they can know xid again and it’s an uncommon case
that they need to restart during skipping changes.Anyway, I'll submit an updated patch soon so we can discuss complexity
vs. convenience.Attached an updated patch. Please review it.
Thank you for the comments!
Thanks for the updated patch, few comments: 1) Should this be case insensitive to support NONE too: + /* Setting xid = NONE is treated as resetting xid */ + if (strcmp(xid_str, "none") == 0) + xid = InvalidTransactionId;
I think the string value is always small cases so we don't need to do
strcacsecmp here.
2) Can we have an option to specify last_error_xid of
pg_stat_subscription_workers. Something like:
alter subscription sub1 skip ( XID = 'last_subscription_error');When the user specified last_subscription_error, it should pick
last_error_xid from pg_stat_subscription_workers.
As this operation is a critical operation, if there is an option which
could automatically pick and set from pg_stat_subscription_workers, it
would be useful.
As I mentioned before in another mail, I think we can do that in a
separate patch.
3) Currently the following syntax is being supported, I felt this
should throw an error:
postgres=# alter subscription sub1 set ( XID = 100);
ALTER SUBSCRIPTION
Fixed.
4) You might need to rebase the patch:
git am v2-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transac.patch
Applying: Add ALTER SUBSCRIPTION ... SKIP to skip the transaction on
subscriber nodes
error: patch failed: doc/src/sgml/logical-replication.sgml:333
error: doc/src/sgml/logical-replication.sgml: patch does not apply
Patch failed at 0001 Add ALTER SUBSCRIPTION ... SKIP to skip the
transaction on subscriber nodes
hint: Use 'git am --show-current-patch=diff' to see the failed patch5) You might have to rename 027_skip_xact to 028_skip_xact as 027_nosuperuser.pl already exists diff --git a/src/test/subscription/t/027_skip_xact.pl b/src/test/subscription/t/027_skip_xact.pl new file mode 100644 index 0000000000..a63c9c345e --- /dev/null +++ b/src/test/subscription/t/027_skip_xact.pl
I've resolved these conflicts.
These comments are incorporated into the latest v3 patch I just submitted[1]/messages/by-id/CAD21AoD9JXah2V8uFURUpZbK_ewsut+jb1ESm6YQkrhQm3nJRg@mail.gmail.com.
Regards,
[1]: /messages/by-id/CAD21AoD9JXah2V8uFURUpZbK_ewsut+jb1ESm6YQkrhQm3nJRg@mail.gmail.com
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wed, Jan 12, 2022 at 11:32 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Jan 12, 2022 at 12:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Jan 12, 2022 at 5:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Jan 11, 2022 at 7:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jan 11, 2022 at 1:51 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On second thought, the same is true for other cases, for example,
preparing the transaction and clearing skip_xid while handling a
prepare message. That is, currently we don't clear skip_xid while
handling a prepare message but do that while handling commit/rollback
prepared message, in order to avoid the worst case. If we do both
while handling a prepare message and the server crashes between them,
it ends up that skip_xid is cleared and the transaction will be
resent, which is identical to the worst-case above.How are you thinking to update the skip xid before prepare? If we do
it in the same transaction then the changes in the catalog will be
part of the prepared xact but won't be committed. Now, say if we do it
after prepare, then the situation won't be the same because after
restart the same xact won't appear again.I was thinking to commit the catalog change first in a separate
transaction while not updating origin LSN and then prepare an empty
transaction while updating origin LSN.But, won't it complicate the handling if in the future we try to
enhance this API such that it skips partial changes like skipping only
for particular relation(s) or particular operations as discussed
previously in this thread?Right. I was thinking that if we accept the situation that the user
has to set skip_xid again in case of the server crashes, we might be
able to accept also the situation that the user has to clear skip_xid
in a case of the server crashes. But it seems the former is less
problematic.I've attached an updated patch that incorporated all comments I got so far.
Thanks for the updated patch, few comments:
1) Currently skip xid is not displayed in describe subscriptions, can
we include it too:
\dRs+ sub1
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two
phase commit | Synchronous commit | Conninfo
------+---------+---------+-------------+--------+-----------+------------------+--------------------+--------------------------------
sub1 | vignesh | t | {pub1} | f | f | e
| off | dbname=postgres host=localhost
(1 row)
2) This import "use PostgreSQL::Test::Utils;" is not required:
+# Tests for skipping logical replication transactions.
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 6;
3) Some of the comments uses a punctuation mark and some of them does
not use, Should we keep it consistent:
+ # Wait for worker error
+ $node_subscriber->poll_query_until(
+ 'postgres',
+ # Set skip xid
+ $node_subscriber->safe_psql(
+ 'postgres',
+# Create publisher node.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+# Create subscriber node.
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
4) Should this be changed:
+ * True if we are skipping all data modification changes (INSERT,
UPDATE, etc.) of
+ * the specified transaction at MySubscription->skipxid. Once we
start skipping
+ * changes, we don't stop it until the we skip all changes of the
transaction even
+ * if pg_subscription is updated that and MySubscription->skipxid
gets changed or
to:
+ * True if we are skipping all data modification changes (INSERT,
UPDATE, etc.) of
+ * the specified transaction at MySubscription->skipxid. Once we
start skipping
+ * changes, we don't stop it until we skip all changes of the transaction even
+ * if pg_subscription is updated that and MySubscription->skipxid
gets changed or
In "stop it until the we skip all changes", here the is not required.
Regards,
Vignesh
On Wed, Jan 12, 2022 2:02 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated patch that incorporated all comments I got so far.
Thanks for updating the patch. Here are some comments:
1)
+ Skip applying changes of the particular transaction. If incoming data
Should "Skip" be "Skips" ?
2)
+ prepared by enabling <literal>two_phase</literal> on susbscriber. After h
+ the logical replication successfully skips the transaction, the transaction
The "h" after word "After" seems redundant.
3)
+ Skipping the whole transaction includes skipping the cahnge that may not violate
"cahnge" should be "changes" I think.
4)
+/*
+ * True if we are skipping all data modification changes (INSERT, UPDATE, etc.) of
+ * the specified transaction at MySubscription->skipxid. Once we start skipping
...
+ */
+static TransactionId skipping_xid = InvalidTransactionId;
+#define is_skipping_changes() (TransactionIdIsValid(skipping_xid))
Maybe we should modify this comment. Something like:
skipping_xid is valid if we are skipping all data modification changes ...
5)
+ if (!superuser())
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("must be superuser to set %s", "skip_xid")));
Should we change the message to "must be superuser to skip xid"?
Because the SQL stmt is "ALTER SUBSCRIPTION ... SKIP (xid = XXX)".
Regards,
Tang
On Wed, Jan 12, 2022 at 11:10 PM vignesh C <vignesh21@gmail.com> wrote:
On Wed, Jan 12, 2022 at 11:32 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Jan 12, 2022 at 12:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Jan 12, 2022 at 5:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Jan 11, 2022 at 7:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jan 11, 2022 at 1:51 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On second thought, the same is true for other cases, for example,
preparing the transaction and clearing skip_xid while handling a
prepare message. That is, currently we don't clear skip_xid while
handling a prepare message but do that while handling commit/rollback
prepared message, in order to avoid the worst case. If we do both
while handling a prepare message and the server crashes between them,
it ends up that skip_xid is cleared and the transaction will be
resent, which is identical to the worst-case above.How are you thinking to update the skip xid before prepare? If we do
it in the same transaction then the changes in the catalog will be
part of the prepared xact but won't be committed. Now, say if we do it
after prepare, then the situation won't be the same because after
restart the same xact won't appear again.I was thinking to commit the catalog change first in a separate
transaction while not updating origin LSN and then prepare an empty
transaction while updating origin LSN.But, won't it complicate the handling if in the future we try to
enhance this API such that it skips partial changes like skipping only
for particular relation(s) or particular operations as discussed
previously in this thread?Right. I was thinking that if we accept the situation that the user
has to set skip_xid again in case of the server crashes, we might be
able to accept also the situation that the user has to clear skip_xid
in a case of the server crashes. But it seems the former is less
problematic.I've attached an updated patch that incorporated all comments I got so far.
Thanks for the updated patch, few comments:
Thank you for the comments!
1) Currently skip xid is not displayed in describe subscriptions, can
we include it too:
\dRs+ sub1
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two
phase commit | Synchronous commit | Conninfo
------+---------+---------+-------------+--------+-----------+------------------+--------------------+--------------------------------
sub1 | vignesh | t | {pub1} | f | f | e
| off | dbname=postgres host=localhost
(1 row)2) This import "use PostgreSQL::Test::Utils;" is not required: +# Tests for skipping logical replication transactions. +use strict; +use warnings; +use PostgreSQL::Test::Cluster; +use PostgreSQL::Test::Utils; +use Test::More tests => 6;3) Some of the comments uses a punctuation mark and some of them does not use, Should we keep it consistent: + # Wait for worker error + $node_subscriber->poll_query_until( + 'postgres',+ # Set skip xid + $node_subscriber->safe_psql( + 'postgres',+# Create publisher node. +my $node_publisher = PostgreSQL::Test::Cluster->new('publisher'); +$node_publisher->init(allows_streaming => 'logical');+# Create subscriber node. +my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');4) Should this be changed: + * True if we are skipping all data modification changes (INSERT, UPDATE, etc.) of + * the specified transaction at MySubscription->skipxid. Once we start skipping + * changes, we don't stop it until the we skip all changes of the transaction even + * if pg_subscription is updated that and MySubscription->skipxid gets changed or to: + * True if we are skipping all data modification changes (INSERT, UPDATE, etc.) of + * the specified transaction at MySubscription->skipxid. Once we start skipping + * changes, we don't stop it until we skip all changes of the transaction even + * if pg_subscription is updated that and MySubscription->skipxid gets changed orIn "stop it until the we skip all changes", here the is not required.
I agree with all the comments above. I've attached an updated patch.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v4-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transac.patchapplication/octet-stream; name=v4-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transac.patchDownload
From 3fddf96b638e1348a50051620f39235cc423f925 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 10 Dec 2021 14:41:30 +0900
Subject: [PATCH v4] Add ALTER SUBSCRIPTION ... SKIP to skip the transaction on
subscriber nodes
If incoming change violates any constraint, logical replication stops
until it's resolved. This commit introduces another way to skip the
transaction in question, other than manually updating the subscriber's
database or using pg_replication_origin_advance().
The user can specify XID by ALTER SUBSCRIPTION ... SKIP (xid = XXX),
updating pg_subscription.subskipxid field, telling the apply worker to
skip the transaction. The apply worker skips all data modification
changes within the specified transaction.
After skipping the transaciton the apply worker clears
pg_subscription.subskipxid.
---
doc/src/sgml/catalogs.sgml | 10 +
doc/src/sgml/logical-replication.sgml | 49 +++-
doc/src/sgml/ref/alter_subscription.sgml | 41 ++++
src/backend/catalog/pg_subscription.c | 1 +
src/backend/commands/subscriptioncmds.c | 53 +++++
src/backend/parser/gram.y | 9 +
src/backend/replication/logical/worker.c | 253 ++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 4 +
src/bin/psql/describe.c | 8 +-
src/bin/psql/tab-complete.c | 8 +-
src/include/catalog/pg_subscription.h | 4 +
src/include/nodes/parsenodes.h | 3 +-
src/test/regress/expected/subscription.out | 121 ++++++----
src/test/regress/sql/subscription.sql | 13 ++
src/test/subscription/t/028_skip_xact.pl | 217 ++++++++++++++++++
15 files changed, 734 insertions(+), 60 deletions(-)
create mode 100644 src/test/subscription/t/028_skip_xact.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 03e2537b07..c3c8b0b428 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7723,6 +7723,16 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subskipxid</structfield> <type>xid</type>
+ </para>
+ <para>
+ ID of the transaction whose changes are to be skipped, if a valid
+ transaction ID; otherwise 0.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subconninfo</structfield> <type>text</type>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 96b4886e08..e5f7f2bd4f 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -353,15 +353,58 @@
<para>
The resolution can be done either by changing data or permissions on the subscriber so
- that it does not conflict with the incoming change or by skipping the
- transaction that conflicts with the existing data. The transaction can be
- skipped by calling the <link linkend="pg-replication-origin-advance">
+ that it does not conflict with the incoming changes or by skipping the
+ the transaction that conflicts with the existing data. When a conflict
+ produces an error, it is shown in
+ <structname>pg_stat_subscription_workers</structname> view as follows:
+ </para>
+
+ <programlisting>
+postgres=# SELECT * FROM pg_stat_subscription_workers;
+-[ RECORD 1 ]------+-----------------------------------------------------------
+subid | 16391
+subname | test_sub
+subrelid |
+last_error_relid | 16385
+last_error_command | INSERT
+last_error_xid | 716
+last_error_count | 50
+last_error_message | duplicate key value violates unique constraint "test_pkey"
+last_error_time | 2021-09-29 15:52:45.165754+00
+</programlisting>
+
+ <para>
+ and it is also shown in subscriber's server log as follows:
+ </para>
+
+<screen>
+ERROR: duplicate key value violates unique constraint "test_pkey"
+DETAIL: Key (id)=(1) already exists.
+CONTEXT: processing remote data during "INSERT" for replication target relation "public.test" in transaction 716 with commit timestamp 2021-09-29 15:52:45.165754+00
+</screen>
+
+ <para>
+ The transaction ID that contains the change violating the constraint can be
+ found from those outputs (transaction ID 716 in the above case). The transaction
+ can be skipped by using <command>ALTER SUBSCRIPTION ... SKIP</command> on the
+ subscription. Alternatively, the transaction can also be skipped by calling the
+ <link linkend="pg-replication-origin-advance">
<function>pg_replication_origin_advance()</function></link> function with
a <parameter>node_name</parameter> corresponding to the subscription name,
and a position. The current position of origins can be seen in the
<link linkend="view-pg-replication-origin-status">
<structname>pg_replication_origin_status</structname></link> system view.
</para>
+
+ <para>
+ In this case, you need to consider changing the data on the subscriber so that it
+ doesn't conflict with incoming changes, or dropping the conflicting constraint or
+ unique index, or writing a trigger on the subscriber to suppress or redirect
+ conflicting incoming changes, or as a last resort, by skipping the whole transaction.
+ Skipping the whole transaction includes skipping the change that may not violate
+ any constraint. This can easily make the subscriber inconsistent, especially if
+ a user specifies the wrong transaction ID or the position of origin.
+ </para>
</sect1>
<sect1 id="logical-replication-restrictions">
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 0b027cc346..f9e2512a6b 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -29,6 +29,7 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> REFRESH PUB
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> ENABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> DISABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable class="parameter">skip_option</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <replaceable>new_name</replaceable>
</synopsis>
@@ -207,6 +208,46 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><literal>SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> [, ... ] )</literal></term>
+ <listitem>
+ <para>
+ Skips applying changes of the particular transaction. If incoming data
+ violates any constraints the logical replication will stop until it is
+ resolved. The resolution can be done either by changing data on the
+ subscriber so that it doesn't conflict with incoming change or by skipping
+ the whole transaction. The logical replication worker skips all data
+ modification changes within the specified transaction including the changes
+ that may not violate the constraint, so, it should only be used as a last
+ resort. This option has no effect on the transaction that is already
+ prepared by enabling <literal>two_phase</literal> on susbscriber. After
+ the logical replication successfully skips the transaction, the transaction
+ ID (stored in
+ <structname>pg_subscription</structname>.<structfield>subskipxid</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts.
+ </para>
+
+ <para>
+ <replaceable>skip_option</replaceable> specifies options for this operation.
+ The supported option is:
+
+ <variablelist>
+ <varlistentry>
+ <term><literal>xid</literal> (<type>xid</type>)</term>
+ <listitem>
+ <para>
+ Specifies the ID of the transaction whose changes are to be skipped
+ by the logical replication worker. Setting <literal>NONE</literal> resets
+ the transaction ID.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><replaceable class="parameter">new_owner</replaceable></term>
<listitem>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index ca65a8bd20..da199e9a3e 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -69,6 +69,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->binary = subform->subbinary;
sub->stream = subform->substream;
sub->twophasestate = subform->subtwophasestate;
+ sub->skipxid = subform->subskipxid;
/* Get conninfo */
datum = SysCacheGetAttr(SUBSCRIPTIONOID,
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index f5eba450ce..1082b60943 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -61,6 +61,7 @@
#define SUBOPT_BINARY 0x00000080
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
+#define SUBOPT_XID 0x00000400
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -82,6 +83,8 @@ typedef struct SubOpts
bool binary;
bool streaming;
bool twophase;
+ TransactionId xid; /* InvalidTransactionId for resetting purpose, otherwise
+ * normal transaction id */
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -249,6 +252,33 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_TWOPHASE_COMMIT;
opts->twophase = defGetBoolean(defel);
}
+ else if (IsSet(supported_opts, SUBOPT_XID) &&
+ strcmp(defel->defname, "xid") == 0)
+ {
+ char *xid_str = defGetString(defel);
+ TransactionId xid;
+
+ if (IsSet(opts->specified_opts, SUBOPT_XID))
+ errorConflictingDefElem(defel, pstate);
+
+ /* Setting xid = NONE is treated as resetting xid */
+ if (strcmp(xid_str, "none") == 0)
+ xid = InvalidTransactionId;
+ else
+ {
+ /* Parse the argument as TransactionId */
+ xid = DatumGetTransactionId(DirectFunctionCall1(xidin,
+ CStringGetDatum(xid_str)));
+
+ if (!TransactionIdIsNormal(xid))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid transaction id: %s", xid_str)));
+ }
+
+ opts->specified_opts |= SUBOPT_XID;
+ opts->xid = xid;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -464,6 +494,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
CharGetDatum(opts.twophase ?
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(InvalidTransactionId);
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -1083,6 +1115,27 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
break;
}
+ case ALTER_SUBSCRIPTION_SKIP:
+ {
+ if (!superuser())
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("must be superuser to skip transaction")));
+
+ parse_subscription_options(pstate, stmt->options, SUBOPT_XID, &opts);
+
+ if (IsSet(opts.specified_opts, SUBOPT_XID))
+ {
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(opts.xid);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ update_tuple = true;
+ }
+
+ break;
+ }
+
default:
elog(ERROR, "unrecognized ALTER SUBSCRIPTION kind %d",
stmt->kind);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 879018377b..d4b542d0bf 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9954,6 +9954,15 @@ AlterSubscriptionStmt:
(Node *)makeInteger(false), @1));
$$ = (Node *)n;
}
+ | ALTER SUBSCRIPTION name SKIP definition
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_SKIP;
+ n->subname = $3;
+ n->options = $5;
+ $$ = (Node *)n;
+ }
;
/*****************************************************************************
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index c9af775bc1..335d622292 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -136,6 +136,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -257,6 +258,21 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * skipping_xid is valid if we are skipping all data modification changes
+ * (INSERT, UPDATE, etc.) of the specified transaction at MySubscription->skipxid.
+ * Once we start skipping changes, we don't stop it until we skip all changes
+ * of the transaction even if pg_subscription is updated and
+ * MySubscription->skipxid gets changed or reset during that. Also, we don't skip
+ * receiving the changes in streaming cases, since we decide whether or not to skip
+ * applying the changes when starting to apply changes. At end of the transaction,
+ * we disable it and reset the skip XID. The timing of resetting the skip XID
+ * varies depending on commit or commit/rollback prepared case. Please refer
+ * to the comments in corresponding functions for details.
+ */
+static TransactionId skipping_xid = InvalidTransactionId;
+#define is_skipping_changes() (TransactionIdIsValid(skipping_xid))
+
/* BufFile handle of the current streaming file */
static BufFile *stream_fd = NULL;
@@ -332,6 +348,13 @@ static void TwoPhaseTransactionGid(Oid subid, TransactionId xid, char *gid, int
/* Common streaming function to apply all the spooled messages */
static void apply_spooled_messages(TransactionId xid, XLogRecPtr lsn);
+/* Functions for skipping changes */
+static void maybe_start_skipping_changes(TransactionId xid);
+static void stop_skipping_changes(bool commit, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp);
+static void clear_subscription_skip_xid(TransactionId xid, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp);
+
/* Functions for apply error callback */
static void apply_error_callback(void *arg);
static inline void set_apply_error_context_xact(TransactionId xid, TimestampTz ts);
@@ -791,6 +814,11 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
+ /*
+ * Enable skipping all changes of this transaction if specified.
+ */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -843,6 +871,11 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
+ /*
+ * Enable skipping all changes of this transaction if specified
+ */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -856,6 +889,36 @@ apply_handle_prepare_internal(LogicalRepPreparedTxnData *prepare_data)
{
char gid[GIDSIZE];
+ /*
+ * If we are skipping all changes of this transaction, we stop it but
+ * unlike commit, we do not clear subskipxid of pg_subscription catalog
+ * here and will do that at commit prepared or rollback prepared time. If
+ * we update the catalog and then prepare the transaction, the changes
+ * will be part of the prepared transaction. Even if we do that in
+ * reverse order, subskipxid will not be cleared but this transaction
+ * won’t be resent in a case where the server crashes between them.
+ *
+ * subskipxid might be changed or cleared by the user before we receive
+ * COMMIT PREPARED or ROLLBACK PREPARED. But that’s okay because this
+ * prepared transaction is empty.
+ *
+ * One might think that we can skip preparing the skipped transaction and
+ * also skip COMMIT PREPARED or ROLLBACK PREPARED by comparing the XID
+ * received as part of the message to subskipxid. But subskipxid could be
+ * changed by users between PREPARE and COMMIT PREPARED or ROLLBACK
+ * PREPARED. There was an idea to disallow users to change subskipxid
+ * while skipping changes. But we don't know when COMMIT PREPARED or
+ * ROLLBACK PREPARED comes and another conflict could occur in the
+ * meanwhile. If such another conflict occurs, we cannot skip the
+ * transaction by using subskipxid. Also, there was another idea to check
+ * whether the transaction has been prepared or not by checking GID,
+ * origin LSN, and origin timestamp of the prepared transaction but that
+ * doesn't seem worthwhile because it requires protocol changes, and
+ * skipping transactions shouldn't be common.
+ */
+ if (is_skipping_changes())
+ stop_skipping_changes(false, InvalidXLogRecPtr, 0);
+
/*
* Compute unique GID for two_phase transactions. We don't use GID of
* prepared transaction sent by server as that can lead to deadlock when
@@ -901,9 +964,9 @@ apply_handle_prepare(StringInfo s)
/*
* Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction. It is done this way because at
- * commit prepared time, we won't know whether we have skipped preparing a
- * transaction because of no change.
+ * change has happened in this transaction or all changes are skipped. It
+ * is done this way because at commit prepared time, we won't know whether
+ * we have skipped preparing a transaction because of no change.
*
* XXX, We can optimize such that at commit prepared time, we first check
* whether we have prepared the transaction or not but that doesn't seem
@@ -940,6 +1003,26 @@ apply_handle_commit_prepared(StringInfo s)
logicalrep_read_commit_prepared(s, &prepare_data);
set_apply_error_context_xact(prepare_data.xid, prepare_data.commit_time);
+ /* Update the subscription cache if necessary */
+ maybe_reread_subscription();
+
+ if (MySubscription->skipxid == prepare_data.xid)
+ {
+ /*
+ * Clear the subskipxid of pg_subscription catalog. This catalog
+ * update must be committed before finishing prepared transaction.
+ * Because otherwise, in a case where the server crashes between
+ * finishing prepared transaction and the catalog update, COMMIT
+ * PREPARED won’t be resent but skipsubxid is left.
+ *
+ * Also, we must not update the replication origin LSN and timestamp
+ * while committing the catalog update so that COMMIT PREPARED will be
+ * resent in case of a crash immediately after the catalog update
+ * commit.
+ */
+ clear_subscription_skip_xid(prepare_data.xid, InvalidXLogRecPtr, 0);
+ }
+
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
gid, sizeof(gid));
@@ -981,6 +1064,20 @@ apply_handle_rollback_prepared(StringInfo s)
logicalrep_read_rollback_prepared(s, &rollback_data);
set_apply_error_context_xact(rollback_data.xid, rollback_data.rollback_time);
+ /* Update the subscription cache if necessary */
+ maybe_reread_subscription();
+
+ if (MySubscription->skipxid == rollback_data.xid)
+ {
+ /*
+ * Same as COMMIT PREPARED, we must clear subskipxid of
+ * pg_subscription before rolling back the prepared transaction.
+ * Please see the comments in apply_handle_commit_prepared() for
+ * details.
+ */
+ clear_subscription_skip_xid(rollback_data.xid, InvalidXLogRecPtr, 0);
+ }
+
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, rollback_data.xid,
gid, sizeof(gid));
@@ -1209,6 +1306,13 @@ apply_handle_stream_abort(StringInfo s)
logicalrep_read_stream_abort(s, &xid, &subxid);
+ /*
+ * We don't expect that the user set the XID of the transaction that is
+ * rolled back but if the skip XID is set, clear it.
+ */
+ if (MySubscription->skipxid == xid || MySubscription->skipxid == subxid)
+ clear_subscription_skip_xid(MySubscription->skipxid, InvalidXLogRecPtr, 0);
+
/*
* If the two XIDs are the same, it's in fact abort of toplevel xact, so
* just delete the files with serialized info.
@@ -1331,6 +1435,9 @@ apply_spooled_messages(TransactionId xid, XLogRecPtr lsn)
remote_final_lsn = lsn;
+ /* Enable skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(xid);
+
/*
* Make sure the handle apply_dispatch methods are aware we're in a remote
* transaction.
@@ -1451,7 +1558,20 @@ apply_handle_stream_commit(StringInfo s)
static void
apply_handle_commit_internal(LogicalRepCommitData *commit_data)
{
- if (IsTransactionState())
+ if (is_skipping_changes())
+ {
+ /*
+ * If we are skipping all changes of this transaction, we stop it and
+ * clear subskipxid of pg_subscription.
+ */
+ stop_skipping_changes(true, commit_data->end_lsn,
+ commit_data->committime);
+
+ pgstat_report_stat(false);
+
+ store_flush_position(commit_data->end_lsn);
+ }
+ else if (IsTransactionState())
{
/*
* Update origin state so we can restart streaming from correct
@@ -2366,6 +2486,17 @@ apply_dispatch(StringInfo s)
LogicalRepMsgType action = pq_getmsgbyte(s);
LogicalRepMsgType saved_command;
+ /*
+ * Skip all data-modification changes if we're skipping changes of this
+ * transaction.
+ */
+ if (is_skipping_changes() &&
+ (action == LOGICAL_REP_MSG_INSERT ||
+ action == LOGICAL_REP_MSG_UPDATE ||
+ action == LOGICAL_REP_MSG_DELETE ||
+ action == LOGICAL_REP_MSG_TRUNCATE))
+ return;
+
/*
* Set the current command being applied. Since this function can be
* called recusively when applying spooled changes, save the current
@@ -3661,6 +3792,120 @@ IsLogicalWorker(void)
return MyLogicalRepWorker != NULL;
}
+/*
+ * Start skipping changes of the transaction if the given XID matches the
+ * transaction ID specified by skip_xid option.
+ */
+static void
+maybe_start_skipping_changes(TransactionId xid)
+{
+ Assert(!is_skipping_changes());
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ /* Make sure subscription cache is up-to-date */
+ maybe_reread_subscription();
+
+ if (MySubscription->skipxid != xid)
+ return;
+
+ /* Start skipping all changes of this transaction */
+ skipping_xid = xid;
+
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction %u",
+ xid));
+}
+
+/*
+ * Stop skipping changes by resetting skipping_xid. If clear_subskipxid is true,
+ * we also clear subskipxid of pg_subscription by setting InvalidTransactionId.
+ */
+static void
+stop_skipping_changes(bool clear_subskipxid, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp)
+{
+ Assert(is_skipping_changes());
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction %u",
+ skipping_xid)));
+
+ if (clear_subskipxid)
+ clear_subscription_skip_xid(skipping_xid, origin_lsn, origin_timestamp);
+
+ /* Stop skipping changes */
+ skipping_xid = InvalidTransactionId;
+}
+
+/* clear subskipxid of pg_subscription catalog */
+static void
+clear_subscription_skip_xid(TransactionId xid, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp)
+{
+ Relation rel;
+ Form_pg_subscription subform;
+ HeapTuple tup;
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ if (!IsTransactionState())
+ StartTransactionCommand();
+
+ LockSharedObject(SubscriptionRelationId, MySubscription->oid, 0,
+ AccessShareLock);
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ ObjectIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ /* Get subskipxid value */
+ subform = (Form_pg_subscription) GETSTRUCT(tup);
+
+ /*
+ * Update the subskipxid of the tuple to InvalidTransactionId. If user
+ * has already changed subskipxid before clearing it we don't update the
+ * catalog and don't advance the replication origin state. So in the
+ * worst case, if the server crashes before sending an acknowledgment of
+ * the flush position the transaction will be sent again and the user
+ * needs to be set subskipxid again. We can reduce the possibility by
+ * logging a replication origin WAL record to advance the origin LSN
+ * instead but it doesn't seem to be worth since it's a very minor case.
+ */
+ if (subform->subskipxid == xid)
+ {
+ /*
+ * Update origin state so we can restart streaming from correct
+ * position in case of crash.
+ */
+ replorigin_session_origin_lsn = origin_lsn;
+ replorigin_session_origin_timestamp = origin_timestamp;
+
+ /* reset subskipxid */
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(InvalidTransactionId);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+ }
+
+ heap_freetuple(tup);
+ table_close(rel, NoLock);
+ CommitTransactionCommand();
+}
+
/* Error callback to give more context info about the change being applied */
static void
apply_error_callback(void *arg)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 92ab95724d..29aea5b56b 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4301,6 +4301,10 @@ getSubscriptions(Archive *fout)
ntups = PQntuples(res);
+ /*
+ * Get subscription fields. We don't fetch subskipxid as we don't
+ * include it in the dump.
+ */
i_tableoid = PQfnumber(res, "tableoid");
i_oid = PQfnumber(res, "oid");
i_subname = PQfnumber(res, "subname");
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 8587b19160..2895ddcea3 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6029,7 +6029,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false};
+ false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6066,8 +6066,10 @@ describeSubscriptions(const char *pattern, bool verbose)
/* Two_phase is only supported in v15 and higher */
if (pset.sversion >= 150000)
appendPQExpBuffer(&buf,
- ", subtwophasestate AS \"%s\"\n",
- gettext_noop("Two phase commit"));
+ ", subtwophasestate AS \"%s\"\n"
+ ", subskipxid AS \"%s\"\n",
+ gettext_noop("Two phase commit"),
+ gettext_noop("Skip XID"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 39be6f556a..db95b50a91 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1675,7 +1675,7 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> */
else if (Matches("ALTER", "SUBSCRIPTION", MatchAny))
COMPLETE_WITH("CONNECTION", "ENABLE", "DISABLE", "OWNER TO",
- "RENAME TO", "REFRESH PUBLICATION", "SET",
+ "RENAME TO", "REFRESH PUBLICATION", "SET", "SKIP",
"ADD PUBLICATION", "DROP PUBLICATION");
/* ALTER SUBSCRIPTION <name> REFRESH PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) &&
@@ -1691,6 +1691,12 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
COMPLETE_WITH("binary", "slot_name", "streaming", "synchronous_commit");
+ /* ALTER SUBSCRIPTION <name> SKIP */
+ else if (Matches("ALTER", "SUBSCRIPTION", MatchAny, "SKIP"))
+ COMPLETE_WITH("(");
+ /* ALTER SUBSCRIPTION <name> SKIP ( */
+ else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
+ COMPLETE_WITH("xid");
/* ALTER SUBSCRIPTION <name> SET PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "PUBLICATION"))
{
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 18c291289f..d4410da58f 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -67,6 +67,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
char subtwophasestate; /* Stream two-phase transactions */
+ TransactionId subskipxid; /* All changes associated with
+ * this XID are skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -103,6 +106,7 @@ typedef struct Subscription
* binary format */
bool stream; /* Allow streaming in-progress transactions. */
char twophasestate; /* Allow streaming two-phase transactions */
+ TransactionId skipxid; /* All changes of the XID are skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 413e7c85a1..ab3554f234 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3716,7 +3716,8 @@ typedef enum AlterSubscriptionType
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
ALTER_SUBSCRIPTION_DROP_PUBLICATION,
ALTER_SUBSCRIPTION_REFRESH,
- ALTER_SUBSCRIPTION_ENABLED
+ ALTER_SUBSCRIPTION_ENABLED,
+ ALTER_SUBSCRIPTION_SKIP
} AlterSubscriptionType;
typedef struct AlterSubscriptionStmt
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 80aae83562..4279a6c3ea 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -76,10 +76,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -93,11 +93,36 @@ ALTER SUBSCRIPTION regress_doesnotexist CONNECTION 'dbname=regress_doesnotexist2
ERROR: subscription "regress_doesnotexist" does not exist
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
ERROR: unrecognized subscription parameter: "create_slot"
+-- ok - valid xid
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 3);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 4294967295);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+ subname | subskipxid
+-----------------+------------
+ regress_testsub | 4294967295
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = NONE);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+ subname | subskipxid
+-----------------+------------
+ regress_testsub | 0
+(1 row)
+
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1.1);
+ERROR: invalid transaction id: 1.1
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 0);
+ERROR: invalid transaction id: 0
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1);
+ERROR: invalid transaction id: 1
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 2);
+ERROR: invalid transaction id: 2
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+--------------------+------------------------------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | off | dbname=regress_doesnotexist2
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+----------+--------------------+------------------------------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | 0 | off | dbname=regress_doesnotexist2
(1 row)
BEGIN;
@@ -129,10 +154,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+--------------------+------------------------------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | local | dbname=regress_doesnotexist2
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+----------+--------------------+------------------------------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | 0 | local | dbname=regress_doesnotexist2
(1 row)
-- rename back to keep the rest simple
@@ -165,19 +190,19 @@ ERROR: binary requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, binary = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -188,19 +213,19 @@ ERROR: streaming requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
-- fail - publication already exists
@@ -215,10 +240,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
-- fail - publication used more then once
@@ -233,10 +258,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -270,10 +295,10 @@ ERROR: two_phase requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | 0 | off | dbname=regress_doesnotexist
(1 row)
--fail - alter of two_phase option not supported.
@@ -282,10 +307,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -294,10 +319,10 @@ DROP SUBSCRIPTION regress_testsub;
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index bd0f4af1e4..6d8392758c 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -72,6 +72,19 @@ ALTER SUBSCRIPTION regress_testsub SET (slot_name = '');
ALTER SUBSCRIPTION regress_doesnotexist CONNECTION 'dbname=regress_doesnotexist2';
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
+-- ok - valid xid
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 3);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 4294967295);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = NONE);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1.1);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 0);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 2);
+
\dRs+
BEGIN;
diff --git a/src/test/subscription/t/028_skip_xact.pl b/src/test/subscription/t/028_skip_xact.pl
new file mode 100644
index 0000000000..4c107fc8f5
--- /dev/null
+++ b/src/test/subscription/t/028_skip_xact.pl
@@ -0,0 +1,217 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Tests for skipping logical replication transactions
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use Test::More tests => 7;
+
+# Test skipping the transaction. This function must be called after the caller
+# inserting data that conflict with the subscriber. After waiting for the
+# subscription worker stats are updated, we skip the transaction in question
+# by ALTER SUBSCRIPTION ... SKIP. Then, check if logical replication can continue
+# working by inserting $nonconflict_data on the publisher.
+sub test_skip_xact
+{
+ my ($node_publisher, $node_subscriber, $subname, $relname, $nonconflict_data,
+ $expected, $xid, $msg) = @_;
+
+ # Wait for worker error
+ $node_subscriber->poll_query_until(
+ 'postgres',
+ qq[
+SELECT count(1) > 0
+FROM pg_stat_subscription_workers
+WHERE last_error_relid = '$relname'::regclass
+ AND subrelid IS NULL
+ AND last_error_command = 'INSERT'
+ AND last_error_xid = '$xid'
+ AND starts_with(last_error_message, 'duplicate key value violates unique constraint');
+]) or die "Timed out while waiting for worker error";
+
+ # Set skip xid
+ $node_subscriber->safe_psql(
+ 'postgres',
+ "ALTER SUBSCRIPTION $subname SKIP (xid = '$xid')");
+
+ # Restart the subscriber node to restart logical replication with no interval
+ $node_subscriber->restart;
+
+ # Wait for the failed transaction to be skipped
+ $node_subscriber->poll_query_until(
+ 'postgres',
+ "SELECT subskipxid = 0 FROM pg_subscription WHERE subname = '$subname'");
+
+ # Insert non-conflict data
+ $node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO $relname VALUES $nonconflict_data");
+
+ $node_publisher->wait_for_catchup($subname);
+
+ # Check replicated data
+ my $res = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT count(*) FROM $relname");
+ is($res, $expected, $msg);
+}
+
+# Create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ qq[
+logical_decoding_work_mem = 64kB
+max_prepared_transactions = 10
+]);
+$node_publisher->start;
+
+# Create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+]);
+
+# The subscriber will enter an infinite error loop, so we don't want
+# to overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+wal_retrieve_retry_interval = 2s
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we
+# create the same tables but with primary keys. Also, insert some data that
+# will conflict with the data replicated from publisher later.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+CREATE TABLE test_tab (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+COMMIT;
+]);
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+CREATE TABLE test_tab (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab VALUES (1);
+INSERT INTO test_tab_streaming VALUES (1, md5(1::text));
+COMMIT;
+]);
+
+# Setup publications
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Create subscriptions
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr' PUBLICATION tap_pub WITH (two_phase = on);
+CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr' PUBLICATION tap_pub_streaming WITH (two_phase = on, streaming = on);
+]);
+
+$node_publisher->wait_for_catchup('tap_sub');
+$node_publisher->wait_for_catchup('tap_sub_streaming');
+
+# Insert data to test_tab1, raising an error on the subscriber due to violation
+# of the unique constraint on test_tab. Then skip the transaction.
+my $xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(2)", "2", $xid,
+ "test skipping transaction");
+
+# Test for PREPARE and COMMIT PREPARED. Insert the same data to test_tab1 and
+# PREPARE the transaction, raising an error. Then skip the transaction.
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+COMMIT PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(3)", "3", $xid,
+ "test skipping prepare and commit prepared ");
+
+# Test for PREPARE and ROLLBACK PREPARED
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+ROLLBACK PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(4)", "4", $xid,
+ "test skipping prepare and rollback prepared");
+
+# Test for STREAM COMMIT. Insert enough rows to test_tab_streaming to exceed the 64kB
+# limit, also raising an error on the subscriber during applying spooled changes for the
+# same reason. Then skip the transaction.
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub_streaming", "test_tab_streaming",
+ "(2, md5(2::text))", "2", $xid,
+ "test skipping stream-commit");
+
+# Test for STREAM PREPARE and COMMIT PREPARED
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+COMMIT PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub_streaming", "test_tab_streaming",
+ "(3, md5(3::text))", "3", $xid,
+ "test skipping stream-prepare and commit prepared");
+
+# Test for STREAM PREPARE and ROLLBACK PREPARED
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+ROLLBACK PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub_streaming", "test_tab_streaming",
+ "(4, md5(4::text))", "4", $xid,
+ "test skipping stream-prepare and rollback prepared");
+
+my $res = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT count(*) FROM pg_prepared_xacts");
+is($res, "0", "check all prepared transactions are resolved on the subscriber");
--
2.24.3 (Apple Git-128)
On Thu, Jan 13, 2022 at 10:07 AM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
On Wed, Jan 12, 2022 2:02 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated patch that incorporated all comments I got so far.
Thanks for updating the patch. Here are some comments:
Thank you for the comments!
1)
+ Skip applying changes of the particular transaction. If incoming dataShould "Skip" be "Skips" ?
2) + prepared by enabling <literal>two_phase</literal> on susbscriber. After h + the logical replication successfully skips the transaction, the transactionThe "h" after word "After" seems redundant.
3)
+ Skipping the whole transaction includes skipping the cahnge that may not violate"cahnge" should be "changes" I think.
4) +/* + * True if we are skipping all data modification changes (INSERT, UPDATE, etc.) of + * the specified transaction at MySubscription->skipxid. Once we start skipping ... + */ +static TransactionId skipping_xid = InvalidTransactionId; +#define is_skipping_changes() (TransactionIdIsValid(skipping_xid))Maybe we should modify this comment. Something like:
skipping_xid is valid if we are skipping all data modification changes ...5) + if (!superuser()) + ereport(ERROR, + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), + errmsg("must be superuser to set %s", "skip_xid")));Should we change the message to "must be superuser to skip xid"?
Because the SQL stmt is "ALTER SUBSCRIPTION ... SKIP (xid = XXX)".
I agree with all the comments above. These are incorporated into the
latest v4 patch I've just submitted[1]postgresql.org/message-id/CAD21AoBZC87nY1pCaexk1uBA68JSBmy2-UqLGirT9g-RVMhjKw%40mail.gmail.com.
Regards,
[1]: postgresql.org/message-id/CAD21AoBZC87nY1pCaexk1uBA68JSBmy2-UqLGirT9g-RVMhjKw%40mail.gmail.com
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Fri, Jan 14, 2022 at 7:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Jan 12, 2022 at 11:10 PM vignesh C <vignesh21@gmail.com> wrote:
On Wed, Jan 12, 2022 at 11:32 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Jan 12, 2022 at 12:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Jan 12, 2022 at 5:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Jan 11, 2022 at 7:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jan 11, 2022 at 1:51 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On second thought, the same is true for other cases, for example,
preparing the transaction and clearing skip_xid while handling a
prepare message. That is, currently we don't clear skip_xid while
handling a prepare message but do that while handling commit/rollback
prepared message, in order to avoid the worst case. If we do both
while handling a prepare message and the server crashes between them,
it ends up that skip_xid is cleared and the transaction will be
resent, which is identical to the worst-case above.How are you thinking to update the skip xid before prepare? If we do
it in the same transaction then the changes in the catalog will be
part of the prepared xact but won't be committed. Now, say if we do it
after prepare, then the situation won't be the same because after
restart the same xact won't appear again.I was thinking to commit the catalog change first in a separate
transaction while not updating origin LSN and then prepare an empty
transaction while updating origin LSN.But, won't it complicate the handling if in the future we try to
enhance this API such that it skips partial changes like skipping only
for particular relation(s) or particular operations as discussed
previously in this thread?Right. I was thinking that if we accept the situation that the user
has to set skip_xid again in case of the server crashes, we might be
able to accept also the situation that the user has to clear skip_xid
in a case of the server crashes. But it seems the former is less
problematic.I've attached an updated patch that incorporated all comments I got so far.
Thanks for the updated patch, few comments:
Thank you for the comments!
1) Currently skip xid is not displayed in describe subscriptions, can
we include it too:
\dRs+ sub1
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two
phase commit | Synchronous commit | Conninfo
------+---------+---------+-------------+--------+-----------+------------------+--------------------+--------------------------------
sub1 | vignesh | t | {pub1} | f | f | e
| off | dbname=postgres host=localhost
(1 row)2) This import "use PostgreSQL::Test::Utils;" is not required: +# Tests for skipping logical replication transactions. +use strict; +use warnings; +use PostgreSQL::Test::Cluster; +use PostgreSQL::Test::Utils; +use Test::More tests => 6;3) Some of the comments uses a punctuation mark and some of them does not use, Should we keep it consistent: + # Wait for worker error + $node_subscriber->poll_query_until( + 'postgres',+ # Set skip xid + $node_subscriber->safe_psql( + 'postgres',+# Create publisher node. +my $node_publisher = PostgreSQL::Test::Cluster->new('publisher'); +$node_publisher->init(allows_streaming => 'logical');+# Create subscriber node. +my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');4) Should this be changed: + * True if we are skipping all data modification changes (INSERT, UPDATE, etc.) of + * the specified transaction at MySubscription->skipxid. Once we start skipping + * changes, we don't stop it until the we skip all changes of the transaction even + * if pg_subscription is updated that and MySubscription->skipxid gets changed or to: + * True if we are skipping all data modification changes (INSERT, UPDATE, etc.) of + * the specified transaction at MySubscription->skipxid. Once we start skipping + * changes, we don't stop it until we skip all changes of the transaction even + * if pg_subscription is updated that and MySubscription->skipxid gets changed orIn "stop it until the we skip all changes", here the is not required.
I agree with all the comments above. I've attached an updated patch.
Thanks for the updated patch, few minor comments:
1) Should "SKIP" be "SKIP (" here:
@@ -1675,7 +1675,7 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> */
else if (Matches("ALTER", "SUBSCRIPTION", MatchAny))
COMPLETE_WITH("CONNECTION", "ENABLE", "DISABLE", "OWNER TO",
- "RENAME TO", "REFRESH
PUBLICATION", "SET",
+ "RENAME TO", "REFRESH
PUBLICATION", "SET", "SKIP",
2) We could add a test for this if possible:
+ case ALTER_SUBSCRIPTION_SKIP:
+ {
+ if (!superuser())
+ ereport(ERROR,
+
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("must
be superuser to skip transaction")));
3) There was one typo in commit message, transaciton shoudl be transaction:
After skipping the transaciton the apply worker clears
pg_subscription.subskipxid.
Another small typo, susbscriber should be subscriber:
+ prepared by enabling <literal>two_phase</literal> on susbscriber. After
+ the logical replication successfully skips the transaction, the
transaction
4) Should skipsubxid be mentioned as subskipxid here:
+ * Clear the subskipxid of pg_subscription catalog. This catalog
+ * update must be committed before finishing prepared transaction.
+ * Because otherwise, in a case where the server crashes between
+ * finishing prepared transaction and the catalog update, COMMIT
+ * PREPARED won’t be resent but skipsubxid is left.
Regards,
Vignesh
On Fri, Jan 14, 2022 at 7:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I agree with all the comments above. I've attached an updated patch.
Review comments
================
1.
+
+ <para>
+ In this case, you need to consider changing the data on the
subscriber so that it
The starting of this sentence doesn't make sense to me. How about
changing it like: "To resolve conflicts, you need to ...
2.
+ <structname>pg_subscription</structname>.<structfield>subskipxid</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts.
+ </para>
+
+ <para>
+ <replaceable>skip_option</replaceable> specifies options for
this operation.
+ The supported option is:
+
+ <variablelist>
+ <varlistentry>
+ <term><literal>xid</literal> (<type>xid</type>)</term>
+ <listitem>
+ <para>
+ Specifies the ID of the transaction whose changes are to be skipped
+ by the logical replication worker. Setting
<literal>NONE</literal> resets
+ the transaction ID.
+ </para>
Empty spaces after line finish are inconsistent. I personally use a
single space before a new line but I see that others use two spaces
and the nearby documentation also uses two spaces in this regard so I
am fine either way but let's be consistent.
3.
+ case ALTER_SUBSCRIPTION_SKIP:
+ {
+ if (!superuser())
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("must be superuser to skip transaction")));
+
+ parse_subscription_options(pstate, stmt->options, SUBOPT_XID, &opts);
+
+ if (IsSet(opts.specified_opts, SUBOPT_XID))
..
..
Is there a case when the above 'if (IsSet(..' won't be true? If not,
then probably there should be Assert instead of 'if'.
4.
+static TransactionId skipping_xid = InvalidTransactionId;
I find this variable name bit odd. Can we name it skip_xid?
5.
+ * skipping_xid is valid if we are skipping all data modification changes
+ * (INSERT, UPDATE, etc.) of the specified transaction at
MySubscription->skipxid.
+ * Once we start skipping changes, we don't stop it until we skip all changes
I think it would be better to write the first line of comment as: "We
enable skipping all data modification changes (INSERT, UPDATE, etc.)
for the subscription if the user has specified skip_xid. Once we ..."
6.
+static void
+maybe_start_skipping_changes(TransactionId xid)
+{
+ Assert(!is_skipping_changes());
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ /* Make sure subscription cache is up-to-date */
+ maybe_reread_subscription();
Why do we need to update the cache here by calling
maybe_reread_subscription() and at other places in the patch? It is
sufficient to get the skip_xid value at the start of the worker via
GetSubscription().
7. In maybe_reread_subscription(), isn't there a need to check whether
skip_xid is changed where we exit and launch the worker and compare
other subscription parameters?
8.
+static void
+clear_subscription_skip_xid(TransactionId xid, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp)
+{
+ Relation rel;
+ Form_pg_subscription subform;
+ HeapTuple tup;
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ if (!IsTransactionState())
+ StartTransactionCommand();
+
+ LockSharedObject(SubscriptionRelationId, MySubscription->oid, 0,
+ AccessShareLock);
It is important to add a comment as to why we need a lock here.
9.
+ * needs to be set subskipxid again. We can reduce the possibility by
+ * logging a replication origin WAL record to advance the origin LSN
+ * instead but it doesn't seem to be worth since it's a very minor case.
You can also add here that there is no way to advance origin_timestamp
so that would be inconsistent.
10.
+clear_subscription_skip_xid(TransactionId xid, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp)
{
..
..
+ if (!IsTransactionState())
+ StartTransactionCommand();
..
..
+ CommitTransactionCommand();
..
}
The transaction should be committed in this function if it is started
here otherwise it should be the responsibility of the caller to commit
it.
--
With Regards,
Amit Kapila.
On Fri, Jan 14, 2022 at 5:35 PM vignesh C <vignesh21@gmail.com> wrote:
Thanks for the updated patch, few minor comments: 1) Should "SKIP" be "SKIP (" here: @@ -1675,7 +1675,7 @@ psql_completion(const char *text, int start, int end) /* ALTER SUBSCRIPTION <name> */ else if (Matches("ALTER", "SUBSCRIPTION", MatchAny)) COMPLETE_WITH("CONNECTION", "ENABLE", "DISABLE", "OWNER TO", - "RENAME TO", "REFRESH PUBLICATION", "SET", + "RENAME TO", "REFRESH PUBLICATION", "SET", "SKIP",
Won't the another rule as follows added by patch sufficient for what
you are asking?
+ /* ALTER SUBSCRIPTION <name> SKIP */
+ else if (Matches("ALTER", "SUBSCRIPTION", MatchAny, "SKIP"))
+ COMPLETE_WITH("(");
I might be missing something but why do you think the handling of SKIP
be any different than what we are doing for SET?
--
With Regards,
Amit Kapila.
On Sat, Jan 15, 2022 at 7:24 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Jan 14, 2022 at 7:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I agree with all the comments above. I've attached an updated patch.
Review comments
================
Thank you for the comments!
1. + + <para> + In this case, you need to consider changing the data on the subscriber so that itThe starting of this sentence doesn't make sense to me. How about
changing it like: "To resolve conflicts, you need to ...
Fixed.
2. + <structname>pg_subscription</structname>.<structfield>subskipxid</structfield>) + is cleared. See <xref linkend="logical-replication-conflicts"/> for + the details of logical replication conflicts. + </para> + + <para> + <replaceable>skip_option</replaceable> specifies options for this operation. + The supported option is: + + <variablelist> + <varlistentry> + <term><literal>xid</literal> (<type>xid</type>)</term> + <listitem> + <para> + Specifies the ID of the transaction whose changes are to be skipped + by the logical replication worker. Setting <literal>NONE</literal> resets + the transaction ID. + </para>Empty spaces after line finish are inconsistent. I personally use a
single space before a new line but I see that others use two spaces
and the nearby documentation also uses two spaces in this regard so I
am fine either way but let's be consistent.
Fixed.
3. + case ALTER_SUBSCRIPTION_SKIP: + { + if (!superuser()) + ereport(ERROR, + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), + errmsg("must be superuser to skip transaction"))); + + parse_subscription_options(pstate, stmt->options, SUBOPT_XID, &opts); + + if (IsSet(opts.specified_opts, SUBOPT_XID)) .. ..Is there a case when the above 'if (IsSet(..' won't be true? If not,
then probably there should be Assert instead of 'if'.
Fixed.
4.
+static TransactionId skipping_xid = InvalidTransactionId;I find this variable name bit odd. Can we name it skip_xid?
Okay, renamed.
5. + * skipping_xid is valid if we are skipping all data modification changes + * (INSERT, UPDATE, etc.) of the specified transaction at MySubscription->skipxid. + * Once we start skipping changes, we don't stop it until we skip all changesI think it would be better to write the first line of comment as: "We
enable skipping all data modification changes (INSERT, UPDATE, etc.)
for the subscription if the user has specified skip_xid. Once we ..."
Changed.
6. +static void +maybe_start_skipping_changes(TransactionId xid) +{ + Assert(!is_skipping_changes()); + Assert(!in_remote_transaction); + Assert(!in_streamed_transaction); + + /* Make sure subscription cache is up-to-date */ + maybe_reread_subscription();Why do we need to update the cache here by calling
maybe_reread_subscription() and at other places in the patch? It is
sufficient to get the skip_xid value at the start of the worker via
GetSubscription().
MySubscription could be out-of-date after a user changes the catalog.
In non-skipping change cases, we check it when starting the
transaction in begin_replication_step() which is called, e.g., when
applying an insert change. But I think we need to make sure it’s
up-to-date at the beginning of applying changes, that is, before
starting a transaction. Otherwise, we may end up skipping the
transaction based on out-of-dated subscription cache.
The reason why calling calling maybe_reread_subscription in both
apply_handle_commit_prepared() and apply_handle_rollback_prepared() is
the same; MySubscription could be out-of-date when applying
commit-prepared or rollback-prepared since we have not called
begin_replication_step() to open a new transaction.
7. In maybe_reread_subscription(), isn't there a need to check whether
skip_xid is changed where we exit and launch the worker and compare
other subscription parameters?
IIUC we relaunch the worker here when subscription parameters such as
slot_name was changed. In the current implementation, I think that
relaunching the worker is not necessarily necessary when skip_xid is
changed. For instance, when skipping the prepared transaction, we
deliberately don’t clear subskipxid of pg_subscription and do that at
commit-prepared or rollback-prepared case. There are chances that the
user changes skip_xid before commit-prepared or rollback-prepared. But
we tolerate this case.
Also, in non-streaming and non-2PC cases, while skipping changes we
don’t call maybe_reread_subscription() until all changes are skipped.
So it cannot work to cancel skipping changes that is already started.
8. +static void +clear_subscription_skip_xid(TransactionId xid, XLogRecPtr origin_lsn, + TimestampTz origin_timestamp) +{ + Relation rel; + Form_pg_subscription subform; + HeapTuple tup; + bool nulls[Natts_pg_subscription]; + bool replaces[Natts_pg_subscription]; + Datum values[Natts_pg_subscription]; + + memset(values, 0, sizeof(values)); + memset(nulls, false, sizeof(nulls)); + memset(replaces, false, sizeof(replaces)); + + if (!IsTransactionState()) + StartTransactionCommand(); + + LockSharedObject(SubscriptionRelationId, MySubscription->oid, 0, + AccessShareLock);It is important to add a comment as to why we need a lock here.
Added.
9. + * needs to be set subskipxid again. We can reduce the possibility by + * logging a replication origin WAL record to advance the origin LSN + * instead but it doesn't seem to be worth since it's a very minor case.You can also add here that there is no way to advance origin_timestamp
so that would be inconsistent.
Added.
10. +clear_subscription_skip_xid(TransactionId xid, XLogRecPtr origin_lsn, + TimestampTz origin_timestamp) { .. .. + if (!IsTransactionState()) + StartTransactionCommand(); .. .. + CommitTransactionCommand(); .. }The transaction should be committed in this function if it is started
here otherwise it should be the responsibility of the caller to commit
it.
Fixed.
I've attached an updated patch that incorporated these comments except
for 6 and 7 that we probably need more discussion on. The comments
from Vignesh are also incorporated.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v5-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transac.patchapplication/octet-stream; name=v5-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transac.patchDownload
From ece020de65036300491b119a9f9abad6d8967cc8 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 10 Dec 2021 14:41:30 +0900
Subject: [PATCH v5] Add ALTER SUBSCRIPTION ... SKIP to skip the transaction on
subscriber nodes
If incoming change violates any constraint, logical replication stops
until it's resolved. This commit introduces another way to skip the
transaction in question, other than manually updating the subscriber's
database or using pg_replication_origin_advance().
The user can specify XID by ALTER SUBSCRIPTION ... SKIP (xid = XXX),
updating pg_subscription.subskipxid field, telling the apply worker to
skip the transaction. The apply worker skips all data modification
changes within the specified transaction.
After skipping the transaction the apply worker clears
pg_subscription.subskipxid.
---
doc/src/sgml/catalogs.sgml | 10 +
doc/src/sgml/logical-replication.sgml | 49 +++-
doc/src/sgml/ref/alter_subscription.sgml | 41 ++++
src/backend/catalog/pg_subscription.c | 1 +
src/backend/commands/subscriptioncmds.c | 53 ++++
src/backend/parser/gram.y | 9 +
src/backend/replication/logical/worker.c | 273 ++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 4 +
src/bin/psql/describe.c | 8 +-
src/bin/psql/tab-complete.c | 8 +-
src/include/catalog/pg_subscription.h | 4 +
src/include/nodes/parsenodes.h | 3 +-
src/test/regress/expected/subscription.out | 126 ++++++----
src/test/regress/sql/subscription.sql | 18 ++
src/test/subscription/t/028_skip_xact.pl | 217 ++++++++++++++++
15 files changed, 764 insertions(+), 60 deletions(-)
create mode 100644 src/test/subscription/t/028_skip_xact.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 2aeb2ef346..16f429b853 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7746,6 +7746,16 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subskipxid</structfield> <type>xid</type>
+ </para>
+ <para>
+ ID of the transaction whose changes are to be skipped, if a valid
+ transaction ID; otherwise 0.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subconninfo</structfield> <type>text</type>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 96b4886e08..8a6971e0e3 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -353,15 +353,58 @@
<para>
The resolution can be done either by changing data or permissions on the subscriber so
- that it does not conflict with the incoming change or by skipping the
- transaction that conflicts with the existing data. The transaction can be
- skipped by calling the <link linkend="pg-replication-origin-advance">
+ that it does not conflict with the incoming changes or by skipping the
+ the transaction that conflicts with the existing data. When a conflict
+ produces an error, it is shown in
+ <structname>pg_stat_subscription_workers</structname> view as follows:
+ </para>
+
+ <programlisting>
+postgres=# SELECT * FROM pg_stat_subscription_workers;
+-[ RECORD 1 ]------+-----------------------------------------------------------
+subid | 16391
+subname | test_sub
+subrelid |
+last_error_relid | 16385
+last_error_command | INSERT
+last_error_xid | 716
+last_error_count | 50
+last_error_message | duplicate key value violates unique constraint "test_pkey"
+last_error_time | 2021-09-29 15:52:45.165754+00
+</programlisting>
+
+ <para>
+ and it is also shown in subscriber's server log as follows:
+ </para>
+
+<screen>
+ERROR: duplicate key value violates unique constraint "test_pkey"
+DETAIL: Key (id)=(1) already exists.
+CONTEXT: processing remote data during "INSERT" for replication target relation "public.test" in transaction 716 with commit timestamp 2021-09-29 15:52:45.165754+00
+</screen>
+
+ <para>
+ The transaction ID that contains the change violating the constraint can be
+ found from those outputs (transaction ID 716 in the above case). The transaction
+ can be skipped by using <command>ALTER SUBSCRIPTION ... SKIP</command> on the
+ subscription. Alternatively, the transaction can also be skipped by calling the
+ <link linkend="pg-replication-origin-advance">
<function>pg_replication_origin_advance()</function></link> function with
a <parameter>node_name</parameter> corresponding to the subscription name,
and a position. The current position of origins can be seen in the
<link linkend="view-pg-replication-origin-status">
<structname>pg_replication_origin_status</structname></link> system view.
</para>
+
+ <para>
+ To resolve conflicts, you need to consider changing the data on the subscriber so
+ that it doesn't conflict with incoming changes, or dropping the conflicting constraint
+ or unique index, or writing a trigger on the subscriber to suppress or redirect
+ conflicting incoming changes, or as a last resort, by skipping the whole transaction.
+ Skipping the whole transaction includes skipping the change that may not violate
+ any constraint. This can easily make the subscriber inconsistent, especially if
+ a user specifies the wrong transaction ID or the position of origin.
+ </para>
</sect1>
<sect1 id="logical-replication-restrictions">
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 0b027cc346..ec5eb6e8c4 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -29,6 +29,7 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> REFRESH PUB
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> ENABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> DISABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable class="parameter">skip_option</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <replaceable>new_name</replaceable>
</synopsis>
@@ -207,6 +208,46 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><literal>SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> [, ... ] )</literal></term>
+ <listitem>
+ <para>
+ Skips applying changes of the particular transaction. If incoming data
+ violates any constraints the logical replication will stop until it is
+ resolved. The resolution can be done either by changing data on the
+ subscriber so that it doesn't conflict with incoming change or by skipping
+ the whole transaction. The logical replication worker skips all data
+ modification changes within the specified transaction including the changes
+ that may not violate the constraint, so, it should only be used as a last
+ resort. This option has no effect on the transaction that is already
+ prepared by enabling <literal>two_phase</literal> on subscriber. After
+ the logical replication successfully skips the transaction, the transaction
+ ID (stored in
+ <structname>pg_subscription</structname>.<structfield>subskipxid</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts.
+ </para>
+
+ <para>
+ <replaceable>skip_option</replaceable> specifies options for this operation.
+ The supported option is:
+
+ <variablelist>
+ <varlistentry>
+ <term><literal>xid</literal> (<type>xid</type>)</term>
+ <listitem>
+ <para>
+ Specifies the ID of the transaction whose changes are to be skipped
+ by the logical replication worker. Setting <literal>NONE</literal>
+ resets the transaction ID.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><replaceable class="parameter">new_owner</replaceable></term>
<listitem>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index ca65a8bd20..da199e9a3e 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -69,6 +69,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->binary = subform->subbinary;
sub->stream = subform->substream;
sub->twophasestate = subform->subtwophasestate;
+ sub->skipxid = subform->subskipxid;
/* Get conninfo */
datum = SysCacheGetAttr(SUBSCRIPTIONOID,
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index f5eba450ce..0ff0e00f19 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -61,6 +61,7 @@
#define SUBOPT_BINARY 0x00000080
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
+#define SUBOPT_XID 0x00000400
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -82,6 +83,8 @@ typedef struct SubOpts
bool binary;
bool streaming;
bool twophase;
+ TransactionId xid; /* InvalidTransactionId for resetting purpose, otherwise
+ * normal transaction id */
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -249,6 +252,33 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_TWOPHASE_COMMIT;
opts->twophase = defGetBoolean(defel);
}
+ else if (IsSet(supported_opts, SUBOPT_XID) &&
+ strcmp(defel->defname, "xid") == 0)
+ {
+ char *xid_str = defGetString(defel);
+ TransactionId xid;
+
+ if (IsSet(opts->specified_opts, SUBOPT_XID))
+ errorConflictingDefElem(defel, pstate);
+
+ /* Setting xid = NONE is treated as resetting xid */
+ if (strcmp(xid_str, "none") == 0)
+ xid = InvalidTransactionId;
+ else
+ {
+ /* Parse the argument as TransactionId */
+ xid = DatumGetTransactionId(DirectFunctionCall1(xidin,
+ CStringGetDatum(xid_str)));
+
+ if (!TransactionIdIsNormal(xid))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid transaction id: %s", xid_str)));
+ }
+
+ opts->specified_opts |= SUBOPT_XID;
+ opts->xid = xid;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -464,6 +494,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
CharGetDatum(opts.twophase ?
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(InvalidTransactionId);
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -1083,6 +1115,27 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
break;
}
+ case ALTER_SUBSCRIPTION_SKIP:
+ {
+ if (!superuser())
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("must be superuser to skip transaction")));
+
+ parse_subscription_options(pstate, stmt->options, SUBOPT_XID, &opts);
+
+ /* ALTER SUBSCRIPTION ... SKIP supports only xid option */
+ Assert(IsSet(opts.specified_opts, SUBOPT_XID));
+
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(opts.xid);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ update_tuple = true;
+
+ break;
+ }
+
default:
elog(ERROR, "unrecognized ALTER SUBSCRIPTION kind %d",
stmt->kind);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index bb015a8bbd..0a0961dbb5 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9954,6 +9954,15 @@ AlterSubscriptionStmt:
(Node *)makeInteger(false), @1));
$$ = (Node *)n;
}
+ | ALTER SUBSCRIPTION name SKIP definition
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_SKIP;
+ n->subname = $3;
+ n->options = $5;
+ $$ = (Node *)n;
+ }
;
/*****************************************************************************
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index c9af775bc1..a2faba11f9 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -136,6 +136,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -257,6 +258,21 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * We enable skipping all data modification changes (INSERT, UPDATE, etc.) for
+ * the subscription if hte user has specified skip_xid. Once we start skipping
+ * changes, we don't stop it until we skip all changes of the transaction even
+ * if pg_subscription is updated and MySubscription->skipxid gets changed or
+ * reset during that. Also, we don't skip receiving the changes in streaming
+ * cases, since we decide whether or not to skip applying the changes when
+ * starting to apply changes. At end of the transaction, we disable it and reset
+ * the skip XID. The timing of resetting the skip XID varies depending on commit
+ * or commit/rollback prepared case. Please refer to the comments in corresponding
+ * functions for details.
+ */
+static TransactionId skip_xid = InvalidTransactionId;
+#define is_skipping_changes() (TransactionIdIsValid(skip_xid))
+
/* BufFile handle of the current streaming file */
static BufFile *stream_fd = NULL;
@@ -332,6 +348,13 @@ static void TwoPhaseTransactionGid(Oid subid, TransactionId xid, char *gid, int
/* Common streaming function to apply all the spooled messages */
static void apply_spooled_messages(TransactionId xid, XLogRecPtr lsn);
+/* Functions for skipping changes */
+static void maybe_start_skipping_changes(TransactionId xid);
+static void stop_skipping_changes(bool commit, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp);
+static void clear_subscription_skip_xid(TransactionId xid, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp);
+
/* Functions for apply error callback */
static void apply_error_callback(void *arg);
static inline void set_apply_error_context_xact(TransactionId xid, TimestampTz ts);
@@ -791,6 +814,11 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
+ /*
+ * Enable skipping all changes of this transaction if specified.
+ */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -843,6 +871,11 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
+ /*
+ * Enable skipping all changes of this transaction if specified
+ */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -856,6 +889,36 @@ apply_handle_prepare_internal(LogicalRepPreparedTxnData *prepare_data)
{
char gid[GIDSIZE];
+ /*
+ * If we are skipping all changes of this transaction, we stop it but
+ * unlike commit, we do not clear subskipxid of pg_subscription catalog
+ * here and will do that at commit prepared or rollback prepared time. If
+ * we update the catalog and then prepare the transaction, the changes
+ * will be part of the prepared transaction. Even if we do that in
+ * reverse order, subskipxid will not be cleared but this transaction
+ * won’t be resent in a case where the server crashes between them.
+ *
+ * subskipxid might be changed or cleared by the user before we receive
+ * COMMIT PREPARED or ROLLBACK PREPARED. But that’s okay because this
+ * prepared transaction is empty.
+ *
+ * One might think that we can skip preparing the skipped transaction and
+ * also skip COMMIT PREPARED or ROLLBACK PREPARED by comparing the XID
+ * received as part of the message to subskipxid. But subskipxid could be
+ * changed by users between PREPARE and COMMIT PREPARED or ROLLBACK
+ * PREPARED. There was an idea to disallow users to change subskipxid
+ * while skipping changes. But we don't know when COMMIT PREPARED or
+ * ROLLBACK PREPARED comes and another conflict could occur in the
+ * meanwhile. If such another conflict occurs, we cannot skip the
+ * transaction by using subskipxid. Also, there was another idea to check
+ * whether the transaction has been prepared or not by checking GID,
+ * origin LSN, and origin timestamp of the prepared transaction but that
+ * doesn't seem worthwhile because it requires protocol changes, and
+ * skipping transactions shouldn't be common.
+ */
+ if (is_skipping_changes())
+ stop_skipping_changes(false, InvalidXLogRecPtr, 0);
+
/*
* Compute unique GID for two_phase transactions. We don't use GID of
* prepared transaction sent by server as that can lead to deadlock when
@@ -901,9 +964,9 @@ apply_handle_prepare(StringInfo s)
/*
* Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction. It is done this way because at
- * commit prepared time, we won't know whether we have skipped preparing a
- * transaction because of no change.
+ * change has happened in this transaction or all changes are skipped. It
+ * is done this way because at commit prepared time, we won't know whether
+ * we have skipped preparing a transaction because of no change.
*
* XXX, We can optimize such that at commit prepared time, we first check
* whether we have prepared the transaction or not but that doesn't seem
@@ -940,6 +1003,26 @@ apply_handle_commit_prepared(StringInfo s)
logicalrep_read_commit_prepared(s, &prepare_data);
set_apply_error_context_xact(prepare_data.xid, prepare_data.commit_time);
+ /* Update the subscription cache if necessary */
+ maybe_reread_subscription();
+
+ if (MySubscription->skipxid == prepare_data.xid)
+ {
+ /*
+ * Clear the subskipxid of pg_subscription catalog. This catalog
+ * update must be committed before finishing prepared transaction.
+ * Because otherwise, in a case where the server crashes between
+ * finishing prepared transaction and the catalog update, COMMIT
+ * PREPARED won’t be resent but subskipxid is left.
+ *
+ * Also, we must not update the replication origin LSN and timestamp
+ * while committing the catalog update so that COMMIT PREPARED will be
+ * resent in case of a crash immediately after the catalog update
+ * commit.
+ */
+ clear_subscription_skip_xid(prepare_data.xid, InvalidXLogRecPtr, 0);
+ }
+
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
gid, sizeof(gid));
@@ -981,6 +1064,20 @@ apply_handle_rollback_prepared(StringInfo s)
logicalrep_read_rollback_prepared(s, &rollback_data);
set_apply_error_context_xact(rollback_data.xid, rollback_data.rollback_time);
+ /* Update the subscription cache if necessary */
+ maybe_reread_subscription();
+
+ if (MySubscription->skipxid == rollback_data.xid)
+ {
+ /*
+ * Same as COMMIT PREPARED, we must clear subskipxid of
+ * pg_subscription before rolling back the prepared transaction.
+ * Please see the comments in apply_handle_commit_prepared() for
+ * details.
+ */
+ clear_subscription_skip_xid(rollback_data.xid, InvalidXLogRecPtr, 0);
+ }
+
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, rollback_data.xid,
gid, sizeof(gid));
@@ -1209,6 +1306,13 @@ apply_handle_stream_abort(StringInfo s)
logicalrep_read_stream_abort(s, &xid, &subxid);
+ /*
+ * We don't expect that the user set the XID of the transaction that is
+ * rolled back but if the skip XID is set, clear it.
+ */
+ if (MySubscription->skipxid == xid || MySubscription->skipxid == subxid)
+ clear_subscription_skip_xid(MySubscription->skipxid, InvalidXLogRecPtr, 0);
+
/*
* If the two XIDs are the same, it's in fact abort of toplevel xact, so
* just delete the files with serialized info.
@@ -1331,6 +1435,9 @@ apply_spooled_messages(TransactionId xid, XLogRecPtr lsn)
remote_final_lsn = lsn;
+ /* Enable skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(xid);
+
/*
* Make sure the handle apply_dispatch methods are aware we're in a remote
* transaction.
@@ -1451,7 +1558,23 @@ apply_handle_stream_commit(StringInfo s)
static void
apply_handle_commit_internal(LogicalRepCommitData *commit_data)
{
- if (IsTransactionState())
+ if (is_skipping_changes())
+ {
+ /*
+ * If we are skipping all changes of this transaction, we stop it and
+ * clear subskipxid of pg_subscription.
+ */
+ stop_skipping_changes(true, commit_data->end_lsn,
+ commit_data->committime);
+
+ /* Clearing subskipxid must be committed */
+ Assert(!IsTransactionState());
+
+ pgstat_report_stat(false);
+
+ store_flush_position(commit_data->end_lsn);
+ }
+ else if (IsTransactionState())
{
/*
* Update origin state so we can restart streaming from correct
@@ -2366,6 +2489,17 @@ apply_dispatch(StringInfo s)
LogicalRepMsgType action = pq_getmsgbyte(s);
LogicalRepMsgType saved_command;
+ /*
+ * Skip all data-modification changes if we're skipping changes of this
+ * transaction.
+ */
+ if (is_skipping_changes() &&
+ (action == LOGICAL_REP_MSG_INSERT ||
+ action == LOGICAL_REP_MSG_UPDATE ||
+ action == LOGICAL_REP_MSG_DELETE ||
+ action == LOGICAL_REP_MSG_TRUNCATE))
+ return;
+
/*
* Set the current command being applied. Since this function can be
* called recusively when applying spooled changes, save the current
@@ -3661,6 +3795,137 @@ IsLogicalWorker(void)
return MyLogicalRepWorker != NULL;
}
+/*
+ * Start skipping changes of the transaction if the given XID matches the
+ * transaction ID specified by skip_xid option.
+ */
+static void
+maybe_start_skipping_changes(TransactionId xid)
+{
+ Assert(!is_skipping_changes());
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ /* Make sure subscription cache is up-to-date */
+ maybe_reread_subscription();
+
+ if (MySubscription->skipxid != xid)
+ return;
+
+ /* Start skipping all changes of this transaction */
+ skip_xid = xid;
+
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction %u",
+ xid));
+}
+
+/*
+ * Stop skipping changes by resetting skip_xid. If clear_subskipxid is true,
+ * we also clear subskipxid of pg_subscription by setting InvalidTransactionId.
+ */
+static void
+stop_skipping_changes(bool clear_subskipxid, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp)
+{
+ Assert(is_skipping_changes());
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction %u",
+ skip_xid)));
+
+ if (clear_subskipxid)
+ {
+ clear_subscription_skip_xid(skip_xid, origin_lsn, origin_timestamp);
+
+ /* Make sure that clearing subskipxid is committed */
+ if (IsTransactionState())
+ CommitTransactionCommand();
+ }
+
+ /* Stop skipping changes */
+ skip_xid = InvalidTransactionId;
+}
+
+/* clear subskipxid of pg_subscription catalog */
+static void
+clear_subscription_skip_xid(TransactionId xid, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp)
+{
+ Relation rel;
+ Form_pg_subscription subform;
+ HeapTuple tup;
+ bool started_tx = false;
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ if (!IsTransactionState())
+ {
+ StartTransactionCommand();
+ started_tx = true;
+ }
+
+ /*
+ * Protect subskip_xid of pg_subscription from being concurrently updated
+ * while clearing it.
+ */
+ LockSharedObject(SubscriptionRelationId, MySubscription->oid, 0,
+ AccessShareLock);
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ ObjectIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ /* Get subskipxid value */
+ subform = (Form_pg_subscription) GETSTRUCT(tup);
+
+ /*
+ * Update the subskipxid of the tuple to InvalidTransactionId. If user
+ * has already changed subskipxid before clearing it we don't update the
+ * catalog and don't advance the replication origin state. So in the
+ * worst case, if the server crashes before sending an acknowledgment of
+ * the flush position the transaction will be sent again and the user
+ * needs to be set subskipxid again. We can reduce the possibility by
+ * logging a replication origin WAL record to advance the origin LSN
+ * instead but there is no way to advance origin timestamp and it
+ * doesn't seem to be worth since it's a very minor case.
+ */
+ if (subform->subskipxid == xid)
+ {
+ /*
+ * Update origin state so we can restart streaming from correct
+ * position in case of crash.
+ */
+ replorigin_session_origin_lsn = origin_lsn;
+ replorigin_session_origin_timestamp = origin_timestamp;
+
+ /* reset subskipxid */
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(InvalidTransactionId);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+ }
+
+ heap_freetuple(tup);
+ table_close(rel, NoLock);
+
+ if (started_tx)
+ CommitTransactionCommand();
+}
+
/* Error callback to give more context info about the change being applied */
static void
apply_error_callback(void *arg)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 92ab95724d..29aea5b56b 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4301,6 +4301,10 @@ getSubscriptions(Archive *fout)
ntups = PQntuples(res);
+ /*
+ * Get subscription fields. We don't fetch subskipxid as we don't
+ * include it in the dump.
+ */
i_tableoid = PQfnumber(res, "tableoid");
i_oid = PQfnumber(res, "oid");
i_subname = PQfnumber(res, "subname");
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 8587b19160..2895ddcea3 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6029,7 +6029,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false};
+ false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6066,8 +6066,10 @@ describeSubscriptions(const char *pattern, bool verbose)
/* Two_phase is only supported in v15 and higher */
if (pset.sversion >= 150000)
appendPQExpBuffer(&buf,
- ", subtwophasestate AS \"%s\"\n",
- gettext_noop("Two phase commit"));
+ ", subtwophasestate AS \"%s\"\n"
+ ", subskipxid AS \"%s\"\n",
+ gettext_noop("Two phase commit"),
+ gettext_noop("Skip XID"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 6bd33a06cb..b5689ec609 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1710,7 +1710,7 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> */
else if (Matches("ALTER", "SUBSCRIPTION", MatchAny))
COMPLETE_WITH("CONNECTION", "ENABLE", "DISABLE", "OWNER TO",
- "RENAME TO", "REFRESH PUBLICATION", "SET",
+ "RENAME TO", "REFRESH PUBLICATION", "SET", "SKIP",
"ADD PUBLICATION", "DROP PUBLICATION");
/* ALTER SUBSCRIPTION <name> REFRESH PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) &&
@@ -1726,6 +1726,12 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
COMPLETE_WITH("binary", "slot_name", "streaming", "synchronous_commit");
+ /* ALTER SUBSCRIPTION <name> SKIP */
+ else if (Matches("ALTER", "SUBSCRIPTION", MatchAny, "SKIP"))
+ COMPLETE_WITH("(");
+ /* ALTER SUBSCRIPTION <name> SKIP ( */
+ else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
+ COMPLETE_WITH("xid");
/* ALTER SUBSCRIPTION <name> SET PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "PUBLICATION"))
{
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 18c291289f..d4410da58f 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -67,6 +67,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
char subtwophasestate; /* Stream two-phase transactions */
+ TransactionId subskipxid; /* All changes associated with
+ * this XID are skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -103,6 +106,7 @@ typedef struct Subscription
* binary format */
bool stream; /* Allow streaming in-progress transactions. */
char twophasestate; /* Allow streaming two-phase transactions */
+ TransactionId skipxid; /* All changes of the XID are skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 413e7c85a1..ab3554f234 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3716,7 +3716,8 @@ typedef enum AlterSubscriptionType
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
ALTER_SUBSCRIPTION_DROP_PUBLICATION,
ALTER_SUBSCRIPTION_REFRESH,
- ALTER_SUBSCRIPTION_ENABLED
+ ALTER_SUBSCRIPTION_ENABLED,
+ ALTER_SUBSCRIPTION_SKIP
} AlterSubscriptionType;
typedef struct AlterSubscriptionStmt
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 80aae83562..fdba3cff94 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -76,10 +76,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -93,11 +93,41 @@ ALTER SUBSCRIPTION regress_doesnotexist CONNECTION 'dbname=regress_doesnotexist2
ERROR: subscription "regress_doesnotexist" does not exist
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
ERROR: unrecognized subscription parameter: "create_slot"
+-- ok - valid xid
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 3);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 4294967295);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+ subname | subskipxid
+-----------------+------------
+ regress_testsub | 4294967295
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = NONE);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+ subname | subskipxid
+-----------------+------------
+ regress_testsub | 0
+(1 row)
+
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1.1);
+ERROR: invalid transaction id: 1.1
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 0);
+ERROR: invalid transaction id: 0
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1);
+ERROR: invalid transaction id: 1
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 2);
+ERROR: invalid transaction id: 2
+-- fail - must be superuser
+SET SESSION AUTHORIZATION 'regress_subscription_user2';
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 100);
+ERROR: must be owner of subscription regress_testsub
+SET SESSION AUTHORIZATION 'regress_subscription_user';
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+--------------------+------------------------------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | off | dbname=regress_doesnotexist2
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+----------+--------------------+------------------------------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | 0 | off | dbname=regress_doesnotexist2
(1 row)
BEGIN;
@@ -129,10 +159,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+--------------------+------------------------------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | local | dbname=regress_doesnotexist2
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+----------+--------------------+------------------------------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | 0 | local | dbname=regress_doesnotexist2
(1 row)
-- rename back to keep the rest simple
@@ -165,19 +195,19 @@ ERROR: binary requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, binary = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -188,19 +218,19 @@ ERROR: streaming requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
-- fail - publication already exists
@@ -215,10 +245,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
-- fail - publication used more then once
@@ -233,10 +263,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -270,10 +300,10 @@ ERROR: two_phase requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | 0 | off | dbname=regress_doesnotexist
(1 row)
--fail - alter of two_phase option not supported.
@@ -282,10 +312,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -294,10 +324,10 @@ DROP SUBSCRIPTION regress_testsub;
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index bd0f4af1e4..39409295a3 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -72,6 +72,24 @@ ALTER SUBSCRIPTION regress_testsub SET (slot_name = '');
ALTER SUBSCRIPTION regress_doesnotexist CONNECTION 'dbname=regress_doesnotexist2';
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
+-- ok - valid xid
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 3);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 4294967295);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = NONE);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1.1);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 0);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 2);
+
+-- fail - must be superuser
+SET SESSION AUTHORIZATION 'regress_subscription_user2';
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 100);
+SET SESSION AUTHORIZATION 'regress_subscription_user';
+
\dRs+
BEGIN;
diff --git a/src/test/subscription/t/028_skip_xact.pl b/src/test/subscription/t/028_skip_xact.pl
new file mode 100644
index 0000000000..4c107fc8f5
--- /dev/null
+++ b/src/test/subscription/t/028_skip_xact.pl
@@ -0,0 +1,217 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Tests for skipping logical replication transactions
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use Test::More tests => 7;
+
+# Test skipping the transaction. This function must be called after the caller
+# inserting data that conflict with the subscriber. After waiting for the
+# subscription worker stats are updated, we skip the transaction in question
+# by ALTER SUBSCRIPTION ... SKIP. Then, check if logical replication can continue
+# working by inserting $nonconflict_data on the publisher.
+sub test_skip_xact
+{
+ my ($node_publisher, $node_subscriber, $subname, $relname, $nonconflict_data,
+ $expected, $xid, $msg) = @_;
+
+ # Wait for worker error
+ $node_subscriber->poll_query_until(
+ 'postgres',
+ qq[
+SELECT count(1) > 0
+FROM pg_stat_subscription_workers
+WHERE last_error_relid = '$relname'::regclass
+ AND subrelid IS NULL
+ AND last_error_command = 'INSERT'
+ AND last_error_xid = '$xid'
+ AND starts_with(last_error_message, 'duplicate key value violates unique constraint');
+]) or die "Timed out while waiting for worker error";
+
+ # Set skip xid
+ $node_subscriber->safe_psql(
+ 'postgres',
+ "ALTER SUBSCRIPTION $subname SKIP (xid = '$xid')");
+
+ # Restart the subscriber node to restart logical replication with no interval
+ $node_subscriber->restart;
+
+ # Wait for the failed transaction to be skipped
+ $node_subscriber->poll_query_until(
+ 'postgres',
+ "SELECT subskipxid = 0 FROM pg_subscription WHERE subname = '$subname'");
+
+ # Insert non-conflict data
+ $node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO $relname VALUES $nonconflict_data");
+
+ $node_publisher->wait_for_catchup($subname);
+
+ # Check replicated data
+ my $res = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT count(*) FROM $relname");
+ is($res, $expected, $msg);
+}
+
+# Create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ qq[
+logical_decoding_work_mem = 64kB
+max_prepared_transactions = 10
+]);
+$node_publisher->start;
+
+# Create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+]);
+
+# The subscriber will enter an infinite error loop, so we don't want
+# to overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+wal_retrieve_retry_interval = 2s
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we
+# create the same tables but with primary keys. Also, insert some data that
+# will conflict with the data replicated from publisher later.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+CREATE TABLE test_tab (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+COMMIT;
+]);
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+CREATE TABLE test_tab (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab VALUES (1);
+INSERT INTO test_tab_streaming VALUES (1, md5(1::text));
+COMMIT;
+]);
+
+# Setup publications
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Create subscriptions
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr' PUBLICATION tap_pub WITH (two_phase = on);
+CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr' PUBLICATION tap_pub_streaming WITH (two_phase = on, streaming = on);
+]);
+
+$node_publisher->wait_for_catchup('tap_sub');
+$node_publisher->wait_for_catchup('tap_sub_streaming');
+
+# Insert data to test_tab1, raising an error on the subscriber due to violation
+# of the unique constraint on test_tab. Then skip the transaction.
+my $xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(2)", "2", $xid,
+ "test skipping transaction");
+
+# Test for PREPARE and COMMIT PREPARED. Insert the same data to test_tab1 and
+# PREPARE the transaction, raising an error. Then skip the transaction.
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+COMMIT PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(3)", "3", $xid,
+ "test skipping prepare and commit prepared ");
+
+# Test for PREPARE and ROLLBACK PREPARED
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+ROLLBACK PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(4)", "4", $xid,
+ "test skipping prepare and rollback prepared");
+
+# Test for STREAM COMMIT. Insert enough rows to test_tab_streaming to exceed the 64kB
+# limit, also raising an error on the subscriber during applying spooled changes for the
+# same reason. Then skip the transaction.
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub_streaming", "test_tab_streaming",
+ "(2, md5(2::text))", "2", $xid,
+ "test skipping stream-commit");
+
+# Test for STREAM PREPARE and COMMIT PREPARED
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+COMMIT PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub_streaming", "test_tab_streaming",
+ "(3, md5(3::text))", "3", $xid,
+ "test skipping stream-prepare and commit prepared");
+
+# Test for STREAM PREPARE and ROLLBACK PREPARED
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+ROLLBACK PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub_streaming", "test_tab_streaming",
+ "(4, md5(4::text))", "4", $xid,
+ "test skipping stream-prepare and rollback prepared");
+
+my $res = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT count(*) FROM pg_prepared_xacts");
+is($res, "0", "check all prepared transactions are resolved on the subscriber");
--
2.24.3 (Apple Git-128)
On Fri, Jan 14, 2022 at 9:05 PM vignesh C <vignesh21@gmail.com> wrote:
On Fri, Jan 14, 2022 at 7:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Jan 12, 2022 at 11:10 PM vignesh C <vignesh21@gmail.com> wrote:
On Wed, Jan 12, 2022 at 11:32 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Jan 12, 2022 at 12:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Jan 12, 2022 at 5:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Jan 11, 2022 at 7:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jan 11, 2022 at 1:51 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On second thought, the same is true for other cases, for example,
preparing the transaction and clearing skip_xid while handling a
prepare message. That is, currently we don't clear skip_xid while
handling a prepare message but do that while handling commit/rollback
prepared message, in order to avoid the worst case. If we do both
while handling a prepare message and the server crashes between them,
it ends up that skip_xid is cleared and the transaction will be
resent, which is identical to the worst-case above.How are you thinking to update the skip xid before prepare? If we do
it in the same transaction then the changes in the catalog will be
part of the prepared xact but won't be committed. Now, say if we do it
after prepare, then the situation won't be the same because after
restart the same xact won't appear again.I was thinking to commit the catalog change first in a separate
transaction while not updating origin LSN and then prepare an empty
transaction while updating origin LSN.But, won't it complicate the handling if in the future we try to
enhance this API such that it skips partial changes like skipping only
for particular relation(s) or particular operations as discussed
previously in this thread?Right. I was thinking that if we accept the situation that the user
has to set skip_xid again in case of the server crashes, we might be
able to accept also the situation that the user has to clear skip_xid
in a case of the server crashes. But it seems the former is less
problematic.I've attached an updated patch that incorporated all comments I got so far.
Thanks for the updated patch, few comments:
Thank you for the comments!
1) Currently skip xid is not displayed in describe subscriptions, can
we include it too:
\dRs+ sub1
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two
phase commit | Synchronous commit | Conninfo
------+---------+---------+-------------+--------+-----------+------------------+--------------------+--------------------------------
sub1 | vignesh | t | {pub1} | f | f | e
| off | dbname=postgres host=localhost
(1 row)2) This import "use PostgreSQL::Test::Utils;" is not required: +# Tests for skipping logical replication transactions. +use strict; +use warnings; +use PostgreSQL::Test::Cluster; +use PostgreSQL::Test::Utils; +use Test::More tests => 6;3) Some of the comments uses a punctuation mark and some of them does not use, Should we keep it consistent: + # Wait for worker error + $node_subscriber->poll_query_until( + 'postgres',+ # Set skip xid + $node_subscriber->safe_psql( + 'postgres',+# Create publisher node. +my $node_publisher = PostgreSQL::Test::Cluster->new('publisher'); +$node_publisher->init(allows_streaming => 'logical');+# Create subscriber node. +my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');4) Should this be changed: + * True if we are skipping all data modification changes (INSERT, UPDATE, etc.) of + * the specified transaction at MySubscription->skipxid. Once we start skipping + * changes, we don't stop it until the we skip all changes of the transaction even + * if pg_subscription is updated that and MySubscription->skipxid gets changed or to: + * True if we are skipping all data modification changes (INSERT, UPDATE, etc.) of + * the specified transaction at MySubscription->skipxid. Once we start skipping + * changes, we don't stop it until we skip all changes of the transaction even + * if pg_subscription is updated that and MySubscription->skipxid gets changed orIn "stop it until the we skip all changes", here the is not required.
I agree with all the comments above. I've attached an updated patch.
Thanks for the updated patch, few minor comments:
Thank you for the comments.
1) Should "SKIP" be "SKIP (" here: @@ -1675,7 +1675,7 @@ psql_completion(const char *text, int start, int end) /* ALTER SUBSCRIPTION <name> */ else if (Matches("ALTER", "SUBSCRIPTION", MatchAny)) COMPLETE_WITH("CONNECTION", "ENABLE", "DISABLE", "OWNER TO", - "RENAME TO", "REFRESH PUBLICATION", "SET", + "RENAME TO", "REFRESH PUBLICATION", "SET", "SKIP",
As Amit mentioned, it's consistent with the SET option.
2) We could add a test for this if possible: + case ALTER_SUBSCRIPTION_SKIP: + { + if (!superuser()) + ereport(ERROR, + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), + errmsg("must be superuser to skip transaction")));3) There was one typo in commit message, transaciton shoudl be transaction:
After skipping the transaciton the apply worker clears
pg_subscription.subskipxid.Another small typo, susbscriber should be subscriber: + prepared by enabling <literal>two_phase</literal> on susbscriber. After + the logical replication successfully skips the transaction, the transaction4) Should skipsubxid be mentioned as subskipxid here: + * Clear the subskipxid of pg_subscription catalog. This catalog + * update must be committed before finishing prepared transaction. + * Because otherwise, in a case where the server crashes between + * finishing prepared transaction and the catalog update, COMMIT + * PREPARED won’t be resent but skipsubxid is left.
The above comments were incorporated into the latest v5 patch I just
submitted[1]/messages/by-id/CAD21AoCd3Y2-b67+pVrzrdteUmup1XG6JeHYOa5dGjh8qZ3VuQ@mail.gmail.com.
Regards,
[1]: /messages/by-id/CAD21AoCd3Y2-b67+pVrzrdteUmup1XG6JeHYOa5dGjh8qZ3VuQ@mail.gmail.com
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Mon, Jan 17, 2022 at 9:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Sat, Jan 15, 2022 at 7:24 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
6. +static void +maybe_start_skipping_changes(TransactionId xid) +{ + Assert(!is_skipping_changes()); + Assert(!in_remote_transaction); + Assert(!in_streamed_transaction); + + /* Make sure subscription cache is up-to-date */ + maybe_reread_subscription();Why do we need to update the cache here by calling
maybe_reread_subscription() and at other places in the patch? It is
sufficient to get the skip_xid value at the start of the worker via
GetSubscription().MySubscription could be out-of-date after a user changes the catalog.
In non-skipping change cases, we check it when starting the
transaction in begin_replication_step() which is called, e.g., when
applying an insert change. But I think we need to make sure it’s
up-to-date at the beginning of applying changes, that is, before
starting a transaction. Otherwise, we may end up skipping the
transaction based on out-of-dated subscription cache.
I thought the user would normally set skip_xid only after an error
which means that the value should be as new as the time of the start
of the worker. I am slightly worried about the cost we might need to
pay for this additional look-up in case skip_xid is not changed. Do
you see any valid user scenario where we might not see the required
skip_xid? I am okay with calling this if we really need it.
7. In maybe_reread_subscription(), isn't there a need to check whether
skip_xid is changed where we exit and launch the worker and compare
other subscription parameters?IIUC we relaunch the worker here when subscription parameters such as
slot_name was changed. In the current implementation, I think that
relaunching the worker is not necessarily necessary when skip_xid is
changed. For instance, when skipping the prepared transaction, we
deliberately don’t clear subskipxid of pg_subscription and do that at
commit-prepared or rollback-prepared case. There are chances that the
user changes skip_xid before commit-prepared or rollback-prepared. But
we tolerate this case.
I think between prepare and commit prepared, the user only needs to
change it if there is another error in which case we will anyway
restart and load the new value of same. But, I understand that we
don't need to restart if skip_xid is changed as it might not impact
remote connection in any way, so I am fine for not doing anything for
this.
--
With Regards,
Amit Kapila.
On Mon, Jan 17, 2022 at 2:48 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Jan 17, 2022 at 9:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Sat, Jan 15, 2022 at 7:24 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
6. +static void +maybe_start_skipping_changes(TransactionId xid) +{ + Assert(!is_skipping_changes()); + Assert(!in_remote_transaction); + Assert(!in_streamed_transaction); + + /* Make sure subscription cache is up-to-date */ + maybe_reread_subscription();Why do we need to update the cache here by calling
maybe_reread_subscription() and at other places in the patch? It is
sufficient to get the skip_xid value at the start of the worker via
GetSubscription().MySubscription could be out-of-date after a user changes the catalog.
In non-skipping change cases, we check it when starting the
transaction in begin_replication_step() which is called, e.g., when
applying an insert change. But I think we need to make sure it’s
up-to-date at the beginning of applying changes, that is, before
starting a transaction. Otherwise, we may end up skipping the
transaction based on out-of-dated subscription cache.I thought the user would normally set skip_xid only after an error
which means that the value should be as new as the time of the start
of the worker. I am slightly worried about the cost we might need to
pay for this additional look-up in case skip_xid is not changed. Do
you see any valid user scenario where we might not see the required
skip_xid? I am okay with calling this if we really need it.
Fair point. I've changed the code accordingly.
7. In maybe_reread_subscription(), isn't there a need to check whether
skip_xid is changed where we exit and launch the worker and compare
other subscription parameters?IIUC we relaunch the worker here when subscription parameters such as
slot_name was changed. In the current implementation, I think that
relaunching the worker is not necessarily necessary when skip_xid is
changed. For instance, when skipping the prepared transaction, we
deliberately don’t clear subskipxid of pg_subscription and do that at
commit-prepared or rollback-prepared case. There are chances that the
user changes skip_xid before commit-prepared or rollback-prepared. But
we tolerate this case.I think between prepare and commit prepared, the user only needs to
change it if there is another error in which case we will anyway
restart and load the new value of same. But, I understand that we
don't need to restart if skip_xid is changed as it might not impact
remote connection in any way, so I am fine for not doing anything for
this.
I'll leave this part for now. We can change it later if others think
it's necessary.
I've attached an updated patch. Please review it.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v6-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transac.patchapplication/octet-stream; name=v6-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transac.patchDownload
From 51ebee91cee2412e3e21d9f3292e428abb1a66e0 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 10 Dec 2021 14:41:30 +0900
Subject: [PATCH v6] Add ALTER SUBSCRIPTION ... SKIP to skip the transaction on
subscriber nodes
If incoming change violates any constraint, logical replication stops
until it's resolved. This commit introduces another way to skip the
transaction in question, other than manually updating the subscriber's
database or using pg_replication_origin_advance().
The user can specify XID by ALTER SUBSCRIPTION ... SKIP (xid = XXX),
updating pg_subscription.subskipxid field, telling the apply worker to
skip the transaction. The apply worker skips all data modification
changes within the specified transaction.
After skipping the transaction the apply worker clears
pg_subscription.subskipxid.
---
doc/src/sgml/catalogs.sgml | 10 +
doc/src/sgml/logical-replication.sgml | 49 +++-
doc/src/sgml/ref/alter_subscription.sgml | 41 ++++
src/backend/catalog/pg_subscription.c | 1 +
src/backend/commands/subscriptioncmds.c | 53 +++++
src/backend/parser/gram.y | 9 +
src/backend/replication/logical/worker.c | 264 ++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 4 +
src/bin/psql/describe.c | 8 +-
src/bin/psql/tab-complete.c | 8 +-
src/include/catalog/pg_subscription.h | 4 +
src/include/nodes/parsenodes.h | 3 +-
src/test/regress/expected/subscription.out | 126 ++++++----
src/test/regress/sql/subscription.sql | 18 ++
src/test/subscription/t/028_skip_xact.pl | 217 +++++++++++++++++
15 files changed, 755 insertions(+), 60 deletions(-)
create mode 100644 src/test/subscription/t/028_skip_xact.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 2aeb2ef346..16f429b853 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7746,6 +7746,16 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subskipxid</structfield> <type>xid</type>
+ </para>
+ <para>
+ ID of the transaction whose changes are to be skipped, if a valid
+ transaction ID; otherwise 0.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subconninfo</structfield> <type>text</type>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 96b4886e08..8a6971e0e3 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -353,15 +353,58 @@
<para>
The resolution can be done either by changing data or permissions on the subscriber so
- that it does not conflict with the incoming change or by skipping the
- transaction that conflicts with the existing data. The transaction can be
- skipped by calling the <link linkend="pg-replication-origin-advance">
+ that it does not conflict with the incoming changes or by skipping the
+ the transaction that conflicts with the existing data. When a conflict
+ produces an error, it is shown in
+ <structname>pg_stat_subscription_workers</structname> view as follows:
+ </para>
+
+ <programlisting>
+postgres=# SELECT * FROM pg_stat_subscription_workers;
+-[ RECORD 1 ]------+-----------------------------------------------------------
+subid | 16391
+subname | test_sub
+subrelid |
+last_error_relid | 16385
+last_error_command | INSERT
+last_error_xid | 716
+last_error_count | 50
+last_error_message | duplicate key value violates unique constraint "test_pkey"
+last_error_time | 2021-09-29 15:52:45.165754+00
+</programlisting>
+
+ <para>
+ and it is also shown in subscriber's server log as follows:
+ </para>
+
+<screen>
+ERROR: duplicate key value violates unique constraint "test_pkey"
+DETAIL: Key (id)=(1) already exists.
+CONTEXT: processing remote data during "INSERT" for replication target relation "public.test" in transaction 716 with commit timestamp 2021-09-29 15:52:45.165754+00
+</screen>
+
+ <para>
+ The transaction ID that contains the change violating the constraint can be
+ found from those outputs (transaction ID 716 in the above case). The transaction
+ can be skipped by using <command>ALTER SUBSCRIPTION ... SKIP</command> on the
+ subscription. Alternatively, the transaction can also be skipped by calling the
+ <link linkend="pg-replication-origin-advance">
<function>pg_replication_origin_advance()</function></link> function with
a <parameter>node_name</parameter> corresponding to the subscription name,
and a position. The current position of origins can be seen in the
<link linkend="view-pg-replication-origin-status">
<structname>pg_replication_origin_status</structname></link> system view.
</para>
+
+ <para>
+ To resolve conflicts, you need to consider changing the data on the subscriber so
+ that it doesn't conflict with incoming changes, or dropping the conflicting constraint
+ or unique index, or writing a trigger on the subscriber to suppress or redirect
+ conflicting incoming changes, or as a last resort, by skipping the whole transaction.
+ Skipping the whole transaction includes skipping the change that may not violate
+ any constraint. This can easily make the subscriber inconsistent, especially if
+ a user specifies the wrong transaction ID or the position of origin.
+ </para>
</sect1>
<sect1 id="logical-replication-restrictions">
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 0b027cc346..ec5eb6e8c4 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -29,6 +29,7 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> REFRESH PUB
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> ENABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> DISABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable class="parameter">skip_option</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <replaceable>new_name</replaceable>
</synopsis>
@@ -207,6 +208,46 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><literal>SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> [, ... ] )</literal></term>
+ <listitem>
+ <para>
+ Skips applying changes of the particular transaction. If incoming data
+ violates any constraints the logical replication will stop until it is
+ resolved. The resolution can be done either by changing data on the
+ subscriber so that it doesn't conflict with incoming change or by skipping
+ the whole transaction. The logical replication worker skips all data
+ modification changes within the specified transaction including the changes
+ that may not violate the constraint, so, it should only be used as a last
+ resort. This option has no effect on the transaction that is already
+ prepared by enabling <literal>two_phase</literal> on subscriber. After
+ the logical replication successfully skips the transaction, the transaction
+ ID (stored in
+ <structname>pg_subscription</structname>.<structfield>subskipxid</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts.
+ </para>
+
+ <para>
+ <replaceable>skip_option</replaceable> specifies options for this operation.
+ The supported option is:
+
+ <variablelist>
+ <varlistentry>
+ <term><literal>xid</literal> (<type>xid</type>)</term>
+ <listitem>
+ <para>
+ Specifies the ID of the transaction whose changes are to be skipped
+ by the logical replication worker. Setting <literal>NONE</literal>
+ resets the transaction ID.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><replaceable class="parameter">new_owner</replaceable></term>
<listitem>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index ca65a8bd20..da199e9a3e 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -69,6 +69,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->binary = subform->subbinary;
sub->stream = subform->substream;
sub->twophasestate = subform->subtwophasestate;
+ sub->skipxid = subform->subskipxid;
/* Get conninfo */
datum = SysCacheGetAttr(SUBSCRIPTIONOID,
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index f5eba450ce..0ff0e00f19 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -61,6 +61,7 @@
#define SUBOPT_BINARY 0x00000080
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
+#define SUBOPT_XID 0x00000400
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -82,6 +83,8 @@ typedef struct SubOpts
bool binary;
bool streaming;
bool twophase;
+ TransactionId xid; /* InvalidTransactionId for resetting purpose, otherwise
+ * normal transaction id */
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -249,6 +252,33 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_TWOPHASE_COMMIT;
opts->twophase = defGetBoolean(defel);
}
+ else if (IsSet(supported_opts, SUBOPT_XID) &&
+ strcmp(defel->defname, "xid") == 0)
+ {
+ char *xid_str = defGetString(defel);
+ TransactionId xid;
+
+ if (IsSet(opts->specified_opts, SUBOPT_XID))
+ errorConflictingDefElem(defel, pstate);
+
+ /* Setting xid = NONE is treated as resetting xid */
+ if (strcmp(xid_str, "none") == 0)
+ xid = InvalidTransactionId;
+ else
+ {
+ /* Parse the argument as TransactionId */
+ xid = DatumGetTransactionId(DirectFunctionCall1(xidin,
+ CStringGetDatum(xid_str)));
+
+ if (!TransactionIdIsNormal(xid))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid transaction id: %s", xid_str)));
+ }
+
+ opts->specified_opts |= SUBOPT_XID;
+ opts->xid = xid;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -464,6 +494,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
CharGetDatum(opts.twophase ?
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(InvalidTransactionId);
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -1083,6 +1115,27 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
break;
}
+ case ALTER_SUBSCRIPTION_SKIP:
+ {
+ if (!superuser())
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("must be superuser to skip transaction")));
+
+ parse_subscription_options(pstate, stmt->options, SUBOPT_XID, &opts);
+
+ /* ALTER SUBSCRIPTION ... SKIP supports only xid option */
+ Assert(IsSet(opts.specified_opts, SUBOPT_XID));
+
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(opts.xid);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ update_tuple = true;
+
+ break;
+ }
+
default:
elog(ERROR, "unrecognized ALTER SUBSCRIPTION kind %d",
stmt->kind);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index bb015a8bbd..0a0961dbb5 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9954,6 +9954,15 @@ AlterSubscriptionStmt:
(Node *)makeInteger(false), @1));
$$ = (Node *)n;
}
+ | ALTER SUBSCRIPTION name SKIP definition
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_SKIP;
+ n->subname = $3;
+ n->options = $5;
+ $$ = (Node *)n;
+ }
;
/*****************************************************************************
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index c9af775bc1..89ee083e3f 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -136,6 +136,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -257,6 +258,21 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * We enable skipping all data modification changes (INSERT, UPDATE, etc.) for
+ * the subscription if hte user has specified skip_xid. Once we start skipping
+ * changes, we don't stop it until we skip all changes of the transaction even
+ * if pg_subscription is updated and MySubscription->skipxid gets changed or
+ * reset during that. Also, we don't skip receiving the changes in streaming
+ * cases, since we decide whether or not to skip applying the changes when
+ * starting to apply changes. At end of the transaction, we disable it and reset
+ * the skip XID. The timing of resetting the skip XID varies depending on commit
+ * or commit/rollback prepared case. Please refer to the comments in corresponding
+ * functions for details.
+ */
+static TransactionId skip_xid = InvalidTransactionId;
+#define is_skipping_changes() (TransactionIdIsValid(skip_xid))
+
/* BufFile handle of the current streaming file */
static BufFile *stream_fd = NULL;
@@ -332,6 +348,13 @@ static void TwoPhaseTransactionGid(Oid subid, TransactionId xid, char *gid, int
/* Common streaming function to apply all the spooled messages */
static void apply_spooled_messages(TransactionId xid, XLogRecPtr lsn);
+/* Functions for skipping changes */
+static void maybe_start_skipping_changes(TransactionId xid);
+static void stop_skipping_changes(bool commit, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp);
+static void clear_subscription_skip_xid(TransactionId xid, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp);
+
/* Functions for apply error callback */
static void apply_error_callback(void *arg);
static inline void set_apply_error_context_xact(TransactionId xid, TimestampTz ts);
@@ -791,6 +814,11 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
+ /*
+ * Enable skipping all changes of this transaction if specified.
+ */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -843,6 +871,11 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
+ /*
+ * Enable skipping all changes of this transaction if specified
+ */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -856,6 +889,36 @@ apply_handle_prepare_internal(LogicalRepPreparedTxnData *prepare_data)
{
char gid[GIDSIZE];
+ /*
+ * If we are skipping all changes of this transaction, we stop it but
+ * unlike commit, we do not clear subskipxid of pg_subscription catalog
+ * here and will do that at commit prepared or rollback prepared time. If
+ * we update the catalog and then prepare the transaction, the changes
+ * will be part of the prepared transaction. Even if we do that in
+ * reverse order, subskipxid will not be cleared but this transaction
+ * won’t be resent in a case where the server crashes between them.
+ *
+ * subskipxid might be changed or cleared by the user before we receive
+ * COMMIT PREPARED or ROLLBACK PREPARED. But that’s okay because this
+ * prepared transaction is empty.
+ *
+ * One might think that we can skip preparing the skipped transaction and
+ * also skip COMMIT PREPARED or ROLLBACK PREPARED by comparing the XID
+ * received as part of the message to subskipxid. But subskipxid could be
+ * changed by users between PREPARE and COMMIT PREPARED or ROLLBACK
+ * PREPARED. There was an idea to disallow users to change subskipxid
+ * while skipping changes. But we don't know when COMMIT PREPARED or
+ * ROLLBACK PREPARED comes and another conflict could occur in the
+ * meanwhile. If such another conflict occurs, we cannot skip the
+ * transaction by using subskipxid. Also, there was another idea to check
+ * whether the transaction has been prepared or not by checking GID,
+ * origin LSN, and origin timestamp of the prepared transaction but that
+ * doesn't seem worthwhile because it requires protocol changes, and
+ * skipping transactions shouldn't be common.
+ */
+ if (is_skipping_changes())
+ stop_skipping_changes(false, InvalidXLogRecPtr, 0);
+
/*
* Compute unique GID for two_phase transactions. We don't use GID of
* prepared transaction sent by server as that can lead to deadlock when
@@ -901,9 +964,9 @@ apply_handle_prepare(StringInfo s)
/*
* Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction. It is done this way because at
- * commit prepared time, we won't know whether we have skipped preparing a
- * transaction because of no change.
+ * change has happened in this transaction or all changes are skipped. It
+ * is done this way because at commit prepared time, we won't know whether
+ * we have skipped preparing a transaction because of no change.
*
* XXX, We can optimize such that at commit prepared time, we first check
* whether we have prepared the transaction or not but that doesn't seem
@@ -940,6 +1003,23 @@ apply_handle_commit_prepared(StringInfo s)
logicalrep_read_commit_prepared(s, &prepare_data);
set_apply_error_context_xact(prepare_data.xid, prepare_data.commit_time);
+ if (MySubscription->skipxid == prepare_data.xid)
+ {
+ /*
+ * Clear the subskipxid of pg_subscription catalog. This catalog
+ * update must be committed before finishing prepared transaction.
+ * Because otherwise, in a case where the server crashes between
+ * finishing prepared transaction and the catalog update, COMMIT
+ * PREPARED won’t be resent but subskipxid is left.
+ *
+ * Also, we must not update the replication origin LSN and timestamp
+ * while committing the catalog update so that COMMIT PREPARED will be
+ * resent in case of a crash immediately after the catalog update
+ * commit.
+ */
+ clear_subscription_skip_xid(prepare_data.xid, InvalidXLogRecPtr, 0);
+ }
+
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
gid, sizeof(gid));
@@ -981,6 +1061,17 @@ apply_handle_rollback_prepared(StringInfo s)
logicalrep_read_rollback_prepared(s, &rollback_data);
set_apply_error_context_xact(rollback_data.xid, rollback_data.rollback_time);
+ if (MySubscription->skipxid == rollback_data.xid)
+ {
+ /*
+ * Same as COMMIT PREPARED, we must clear subskipxid of
+ * pg_subscription before rolling back the prepared transaction.
+ * Please see the comments in apply_handle_commit_prepared() for
+ * details.
+ */
+ clear_subscription_skip_xid(rollback_data.xid, InvalidXLogRecPtr, 0);
+ }
+
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, rollback_data.xid,
gid, sizeof(gid));
@@ -1209,6 +1300,13 @@ apply_handle_stream_abort(StringInfo s)
logicalrep_read_stream_abort(s, &xid, &subxid);
+ /*
+ * We don't expect that the user set the XID of the transaction that is
+ * rolled back but if the skip XID is set, clear it.
+ */
+ if (MySubscription->skipxid == xid || MySubscription->skipxid == subxid)
+ clear_subscription_skip_xid(MySubscription->skipxid, InvalidXLogRecPtr, 0);
+
/*
* If the two XIDs are the same, it's in fact abort of toplevel xact, so
* just delete the files with serialized info.
@@ -1331,6 +1429,9 @@ apply_spooled_messages(TransactionId xid, XLogRecPtr lsn)
remote_final_lsn = lsn;
+ /* Enable skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(xid);
+
/*
* Make sure the handle apply_dispatch methods are aware we're in a remote
* transaction.
@@ -1451,7 +1552,23 @@ apply_handle_stream_commit(StringInfo s)
static void
apply_handle_commit_internal(LogicalRepCommitData *commit_data)
{
- if (IsTransactionState())
+ if (is_skipping_changes())
+ {
+ /*
+ * If we are skipping all changes of this transaction, we stop it and
+ * clear subskipxid of pg_subscription.
+ */
+ stop_skipping_changes(true, commit_data->end_lsn,
+ commit_data->committime);
+
+ /* Clearing subskipxid must be committed */
+ Assert(!IsTransactionState());
+
+ pgstat_report_stat(false);
+
+ store_flush_position(commit_data->end_lsn);
+ }
+ else if (IsTransactionState())
{
/*
* Update origin state so we can restart streaming from correct
@@ -2366,6 +2483,17 @@ apply_dispatch(StringInfo s)
LogicalRepMsgType action = pq_getmsgbyte(s);
LogicalRepMsgType saved_command;
+ /*
+ * Skip all data-modification changes if we're skipping changes of this
+ * transaction.
+ */
+ if (is_skipping_changes() &&
+ (action == LOGICAL_REP_MSG_INSERT ||
+ action == LOGICAL_REP_MSG_UPDATE ||
+ action == LOGICAL_REP_MSG_DELETE ||
+ action == LOGICAL_REP_MSG_TRUNCATE))
+ return;
+
/*
* Set the current command being applied. Since this function can be
* called recusively when applying spooled changes, save the current
@@ -3661,6 +3789,134 @@ IsLogicalWorker(void)
return MyLogicalRepWorker != NULL;
}
+/*
+ * Start skipping changes of the transaction if the given XID matches the
+ * transaction ID specified by skip_xid option.
+ */
+static void
+maybe_start_skipping_changes(TransactionId xid)
+{
+ Assert(!is_skipping_changes());
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ if (MySubscription->skipxid != xid)
+ return;
+
+ /* Start skipping all changes of this transaction */
+ skip_xid = xid;
+
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction %u",
+ xid));
+}
+
+/*
+ * Stop skipping changes by resetting skip_xid. If clear_subskipxid is true,
+ * we also clear subskipxid of pg_subscription by setting InvalidTransactionId.
+ */
+static void
+stop_skipping_changes(bool clear_subskipxid, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp)
+{
+ Assert(is_skipping_changes());
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction %u",
+ skip_xid)));
+
+ if (clear_subskipxid)
+ {
+ clear_subscription_skip_xid(skip_xid, origin_lsn, origin_timestamp);
+
+ /* Make sure that clearing subskipxid is committed */
+ if (IsTransactionState())
+ CommitTransactionCommand();
+ }
+
+ /* Stop skipping changes */
+ skip_xid = InvalidTransactionId;
+}
+
+/* clear subskipxid of pg_subscription catalog */
+static void
+clear_subscription_skip_xid(TransactionId xid, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp)
+{
+ Relation rel;
+ Form_pg_subscription subform;
+ HeapTuple tup;
+ bool started_tx = false;
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ if (!IsTransactionState())
+ {
+ StartTransactionCommand();
+ started_tx = true;
+ }
+
+ /*
+ * Protect subskip_xid of pg_subscription from being concurrently updated
+ * while clearing it.
+ */
+ LockSharedObject(SubscriptionRelationId, MySubscription->oid, 0,
+ AccessShareLock);
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ ObjectIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ /* Get subskipxid value */
+ subform = (Form_pg_subscription) GETSTRUCT(tup);
+
+ /*
+ * Update the subskipxid of the tuple to InvalidTransactionId. If user
+ * has already changed subskipxid before clearing it we don't update the
+ * catalog and don't advance the replication origin state. So in the
+ * worst case, if the server crashes before sending an acknowledgment of
+ * the flush position the transaction will be sent again and the user
+ * needs to be set subskipxid again. We can reduce the possibility by
+ * logging a replication origin WAL record to advance the origin LSN
+ * instead but there is no way to advance origin timestamp and it
+ * doesn't seem to be worth since it's a very minor case.
+ */
+ if (subform->subskipxid == xid)
+ {
+ /*
+ * Update origin state so we can restart streaming from correct
+ * position in case of crash.
+ */
+ replorigin_session_origin_lsn = origin_lsn;
+ replorigin_session_origin_timestamp = origin_timestamp;
+
+ /* reset subskipxid */
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(InvalidTransactionId);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+ }
+
+ heap_freetuple(tup);
+ table_close(rel, NoLock);
+
+ if (started_tx)
+ CommitTransactionCommand();
+}
+
/* Error callback to give more context info about the change being applied */
static void
apply_error_callback(void *arg)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 92ab95724d..29aea5b56b 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4301,6 +4301,10 @@ getSubscriptions(Archive *fout)
ntups = PQntuples(res);
+ /*
+ * Get subscription fields. We don't fetch subskipxid as we don't
+ * include it in the dump.
+ */
i_tableoid = PQfnumber(res, "tableoid");
i_oid = PQfnumber(res, "oid");
i_subname = PQfnumber(res, "subname");
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 8587b19160..2895ddcea3 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6029,7 +6029,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false};
+ false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6066,8 +6066,10 @@ describeSubscriptions(const char *pattern, bool verbose)
/* Two_phase is only supported in v15 and higher */
if (pset.sversion >= 150000)
appendPQExpBuffer(&buf,
- ", subtwophasestate AS \"%s\"\n",
- gettext_noop("Two phase commit"));
+ ", subtwophasestate AS \"%s\"\n"
+ ", subskipxid AS \"%s\"\n",
+ gettext_noop("Two phase commit"),
+ gettext_noop("Skip XID"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 6bd33a06cb..b5689ec609 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1710,7 +1710,7 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> */
else if (Matches("ALTER", "SUBSCRIPTION", MatchAny))
COMPLETE_WITH("CONNECTION", "ENABLE", "DISABLE", "OWNER TO",
- "RENAME TO", "REFRESH PUBLICATION", "SET",
+ "RENAME TO", "REFRESH PUBLICATION", "SET", "SKIP",
"ADD PUBLICATION", "DROP PUBLICATION");
/* ALTER SUBSCRIPTION <name> REFRESH PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) &&
@@ -1726,6 +1726,12 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
COMPLETE_WITH("binary", "slot_name", "streaming", "synchronous_commit");
+ /* ALTER SUBSCRIPTION <name> SKIP */
+ else if (Matches("ALTER", "SUBSCRIPTION", MatchAny, "SKIP"))
+ COMPLETE_WITH("(");
+ /* ALTER SUBSCRIPTION <name> SKIP ( */
+ else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
+ COMPLETE_WITH("xid");
/* ALTER SUBSCRIPTION <name> SET PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "PUBLICATION"))
{
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 18c291289f..d4410da58f 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -67,6 +67,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
char subtwophasestate; /* Stream two-phase transactions */
+ TransactionId subskipxid; /* All changes associated with
+ * this XID are skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -103,6 +106,7 @@ typedef struct Subscription
* binary format */
bool stream; /* Allow streaming in-progress transactions. */
char twophasestate; /* Allow streaming two-phase transactions */
+ TransactionId skipxid; /* All changes of the XID are skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 413e7c85a1..ab3554f234 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3716,7 +3716,8 @@ typedef enum AlterSubscriptionType
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
ALTER_SUBSCRIPTION_DROP_PUBLICATION,
ALTER_SUBSCRIPTION_REFRESH,
- ALTER_SUBSCRIPTION_ENABLED
+ ALTER_SUBSCRIPTION_ENABLED,
+ ALTER_SUBSCRIPTION_SKIP
} AlterSubscriptionType;
typedef struct AlterSubscriptionStmt
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 80aae83562..fdba3cff94 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -76,10 +76,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -93,11 +93,41 @@ ALTER SUBSCRIPTION regress_doesnotexist CONNECTION 'dbname=regress_doesnotexist2
ERROR: subscription "regress_doesnotexist" does not exist
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
ERROR: unrecognized subscription parameter: "create_slot"
+-- ok - valid xid
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 3);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 4294967295);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+ subname | subskipxid
+-----------------+------------
+ regress_testsub | 4294967295
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = NONE);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+ subname | subskipxid
+-----------------+------------
+ regress_testsub | 0
+(1 row)
+
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1.1);
+ERROR: invalid transaction id: 1.1
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 0);
+ERROR: invalid transaction id: 0
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1);
+ERROR: invalid transaction id: 1
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 2);
+ERROR: invalid transaction id: 2
+-- fail - must be superuser
+SET SESSION AUTHORIZATION 'regress_subscription_user2';
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 100);
+ERROR: must be owner of subscription regress_testsub
+SET SESSION AUTHORIZATION 'regress_subscription_user';
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+--------------------+------------------------------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | off | dbname=regress_doesnotexist2
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+----------+--------------------+------------------------------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | 0 | off | dbname=regress_doesnotexist2
(1 row)
BEGIN;
@@ -129,10 +159,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+--------------------+------------------------------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | local | dbname=regress_doesnotexist2
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+----------+--------------------+------------------------------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | 0 | local | dbname=regress_doesnotexist2
(1 row)
-- rename back to keep the rest simple
@@ -165,19 +195,19 @@ ERROR: binary requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, binary = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -188,19 +218,19 @@ ERROR: streaming requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
-- fail - publication already exists
@@ -215,10 +245,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
-- fail - publication used more then once
@@ -233,10 +263,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -270,10 +300,10 @@ ERROR: two_phase requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | 0 | off | dbname=regress_doesnotexist
(1 row)
--fail - alter of two_phase option not supported.
@@ -282,10 +312,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -294,10 +324,10 @@ DROP SUBSCRIPTION regress_testsub;
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index bd0f4af1e4..39409295a3 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -72,6 +72,24 @@ ALTER SUBSCRIPTION regress_testsub SET (slot_name = '');
ALTER SUBSCRIPTION regress_doesnotexist CONNECTION 'dbname=regress_doesnotexist2';
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
+-- ok - valid xid
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 3);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 4294967295);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = NONE);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1.1);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 0);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 2);
+
+-- fail - must be superuser
+SET SESSION AUTHORIZATION 'regress_subscription_user2';
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 100);
+SET SESSION AUTHORIZATION 'regress_subscription_user';
+
\dRs+
BEGIN;
diff --git a/src/test/subscription/t/028_skip_xact.pl b/src/test/subscription/t/028_skip_xact.pl
new file mode 100644
index 0000000000..4c107fc8f5
--- /dev/null
+++ b/src/test/subscription/t/028_skip_xact.pl
@@ -0,0 +1,217 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Tests for skipping logical replication transactions
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use Test::More tests => 7;
+
+# Test skipping the transaction. This function must be called after the caller
+# inserting data that conflict with the subscriber. After waiting for the
+# subscription worker stats are updated, we skip the transaction in question
+# by ALTER SUBSCRIPTION ... SKIP. Then, check if logical replication can continue
+# working by inserting $nonconflict_data on the publisher.
+sub test_skip_xact
+{
+ my ($node_publisher, $node_subscriber, $subname, $relname, $nonconflict_data,
+ $expected, $xid, $msg) = @_;
+
+ # Wait for worker error
+ $node_subscriber->poll_query_until(
+ 'postgres',
+ qq[
+SELECT count(1) > 0
+FROM pg_stat_subscription_workers
+WHERE last_error_relid = '$relname'::regclass
+ AND subrelid IS NULL
+ AND last_error_command = 'INSERT'
+ AND last_error_xid = '$xid'
+ AND starts_with(last_error_message, 'duplicate key value violates unique constraint');
+]) or die "Timed out while waiting for worker error";
+
+ # Set skip xid
+ $node_subscriber->safe_psql(
+ 'postgres',
+ "ALTER SUBSCRIPTION $subname SKIP (xid = '$xid')");
+
+ # Restart the subscriber node to restart logical replication with no interval
+ $node_subscriber->restart;
+
+ # Wait for the failed transaction to be skipped
+ $node_subscriber->poll_query_until(
+ 'postgres',
+ "SELECT subskipxid = 0 FROM pg_subscription WHERE subname = '$subname'");
+
+ # Insert non-conflict data
+ $node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO $relname VALUES $nonconflict_data");
+
+ $node_publisher->wait_for_catchup($subname);
+
+ # Check replicated data
+ my $res = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT count(*) FROM $relname");
+ is($res, $expected, $msg);
+}
+
+# Create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ qq[
+logical_decoding_work_mem = 64kB
+max_prepared_transactions = 10
+]);
+$node_publisher->start;
+
+# Create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+]);
+
+# The subscriber will enter an infinite error loop, so we don't want
+# to overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+wal_retrieve_retry_interval = 2s
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we
+# create the same tables but with primary keys. Also, insert some data that
+# will conflict with the data replicated from publisher later.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+CREATE TABLE test_tab (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+COMMIT;
+]);
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+CREATE TABLE test_tab (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab VALUES (1);
+INSERT INTO test_tab_streaming VALUES (1, md5(1::text));
+COMMIT;
+]);
+
+# Setup publications
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Create subscriptions
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr' PUBLICATION tap_pub WITH (two_phase = on);
+CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr' PUBLICATION tap_pub_streaming WITH (two_phase = on, streaming = on);
+]);
+
+$node_publisher->wait_for_catchup('tap_sub');
+$node_publisher->wait_for_catchup('tap_sub_streaming');
+
+# Insert data to test_tab1, raising an error on the subscriber due to violation
+# of the unique constraint on test_tab. Then skip the transaction.
+my $xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(2)", "2", $xid,
+ "test skipping transaction");
+
+# Test for PREPARE and COMMIT PREPARED. Insert the same data to test_tab1 and
+# PREPARE the transaction, raising an error. Then skip the transaction.
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+COMMIT PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(3)", "3", $xid,
+ "test skipping prepare and commit prepared ");
+
+# Test for PREPARE and ROLLBACK PREPARED
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+ROLLBACK PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(4)", "4", $xid,
+ "test skipping prepare and rollback prepared");
+
+# Test for STREAM COMMIT. Insert enough rows to test_tab_streaming to exceed the 64kB
+# limit, also raising an error on the subscriber during applying spooled changes for the
+# same reason. Then skip the transaction.
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub_streaming", "test_tab_streaming",
+ "(2, md5(2::text))", "2", $xid,
+ "test skipping stream-commit");
+
+# Test for STREAM PREPARE and COMMIT PREPARED
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+COMMIT PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub_streaming", "test_tab_streaming",
+ "(3, md5(3::text))", "3", $xid,
+ "test skipping stream-prepare and commit prepared");
+
+# Test for STREAM PREPARE and ROLLBACK PREPARED
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+ROLLBACK PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub_streaming", "test_tab_streaming",
+ "(4, md5(4::text))", "4", $xid,
+ "test skipping stream-prepare and rollback prepared");
+
+my $res = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT count(*) FROM pg_prepared_xacts");
+is($res, "0", "check all prepared transactions are resolved on the subscriber");
--
2.24.3 (Apple Git-128)
On Monday, January 17, 2022 3:18 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated patch. Please review it.
Hi, thank you for sharing a new patch.
Few comments on the v6.
(1) doc/src/sgml/ref/alter_subscription.sgml
+ resort. This option has no effect on the transaction that is already
One TAB exists between "resort" and "This".
(2) Minor improvement suggestion of comment in src/backend/replication/logical/worker.c
+ * reset during that. Also, we don't skip receiving the changes in streaming
+ * cases, since we decide whether or not to skip applying the changes when
I sugguest that you don't use 'streaming cases', because
what "streaming cases" means sounds a bit broader than actual your implementation.
We do skip transaction of streaming cases but not during the spooling phase, right ?
I suggest below.
"We don't skip receiving the changes at the phase to spool streaming transactions"
(3) in the comment of apply_handle_prepare_internal, two full-width characters.
3-1
+ * won’t be resent in a case where the server crashes between them.
3-2
+ * COMMIT PREPARED or ROLLBACK PREPARED. But that’s okay because this
You have full-width characters for "won't" and "that's".
Could you please check ?
(4) typo
+ * the subscription if hte user has specified skip_xid. Once we start skipping
"hte" should "the" ?
(5)
I can miss something here but, in one of
the past discussions, there seems a consensus that
if the user specifies XID of a subtransaction,
it would be better to skip only the subtransaction.
This time, is it out of the range of the patch ?
If so, I suggest you include some description about it
either in the commit message or around codes related to it.
(6)
I feel it's a better idea to include a test whether
to skip aborted streaming transaction clears the XID
in the TAP test for this feature, in a sense to cover
various new code paths. Did you have any special reason
to omit the case ?
(7)
I want more explanation for the reason to restart the subscriber
in the TAP test because this is not mandatory operation.
(We can pass the TAP tests without this restart)
From :
# Restart the subscriber node to restart logical replication with no interval
IIUC, below would be better.
To :
# As an optimization to finish tests earlier, restart the subscriber with no interval,
# rather than waiting for new error to laucher a new apply worker.
Best Regards,
Takamichi Osumi
On Monday, January 17, 2022 5:03 PM I wrote:
Hi, thank you for sharing a new patch.
Few comments on the v6.(1) doc/src/sgml/ref/alter_subscription.sgml
+ resort. This option has no effect on the transaction that is + alreadyOne TAB exists between "resort" and "This".
(2) Minor improvement suggestion of comment in
src/backend/replication/logical/worker.c+ * reset during that. Also, we don't skip receiving the changes in + streaming + * cases, since we decide whether or not to skip applying the changes + whenI sugguest that you don't use 'streaming cases', because what "streaming
cases" means sounds a bit broader than actual your implementation.
We do skip transaction of streaming cases but not during the spooling phase,
right ?I suggest below.
"We don't skip receiving the changes at the phase to spool streaming
transactions"(3) in the comment of apply_handle_prepare_internal, two full-width
characters.3-1
+ * won’t be resent in a case where the server crashes between them.3-2
+ * COMMIT PREPARED or ROLLBACK PREPARED. But that’s okay
because thisYou have full-width characters for "won't" and "that's".
Could you please check ?(4) typo
+ * the subscription if hte user has specified skip_xid. Once we start + skipping"hte" should "the" ?
(5)
I can miss something here but, in one of the past discussions, there seems a
consensus that if the user specifies XID of a subtransaction, it would be better
to skip only the subtransaction.This time, is it out of the range of the patch ?
If so, I suggest you include some description about it either in the commit
message or around codes related to it.(6)
I feel it's a better idea to include a test whether to skip aborted streaming
transaction clears the XID in the TAP test for this feature, in a sense to cover
various new code paths. Did you have any special reason to omit the case ?(7)
I want more explanation for the reason to restart the subscriber in the TAP test
because this is not mandatory operation.
(We can pass the TAP tests without this restart)From :
# Restart the subscriber node to restart logical replication with no intervalIIUC, below would be better.
To :
# As an optimization to finish tests earlier, restart the subscriber with no
interval, # rather than waiting for new error to laucher a new apply worker.
Few more minor comments
(8) another full-width char in apply_handle_commit_prepared
+ * PREPARED won't be resent but subskipxid is left.
Kindly check "won't" ?
(9) the header comments of clear_subscription_skip_xid
+/* clear subskipxid of pg_subscription catalog */
Should start with an upper letter ?
(10) some variable declarations and initialization of clear_subscription_skip_xid
There's no harm in moving below codes into a condition case
where the user didn't change the subskipxid before
apply worker clearing it.
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
Best Regards,
Takamichi Osumi
On Mon, Jan 17, 2022 at 5:03 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
On Monday, January 17, 2022 3:18 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated patch. Please review it.
Hi, thank you for sharing a new patch.
Few comments on the v6.
Thank you for the comments!
(1) doc/src/sgml/ref/alter_subscription.sgml
+ resort. This option has no effect on the transaction that is already
One TAB exists between "resort" and "This".
Will remove.
(2) Minor improvement suggestion of comment in src/backend/replication/logical/worker.c
+ * reset during that. Also, we don't skip receiving the changes in streaming + * cases, since we decide whether or not to skip applying the changes whenI sugguest that you don't use 'streaming cases', because
what "streaming cases" means sounds a bit broader than actual your implementation.
We do skip transaction of streaming cases but not during the spooling phase, right ?I suggest below.
"We don't skip receiving the changes at the phase to spool streaming transactions"
I might be missing your point but I think it's correct that we don't
skip receiving the change of the transaction that is sent via
streaming protocol. And it doesn't sound broader to me. Could you
elaborate on that?
(3) in the comment of apply_handle_prepare_internal, two full-width characters.
3-1
+ * won’t be resent in a case where the server crashes between them.3-2
+ * COMMIT PREPARED or ROLLBACK PREPARED. But that’s okay because thisYou have full-width characters for "won't" and "that's".
Could you please check ?
Which characters in "won't" are full-width characters? I could not find them.
(4) typo
+ * the subscription if hte user has specified skip_xid. Once we start skipping
"hte" should "the" ?
Will fix.
(5)
I can miss something here but, in one of
the past discussions, there seems a consensus that
if the user specifies XID of a subtransaction,
it would be better to skip only the subtransaction.This time, is it out of the range of the patch ?
If so, I suggest you include some description about it
either in the commit message or around codes related to it.
How can the user know subtransaction XID? I suppose you refer to
streaming protocol cases but while applying spooled changes we don't
report subtransaction XID neither in server log nor
pg_stat_subscription_workers.
(6)
I feel it's a better idea to include a test whether
to skip aborted streaming transaction clears the XID
in the TAP test for this feature, in a sense to cover
various new code paths. Did you have any special reason
to omit the case ?
Which code path is newly covered by this aborted streaming transaction
tests? I think that this patch is already covered even by the test for
a committed-and-streamed transaction. It doesn't matter whether the
streamed transaction is committed or aborted because an error occurs
while applying the spooled changes.
(7)
I want more explanation for the reason to restart the subscriber
in the TAP test because this is not mandatory operation.
(We can pass the TAP tests without this restart)From :
# Restart the subscriber node to restart logical replication with no intervalIIUC, below would be better.
To :
# As an optimization to finish tests earlier, restart the subscriber with no interval,
# rather than waiting for new error to laucher a new apply worker.
I could not understand why the proposed sentence has more information.
Does it mean you want to mention "As an optimization to finish tests
earlier"?
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Mon, Jan 17, 2022 at 9:35 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
On Monday, January 17, 2022 5:03 PM I wrote:
Hi, thank you for sharing a new patch.
Few comments on the v6.(1) doc/src/sgml/ref/alter_subscription.sgml
+ resort. This option has no effect on the transaction that is + alreadyOne TAB exists between "resort" and "This".
(2) Minor improvement suggestion of comment in
src/backend/replication/logical/worker.c+ * reset during that. Also, we don't skip receiving the changes in + streaming + * cases, since we decide whether or not to skip applying the changes + whenI sugguest that you don't use 'streaming cases', because what "streaming
cases" means sounds a bit broader than actual your implementation.
We do skip transaction of streaming cases but not during the spooling phase,
right ?I suggest below.
"We don't skip receiving the changes at the phase to spool streaming
transactions"(3) in the comment of apply_handle_prepare_internal, two full-width
characters.3-1
+ * won’t be resent in a case where the server crashes between them.3-2
+ * COMMIT PREPARED or ROLLBACK PREPARED. But that’s okay
because thisYou have full-width characters for "won't" and "that's".
Could you please check ?(4) typo
+ * the subscription if hte user has specified skip_xid. Once we start + skipping"hte" should "the" ?
(5)
I can miss something here but, in one of the past discussions, there seems a
consensus that if the user specifies XID of a subtransaction, it would be better
to skip only the subtransaction.This time, is it out of the range of the patch ?
If so, I suggest you include some description about it either in the commit
message or around codes related to it.(6)
I feel it's a better idea to include a test whether to skip aborted streaming
transaction clears the XID in the TAP test for this feature, in a sense to cover
various new code paths. Did you have any special reason to omit the case ?(7)
I want more explanation for the reason to restart the subscriber in the TAP test
because this is not mandatory operation.
(We can pass the TAP tests without this restart)From :
# Restart the subscriber node to restart logical replication with no intervalIIUC, below would be better.
To :
# As an optimization to finish tests earlier, restart the subscriber with no
interval, # rather than waiting for new error to laucher a new apply worker.Few more minor comments
Thank you for the comments!
(8) another full-width char in apply_handle_commit_prepared
+ * PREPARED won't be resent but subskipxid is left.
Kindly check "won't" ?
Again, I don't follow what you mean by full-width character in this context.
(9) the header comments of clear_subscription_skip_xid
+/* clear subskipxid of pg_subscription catalog */
Should start with an upper letter ?
Okay, I'll change it.
(10) some variable declarations and initialization of clear_subscription_skip_xid
There's no harm in moving below codes into a condition case
where the user didn't change the subskipxid before
apply worker clearing it.+ bool nulls[Natts_pg_subscription]; + bool replaces[Natts_pg_subscription]; + Datum values[Natts_pg_subscription]; + + memset(values, 0, sizeof(values)); + memset(nulls, false, sizeof(nulls)); + memset(replaces, false, sizeof(replaces));
Will move.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Mon, Jan 17, 2022 at 6:22 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
(5)
I can miss something here but, in one of
the past discussions, there seems a consensus that
if the user specifies XID of a subtransaction,
it would be better to skip only the subtransaction.This time, is it out of the range of the patch ?
If so, I suggest you include some description about it
either in the commit message or around codes related to it.How can the user know subtransaction XID? I suppose you refer to
streaming protocol cases but while applying spooled changes we don't
report subtransaction XID neither in server log nor
pg_stat_subscription_workers.
I also think in the current system users won't be aware of
subtransaction's XID but I feel Osumi-San's point is valid that we
should at least add it in docs that we allow to skip only top-level
xacts. Also, in the future, it won't be impossible to imagine that we
can have subtransaction's XID info also available to users as we have
that in the case of streaming xacts (See subxact_data).
Few minor points:
===============
1.
+ * the subscription if hte user has specified skip_xid.
Typo. /hte/the
2.
+ * PREPARED won’t be resent but subskipxid is left.
In diffmerge tool, won't is showing some funny chars. When I manually
removed 't and added it again, everything is fine. I am not sure why
it is so? I think Osumi-San has also raised this complaint.
3.
+ /*
+ * We don't expect that the user set the XID of the transaction that is
+ * rolled back but if the skip XID is set, clear it.
+ */
/user set/user to set/
--
With Regards,
Amit Kapila.
On Mon, Jan 17, 2022 at 5:18 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated patch. Please review it.
Some review comments for the v6 patch:
doc/src/sgml/logical-replication.sgml
(1) Expanded output
Since the view output is shown in "expanded output" mode, perhaps the
doc should say that, or alternatively add the following lines prior to
it, to make it clear:
postgres=# \x
Expanded display is on.
(2) Message output in server log
The actual CONTEXT text now just says "at ..." instead of "with commit
timestamp ...", so the doc needs to be updated as follows:
BEFORE:
+CONTEXT: processing remote data during "INSERT" for replication
target relation "public.test" in transaction 716 with commit timestamp
2021-09-29 15:52:45.165754+00
AFTER:
+CONTEXT: processing remote data during "INSERT" for replication
target relation "public.test" in transaction 716 at 2021-09-29
15:52:45.165754+00
(3)
The wording "the change" doesn't seem right here, so I suggest the
following update:
BEFORE:
+ Skipping the whole transaction includes skipping the change that
may not violate
AFTER:
+ Skipping the whole transaction includes skipping changes that may
not violate
doc/src/sgml/ref/alter_subscription.sgml
(4)
I have a number of suggested wording improvements:
BEFORE:
+ Skips applying changes of the particular transaction. If incoming data
+ violates any constraints the logical replication will stop until it is
+ resolved. The resolution can be done either by changing data on the
+ subscriber so that it doesn't conflict with incoming change or
by skipping
+ the whole transaction. The logical replication worker skips all data
+ modification changes within the specified transaction including
the changes
+ that may not violate the constraint, so, it should only be used as a last
+ resort. This option has no effect on the transaction that is already
+ prepared by enabling <literal>two_phase</literal> on subscriber.
AFTER:
+ Skips applying all changes of the specified transaction. If
incoming data
+ violates any constraints, logical replication will stop until it is
+ resolved. The resolution can be done either by changing data on the
+ subscriber so that it doesn't conflict with incoming change or
by skipping
+ the whole transaction. Using the SKIP option, the logical
replication worker skips all data
+ modification changes within the specified transaction, including changes
+ that may not violate the constraint, so, it should only be used as a last
+ resort. This option has no effect on transactions that are already
+ prepared by enabling <literal>two_phase</literal> on the subscriber.
(5)
change -> changes
BEFORE:
+ subscriber so that it doesn't conflict with incoming change or
by skipping
AFTER:
+ subscriber so that it doesn't conflict with incoming changes or
by skipping
src/backend/replication/logical/worker.c
(6) Missing word?
The following should say "worth doing" or "worth it"?
+ * doesn't seem to be worth since it's a very minor case.
src/test/regress/sql/subscription.sql
(7) Misleading test case
I think the following test case is misleading and should be removed,
because the "1.1" xid value is only regarded as invalid because "1" is
an invalid xid (and there's already a test case for a "1" xid) - the
fractional part gets thrown away, and doesn't affect the validity
here.
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1.1);
Regards,
Greg Nancarrow
Fujitsu Australia
On Mon, Jan 17, 2022 at 10:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Jan 17, 2022 at 6:22 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
(5)
I can miss something here but, in one of
the past discussions, there seems a consensus that
if the user specifies XID of a subtransaction,
it would be better to skip only the subtransaction.This time, is it out of the range of the patch ?
If so, I suggest you include some description about it
either in the commit message or around codes related to it.How can the user know subtransaction XID? I suppose you refer to
streaming protocol cases but while applying spooled changes we don't
report subtransaction XID neither in server log nor
pg_stat_subscription_workers.I also think in the current system users won't be aware of
subtransaction's XID but I feel Osumi-San's point is valid that we
should at least add it in docs that we allow to skip only top-level
xacts. Also, in the future, it won't be impossible to imagine that we
can have subtransaction's XID info also available to users as we have
that in the case of streaming xacts (See subxact_data).
Fair point and more accurate, but I'm a bit concerned that using these
words could confuse the user. There are some places in the doc where
we use the words “top-level transaction” and "sub transactions” but
these are not commonly used in the doc. The user normally would not be
aware that sub transactions are used to implement SAVEPOINTs. Also,
the publisher's subtransaction ID doesn’t appear anywhere on the
subscriber. So if we want to mention it, I think we should use other
words instead of them but I don’t have a good idea for that. Do you
have any ideas?
Few minor points:
===============
1.
+ * the subscription if hte user has specified skip_xid.Typo. /hte/the
Will fix.
2.
+ * PREPARED won’t be resent but subskipxid is left.In diffmerge tool, won't is showing some funny chars. When I manually
removed 't and added it again, everything is fine. I am not sure why
it is so? I think Osumi-San has also raised this complaint.
Oh I didn't realize that. I'll check it again by using diffmerge tool.
3. + /* + * We don't expect that the user set the XID of the transaction that is + * rolled back but if the skip XID is set, clear it. + *//user set/user to set/
Will fix.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, Jan 18, 2022 at 10:36 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Mon, Jan 17, 2022 at 5:18 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated patch. Please review it.
Some review comments for the v6 patch:
Thank you for the comments!
doc/src/sgml/logical-replication.sgml
(1) Expanded output
Since the view output is shown in "expanded output" mode, perhaps the
doc should say that, or alternatively add the following lines prior to
it, to make it clear:postgres=# \x
Expanded display is on.
I'm not sure it's really necessary. A similar example would be
perform.sgml but it doesn't say "\x".
(2) Message output in server log
The actual CONTEXT text now just says "at ..." instead of "with commit
timestamp ...", so the doc needs to be updated as follows:BEFORE: +CONTEXT: processing remote data during "INSERT" for replication target relation "public.test" in transaction 716 with commit timestamp 2021-09-29 15:52:45.165754+00 AFTER: +CONTEXT: processing remote data during "INSERT" for replication target relation "public.test" in transaction 716 at 2021-09-29 15:52:45.165754+00
Will fix.
(3)
The wording "the change" doesn't seem right here, so I suggest the
following update:BEFORE: + Skipping the whole transaction includes skipping the change that may not violate AFTER: + Skipping the whole transaction includes skipping changes that may not violatedoc/src/sgml/ref/alter_subscription.sgml
Will fix.
(4)
I have a number of suggested wording improvements:BEFORE: + Skips applying changes of the particular transaction. If incoming data + violates any constraints the logical replication will stop until it is + resolved. The resolution can be done either by changing data on the + subscriber so that it doesn't conflict with incoming change or by skipping + the whole transaction. The logical replication worker skips all data + modification changes within the specified transaction including the changes + that may not violate the constraint, so, it should only be used as a last + resort. This option has no effect on the transaction that is already + prepared by enabling <literal>two_phase</literal> on subscriber.AFTER: + Skips applying all changes of the specified transaction. If incoming data + violates any constraints, logical replication will stop until it is + resolved. The resolution can be done either by changing data on the + subscriber so that it doesn't conflict with incoming change or by skipping + the whole transaction. Using the SKIP option, the logical replication worker skips all data + modification changes within the specified transaction, including changes + that may not violate the constraint, so, it should only be used as a last + resort. This option has no effect on transactions that are already + prepared by enabling <literal>two_phase</literal> on the subscriber.
Will fix.
(5)
change -> changesBEFORE: + subscriber so that it doesn't conflict with incoming change or by skipping AFTER: + subscriber so that it doesn't conflict with incoming changes or by skipping
Will fix.
src/backend/replication/logical/worker.c
(6) Missing word?
The following should say "worth doing" or "worth it"?+ * doesn't seem to be worth since it's a very minor case.
WIll fix
src/test/regress/sql/subscription.sql
(7) Misleading test case
I think the following test case is misleading and should be removed,
because the "1.1" xid value is only regarded as invalid because "1" is
an invalid xid (and there's already a test case for a "1" xid) - the
fractional part gets thrown away, and doesn't affect the validity
here.+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1.1);
Good point. Will remove.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, Jan 18, 2022 at 8:02 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Jan 17, 2022 at 10:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Jan 17, 2022 at 6:22 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
(5)
I can miss something here but, in one of
the past discussions, there seems a consensus that
if the user specifies XID of a subtransaction,
it would be better to skip only the subtransaction.This time, is it out of the range of the patch ?
If so, I suggest you include some description about it
either in the commit message or around codes related to it.How can the user know subtransaction XID? I suppose you refer to
streaming protocol cases but while applying spooled changes we don't
report subtransaction XID neither in server log nor
pg_stat_subscription_workers.I also think in the current system users won't be aware of
subtransaction's XID but I feel Osumi-San's point is valid that we
should at least add it in docs that we allow to skip only top-level
xacts. Also, in the future, it won't be impossible to imagine that we
can have subtransaction's XID info also available to users as we have
that in the case of streaming xacts (See subxact_data).Fair point and more accurate, but I'm a bit concerned that using these
words could confuse the user. There are some places in the doc where
we use the words “top-level transaction” and "sub transactions” but
these are not commonly used in the doc. The user normally would not be
aware that sub transactions are used to implement SAVEPOINTs. Also,
the publisher's subtransaction ID doesn’t appear anywhere on the
subscriber. So if we want to mention it, I think we should use other
words instead of them but I don’t have a good idea for that. Do you
have any ideas?
How about changing existing text:
+ Specifies the ID of the transaction whose changes are to be skipped
+ by the logical replication worker. Setting <literal>NONE</literal>
+ resets the transaction ID.
to
Specifies the top-level transaction identifier whose changes are to be
skipped by the logical replication worker. We don't support skipping
individual subtransactions. Setting <literal>NONE</literal> resets
the transaction ID.
--
With Regards,
Amit Kapila.
On Mon, Jan 17, 2022 2:18 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated patch. Please review it.
Thanks for updating the patch. Few comments:
1)
/* Two_phase is only supported in v15 and higher */
if (pset.sversion >= 150000)
appendPQExpBuffer(&buf,
- ", subtwophasestate AS \"%s\"\n",
- gettext_noop("Two phase commit"));
+ ", subtwophasestate AS \"%s\"\n"
+ ", subskipxid AS \"%s\"\n",
+ gettext_noop("Two phase commit"),
+ gettext_noop("Skip XID"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
I think "skip xid" should be mentioned in the comment. Maybe it could be changed to:
"Two_phase and skip XID are only supported in v15 and higher"
2) The following two places are not consistent in whether "= value" is surround
with square brackets.
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable class="parameter">skip_option</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ <term><literal>SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> [, ... ] )</literal></term>
Should we modify the first place to:
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> [, ... ] )
Because currently there is only one skip_option - xid, and a parameter must be
specified when using it.
3)
+ * Protect subskip_xid of pg_subscription from being concurrently updated
+ * while clearing it.
"subskip_xid" should be "subskipxid" I think.
4)
+/*
+ * Start skipping changes of the transaction if the given XID matches the
+ * transaction ID specified by skip_xid option.
+ */
The option name was "skip_xid" in the previous version, and it is "xid" in
latest patch. So should we modify "skip_xid option" to "skip xid option", or
"skip option xid", or something else?
Also the following place has similar issue:
+ * the subscription if hte user has specified skip_xid. Once we start skipping
Regards,
Tang
On Monday, January 17, 2022 9:52 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Thank you for the comments!
..
(2) Minor improvement suggestion of comment in
src/backend/replication/logical/worker.c+ * reset during that. Also, we don't skip receiving the changes in + streaming + * cases, since we decide whether or not to skip applying the changes + whenI sugguest that you don't use 'streaming cases', because what
"streaming cases" means sounds a bit broader than actual yourimplementation.
We do skip transaction of streaming cases but not during the spooling phase,
right ?
I suggest below.
"We don't skip receiving the changes at the phase to spool streaming
transactions"
I might be missing your point but I think it's correct that we don't skip receiving
the change of the transaction that is sent via streaming protocol. And it doesn't
sound broader to me. Could you elaborate on that?
OK. Excuse me for lack of explanation.
I felt "streaming cases" implies "non-streaming cases"
to compare a diffference (in my head) when it is
used to explain something at first.
I imagined the contrast between those, when I saw it.
Thus, I thought "streaming cases" meant
whole flow of streaming transactions which consists of messages
surrounded by stream start and stream stop and which are finished by
stream commit/stream abort (including 2PC variations).
When I come back to the subject, you wrote below in the comment
"we don't skip receiving the changes in streaming cases,
since we decide whether or not to skip applying the changes
when starting to apply changes"
The first part of this sentence
("we don't skip receiving the changes in streaming cases")
gives me an impression where we don't skip changes in the streaming cases
(of my understanding above), but the last part
("we decide whether or not to skip applying the changes
when starting to apply change") means we skip transactions for streaming at apply phase.
So, this sentence looked confusing to me slightly.
Thus, I suggested below (and when I connect it with existing part)
"we don't skip receiving the changes at the phase to spool streaming transactions
since we decide whether or not to skip applying the changes when starting to apply changes"
For me this looked better, but of course, this is a suggestion.
(3) in the comment of apply_handle_prepare_internal, two full-width
characters.
3-1
+ * won’t be resent in a case where the server crashes betweenthem.
3-2 + * COMMIT PREPARED or ROLLBACK PREPARED. But that’s okay + because thisYou have full-width characters for "won't" and "that's".
Could you please check ?Which characters in "won't" are full-width characters? I could not find them.
All characters I found and mentioned as full-width are single quotes.
It might be good that you check the entire patch once
by some tool that helps you to detect it.
(5)
I can miss something here but, in one of the past discussions, there
seems a consensus that if the user specifies XID of a subtransaction,
it would be better to skip only the subtransaction.This time, is it out of the range of the patch ?
If so, I suggest you include some description about it either in the
commit message or around codes related to it.How can the user know subtransaction XID? I suppose you refer to streaming
protocol cases but while applying spooled changes we don't report
subtransaction XID neither in server log nor pg_stat_subscription_workers.
Yeah, usually subtransaction XID is not exposed to the users. I agree.
But, clarifying the target of this feature is only top-level transactions
sounds better to me. Thank you Amit-san for your support
about how we should write it in [1]/messages/by-id/CAA4eK1JHUF7fVNHQ1ZRRgVsdE8XDY8BruU9dNP3Q3jizNdpEbg@mail.gmail.com !
(6)
I feel it's a better idea to include a test whether to skip aborted
streaming transaction clears the XID in the TAP test for this feature,
in a sense to cover various new code paths. Did you have any special
reason to omit the case ?Which code path is newly covered by this aborted streaming transaction tests?
I think that this patch is already covered even by the test for a
committed-and-streamed transaction. It doesn't matter whether the streamed
transaction is committed or aborted because an error occurs while applying the
spooled changes.
Oh, this was my mistake. What I expressed as a new patch is
apply_handle_stream_abort -> clear_subscription_skip_xid.
But, this was totally wrong as you explained.
(7)
I want more explanation for the reason to restart the subscriber in
the TAP test because this is not mandatory operation.
(We can pass the TAP tests without this restart)From :
# Restart the subscriber node to restart logical replication with no
intervalIIUC, below would be better.
To :
# As an optimization to finish tests earlier, restart the subscriber
with no interval, # rather than waiting for new error to laucher a new applyworker.
I could not understand why the proposed sentence has more information.
Does it mean you want to mention "As an optimization to finish tests earlier"?
Yes, exactly. The point is to add "As an optimization to finish tests earlier".
Probably, I should have asked a simple question "why do you restart the subscriber" ?
At first sight, I couldn't understand the meaning for the restart and
you don't explain the reason itself.
[1]: /messages/by-id/CAA4eK1JHUF7fVNHQ1ZRRgVsdE8XDY8BruU9dNP3Q3jizNdpEbg@mail.gmail.com
Best Regards,
Takamichi Osumi
On Tue, Jan 18, 2022 at 8:34 AM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
On Mon, Jan 17, 2022 2:18 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
2) The following two places are not consistent in whether "= value" is surround
with square brackets.+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable class="parameter">skip_option</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ <term><literal>SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> [, ... ] )</literal></term>
Should we modify the first place to:
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> [, ... ] )Because currently there is only one skip_option - xid, and a parameter must be
specified when using it.
Good observation. Do we really need [, ... ] as currently, we support
only one value for XID?
--
With Regards,
Amit Kapila.
On Tue, Jan 18, 2022 at 12:37 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jan 18, 2022 at 8:34 AM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:On Mon, Jan 17, 2022 2:18 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
2) The following two places are not consistent in whether "= value" is surround
with square brackets.+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable class="parameter">skip_option</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ <term><literal>SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> [, ... ] )</literal></term>
Should we modify the first place to:
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> [, ... ] )Because currently there is only one skip_option - xid, and a parameter must be
specified when using it.Good observation. Do we really need [, ... ] as currently, we support
only one value for XID?
I think no. In the doc, it should be:
ALTER SUBSCRIPTION name SKIP ( skip_option = value )
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, Jan 18, 2022 at 12:04 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
On Mon, Jan 17, 2022 2:18 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated patch. Please review it.
Thanks for updating the patch. Few comments:
1) /* Two_phase is only supported in v15 and higher */ if (pset.sversion >= 150000) appendPQExpBuffer(&buf, - ", subtwophasestate AS \"%s\"\n", - gettext_noop("Two phase commit")); + ", subtwophasestate AS \"%s\"\n" + ", subskipxid AS \"%s\"\n", + gettext_noop("Two phase commit"), + gettext_noop("Skip XID"));appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"I think "skip xid" should be mentioned in the comment. Maybe it could be changed to:
"Two_phase and skip XID are only supported in v15 and higher"
Added.
2) The following two places are not consistent in whether "= value" is surround
with square brackets.+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable class="parameter">skip_option</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ <term><literal>SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> [, ... ] )</literal></term>
Should we modify the first place to:
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> [, ... ] )Because currently there is only one skip_option - xid, and a parameter must be
specified when using it.
Good catch. Fixed.
3) + * Protect subskip_xid of pg_subscription from being concurrently updated + * while clearing it."subskip_xid" should be "subskipxid" I think.
Fixed.
4) +/* + * Start skipping changes of the transaction if the given XID matches the + * transaction ID specified by skip_xid option. + */The option name was "skip_xid" in the previous version, and it is "xid" in
latest patch. So should we modify "skip_xid option" to "skip xid option", or
"skip option xid", or something else?Also the following place has similar issue:
+ * the subscription if hte user has specified skip_xid. Once we start skipping
Fixed.
I've attached an updated patch. All comments I got so far were
incorporated into this patch unless I'm missing something.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v7-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transac.patchapplication/x-patch; name=v7-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transac.patchDownload
From 9faf874a7388368f86c500e1fef9616ecf86e5b5 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 10 Dec 2021 14:41:30 +0900
Subject: [PATCH v7] Add ALTER SUBSCRIPTION ... SKIP to skip the transaction on
subscriber nodes
If incoming change violates any constraint, logical replication stops
until it's resolved. This commit introduces another way to skip the
transaction in question, other than manually updating the subscriber's
database or using pg_replication_origin_advance().
The user can specify XID by ALTER SUBSCRIPTION ... SKIP (xid = XXX),
updating pg_subscription.subskipxid field, telling the apply worker to
skip the transaction. The apply worker skips all data modification
changes within the specified transaction.
After skipping the transaction the apply worker clears
pg_subscription.subskipxid.
---
doc/src/sgml/catalogs.sgml | 10 +
doc/src/sgml/logical-replication.sgml | 49 +++-
doc/src/sgml/ref/alter_subscription.sgml | 42 ++++
src/backend/catalog/pg_subscription.c | 1 +
src/backend/commands/subscriptioncmds.c | 53 +++++
src/backend/parser/gram.y | 9 +
src/backend/replication/logical/worker.c | 265 ++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 4 +
src/bin/psql/describe.c | 10 +-
src/bin/psql/tab-complete.c | 8 +-
src/include/catalog/pg_subscription.h | 4 +
src/include/nodes/parsenodes.h | 3 +-
src/test/regress/expected/subscription.out | 124 ++++++----
src/test/regress/sql/subscription.sql | 17 ++
src/test/subscription/t/028_skip_xact.pl | 217 +++++++++++++++++
15 files changed, 755 insertions(+), 61 deletions(-)
create mode 100644 src/test/subscription/t/028_skip_xact.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 2aeb2ef346..16f429b853 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7746,6 +7746,16 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subskipxid</structfield> <type>xid</type>
+ </para>
+ <para>
+ ID of the transaction whose changes are to be skipped, if a valid
+ transaction ID; otherwise 0.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subconninfo</structfield> <type>text</type>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 96b4886e08..de4f83bbcc 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -353,15 +353,58 @@
<para>
The resolution can be done either by changing data or permissions on the subscriber so
- that it does not conflict with the incoming change or by skipping the
- transaction that conflicts with the existing data. The transaction can be
- skipped by calling the <link linkend="pg-replication-origin-advance">
+ that it does not conflict with the incoming changes or by skipping the
+ the transaction that conflicts with the existing data. When a conflict
+ produces an error, it is shown in
+ <structname>pg_stat_subscription_workers</structname> view as follows:
+ </para>
+
+ <programlisting>
+postgres=# SELECT * FROM pg_stat_subscription_workers;
+-[ RECORD 1 ]------+-----------------------------------------------------------
+subid | 16391
+subname | test_sub
+subrelid |
+last_error_relid | 16385
+last_error_command | INSERT
+last_error_xid | 716
+last_error_count | 50
+last_error_message | duplicate key value violates unique constraint "test_pkey"
+last_error_time | 2021-09-29 15:52:45.165754+00
+</programlisting>
+
+ <para>
+ and it is also shown in subscriber's server log as follows:
+ </para>
+
+<screen>
+ERROR: duplicate key value violates unique constraint "test_pkey"
+DETAIL: Key (id)=(1) already exists.
+CONTEXT: processing remote data during "INSERT" for replication target relation "public.test" in transaction 716 at 2021-09-29 15:52:45.165754+00
+</screen>
+
+ <para>
+ The transaction ID that contains the change violating the constraint can be
+ found from those outputs (transaction ID 716 in the above case). The transaction
+ can be skipped by using <command>ALTER SUBSCRIPTION ... SKIP</command> on the
+ subscription. Alternatively, the transaction can also be skipped by calling the
+ <link linkend="pg-replication-origin-advance">
<function>pg_replication_origin_advance()</function></link> function with
a <parameter>node_name</parameter> corresponding to the subscription name,
and a position. The current position of origins can be seen in the
<link linkend="view-pg-replication-origin-status">
<structname>pg_replication_origin_status</structname></link> system view.
</para>
+
+ <para>
+ To resolve conflicts, you need to consider changing the data on the subscriber so
+ that it doesn't conflict with incoming changes, or dropping the conflicting constraint
+ or unique index, or writing a trigger on the subscriber to suppress or redirect
+ conflicting incoming changes, or as a last resort, by skipping the whole transaction.
+ Skipping the whole transaction includes skipping changes that may not violate
+ any constraint. This can easily make the subscriber inconsistent, especially if
+ a user specifies the wrong transaction ID or the position of origin.
+ </para>
</sect1>
<sect1 id="logical-replication-restrictions">
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 0b027cc346..8b4568ddab 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -29,6 +29,7 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> REFRESH PUB
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> ENABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> DISABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> )
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <replaceable>new_name</replaceable>
</synopsis>
@@ -207,6 +208,47 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><literal>SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> [, ... ] )</literal></term>
+ <listitem>
+ <para>
+ Skips applying all changes of the specified transaction. If incoming data
+ violates any constraints, the logical replication will stop until it is
+ resolved. The resolution can be done either by changing data on the
+ subscriber so that it doesn't conflict with incoming changes or by skipping
+ the whole transaction. Using <command> ALTER SUBSCRIPTION ... SKIP </command>
+ command, the logical replication worker skips all data modification changes
+ within the specified transaction including changes that may not violate
+ the constraint, so, it should only be used as a last resort. This option has
+ no effect on the transactions that are already prepared by enabling
+ <literal>two_phase</literal> on subscriber. After the logical replication
+ successfully skips the transaction, the transaction ID (stored in
+ <structname>pg_subscription</structname>.<structfield>subskipxid</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts.
+ </para>
+
+ <para>
+ <replaceable>skip_option</replaceable> specifies options for this operation.
+ The supported option is:
+
+ <variablelist>
+ <varlistentry>
+ <term><literal>xid</literal> (<type>xid</type>)</term>
+ <listitem>
+ <para>
+ Specifies the ID of the transaction whose changes are to be skipped
+ by the logical replication worker. We don't support skipping
+ individual subtransactions. Setting <literal>NONE</literal>
+ resets the transaction ID.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><replaceable class="parameter">new_owner</replaceable></term>
<listitem>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index ca65a8bd20..da199e9a3e 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -69,6 +69,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->binary = subform->subbinary;
sub->stream = subform->substream;
sub->twophasestate = subform->subtwophasestate;
+ sub->skipxid = subform->subskipxid;
/* Get conninfo */
datum = SysCacheGetAttr(SUBSCRIPTIONOID,
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index f5eba450ce..0ff0e00f19 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -61,6 +61,7 @@
#define SUBOPT_BINARY 0x00000080
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
+#define SUBOPT_XID 0x00000400
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -82,6 +83,8 @@ typedef struct SubOpts
bool binary;
bool streaming;
bool twophase;
+ TransactionId xid; /* InvalidTransactionId for resetting purpose, otherwise
+ * normal transaction id */
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -249,6 +252,33 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_TWOPHASE_COMMIT;
opts->twophase = defGetBoolean(defel);
}
+ else if (IsSet(supported_opts, SUBOPT_XID) &&
+ strcmp(defel->defname, "xid") == 0)
+ {
+ char *xid_str = defGetString(defel);
+ TransactionId xid;
+
+ if (IsSet(opts->specified_opts, SUBOPT_XID))
+ errorConflictingDefElem(defel, pstate);
+
+ /* Setting xid = NONE is treated as resetting xid */
+ if (strcmp(xid_str, "none") == 0)
+ xid = InvalidTransactionId;
+ else
+ {
+ /* Parse the argument as TransactionId */
+ xid = DatumGetTransactionId(DirectFunctionCall1(xidin,
+ CStringGetDatum(xid_str)));
+
+ if (!TransactionIdIsNormal(xid))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid transaction id: %s", xid_str)));
+ }
+
+ opts->specified_opts |= SUBOPT_XID;
+ opts->xid = xid;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -464,6 +494,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
CharGetDatum(opts.twophase ?
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(InvalidTransactionId);
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -1083,6 +1115,27 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
break;
}
+ case ALTER_SUBSCRIPTION_SKIP:
+ {
+ if (!superuser())
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("must be superuser to skip transaction")));
+
+ parse_subscription_options(pstate, stmt->options, SUBOPT_XID, &opts);
+
+ /* ALTER SUBSCRIPTION ... SKIP supports only xid option */
+ Assert(IsSet(opts.specified_opts, SUBOPT_XID));
+
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(opts.xid);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ update_tuple = true;
+
+ break;
+ }
+
default:
elog(ERROR, "unrecognized ALTER SUBSCRIPTION kind %d",
stmt->kind);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index bb015a8bbd..0a0961dbb5 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9954,6 +9954,15 @@ AlterSubscriptionStmt:
(Node *)makeInteger(false), @1));
$$ = (Node *)n;
}
+ | ALTER SUBSCRIPTION name SKIP definition
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_SKIP;
+ n->subname = $3;
+ n->options = $5;
+ $$ = (Node *)n;
+ }
;
/*****************************************************************************
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index c9af775bc1..0264e30112 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -136,6 +136,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -257,6 +258,21 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * We enable skipping all data modification changes (INSERT, UPDATE, etc.) for
+ * the subscription if the user has specified subskipxid. Once we start skipping
+ * changes, we don't stop it until we skip all changes of the transaction even
+ * if pg_subscription is updated and MySubscription->skipxid gets changed or
+ * reset during that. Also, in streaming transaction cases, we don't skip
+ * receiving and spooling the changes, since we decide whether or not to skip
+ * applying the changes when starting to apply changes. At end of the transaction,
+ * we disable it and reset subskipxid. The timing of resetting subskipxid varies
+ * depending on commit or commit/rollback prepared case. Please refer to the
+ * comments in corresponding functions for details.
+ */
+static TransactionId skip_xid = InvalidTransactionId;
+#define is_skipping_changes() (TransactionIdIsValid(skip_xid))
+
/* BufFile handle of the current streaming file */
static BufFile *stream_fd = NULL;
@@ -332,6 +348,13 @@ static void TwoPhaseTransactionGid(Oid subid, TransactionId xid, char *gid, int
/* Common streaming function to apply all the spooled messages */
static void apply_spooled_messages(TransactionId xid, XLogRecPtr lsn);
+/* Functions for skipping changes */
+static void maybe_start_skipping_changes(TransactionId xid);
+static void stop_skipping_changes(bool commit, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp);
+static void clear_subscription_skip_xid(TransactionId xid, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp);
+
/* Functions for apply error callback */
static void apply_error_callback(void *arg);
static inline void set_apply_error_context_xact(TransactionId xid, TimestampTz ts);
@@ -791,6 +814,11 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
+ /*
+ * Enable skipping all changes of this transaction if specified.
+ */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -843,6 +871,11 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
+ /*
+ * Enable skipping all changes of this transaction if specified
+ */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -856,6 +889,36 @@ apply_handle_prepare_internal(LogicalRepPreparedTxnData *prepare_data)
{
char gid[GIDSIZE];
+ /*
+ * If we are skipping all changes of this transaction, we stop it but
+ * unlike commit, we do not clear subskipxid of pg_subscription catalog
+ * here and will do that at commit prepared or rollback prepared time. If
+ * we update the catalog and then prepare the transaction, the changes
+ * will be part of the prepared transaction. Even if we do that in
+ * reverse order, subskipxid will not be cleared but this transaction
+ * won't be resent in a case where the server crashes between them.
+ *
+ * subskipxid might be changed or cleared by the user before we receive
+ * COMMIT PREPARED or ROLLBACK PREPARED. But that's okay because this
+ * prepared transaction is empty.
+ *
+ * One might think that we can skip preparing the skipped transaction and
+ * also skip COMMIT PREPARED or ROLLBACK PREPARED by comparing the XID
+ * received as part of the message to subskipxid. But subskipxid could be
+ * changed by users between PREPARE and COMMIT PREPARED or ROLLBACK
+ * PREPARED. There was an idea to disallow users to change subskipxid
+ * while skipping changes. But we don't know when COMMIT PREPARED or
+ * ROLLBACK PREPARED comes and another conflict could occur in the
+ * meanwhile. If such another conflict occurs, we cannot skip the
+ * transaction by using subskipxid. Also, there was another idea to check
+ * whether the transaction has been prepared or not by checking GID,
+ * origin LSN, and origin timestamp of the prepared transaction but that
+ * doesn't seem worthwhile because it requires protocol changes, and
+ * skipping transactions shouldn't be common.
+ */
+ if (is_skipping_changes())
+ stop_skipping_changes(false, InvalidXLogRecPtr, 0);
+
/*
* Compute unique GID for two_phase transactions. We don't use GID of
* prepared transaction sent by server as that can lead to deadlock when
@@ -901,9 +964,9 @@ apply_handle_prepare(StringInfo s)
/*
* Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction. It is done this way because at
- * commit prepared time, we won't know whether we have skipped preparing a
- * transaction because of no change.
+ * change has happened in this transaction or all changes are skipped. It
+ * is done this way because at commit prepared time, we won't know whether
+ * we have skipped preparing a transaction because of no change.
*
* XXX, We can optimize such that at commit prepared time, we first check
* whether we have prepared the transaction or not but that doesn't seem
@@ -940,6 +1003,23 @@ apply_handle_commit_prepared(StringInfo s)
logicalrep_read_commit_prepared(s, &prepare_data);
set_apply_error_context_xact(prepare_data.xid, prepare_data.commit_time);
+ if (MySubscription->skipxid == prepare_data.xid)
+ {
+ /*
+ * Clear the subskipxid of pg_subscription catalog. This catalog
+ * update must be committed before finishing prepared transaction.
+ * Because otherwise, in a case where the server crashes between
+ * finishing prepared transaction and the catalog update, COMMIT
+ * PREPARED won't be resent but subskipxid is left.
+ *
+ * Also, we must not update the replication origin LSN and timestamp
+ * while committing the catalog update so that COMMIT PREPARED will be
+ * resent in case of a crash immediately after the catalog update
+ * commit.
+ */
+ clear_subscription_skip_xid(prepare_data.xid, InvalidXLogRecPtr, 0);
+ }
+
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
gid, sizeof(gid));
@@ -981,6 +1061,17 @@ apply_handle_rollback_prepared(StringInfo s)
logicalrep_read_rollback_prepared(s, &rollback_data);
set_apply_error_context_xact(rollback_data.xid, rollback_data.rollback_time);
+ if (MySubscription->skipxid == rollback_data.xid)
+ {
+ /*
+ * Same as COMMIT PREPARED, we must clear subskipxid of
+ * pg_subscription before rolling back the prepared transaction.
+ * Please see the comments in apply_handle_commit_prepared() for
+ * details.
+ */
+ clear_subscription_skip_xid(rollback_data.xid, InvalidXLogRecPtr, 0);
+ }
+
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, rollback_data.xid,
gid, sizeof(gid));
@@ -1209,6 +1300,13 @@ apply_handle_stream_abort(StringInfo s)
logicalrep_read_stream_abort(s, &xid, &subxid);
+ /*
+ * We don't expect the user to set the XID of the transaction that is
+ * rolled back but if the skip XID is set, clear it.
+ */
+ if (MySubscription->skipxid == xid || MySubscription->skipxid == subxid)
+ clear_subscription_skip_xid(MySubscription->skipxid, InvalidXLogRecPtr, 0);
+
/*
* If the two XIDs are the same, it's in fact abort of toplevel xact, so
* just delete the files with serialized info.
@@ -1331,6 +1429,9 @@ apply_spooled_messages(TransactionId xid, XLogRecPtr lsn)
remote_final_lsn = lsn;
+ /* Enable skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(xid);
+
/*
* Make sure the handle apply_dispatch methods are aware we're in a remote
* transaction.
@@ -1451,7 +1552,23 @@ apply_handle_stream_commit(StringInfo s)
static void
apply_handle_commit_internal(LogicalRepCommitData *commit_data)
{
- if (IsTransactionState())
+ if (is_skipping_changes())
+ {
+ /*
+ * If we are skipping all changes of this transaction, we stop it and
+ * clear subskipxid of pg_subscription.
+ */
+ stop_skipping_changes(true, commit_data->end_lsn,
+ commit_data->committime);
+
+ /* Clearing subskipxid must be committed */
+ Assert(!IsTransactionState());
+
+ pgstat_report_stat(false);
+
+ store_flush_position(commit_data->end_lsn);
+ }
+ else if (IsTransactionState())
{
/*
* Update origin state so we can restart streaming from correct
@@ -2366,6 +2483,17 @@ apply_dispatch(StringInfo s)
LogicalRepMsgType action = pq_getmsgbyte(s);
LogicalRepMsgType saved_command;
+ /*
+ * Skip all data-modification changes if we're skipping changes of this
+ * transaction.
+ */
+ if (is_skipping_changes() &&
+ (action == LOGICAL_REP_MSG_INSERT ||
+ action == LOGICAL_REP_MSG_UPDATE ||
+ action == LOGICAL_REP_MSG_DELETE ||
+ action == LOGICAL_REP_MSG_TRUNCATE))
+ return;
+
/*
* Set the current command being applied. Since this function can be
* called recusively when applying spooled changes, save the current
@@ -3661,6 +3789,135 @@ IsLogicalWorker(void)
return MyLogicalRepWorker != NULL;
}
+/*
+ * Start skipping changes of the transaction if the given XID matches the
+ * transaction ID specified by subskipxid.
+ */
+static void
+maybe_start_skipping_changes(TransactionId xid)
+{
+ Assert(!is_skipping_changes());
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ if (MySubscription->skipxid != xid)
+ return;
+
+ /* Start skipping all changes of this transaction */
+ skip_xid = xid;
+
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction %u",
+ xid));
+}
+
+/*
+ * Stop skipping changes by resetting subskipxid. If clear_subskipxid is true,
+ * we also clear subskipxid of pg_subscription by setting InvalidTransactionId.
+ */
+static void
+stop_skipping_changes(bool clear_subskipxid, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp)
+{
+ Assert(is_skipping_changes());
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction %u",
+ skip_xid)));
+
+ if (clear_subskipxid)
+ {
+ clear_subscription_skip_xid(skip_xid, origin_lsn, origin_timestamp);
+
+ /* Make sure that clearing subskipxid is committed */
+ if (IsTransactionState())
+ CommitTransactionCommand();
+ }
+
+ /* Stop skipping changes */
+ skip_xid = InvalidTransactionId;
+}
+
+/* Clear subskipxid of pg_subscription catalog */
+static void
+clear_subscription_skip_xid(TransactionId xid, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp)
+{
+ Relation rel;
+ Form_pg_subscription subform;
+ HeapTuple tup;
+ bool started_tx = false;
+
+ if (!IsTransactionState())
+ {
+ StartTransactionCommand();
+ started_tx = true;
+ }
+
+ /*
+ * Protect subskipxid of pg_subscription from being concurrently updated
+ * while clearing it.
+ */
+ LockSharedObject(SubscriptionRelationId, MySubscription->oid, 0,
+ AccessShareLock);
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ ObjectIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ /* Get subskipxid value */
+ subform = (Form_pg_subscription) GETSTRUCT(tup);
+
+ /*
+ * Update the subskipxid of the tuple to InvalidTransactionId. If user
+ * has already changed subskipxid before clearing it we don't update the
+ * catalog and don't advance the replication origin state. So in the
+ * worst case, if the server crashes before sending an acknowledgment of
+ * the flush position the transaction will be sent again and the user
+ * needs to be set subskipxid again. We can reduce the possibility by
+ * logging a replication origin WAL record to advance the origin LSN
+ * instead but there is no way to advance origin timestamp and it
+ * doesn't seem to be worth it since it's a very minor case.
+ */
+ if (subform->subskipxid == xid)
+ {
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ /* reset subskipxid */
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(InvalidTransactionId);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ /*
+ * Update origin state so we can restart streaming from correct
+ * position in case of crash.
+ */
+ replorigin_session_origin_lsn = origin_lsn;
+ replorigin_session_origin_timestamp = origin_timestamp;
+
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+ }
+
+ heap_freetuple(tup);
+ table_close(rel, NoLock);
+
+ if (started_tx)
+ CommitTransactionCommand();
+}
+
/* Error callback to give more context info about the change being applied */
static void
apply_error_callback(void *arg)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 92ab95724d..29aea5b56b 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4301,6 +4301,10 @@ getSubscriptions(Archive *fout)
ntups = PQntuples(res);
+ /*
+ * Get subscription fields. We don't fetch subskipxid as we don't
+ * include it in the dump.
+ */
i_tableoid = PQfnumber(res, "tableoid");
i_oid = PQfnumber(res, "oid");
i_subname = PQfnumber(res, "subname");
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 8587b19160..9cd478025d 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6029,7 +6029,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false};
+ false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6063,11 +6063,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Binary"),
gettext_noop("Streaming"));
- /* Two_phase is only supported in v15 and higher */
+ /* Two_phase and skip XID are only supported in v15 and higher */
if (pset.sversion >= 150000)
appendPQExpBuffer(&buf,
- ", subtwophasestate AS \"%s\"\n",
- gettext_noop("Two phase commit"));
+ ", subtwophasestate AS \"%s\"\n"
+ ", subskipxid AS \"%s\"\n",
+ gettext_noop("Two phase commit"),
+ gettext_noop("Skip XID"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 6bd33a06cb..b5689ec609 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1710,7 +1710,7 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> */
else if (Matches("ALTER", "SUBSCRIPTION", MatchAny))
COMPLETE_WITH("CONNECTION", "ENABLE", "DISABLE", "OWNER TO",
- "RENAME TO", "REFRESH PUBLICATION", "SET",
+ "RENAME TO", "REFRESH PUBLICATION", "SET", "SKIP",
"ADD PUBLICATION", "DROP PUBLICATION");
/* ALTER SUBSCRIPTION <name> REFRESH PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) &&
@@ -1726,6 +1726,12 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
COMPLETE_WITH("binary", "slot_name", "streaming", "synchronous_commit");
+ /* ALTER SUBSCRIPTION <name> SKIP */
+ else if (Matches("ALTER", "SUBSCRIPTION", MatchAny, "SKIP"))
+ COMPLETE_WITH("(");
+ /* ALTER SUBSCRIPTION <name> SKIP ( */
+ else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
+ COMPLETE_WITH("xid");
/* ALTER SUBSCRIPTION <name> SET PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "PUBLICATION"))
{
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 18c291289f..d4410da58f 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -67,6 +67,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
char subtwophasestate; /* Stream two-phase transactions */
+ TransactionId subskipxid; /* All changes associated with
+ * this XID are skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -103,6 +106,7 @@ typedef struct Subscription
* binary format */
bool stream; /* Allow streaming in-progress transactions. */
char twophasestate; /* Allow streaming two-phase transactions */
+ TransactionId skipxid; /* All changes of the XID are skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 413e7c85a1..ab3554f234 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3716,7 +3716,8 @@ typedef enum AlterSubscriptionType
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
ALTER_SUBSCRIPTION_DROP_PUBLICATION,
ALTER_SUBSCRIPTION_REFRESH,
- ALTER_SUBSCRIPTION_ENABLED
+ ALTER_SUBSCRIPTION_ENABLED,
+ ALTER_SUBSCRIPTION_SKIP
} AlterSubscriptionType;
typedef struct AlterSubscriptionStmt
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 80aae83562..892b6739bc 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -76,10 +76,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -93,11 +93,39 @@ ALTER SUBSCRIPTION regress_doesnotexist CONNECTION 'dbname=regress_doesnotexist2
ERROR: subscription "regress_doesnotexist" does not exist
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
ERROR: unrecognized subscription parameter: "create_slot"
+-- ok - valid xid
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 3);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 4294967295);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+ subname | subskipxid
+-----------------+------------
+ regress_testsub | 4294967295
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = NONE);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+ subname | subskipxid
+-----------------+------------
+ regress_testsub | 0
+(1 row)
+
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 0);
+ERROR: invalid transaction id: 0
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1);
+ERROR: invalid transaction id: 1
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 2);
+ERROR: invalid transaction id: 2
+-- fail - must be superuser
+SET SESSION AUTHORIZATION 'regress_subscription_user2';
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 100);
+ERROR: must be owner of subscription regress_testsub
+SET SESSION AUTHORIZATION 'regress_subscription_user';
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+--------------------+------------------------------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | off | dbname=regress_doesnotexist2
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+----------+--------------------+------------------------------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | 0 | off | dbname=regress_doesnotexist2
(1 row)
BEGIN;
@@ -129,10 +157,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+--------------------+------------------------------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | local | dbname=regress_doesnotexist2
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+----------+--------------------+------------------------------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | 0 | local | dbname=regress_doesnotexist2
(1 row)
-- rename back to keep the rest simple
@@ -165,19 +193,19 @@ ERROR: binary requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, binary = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -188,19 +216,19 @@ ERROR: streaming requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
-- fail - publication already exists
@@ -215,10 +243,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
-- fail - publication used more then once
@@ -233,10 +261,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -270,10 +298,10 @@ ERROR: two_phase requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | 0 | off | dbname=regress_doesnotexist
(1 row)
--fail - alter of two_phase option not supported.
@@ -282,10 +310,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -294,10 +322,10 @@ DROP SUBSCRIPTION regress_testsub;
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index bd0f4af1e4..aa15d12d9d 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -72,6 +72,23 @@ ALTER SUBSCRIPTION regress_testsub SET (slot_name = '');
ALTER SUBSCRIPTION regress_doesnotexist CONNECTION 'dbname=regress_doesnotexist2';
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
+-- ok - valid xid
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 3);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 4294967295);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = NONE);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 0);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 2);
+
+-- fail - must be superuser
+SET SESSION AUTHORIZATION 'regress_subscription_user2';
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 100);
+SET SESSION AUTHORIZATION 'regress_subscription_user';
+
\dRs+
BEGIN;
diff --git a/src/test/subscription/t/028_skip_xact.pl b/src/test/subscription/t/028_skip_xact.pl
new file mode 100644
index 0000000000..4c107fc8f5
--- /dev/null
+++ b/src/test/subscription/t/028_skip_xact.pl
@@ -0,0 +1,217 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Tests for skipping logical replication transactions
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use Test::More tests => 7;
+
+# Test skipping the transaction. This function must be called after the caller
+# inserting data that conflict with the subscriber. After waiting for the
+# subscription worker stats are updated, we skip the transaction in question
+# by ALTER SUBSCRIPTION ... SKIP. Then, check if logical replication can continue
+# working by inserting $nonconflict_data on the publisher.
+sub test_skip_xact
+{
+ my ($node_publisher, $node_subscriber, $subname, $relname, $nonconflict_data,
+ $expected, $xid, $msg) = @_;
+
+ # Wait for worker error
+ $node_subscriber->poll_query_until(
+ 'postgres',
+ qq[
+SELECT count(1) > 0
+FROM pg_stat_subscription_workers
+WHERE last_error_relid = '$relname'::regclass
+ AND subrelid IS NULL
+ AND last_error_command = 'INSERT'
+ AND last_error_xid = '$xid'
+ AND starts_with(last_error_message, 'duplicate key value violates unique constraint');
+]) or die "Timed out while waiting for worker error";
+
+ # Set skip xid
+ $node_subscriber->safe_psql(
+ 'postgres',
+ "ALTER SUBSCRIPTION $subname SKIP (xid = '$xid')");
+
+ # Restart the subscriber node to restart logical replication with no interval
+ $node_subscriber->restart;
+
+ # Wait for the failed transaction to be skipped
+ $node_subscriber->poll_query_until(
+ 'postgres',
+ "SELECT subskipxid = 0 FROM pg_subscription WHERE subname = '$subname'");
+
+ # Insert non-conflict data
+ $node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO $relname VALUES $nonconflict_data");
+
+ $node_publisher->wait_for_catchup($subname);
+
+ # Check replicated data
+ my $res = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT count(*) FROM $relname");
+ is($res, $expected, $msg);
+}
+
+# Create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ qq[
+logical_decoding_work_mem = 64kB
+max_prepared_transactions = 10
+]);
+$node_publisher->start;
+
+# Create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+]);
+
+# The subscriber will enter an infinite error loop, so we don't want
+# to overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+wal_retrieve_retry_interval = 2s
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we
+# create the same tables but with primary keys. Also, insert some data that
+# will conflict with the data replicated from publisher later.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+CREATE TABLE test_tab (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+COMMIT;
+]);
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+CREATE TABLE test_tab (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab VALUES (1);
+INSERT INTO test_tab_streaming VALUES (1, md5(1::text));
+COMMIT;
+]);
+
+# Setup publications
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Create subscriptions
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr' PUBLICATION tap_pub WITH (two_phase = on);
+CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr' PUBLICATION tap_pub_streaming WITH (two_phase = on, streaming = on);
+]);
+
+$node_publisher->wait_for_catchup('tap_sub');
+$node_publisher->wait_for_catchup('tap_sub_streaming');
+
+# Insert data to test_tab1, raising an error on the subscriber due to violation
+# of the unique constraint on test_tab. Then skip the transaction.
+my $xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(2)", "2", $xid,
+ "test skipping transaction");
+
+# Test for PREPARE and COMMIT PREPARED. Insert the same data to test_tab1 and
+# PREPARE the transaction, raising an error. Then skip the transaction.
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+COMMIT PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(3)", "3", $xid,
+ "test skipping prepare and commit prepared ");
+
+# Test for PREPARE and ROLLBACK PREPARED
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+ROLLBACK PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(4)", "4", $xid,
+ "test skipping prepare and rollback prepared");
+
+# Test for STREAM COMMIT. Insert enough rows to test_tab_streaming to exceed the 64kB
+# limit, also raising an error on the subscriber during applying spooled changes for the
+# same reason. Then skip the transaction.
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub_streaming", "test_tab_streaming",
+ "(2, md5(2::text))", "2", $xid,
+ "test skipping stream-commit");
+
+# Test for STREAM PREPARE and COMMIT PREPARED
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+COMMIT PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub_streaming", "test_tab_streaming",
+ "(3, md5(3::text))", "3", $xid,
+ "test skipping stream-prepare and commit prepared");
+
+# Test for STREAM PREPARE and ROLLBACK PREPARED
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+ROLLBACK PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub_streaming", "test_tab_streaming",
+ "(4, md5(4::text))", "4", $xid,
+ "test skipping stream-prepare and rollback prepared");
+
+my $res = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT count(*) FROM pg_prepared_xacts");
+is($res, "0", "check all prepared transactions are resolved on the subscriber");
--
2.24.3 (Apple Git-128)
On Tue, Jan 18, 2022 at 12:20 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
On Monday, January 17, 2022 9:52 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Thank you for the comments!
..
(2) Minor improvement suggestion of comment in
src/backend/replication/logical/worker.c+ * reset during that. Also, we don't skip receiving the changes in + streaming + * cases, since we decide whether or not to skip applying the changes + whenI sugguest that you don't use 'streaming cases', because what
"streaming cases" means sounds a bit broader than actual yourimplementation.
We do skip transaction of streaming cases but not during the spooling phase,
right ?
I suggest below.
"We don't skip receiving the changes at the phase to spool streaming
transactions"
I might be missing your point but I think it's correct that we don't skip receiving
the change of the transaction that is sent via streaming protocol. And it doesn't
sound broader to me. Could you elaborate on that?OK. Excuse me for lack of explanation.
I felt "streaming cases" implies "non-streaming cases"
to compare a diffference (in my head) when it is
used to explain something at first.
I imagined the contrast between those, when I saw it.Thus, I thought "streaming cases" meant
whole flow of streaming transactions which consists of messages
surrounded by stream start and stream stop and which are finished by
stream commit/stream abort (including 2PC variations).When I come back to the subject, you wrote below in the comment
"we don't skip receiving the changes in streaming cases,
since we decide whether or not to skip applying the changes
when starting to apply changes"The first part of this sentence
("we don't skip receiving the changes in streaming cases")
gives me an impression where we don't skip changes in the streaming cases
(of my understanding above), but the last part
("we decide whether or not to skip applying the changes
when starting to apply change") means we skip transactions for streaming at apply phase.So, this sentence looked confusing to me slightly.
Thus, I suggested below (and when I connect it with existing part)"we don't skip receiving the changes at the phase to spool streaming transactions
since we decide whether or not to skip applying the changes when starting to apply changes"For me this looked better, but of course, this is a suggestion.
Thank you for your explanation.
I've modified the comment with some changes since "the phase to spool
streaming transaction" seems not commonly be used in worker.c.
(3) in the comment of apply_handle_prepare_internal, two full-width
characters.
3-1
+ * won’t be resent in a case where the server crashes betweenthem.
3-2 + * COMMIT PREPARED or ROLLBACK PREPARED. But that’s okay + because thisYou have full-width characters for "won't" and "that's".
Could you please check ?Which characters in "won't" are full-width characters? I could not find them.
All characters I found and mentioned as full-width are single quotes.
It might be good that you check the entire patch once
by some tool that helps you to detect it.
Thanks!
(5)
I can miss something here but, in one of the past discussions, there
seems a consensus that if the user specifies XID of a subtransaction,
it would be better to skip only the subtransaction.This time, is it out of the range of the patch ?
If so, I suggest you include some description about it either in the
commit message or around codes related to it.How can the user know subtransaction XID? I suppose you refer to streaming
protocol cases but while applying spooled changes we don't report
subtransaction XID neither in server log nor pg_stat_subscription_workers.Yeah, usually subtransaction XID is not exposed to the users. I agree.
But, clarifying the target of this feature is only top-level transactions
sounds better to me. Thank you Amit-san for your support
about how we should write it in [1] !
Yes, I've included the sentence proposed by Amit in the latest patch.
(6)
I feel it's a better idea to include a test whether to skip aborted
streaming transaction clears the XID in the TAP test for this feature,
in a sense to cover various new code paths. Did you have any special
reason to omit the case ?Which code path is newly covered by this aborted streaming transaction tests?
I think that this patch is already covered even by the test for a
committed-and-streamed transaction. It doesn't matter whether the streamed
transaction is committed or aborted because an error occurs while applying the
spooled changes.Oh, this was my mistake. What I expressed as a new patch is
apply_handle_stream_abort -> clear_subscription_skip_xid.
But, this was totally wrong as you explained.(7)
I want more explanation for the reason to restart the subscriber in
the TAP test because this is not mandatory operation.
(We can pass the TAP tests without this restart)From :
# Restart the subscriber node to restart logical replication with no
intervalIIUC, below would be better.
To :
# As an optimization to finish tests earlier, restart the subscriber
with no interval, # rather than waiting for new error to laucher a new applyworker.
I could not understand why the proposed sentence has more information.
Does it mean you want to mention "As an optimization to finish tests earlier"?Yes, exactly. The point is to add "As an optimization to finish tests earlier".
Probably, I should have asked a simple question "why do you restart the subscriber" ?
At first sight, I couldn't understand the meaning for the restart and
you don't explain the reason itself.
I thought "to restart logical replication with no interval" explains
the reason why we restart the subscriber. I left this part but we can
change it later if others also want to do that change.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tuesday, January 18, 2022 1:39 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated patch. All comments I got so far were incorporated
into this patch unless I'm missing something.
Hi, thank you for your new patch v7.
For your information, I've encountered a failure to apply patch v7
on top of the latest commit (d3f4532)
$ git am v7-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transac.patch
Applying: Add ALTER SUBSCRIPTION ... SKIP to skip the transaction on subscriber nodes
error: patch failed: src/backend/parser/gram.y:9954
error: src/backend/parser/gram.y: patch does not apply
Could you please rebase it when it's necessary ?
Best Regards,
Takamichi Osumi
On Tue, Jan 18, 2022 at 2:37 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
On Tuesday, January 18, 2022 1:39 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated patch. All comments I got so far were incorporated
into this patch unless I'm missing something.Hi, thank you for your new patch v7.
For your information, I've encountered a failure to apply patch v7
on top of the latest commit (d3f4532)$ git am v7-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transac.patch
Applying: Add ALTER SUBSCRIPTION ... SKIP to skip the transaction on subscriber nodes
error: patch failed: src/backend/parser/gram.y:9954
error: src/backend/parser/gram.y: patch does not applyCould you please rebase it when it's necessary ?
Thank you for reporting!
I've attached a rebased patch.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v8-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transac.patchapplication/octet-stream; name=v8-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transac.patchDownload
From 09fc9a267c457c0daba7fb0317ee521b54b7d86b Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 10 Dec 2021 14:41:30 +0900
Subject: [PATCH v8] Add ALTER SUBSCRIPTION ... SKIP to skip the transaction on
subscriber nodes
If incoming change violates any constraint, logical replication stops
until it's resolved. This commit introduces another way to skip the
transaction in question, other than manually updating the subscriber's
database or using pg_replication_origin_advance().
The user can specify XID by ALTER SUBSCRIPTION ... SKIP (xid = XXX),
updating pg_subscription.subskipxid field, telling the apply worker to
skip the transaction. The apply worker skips all data modification
changes within the specified transaction.
After skipping the transaction the apply worker clears
pg_subscription.subskipxid.
---
doc/src/sgml/catalogs.sgml | 10 +
doc/src/sgml/logical-replication.sgml | 49 +++-
doc/src/sgml/ref/alter_subscription.sgml | 42 ++++
src/backend/catalog/pg_subscription.c | 1 +
src/backend/commands/subscriptioncmds.c | 53 +++++
src/backend/parser/gram.y | 9 +
src/backend/replication/logical/worker.c | 265 ++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 4 +
src/bin/psql/describe.c | 10 +-
src/bin/psql/tab-complete.c | 8 +-
src/include/catalog/pg_subscription.h | 4 +
src/include/nodes/parsenodes.h | 3 +-
src/test/regress/expected/subscription.out | 124 ++++++----
src/test/regress/sql/subscription.sql | 17 ++
src/test/subscription/t/028_skip_xact.pl | 217 +++++++++++++++++
15 files changed, 755 insertions(+), 61 deletions(-)
create mode 100644 src/test/subscription/t/028_skip_xact.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 2aeb2ef346..16f429b853 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7746,6 +7746,16 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subskipxid</structfield> <type>xid</type>
+ </para>
+ <para>
+ ID of the transaction whose changes are to be skipped, if a valid
+ transaction ID; otherwise 0.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subconninfo</structfield> <type>text</type>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 96b4886e08..de4f83bbcc 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -353,15 +353,58 @@
<para>
The resolution can be done either by changing data or permissions on the subscriber so
- that it does not conflict with the incoming change or by skipping the
- transaction that conflicts with the existing data. The transaction can be
- skipped by calling the <link linkend="pg-replication-origin-advance">
+ that it does not conflict with the incoming changes or by skipping the
+ the transaction that conflicts with the existing data. When a conflict
+ produces an error, it is shown in
+ <structname>pg_stat_subscription_workers</structname> view as follows:
+ </para>
+
+ <programlisting>
+postgres=# SELECT * FROM pg_stat_subscription_workers;
+-[ RECORD 1 ]------+-----------------------------------------------------------
+subid | 16391
+subname | test_sub
+subrelid |
+last_error_relid | 16385
+last_error_command | INSERT
+last_error_xid | 716
+last_error_count | 50
+last_error_message | duplicate key value violates unique constraint "test_pkey"
+last_error_time | 2021-09-29 15:52:45.165754+00
+</programlisting>
+
+ <para>
+ and it is also shown in subscriber's server log as follows:
+ </para>
+
+<screen>
+ERROR: duplicate key value violates unique constraint "test_pkey"
+DETAIL: Key (id)=(1) already exists.
+CONTEXT: processing remote data during "INSERT" for replication target relation "public.test" in transaction 716 at 2021-09-29 15:52:45.165754+00
+</screen>
+
+ <para>
+ The transaction ID that contains the change violating the constraint can be
+ found from those outputs (transaction ID 716 in the above case). The transaction
+ can be skipped by using <command>ALTER SUBSCRIPTION ... SKIP</command> on the
+ subscription. Alternatively, the transaction can also be skipped by calling the
+ <link linkend="pg-replication-origin-advance">
<function>pg_replication_origin_advance()</function></link> function with
a <parameter>node_name</parameter> corresponding to the subscription name,
and a position. The current position of origins can be seen in the
<link linkend="view-pg-replication-origin-status">
<structname>pg_replication_origin_status</structname></link> system view.
</para>
+
+ <para>
+ To resolve conflicts, you need to consider changing the data on the subscriber so
+ that it doesn't conflict with incoming changes, or dropping the conflicting constraint
+ or unique index, or writing a trigger on the subscriber to suppress or redirect
+ conflicting incoming changes, or as a last resort, by skipping the whole transaction.
+ Skipping the whole transaction includes skipping changes that may not violate
+ any constraint. This can easily make the subscriber inconsistent, especially if
+ a user specifies the wrong transaction ID or the position of origin.
+ </para>
</sect1>
<sect1 id="logical-replication-restrictions">
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 0b027cc346..8b4568ddab 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -29,6 +29,7 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> REFRESH PUB
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> ENABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> DISABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> )
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <replaceable>new_name</replaceable>
</synopsis>
@@ -207,6 +208,47 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><literal>SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> [, ... ] )</literal></term>
+ <listitem>
+ <para>
+ Skips applying all changes of the specified transaction. If incoming data
+ violates any constraints, the logical replication will stop until it is
+ resolved. The resolution can be done either by changing data on the
+ subscriber so that it doesn't conflict with incoming changes or by skipping
+ the whole transaction. Using <command> ALTER SUBSCRIPTION ... SKIP </command>
+ command, the logical replication worker skips all data modification changes
+ within the specified transaction including changes that may not violate
+ the constraint, so, it should only be used as a last resort. This option has
+ no effect on the transactions that are already prepared by enabling
+ <literal>two_phase</literal> on subscriber. After the logical replication
+ successfully skips the transaction, the transaction ID (stored in
+ <structname>pg_subscription</structname>.<structfield>subskipxid</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts.
+ </para>
+
+ <para>
+ <replaceable>skip_option</replaceable> specifies options for this operation.
+ The supported option is:
+
+ <variablelist>
+ <varlistentry>
+ <term><literal>xid</literal> (<type>xid</type>)</term>
+ <listitem>
+ <para>
+ Specifies the ID of the transaction whose changes are to be skipped
+ by the logical replication worker. We don't support skipping
+ individual subtransactions. Setting <literal>NONE</literal>
+ resets the transaction ID.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><replaceable class="parameter">new_owner</replaceable></term>
<listitem>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index ca65a8bd20..da199e9a3e 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -69,6 +69,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->binary = subform->subbinary;
sub->stream = subform->substream;
sub->twophasestate = subform->subtwophasestate;
+ sub->skipxid = subform->subskipxid;
/* Get conninfo */
datum = SysCacheGetAttr(SUBSCRIPTIONOID,
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index f5eba450ce..0ff0e00f19 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -61,6 +61,7 @@
#define SUBOPT_BINARY 0x00000080
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
+#define SUBOPT_XID 0x00000400
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -82,6 +83,8 @@ typedef struct SubOpts
bool binary;
bool streaming;
bool twophase;
+ TransactionId xid; /* InvalidTransactionId for resetting purpose, otherwise
+ * normal transaction id */
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -249,6 +252,33 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_TWOPHASE_COMMIT;
opts->twophase = defGetBoolean(defel);
}
+ else if (IsSet(supported_opts, SUBOPT_XID) &&
+ strcmp(defel->defname, "xid") == 0)
+ {
+ char *xid_str = defGetString(defel);
+ TransactionId xid;
+
+ if (IsSet(opts->specified_opts, SUBOPT_XID))
+ errorConflictingDefElem(defel, pstate);
+
+ /* Setting xid = NONE is treated as resetting xid */
+ if (strcmp(xid_str, "none") == 0)
+ xid = InvalidTransactionId;
+ else
+ {
+ /* Parse the argument as TransactionId */
+ xid = DatumGetTransactionId(DirectFunctionCall1(xidin,
+ CStringGetDatum(xid_str)));
+
+ if (!TransactionIdIsNormal(xid))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid transaction id: %s", xid_str)));
+ }
+
+ opts->specified_opts |= SUBOPT_XID;
+ opts->xid = xid;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -464,6 +494,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
CharGetDatum(opts.twophase ?
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(InvalidTransactionId);
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -1083,6 +1115,27 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
break;
}
+ case ALTER_SUBSCRIPTION_SKIP:
+ {
+ if (!superuser())
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("must be superuser to skip transaction")));
+
+ parse_subscription_options(pstate, stmt->options, SUBOPT_XID, &opts);
+
+ /* ALTER SUBSCRIPTION ... SKIP supports only xid option */
+ Assert(IsSet(opts.specified_opts, SUBOPT_XID));
+
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(opts.xid);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ update_tuple = true;
+
+ break;
+ }
+
default:
elog(ERROR, "unrecognized ALTER SUBSCRIPTION kind %d",
stmt->kind);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index b5966712ce..d1fc1a8b42 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9954,6 +9954,15 @@ AlterSubscriptionStmt:
(Node *)makeBoolean(false), @1));
$$ = (Node *)n;
}
+ | ALTER SUBSCRIPTION name SKIP definition
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_SKIP;
+ n->subname = $3;
+ n->options = $5;
+ $$ = (Node *)n;
+ }
;
/*****************************************************************************
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index c9af775bc1..0264e30112 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -136,6 +136,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -257,6 +258,21 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * We enable skipping all data modification changes (INSERT, UPDATE, etc.) for
+ * the subscription if the user has specified subskipxid. Once we start skipping
+ * changes, we don't stop it until we skip all changes of the transaction even
+ * if pg_subscription is updated and MySubscription->skipxid gets changed or
+ * reset during that. Also, in streaming transaction cases, we don't skip
+ * receiving and spooling the changes, since we decide whether or not to skip
+ * applying the changes when starting to apply changes. At end of the transaction,
+ * we disable it and reset subskipxid. The timing of resetting subskipxid varies
+ * depending on commit or commit/rollback prepared case. Please refer to the
+ * comments in corresponding functions for details.
+ */
+static TransactionId skip_xid = InvalidTransactionId;
+#define is_skipping_changes() (TransactionIdIsValid(skip_xid))
+
/* BufFile handle of the current streaming file */
static BufFile *stream_fd = NULL;
@@ -332,6 +348,13 @@ static void TwoPhaseTransactionGid(Oid subid, TransactionId xid, char *gid, int
/* Common streaming function to apply all the spooled messages */
static void apply_spooled_messages(TransactionId xid, XLogRecPtr lsn);
+/* Functions for skipping changes */
+static void maybe_start_skipping_changes(TransactionId xid);
+static void stop_skipping_changes(bool commit, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp);
+static void clear_subscription_skip_xid(TransactionId xid, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp);
+
/* Functions for apply error callback */
static void apply_error_callback(void *arg);
static inline void set_apply_error_context_xact(TransactionId xid, TimestampTz ts);
@@ -791,6 +814,11 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
+ /*
+ * Enable skipping all changes of this transaction if specified.
+ */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -843,6 +871,11 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
+ /*
+ * Enable skipping all changes of this transaction if specified
+ */
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -856,6 +889,36 @@ apply_handle_prepare_internal(LogicalRepPreparedTxnData *prepare_data)
{
char gid[GIDSIZE];
+ /*
+ * If we are skipping all changes of this transaction, we stop it but
+ * unlike commit, we do not clear subskipxid of pg_subscription catalog
+ * here and will do that at commit prepared or rollback prepared time. If
+ * we update the catalog and then prepare the transaction, the changes
+ * will be part of the prepared transaction. Even if we do that in
+ * reverse order, subskipxid will not be cleared but this transaction
+ * won't be resent in a case where the server crashes between them.
+ *
+ * subskipxid might be changed or cleared by the user before we receive
+ * COMMIT PREPARED or ROLLBACK PREPARED. But that's okay because this
+ * prepared transaction is empty.
+ *
+ * One might think that we can skip preparing the skipped transaction and
+ * also skip COMMIT PREPARED or ROLLBACK PREPARED by comparing the XID
+ * received as part of the message to subskipxid. But subskipxid could be
+ * changed by users between PREPARE and COMMIT PREPARED or ROLLBACK
+ * PREPARED. There was an idea to disallow users to change subskipxid
+ * while skipping changes. But we don't know when COMMIT PREPARED or
+ * ROLLBACK PREPARED comes and another conflict could occur in the
+ * meanwhile. If such another conflict occurs, we cannot skip the
+ * transaction by using subskipxid. Also, there was another idea to check
+ * whether the transaction has been prepared or not by checking GID,
+ * origin LSN, and origin timestamp of the prepared transaction but that
+ * doesn't seem worthwhile because it requires protocol changes, and
+ * skipping transactions shouldn't be common.
+ */
+ if (is_skipping_changes())
+ stop_skipping_changes(false, InvalidXLogRecPtr, 0);
+
/*
* Compute unique GID for two_phase transactions. We don't use GID of
* prepared transaction sent by server as that can lead to deadlock when
@@ -901,9 +964,9 @@ apply_handle_prepare(StringInfo s)
/*
* Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction. It is done this way because at
- * commit prepared time, we won't know whether we have skipped preparing a
- * transaction because of no change.
+ * change has happened in this transaction or all changes are skipped. It
+ * is done this way because at commit prepared time, we won't know whether
+ * we have skipped preparing a transaction because of no change.
*
* XXX, We can optimize such that at commit prepared time, we first check
* whether we have prepared the transaction or not but that doesn't seem
@@ -940,6 +1003,23 @@ apply_handle_commit_prepared(StringInfo s)
logicalrep_read_commit_prepared(s, &prepare_data);
set_apply_error_context_xact(prepare_data.xid, prepare_data.commit_time);
+ if (MySubscription->skipxid == prepare_data.xid)
+ {
+ /*
+ * Clear the subskipxid of pg_subscription catalog. This catalog
+ * update must be committed before finishing prepared transaction.
+ * Because otherwise, in a case where the server crashes between
+ * finishing prepared transaction and the catalog update, COMMIT
+ * PREPARED won't be resent but subskipxid is left.
+ *
+ * Also, we must not update the replication origin LSN and timestamp
+ * while committing the catalog update so that COMMIT PREPARED will be
+ * resent in case of a crash immediately after the catalog update
+ * commit.
+ */
+ clear_subscription_skip_xid(prepare_data.xid, InvalidXLogRecPtr, 0);
+ }
+
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
gid, sizeof(gid));
@@ -981,6 +1061,17 @@ apply_handle_rollback_prepared(StringInfo s)
logicalrep_read_rollback_prepared(s, &rollback_data);
set_apply_error_context_xact(rollback_data.xid, rollback_data.rollback_time);
+ if (MySubscription->skipxid == rollback_data.xid)
+ {
+ /*
+ * Same as COMMIT PREPARED, we must clear subskipxid of
+ * pg_subscription before rolling back the prepared transaction.
+ * Please see the comments in apply_handle_commit_prepared() for
+ * details.
+ */
+ clear_subscription_skip_xid(rollback_data.xid, InvalidXLogRecPtr, 0);
+ }
+
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, rollback_data.xid,
gid, sizeof(gid));
@@ -1209,6 +1300,13 @@ apply_handle_stream_abort(StringInfo s)
logicalrep_read_stream_abort(s, &xid, &subxid);
+ /*
+ * We don't expect the user to set the XID of the transaction that is
+ * rolled back but if the skip XID is set, clear it.
+ */
+ if (MySubscription->skipxid == xid || MySubscription->skipxid == subxid)
+ clear_subscription_skip_xid(MySubscription->skipxid, InvalidXLogRecPtr, 0);
+
/*
* If the two XIDs are the same, it's in fact abort of toplevel xact, so
* just delete the files with serialized info.
@@ -1331,6 +1429,9 @@ apply_spooled_messages(TransactionId xid, XLogRecPtr lsn)
remote_final_lsn = lsn;
+ /* Enable skipping all changes of this transaction if specified */
+ maybe_start_skipping_changes(xid);
+
/*
* Make sure the handle apply_dispatch methods are aware we're in a remote
* transaction.
@@ -1451,7 +1552,23 @@ apply_handle_stream_commit(StringInfo s)
static void
apply_handle_commit_internal(LogicalRepCommitData *commit_data)
{
- if (IsTransactionState())
+ if (is_skipping_changes())
+ {
+ /*
+ * If we are skipping all changes of this transaction, we stop it and
+ * clear subskipxid of pg_subscription.
+ */
+ stop_skipping_changes(true, commit_data->end_lsn,
+ commit_data->committime);
+
+ /* Clearing subskipxid must be committed */
+ Assert(!IsTransactionState());
+
+ pgstat_report_stat(false);
+
+ store_flush_position(commit_data->end_lsn);
+ }
+ else if (IsTransactionState())
{
/*
* Update origin state so we can restart streaming from correct
@@ -2366,6 +2483,17 @@ apply_dispatch(StringInfo s)
LogicalRepMsgType action = pq_getmsgbyte(s);
LogicalRepMsgType saved_command;
+ /*
+ * Skip all data-modification changes if we're skipping changes of this
+ * transaction.
+ */
+ if (is_skipping_changes() &&
+ (action == LOGICAL_REP_MSG_INSERT ||
+ action == LOGICAL_REP_MSG_UPDATE ||
+ action == LOGICAL_REP_MSG_DELETE ||
+ action == LOGICAL_REP_MSG_TRUNCATE))
+ return;
+
/*
* Set the current command being applied. Since this function can be
* called recusively when applying spooled changes, save the current
@@ -3661,6 +3789,135 @@ IsLogicalWorker(void)
return MyLogicalRepWorker != NULL;
}
+/*
+ * Start skipping changes of the transaction if the given XID matches the
+ * transaction ID specified by subskipxid.
+ */
+static void
+maybe_start_skipping_changes(TransactionId xid)
+{
+ Assert(!is_skipping_changes());
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ if (MySubscription->skipxid != xid)
+ return;
+
+ /* Start skipping all changes of this transaction */
+ skip_xid = xid;
+
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction %u",
+ xid));
+}
+
+/*
+ * Stop skipping changes by resetting subskipxid. If clear_subskipxid is true,
+ * we also clear subskipxid of pg_subscription by setting InvalidTransactionId.
+ */
+static void
+stop_skipping_changes(bool clear_subskipxid, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp)
+{
+ Assert(is_skipping_changes());
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction %u",
+ skip_xid)));
+
+ if (clear_subskipxid)
+ {
+ clear_subscription_skip_xid(skip_xid, origin_lsn, origin_timestamp);
+
+ /* Make sure that clearing subskipxid is committed */
+ if (IsTransactionState())
+ CommitTransactionCommand();
+ }
+
+ /* Stop skipping changes */
+ skip_xid = InvalidTransactionId;
+}
+
+/* Clear subskipxid of pg_subscription catalog */
+static void
+clear_subscription_skip_xid(TransactionId xid, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp)
+{
+ Relation rel;
+ Form_pg_subscription subform;
+ HeapTuple tup;
+ bool started_tx = false;
+
+ if (!IsTransactionState())
+ {
+ StartTransactionCommand();
+ started_tx = true;
+ }
+
+ /*
+ * Protect subskipxid of pg_subscription from being concurrently updated
+ * while clearing it.
+ */
+ LockSharedObject(SubscriptionRelationId, MySubscription->oid, 0,
+ AccessShareLock);
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ ObjectIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ /* Get subskipxid value */
+ subform = (Form_pg_subscription) GETSTRUCT(tup);
+
+ /*
+ * Update the subskipxid of the tuple to InvalidTransactionId. If user
+ * has already changed subskipxid before clearing it we don't update the
+ * catalog and don't advance the replication origin state. So in the
+ * worst case, if the server crashes before sending an acknowledgment of
+ * the flush position the transaction will be sent again and the user
+ * needs to be set subskipxid again. We can reduce the possibility by
+ * logging a replication origin WAL record to advance the origin LSN
+ * instead but there is no way to advance origin timestamp and it
+ * doesn't seem to be worth it since it's a very minor case.
+ */
+ if (subform->subskipxid == xid)
+ {
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ /* reset subskipxid */
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(InvalidTransactionId);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ /*
+ * Update origin state so we can restart streaming from correct
+ * position in case of crash.
+ */
+ replorigin_session_origin_lsn = origin_lsn;
+ replorigin_session_origin_timestamp = origin_timestamp;
+
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+ }
+
+ heap_freetuple(tup);
+ table_close(rel, NoLock);
+
+ if (started_tx)
+ CommitTransactionCommand();
+}
+
/* Error callback to give more context info about the change being applied */
static void
apply_error_callback(void *arg)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 7c2f1d3044..124f755486 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4304,6 +4304,10 @@ getSubscriptions(Archive *fout)
ntups = PQntuples(res);
+ /*
+ * Get subscription fields. We don't fetch subskipxid as we don't
+ * include it in the dump.
+ */
i_tableoid = PQfnumber(res, "tableoid");
i_oid = PQfnumber(res, "oid");
i_subname = PQfnumber(res, "subname");
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 40433e32fa..6be3c5ce18 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6051,7 +6051,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false};
+ false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6085,11 +6085,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Binary"),
gettext_noop("Streaming"));
- /* Two_phase is only supported in v15 and higher */
+ /* Two_phase and skip XID are only supported in v15 and higher */
if (pset.sversion >= 150000)
appendPQExpBuffer(&buf,
- ", subtwophasestate AS \"%s\"\n",
- gettext_noop("Two phase commit"));
+ ", subtwophasestate AS \"%s\"\n"
+ ", subskipxid AS \"%s\"\n",
+ gettext_noop("Two phase commit"),
+ gettext_noop("Skip XID"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 6bd33a06cb..b5689ec609 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1710,7 +1710,7 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> */
else if (Matches("ALTER", "SUBSCRIPTION", MatchAny))
COMPLETE_WITH("CONNECTION", "ENABLE", "DISABLE", "OWNER TO",
- "RENAME TO", "REFRESH PUBLICATION", "SET",
+ "RENAME TO", "REFRESH PUBLICATION", "SET", "SKIP",
"ADD PUBLICATION", "DROP PUBLICATION");
/* ALTER SUBSCRIPTION <name> REFRESH PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) &&
@@ -1726,6 +1726,12 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
COMPLETE_WITH("binary", "slot_name", "streaming", "synchronous_commit");
+ /* ALTER SUBSCRIPTION <name> SKIP */
+ else if (Matches("ALTER", "SUBSCRIPTION", MatchAny, "SKIP"))
+ COMPLETE_WITH("(");
+ /* ALTER SUBSCRIPTION <name> SKIP ( */
+ else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
+ COMPLETE_WITH("xid");
/* ALTER SUBSCRIPTION <name> SET PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "PUBLICATION"))
{
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 18c291289f..d4410da58f 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -67,6 +67,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
char subtwophasestate; /* Stream two-phase transactions */
+ TransactionId subskipxid; /* All changes associated with
+ * this XID are skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -103,6 +106,7 @@ typedef struct Subscription
* binary format */
bool stream; /* Allow streaming in-progress transactions. */
char twophasestate; /* Allow streaming two-phase transactions */
+ TransactionId skipxid; /* All changes of the XID are skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 3e9bdc781f..c7f9d12ac6 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3717,7 +3717,8 @@ typedef enum AlterSubscriptionType
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
ALTER_SUBSCRIPTION_DROP_PUBLICATION,
ALTER_SUBSCRIPTION_REFRESH,
- ALTER_SUBSCRIPTION_ENABLED
+ ALTER_SUBSCRIPTION_ENABLED,
+ ALTER_SUBSCRIPTION_SKIP
} AlterSubscriptionType;
typedef struct AlterSubscriptionStmt
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 80aae83562..892b6739bc 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -76,10 +76,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -93,11 +93,39 @@ ALTER SUBSCRIPTION regress_doesnotexist CONNECTION 'dbname=regress_doesnotexist2
ERROR: subscription "regress_doesnotexist" does not exist
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
ERROR: unrecognized subscription parameter: "create_slot"
+-- ok - valid xid
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 3);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 4294967295);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+ subname | subskipxid
+-----------------+------------
+ regress_testsub | 4294967295
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = NONE);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+ subname | subskipxid
+-----------------+------------
+ regress_testsub | 0
+(1 row)
+
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 0);
+ERROR: invalid transaction id: 0
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1);
+ERROR: invalid transaction id: 1
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 2);
+ERROR: invalid transaction id: 2
+-- fail - must be superuser
+SET SESSION AUTHORIZATION 'regress_subscription_user2';
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 100);
+ERROR: must be owner of subscription regress_testsub
+SET SESSION AUTHORIZATION 'regress_subscription_user';
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+--------------------+------------------------------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | off | dbname=regress_doesnotexist2
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+----------+--------------------+------------------------------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | 0 | off | dbname=regress_doesnotexist2
(1 row)
BEGIN;
@@ -129,10 +157,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+--------------------+------------------------------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | local | dbname=regress_doesnotexist2
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+----------+--------------------+------------------------------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | 0 | local | dbname=regress_doesnotexist2
(1 row)
-- rename back to keep the rest simple
@@ -165,19 +193,19 @@ ERROR: binary requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, binary = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -188,19 +216,19 @@ ERROR: streaming requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
-- fail - publication already exists
@@ -215,10 +243,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
-- fail - publication used more then once
@@ -233,10 +261,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -270,10 +298,10 @@ ERROR: two_phase requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | 0 | off | dbname=regress_doesnotexist
(1 row)
--fail - alter of two_phase option not supported.
@@ -282,10 +310,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -294,10 +322,10 @@ DROP SUBSCRIPTION regress_testsub;
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Skip XID | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+----------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index bd0f4af1e4..aa15d12d9d 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -72,6 +72,23 @@ ALTER SUBSCRIPTION regress_testsub SET (slot_name = '');
ALTER SUBSCRIPTION regress_doesnotexist CONNECTION 'dbname=regress_doesnotexist2';
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
+-- ok - valid xid
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 3);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 4294967295);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = NONE);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 0);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 2);
+
+-- fail - must be superuser
+SET SESSION AUTHORIZATION 'regress_subscription_user2';
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 100);
+SET SESSION AUTHORIZATION 'regress_subscription_user';
+
\dRs+
BEGIN;
diff --git a/src/test/subscription/t/028_skip_xact.pl b/src/test/subscription/t/028_skip_xact.pl
new file mode 100644
index 0000000000..4c107fc8f5
--- /dev/null
+++ b/src/test/subscription/t/028_skip_xact.pl
@@ -0,0 +1,217 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Tests for skipping logical replication transactions
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use Test::More tests => 7;
+
+# Test skipping the transaction. This function must be called after the caller
+# inserting data that conflict with the subscriber. After waiting for the
+# subscription worker stats are updated, we skip the transaction in question
+# by ALTER SUBSCRIPTION ... SKIP. Then, check if logical replication can continue
+# working by inserting $nonconflict_data on the publisher.
+sub test_skip_xact
+{
+ my ($node_publisher, $node_subscriber, $subname, $relname, $nonconflict_data,
+ $expected, $xid, $msg) = @_;
+
+ # Wait for worker error
+ $node_subscriber->poll_query_until(
+ 'postgres',
+ qq[
+SELECT count(1) > 0
+FROM pg_stat_subscription_workers
+WHERE last_error_relid = '$relname'::regclass
+ AND subrelid IS NULL
+ AND last_error_command = 'INSERT'
+ AND last_error_xid = '$xid'
+ AND starts_with(last_error_message, 'duplicate key value violates unique constraint');
+]) or die "Timed out while waiting for worker error";
+
+ # Set skip xid
+ $node_subscriber->safe_psql(
+ 'postgres',
+ "ALTER SUBSCRIPTION $subname SKIP (xid = '$xid')");
+
+ # Restart the subscriber node to restart logical replication with no interval
+ $node_subscriber->restart;
+
+ # Wait for the failed transaction to be skipped
+ $node_subscriber->poll_query_until(
+ 'postgres',
+ "SELECT subskipxid = 0 FROM pg_subscription WHERE subname = '$subname'");
+
+ # Insert non-conflict data
+ $node_publisher->safe_psql(
+ 'postgres',
+ "INSERT INTO $relname VALUES $nonconflict_data");
+
+ $node_publisher->wait_for_catchup($subname);
+
+ # Check replicated data
+ my $res = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT count(*) FROM $relname");
+ is($res, $expected, $msg);
+}
+
+# Create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ qq[
+logical_decoding_work_mem = 64kB
+max_prepared_transactions = 10
+]);
+$node_publisher->start;
+
+# Create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+]);
+
+# The subscriber will enter an infinite error loop, so we don't want
+# to overflow the server log with error messages.
+$node_subscriber->append_conf('postgresql.conf',
+ qq[
+wal_retrieve_retry_interval = 2s
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On subscriber we
+# create the same tables but with primary keys. Also, insert some data that
+# will conflict with the data replicated from publisher later.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+CREATE TABLE test_tab (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+COMMIT;
+]);
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+CREATE TABLE test_tab (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab VALUES (1);
+INSERT INTO test_tab_streaming VALUES (1, md5(1::text));
+COMMIT;
+]);
+
+# Setup publications
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Create subscriptions
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr' PUBLICATION tap_pub WITH (two_phase = on);
+CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr' PUBLICATION tap_pub_streaming WITH (two_phase = on, streaming = on);
+]);
+
+$node_publisher->wait_for_catchup('tap_sub');
+$node_publisher->wait_for_catchup('tap_sub_streaming');
+
+# Insert data to test_tab1, raising an error on the subscriber due to violation
+# of the unique constraint on test_tab. Then skip the transaction.
+my $xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(2)", "2", $xid,
+ "test skipping transaction");
+
+# Test for PREPARE and COMMIT PREPARED. Insert the same data to test_tab1 and
+# PREPARE the transaction, raising an error. Then skip the transaction.
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+COMMIT PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(3)", "3", $xid,
+ "test skipping prepare and commit prepared ");
+
+# Test for PREPARE and ROLLBACK PREPARED
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+ROLLBACK PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(4)", "4", $xid,
+ "test skipping prepare and rollback prepared");
+
+# Test for STREAM COMMIT. Insert enough rows to test_tab_streaming to exceed the 64kB
+# limit, also raising an error on the subscriber during applying spooled changes for the
+# same reason. Then skip the transaction.
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub_streaming", "test_tab_streaming",
+ "(2, md5(2::text))", "2", $xid,
+ "test skipping stream-commit");
+
+# Test for STREAM PREPARE and COMMIT PREPARED
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+COMMIT PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub_streaming", "test_tab_streaming",
+ "(3, md5(3::text))", "3", $xid,
+ "test skipping stream-prepare and commit prepared");
+
+# Test for STREAM PREPARE and ROLLBACK PREPARED
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+ROLLBACK PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub_streaming", "test_tab_streaming",
+ "(4, md5(4::text))", "4", $xid,
+ "test skipping stream-prepare and rollback prepared");
+
+my $res = $node_subscriber->safe_psql(
+ 'postgres',
+ "SELECT count(*) FROM pg_prepared_xacts");
+is($res, "0", "check all prepared transactions are resolved on the subscriber");
--
2.24.3 (Apple Git-128)
On Tuesday, January 18, 2022 3:05 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached a rebased patch.
Thank you for your rebase !
Several review comments on v8.
(1) doc/src/sgml/logical-replication.sgml
+
+ <para>
+ To resolve conflicts, you need to consider changing the data on the subscriber so
+ that it doesn't conflict with incoming changes, or dropping the conflicting constraint
+ or unique index, or writing a trigger on the subscriber to suppress or redirect
+ conflicting incoming changes, or as a last resort, by skipping the whole transaction.
+ Skipping the whole transaction includes skipping changes that may not violate
+ any constraint. This can easily make the subscriber inconsistent, especially if
+ a user specifies the wrong transaction ID or the position of origin.
+ </para>
The first sentence is too long and lack of readability slightly.
One idea to sort out listing items is to utilize "itemizedlist".
For instance, I imagined something like below.
<para>
To resolve conflicts, you need to consider following actions:
<itemizedlist>
<listitem>
<para>
Change the data on the subscriber so that it doesn't conflict with incoming changes
</para>
</listitem>
...
<listitem>
<para>
As a last resort, skip the whole transaction
</para>
</listitem>
</itemizedlist>
....
</para>
What did you think ?
By the way, in case only when you want to keep the current sentence style,
I have one more question. Do we need "by" in the part
"by skipping the whole transaction" ? If we focus on only this action,
I think the sentence becomes "you need to consider skipping the whole transaction".
If this is true, we don't need "by" in the part.
(2)
Also, in the same paragraph, we write
+ ... This can easily make the subscriber inconsistent, especially if
+ a user specifies the wrong transaction ID or the position of origin.
The subject of this sentence should be "Those" or "Some of those" ?
because we want to mention either "new skip xid feature" or
"pg_replication_origin_advance".
(3) doc/src/sgml/ref/alter_subscription.sgml
Below change contains unnecessary spaces.
+ the whole transaction. Using <command> ALTER SUBSCRIPTION ... SKIP </command>
Need to change
From:
<command> ALTER SUBSCRIPTION ... SKIP </command>
To:
<command>ALTER SUBSCRIPTION ... SKIP</command>
(4) comment in clear_subscription_skip_xid
+ * the flush position the transaction will be sent again and the user
+ * needs to be set subskipxid again. We can reduce the possibility by
Shoud change
From:
the user needs to be set...
To:
the user needs to set...
(5) clear_subscription_skip_xid
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
Can we change it to ereport with ERRCODE_UNDEFINED_OBJECT ?
This suggestion has another aspect that in within one patch, we don't mix
both ereport and elog at the same time.
(6) apply_handle_stream_abort
@@ -1209,6 +1300,13 @@ apply_handle_stream_abort(StringInfo s)
logicalrep_read_stream_abort(s, &xid, &subxid);
+ /*
+ * We don't expect the user to set the XID of the transaction that is
+ * rolled back but if the skip XID is set, clear it.
+ */
+ if (MySubscription->skipxid == xid || MySubscription->skipxid == subxid)
+ clear_subscription_skip_xid(MySubscription->skipxid, InvalidXLogRecPtr, 0);
+
In my humble opinion, this still cares about subtransaction xid still.
If we want to be consistent with top level transactions only,
I felt checking MySubscription->skipxid == xid should be sufficient.
Below is an *insame* (in a sense not correct usage) scenario
to hit the "MySubscription->skipxid == subxid".
Sorry if it is not perfect.
-------
Set logical_decoding_work_mem = 64.
Create tables named 'tab' with a column id (integer);
Create pub and sub with streaming = true.
No initial data is required on both nodes
because we just want to issue stream_abort
after executing skip xid feature.
<Session1> to the publisher
begin;
select pg_current_xact_id(); -- for reference
insert into tab values (1);
savepoint s1;
insert into tab values (2);
savepoint s2;
insert into tab values (generate_series(1001, 2000));
select ctid, xmin, xmax, id from tab where id in (1, 2, 1001);
<Session2> to the subscriber
select subname, subskipxid from pg_subscription; -- shows 0
alter subscription mysub skip (xid = xxx); -- xxx is that of xmin for 1001 on the publisher
select subname, subskipxid from pg_subscription; -- check it shows xxx just in case
<Session1>
rollback to s1;
commit;
select * from tab; -- shows only data '1'.
<Session2>
select subname, subskipxid from pg_subscription; -- shows 0. subskipxid was reset by the skip xid feature
select count(1) = 1 from tab; -- shows true
FYI: the commands result of those last two commands.
postgres=# select subname, subskipxid from pg_subscription;
subname | subskipxid
---------+------------
mysub | 0
(1 row)
postgres=# select count(1) = 1 from tab;
?column?
----------
t
(1 row)
Thus, it still cares about subtransactions and clear the subskipxid.
Should we fix this behavior for consistency ?
Best Regards,
Takamichi Osumi
On Sat, Jan 15, 2022 at 3:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Jan 14, 2022 at 5:35 PM vignesh C <vignesh21@gmail.com> wrote:
Thanks for the updated patch, few minor comments: 1) Should "SKIP" be "SKIP (" here: @@ -1675,7 +1675,7 @@ psql_completion(const char *text, int start, int end) /* ALTER SUBSCRIPTION <name> */ else if (Matches("ALTER", "SUBSCRIPTION", MatchAny)) COMPLETE_WITH("CONNECTION", "ENABLE", "DISABLE", "OWNER TO", - "RENAME TO", "REFRESH PUBLICATION", "SET", + "RENAME TO", "REFRESH PUBLICATION", "SET", "SKIP",Won't the another rule as follows added by patch sufficient for what you are asking? + /* ALTER SUBSCRIPTION <name> SKIP */ + else if (Matches("ALTER", "SUBSCRIPTION", MatchAny, "SKIP")) + COMPLETE_WITH("(");I might be missing something but why do you think the handling of SKIP
be any different than what we are doing for SET?
In case of "ALTER SUBSCRIPTION sub1 SET" there are 2 possible tab
completion options, user can either specify "ALTER SUBSCRIPTION sub1
SET PUBLICATION pub1" or "ALTER SUBSCRIPTION sub1 SET ( SET option
like STREAMING,etc = 'on')", that is why we have 2 possible options as
below:
postgres=# ALTER SUBSCRIPTION sub1 SET
( PUBLICATION
Whereas in the case of SKIP there is only one possible tab completion
option i.e XID. We handle similarly in case of WITH option, we specify
"WITH (" in case of tab completion for "CREATE PUBLICATION pub1"
postgres=# CREATE PUBLICATION pub1
FOR ALL TABLES FOR ALL TABLES IN SCHEMA FOR TABLE
WITH (
Regards,
Vignesh
On Tue, Jan 18, 2022 at 5:05 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached a rebased patch.
A couple of comments for the v8 patch:
doc/src/sgml/logical-replication.sgml
(1)
Strictly-speaking it's the transaction, not transaction ID, that
contains changes, so suggesting minor change:
BEFORE:
+ The transaction ID that contains the change violating the constraint can be
AFTER:
+ The ID of the transaction that contains the change violating the
constraint can be
doc/src/sgml/ref/alter_subscription.sgml
(2) apply_handle_commit_internal
It's not entirely apparent what commits the clearing of subskixpid
here, so I suggest the following addition:
BEFORE:
+ * clear subskipxid of pg_subscription.
AFTER:
+ * clear subskipxid of pg_subscription, then commit.
Regards,
Greg Nancarrow
Fujitsu Australia
On Wed, Jan 19, 2022 at 12:22 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
On Tuesday, January 18, 2022 3:05 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached a rebased patch.
Thank you for your rebase !
Several review comments on v8.
Thank you for the comments!
(1) doc/src/sgml/logical-replication.sgml
+ + <para> + To resolve conflicts, you need to consider changing the data on the subscriber so + that it doesn't conflict with incoming changes, or dropping the conflicting constraint + or unique index, or writing a trigger on the subscriber to suppress or redirect + conflicting incoming changes, or as a last resort, by skipping the whole transaction. + Skipping the whole transaction includes skipping changes that may not violate + any constraint. This can easily make the subscriber inconsistent, especially if + a user specifies the wrong transaction ID or the position of origin. + </para>The first sentence is too long and lack of readability slightly.
One idea to sort out listing items is to utilize "itemizedlist".
For instance, I imagined something like below.<para>
To resolve conflicts, you need to consider following actions:
<itemizedlist>
<listitem>
<para>
Change the data on the subscriber so that it doesn't conflict with incoming changes
</para>
</listitem>
...
<listitem>
<para>
As a last resort, skip the whole transaction
</para>
</listitem>
</itemizedlist>
....
</para>What did you think ?
By the way, in case only when you want to keep the current sentence style,
I have one more question. Do we need "by" in the part
"by skipping the whole transaction" ? If we focus on only this action,
I think the sentence becomes "you need to consider skipping the whole transaction".
If this is true, we don't need "by" in the part.
I personally prefer to keep the current sentence since listing them
seems not suitable in this case. But I agree that "by" is not
necessary here.
(2)
Also, in the same paragraph, we write
+ ... This can easily make the subscriber inconsistent, especially if + a user specifies the wrong transaction ID or the position of origin.The subject of this sentence should be "Those" or "Some of those" ?
because we want to mention either "new skip xid feature" or
"pg_replication_origin_advance".
I think "This" in the sentence refers to "Skipping the whole
transaction". In the previous paragraph, we describe that there are
two methods for skipping the whole transaction: this new feature and
pg_replication_origin_advance(). And in this paragraph, we don't
mention any specific methods for skipping the whole transaction but
describe that skipping the whole transaction per se can easily make
the subscriber inconsistent. The current structure is fine with me.
(3) doc/src/sgml/ref/alter_subscription.sgml
Below change contains unnecessary spaces.
+ the whole transaction. Using <command> ALTER SUBSCRIPTION ... SKIP </command>Need to change
From:
<command> ALTER SUBSCRIPTION ... SKIP </command>
To:
<command>ALTER SUBSCRIPTION ... SKIP</command>
Will remove.
(4) comment in clear_subscription_skip_xid
+ * the flush position the transaction will be sent again and the user + * needs to be set subskipxid again. We can reduce the possibility byShoud change
From:
the user needs to be set...
To:
the user needs to set...
Will remove.
(5) clear_subscription_skip_xid
+ if (!HeapTupleIsValid(tup)) + elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);Can we change it to ereport with ERRCODE_UNDEFINED_OBJECT ?
This suggestion has another aspect that in within one patch, we don't mix
both ereport and elog at the same time.
I don’t think we need to set errcode since this error is a
should-not-happen error.
(6) apply_handle_stream_abort
@@ -1209,6 +1300,13 @@ apply_handle_stream_abort(StringInfo s)
logicalrep_read_stream_abort(s, &xid, &subxid);
+ /* + * We don't expect the user to set the XID of the transaction that is + * rolled back but if the skip XID is set, clear it. + */ + if (MySubscription->skipxid == xid || MySubscription->skipxid == subxid) + clear_subscription_skip_xid(MySubscription->skipxid, InvalidXLogRecPtr, 0); +In my humble opinion, this still cares about subtransaction xid still.
If we want to be consistent with top level transactions only,
I felt checking MySubscription->skipxid == xid should be sufficient.
I thought if we can clear subskipxid whose value has already been
processed on the subscriber with a reasonable cost it makes sense to
do that because it can reduce the possibility of the issue that XID is
wraparound while leaving the wrong in subskipxid. But as you pointed
out, the current behavior doesn’t match the description in the doc:
After the logical replication successfully skips the transaction, the
transaction ID (stored in pg_subscription.subskipxid) is cleared.
and
We don't support skipping individual subtransactions.
I'll remove it in the next version patch.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wed, Jan 19, 2022 at 12:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Jan 19, 2022 at 12:22 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:(6) apply_handle_stream_abort
@@ -1209,6 +1300,13 @@ apply_handle_stream_abort(StringInfo s)
logicalrep_read_stream_abort(s, &xid, &subxid);
+ /* + * We don't expect the user to set the XID of the transaction that is + * rolled back but if the skip XID is set, clear it. + */ + if (MySubscription->skipxid == xid || MySubscription->skipxid == subxid) + clear_subscription_skip_xid(MySubscription->skipxid, InvalidXLogRecPtr, 0); +In my humble opinion, this still cares about subtransaction xid still.
If we want to be consistent with top level transactions only,
I felt checking MySubscription->skipxid == xid should be sufficient.I thought if we can clear subskipxid whose value has already been
processed on the subscriber with a reasonable cost it makes sense to
do that because it can reduce the possibility of the issue that XID is
wraparound while leaving the wrong in subskipxid.
I guess that could happen if the user sets some unrelated XID value.
So, I think it should be okay to not clear this but we can add a
comment in the code at that place that we don't clear subtransaction's
XID as we don't support skipping individual subtransactions or
something like that.
--
With Regards,
Amit Kapila.
On 18.01.22 07:05, Masahiko Sawada wrote:
I've attached a rebased patch.
I think this is now almost done. Attached I have a small fixup patch
with some documentation proof-reading, and removing some comments I felt
are redundant. Some others have also sent you some documentation
updates, so feel free to merge mine in with them.
Some other comments:
parse_subscription_options() and AlterSubscriptionStmt mixes regular
options and skip options in ways that confuse me. It seems to work
correctly, though. I guess for now it's okay, but if we add more skip
options, it might be better to separate those more cleanly.
I think the superuser check in AlterSubscription() might no longer be
appropriate. Subscriptions can now be owned by non-superusers. Please
check that.
The display order in psql \dRs+ is a bit odd. I would put it at the
end, certainly not between Two phase commit and Synchronous commit.
Please run pgperltidy over 028_skip_xact.pl.
Is the setting of logical_decoding_work_mem in the test script required?
If so, comment why.
Please document arguments origin_lsn and origin_timestamp of
stop_skipping_changes(). Otherwise, one has to dig quite deep to find
out what they are for.
This is all minor stuff, so I think when this and the nearby comments
are addressed, this is fine by me.
Attachments:
0001-fixup-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-tran.patchtext/plain; charset=UTF-8; name=0001-fixup-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-tran.patchDownload
From 751be8216317f1d996c7b4f9f0e915adff805567 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Thu, 20 Jan 2022 17:03:27 +0100
Subject: [PATCH] fixup! Add ALTER SUBSCRIPTION ... SKIP to skip the
transaction on subscriber nodes
---
doc/src/sgml/logical-replication.sgml | 12 ++++--------
doc/src/sgml/ref/alter_subscription.sgml | 16 ++++++++--------
src/backend/commands/subscriptioncmds.c | 2 +-
src/backend/replication/logical/worker.c | 7 -------
src/test/regress/expected/subscription.out | 6 +++---
src/test/subscription/t/028_skip_xact.pl | 8 +++++---
6 files changed, 21 insertions(+), 30 deletions(-)
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index de4f83bbcc..873530db99 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -355,11 +355,10 @@ <title>Conflicts</title>
The resolution can be done either by changing data or permissions on the subscriber so
that it does not conflict with the incoming changes or by skipping the
the transaction that conflicts with the existing data. When a conflict
- produces an error, it is shown in
+ produces an error, it is shown in the
<structname>pg_stat_subscription_workers</structname> view as follows:
- </para>
- <programlisting>
+<programlisting>
postgres=# SELECT * FROM pg_stat_subscription_workers;
-[ RECORD 1 ]------+-----------------------------------------------------------
subid | 16391
@@ -373,9 +372,7 @@ <title>Conflicts</title>
last_error_time | 2021-09-29 15:52:45.165754+00
</programlisting>
- <para>
and it is also shown in subscriber's server log as follows:
- </para>
<screen>
ERROR: duplicate key value violates unique constraint "test_pkey"
@@ -383,7 +380,6 @@ <title>Conflicts</title>
CONTEXT: processing remote data during "INSERT" for replication target relation "public.test" in transaction 716 at 2021-09-29 15:52:45.165754+00
</screen>
- <para>
The transaction ID that contains the change violating the constraint can be
found from those outputs (transaction ID 716 in the above case). The transaction
can be skipped by using <command>ALTER SUBSCRIPTION ... SKIP</command> on the
@@ -401,9 +397,9 @@ <title>Conflicts</title>
that it doesn't conflict with incoming changes, or dropping the conflicting constraint
or unique index, or writing a trigger on the subscriber to suppress or redirect
conflicting incoming changes, or as a last resort, by skipping the whole transaction.
- Skipping the whole transaction includes skipping changes that may not violate
+ Skipping the whole transaction includes skipping changes that might not violate
any constraint. This can easily make the subscriber inconsistent, especially if
- a user specifies the wrong transaction ID or the position of origin.
+ a user specifies the wrong transaction ID or the wrong position of origin.
</para>
</sect1>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 8b4568ddab..7e0eb55653 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -212,16 +212,16 @@ <title>Parameters</title>
<term><literal>SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> [, ... ] )</literal></term>
<listitem>
<para>
- Skips applying all changes of the specified transaction. If incoming data
- violates any constraints, the logical replication will stop until it is
+ Skips applying all changes of the specified remote transaction. If incoming data
+ violates any constraints, logical replication will stop until it is
resolved. The resolution can be done either by changing data on the
subscriber so that it doesn't conflict with incoming changes or by skipping
- the whole transaction. Using <command> ALTER SUBSCRIPTION ... SKIP </command>
+ the whole transaction. Using the <command>ALTER SUBSCRIPTION ... SKIP</command>
command, the logical replication worker skips all data modification changes
- within the specified transaction including changes that may not violate
+ within the specified transaction, including changes that might not violate
the constraint, so, it should only be used as a last resort. This option has
no effect on the transactions that are already prepared by enabling
- <literal>two_phase</literal> on subscriber. After the logical replication
+ <literal>two_phase</literal> on subscriber. After logical replication
successfully skips the transaction, the transaction ID (stored in
<structname>pg_subscription</structname>.<structfield>subskipxid</structfield>)
is cleared. See <xref linkend="logical-replication-conflicts"/> for
@@ -237,9 +237,9 @@ <title>Parameters</title>
<term><literal>xid</literal> (<type>xid</type>)</term>
<listitem>
<para>
- Specifies the ID of the transaction whose changes are to be skipped
- by the logical replication worker. We don't support skipping
- individual subtransactions. Setting <literal>NONE</literal>
+ Specifies the ID of the remote transaction whose changes are to be skipped
+ by the logical replication worker. Skipping
+ individual subtransactions is not supported. Setting <literal>NONE</literal>
resets the transaction ID.
</para>
</listitem>
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 0ff0e00f19..b8fb7130a6 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -273,7 +273,7 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
if (!TransactionIdIsNormal(xid))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("invalid transaction id: %s", xid_str)));
+ errmsg("invalid transaction ID: %s", xid_str)));
}
opts->specified_opts |= SUBOPT_XID;
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 0264e30112..0b6d9a203a 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -814,9 +814,6 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
- /*
- * Enable skipping all changes of this transaction if specified.
- */
maybe_start_skipping_changes(begin_data.xid);
in_remote_transaction = true;
@@ -871,9 +868,6 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
- /*
- * Enable skipping all changes of this transaction if specified
- */
maybe_start_skipping_changes(begin_data.xid);
in_remote_transaction = true;
@@ -1429,7 +1423,6 @@ apply_spooled_messages(TransactionId xid, XLogRecPtr lsn)
remote_final_lsn = lsn;
- /* Enable skipping all changes of this transaction if specified */
maybe_start_skipping_changes(xid);
/*
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 892b6739bc..82733eed98 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -111,11 +111,11 @@ SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub
-- fail
ALTER SUBSCRIPTION regress_testsub SKIP (xid = 0);
-ERROR: invalid transaction id: 0
+ERROR: invalid transaction ID: 0
ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1);
-ERROR: invalid transaction id: 1
+ERROR: invalid transaction ID: 1
ALTER SUBSCRIPTION regress_testsub SKIP (xid = 2);
-ERROR: invalid transaction id: 2
+ERROR: invalid transaction ID: 2
-- fail - must be superuser
SET SESSION AUTHORIZATION 'regress_subscription_user2';
ALTER SUBSCRIPTION regress_testsub SKIP (xid = 100);
diff --git a/src/test/subscription/t/028_skip_xact.pl b/src/test/subscription/t/028_skip_xact.pl
index 4c107fc8f5..efe28c71af 100644
--- a/src/test/subscription/t/028_skip_xact.pl
+++ b/src/test/subscription/t/028_skip_xact.pl
@@ -8,8 +8,8 @@
use Test::More tests => 7;
# Test skipping the transaction. This function must be called after the caller
-# inserting data that conflict with the subscriber. After waiting for the
-# subscription worker stats are updated, we skip the transaction in question
+# has inserted data that conflicts with the subscriber. After waiting for the
+# subscription worker stats to be updated, we skip the transaction in question
# by ALTER SUBSCRIPTION ... SKIP. Then, check if logical replication can continue
# working by inserting $nonconflict_data on the publisher.
sub test_skip_xact
@@ -17,6 +17,8 @@ sub test_skip_xact
my ($node_publisher, $node_subscriber, $subname, $relname, $nonconflict_data,
$expected, $xid, $msg) = @_;
+ local $Test::Builder::Level = $Test::Builder::Level + 1;
+
# Wait for worker error
$node_subscriber->poll_query_until(
'postgres',
@@ -83,7 +85,7 @@ sub test_skip_xact
]);
$node_subscriber->start;
-# Initial table setup on both publisher and subscriber. On subscriber we
+# Initial table setup on both publisher and subscriber. On the subscriber, we
# create the same tables but with primary keys. Also, insert some data that
# will conflict with the data replicated from publisher later.
$node_publisher->safe_psql(
--
2.34.1
On Fri, Jan 21, 2022 at 1:18 AM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
On 18.01.22 07:05, Masahiko Sawada wrote:
I've attached a rebased patch.
I think this is now almost done. Attached I have a small fixup patch
with some documentation proof-reading, and removing some comments I felt
are redundant. Some others have also sent you some documentation
updates, so feel free to merge mine in with them.
Thank you for reviewing the patch and attaching the fixup patch!
Some other comments:
parse_subscription_options() and AlterSubscriptionStmt mixes regular
options and skip options in ways that confuse me. It seems to work
correctly, though. I guess for now it's okay, but if we add more skip
options, it might be better to separate those more cleanly.
Agreed.
I think the superuser check in AlterSubscription() might no longer be
appropriate. Subscriptions can now be owned by non-superusers. Please
check that.
IIUC we don't allow non-superuser to own the subscription yet. We
still have the following superuser checks:
In CreateSubscription():
if (!superuser())
ereport(ERROR,
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
errmsg("must be superuser to create subscriptions")));
and in AlterSubscriptionOwner_internal();
/* New owner must be a superuser */
if (!superuser_arg(newOwnerId))
ereport(ERROR,
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
errmsg("permission denied to change owner of
subscription \"%s\"",
NameStr(form->subname)),
errhint("The owner of a subscription must be a superuser.")));
Also, doing superuser check here seems to be consistent with
pg_replication_origin_advance() which is another way to skip
transactions and also requires superuser permission.
The display order in psql \dRs+ is a bit odd. I would put it at the
end, certainly not between Two phase commit and Synchronous commit.
Fixed.
Please run pgperltidy over 028_skip_xact.pl.
Fixed.
Is the setting of logical_decoding_work_mem in the test script required?
If so, comment why.
Yes, it makes the tests check streaming logical replication cases
easily. Added the comment.
Please document arguments origin_lsn and origin_timestamp of
stop_skipping_changes(). Otherwise, one has to dig quite deep to find
out what they are for.
Added.
Also, after reading the documentation updates, I realized that there
are two paragraphs describing almost the same things so merged them.
Please check the doc updates in the latest patch.
I've attached an updated patch that incorporated these commends as
well as other comments I got so far.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v9-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transac.patchapplication/x-patch; name=v9-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transac.patchDownload
From 8edaa5fa88fe3ece0ca22c2ff5ccc0daa1d4029f Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 10 Dec 2021 14:41:30 +0900
Subject: [PATCH v9] Add ALTER SUBSCRIPTION ... SKIP to skip the transaction on
subscriber nodes
If incoming change violates any constraint, logical replication stops
until it's resolved. This commit introduces another way to skip the
transaction in question, other than manually updating the subscriber's
database or using pg_replication_origin_advance().
The user can specify XID by ALTER SUBSCRIPTION ... SKIP (xid = XXX),
updating pg_subscription.subskipxid field, telling the apply worker to
skip the transaction. The apply worker skips all data modification
changes within the specified transaction.
After skipping the transaction the apply worker clears
pg_subscription.subskipxid.
---
doc/src/sgml/catalogs.sgml | 10 +
doc/src/sgml/logical-replication.sgml | 44 +++-
doc/src/sgml/ref/alter_subscription.sgml | 42 ++++
src/backend/catalog/pg_subscription.c | 1 +
src/backend/commands/subscriptioncmds.c | 53 +++++
src/backend/parser/gram.y | 9 +
src/backend/replication/logical/worker.c | 263 ++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 4 +
src/bin/psql/describe.c | 8 +-
src/bin/psql/tab-complete.c | 5 +-
src/include/catalog/pg_subscription.h | 4 +
src/include/nodes/parsenodes.h | 3 +-
src/test/regress/expected/subscription.out | 124 ++++++----
src/test/regress/sql/subscription.sql | 17 ++
src/test/subscription/t/028_skip_xact.pl | 226 ++++++++++++++++++
15 files changed, 754 insertions(+), 59 deletions(-)
create mode 100644 src/test/subscription/t/028_skip_xact.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 1e65c426b2..de3581b92d 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7736,6 +7736,16 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subskipxid</structfield> <type>xid</type>
+ </para>
+ <para>
+ ID of the transaction whose changes are to be skipped, if a valid
+ transaction ID; otherwise 0.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subconninfo</structfield> <type>text</type>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 96b4886e08..a89e545b8c 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -352,16 +352,52 @@
</para>
<para>
- The resolution can be done either by changing data or permissions on the subscriber so
- that it does not conflict with the incoming change or by skipping the
- transaction that conflicts with the existing data. The transaction can be
- skipped by calling the <link linkend="pg-replication-origin-advance">
+ When a conflict produces an error, it is shown in the
+ <structname>pg_stat_subscription_workers</structname> view as follows:
+
+<programlisting>
+postgres=# SELECT * FROM pg_stat_subscription_workers;
+-[ RECORD 1 ]------+-----------------------------------------------------------
+subid | 16391
+subname | test_sub
+subrelid |
+last_error_relid | 16385
+last_error_command | INSERT
+last_error_xid | 716
+last_error_count | 50
+last_error_message | duplicate key value violates unique constraint "test_pkey"
+last_error_time | 2021-09-29 15:52:45.165754+00
+</programlisting>
+
+ and it is also shown in subscriber's server log as follows:
+
+<screen>
+ERROR: duplicate key value violates unique constraint "test_pkey"
+DETAIL: Key (id)=(1) already exists.
+CONTEXT: processing remote data during "INSERT" for replication target relation "public.test" in transaction 716 at 2021-09-29 15:52:45.165754+00
+</screen>
+
+ The ID of the transaction that contains the change violating the constraint can be
+ found from those outputs (transaction ID 716 in the above case). The transaction
+ can be skipped by using <command>ALTER SUBSCRIPTION ... SKIP</command> on the
+ subscription. Alternatively, the transaction can also be skipped by calling the
+ <link linkend="pg-replication-origin-advance">
<function>pg_replication_origin_advance()</function></link> function with
a <parameter>node_name</parameter> corresponding to the subscription name,
and a position. The current position of origins can be seen in the
<link linkend="view-pg-replication-origin-status">
<structname>pg_replication_origin_status</structname></link> system view.
</para>
+
+ <para>
+ The resolution can be done by changing data or permissions on the subscriber so
+ that it does not conflict with incoming changes, by dropping the conflicting constraint
+ or unique index, or by writing a trigger on the subscriber to suppress or redirect
+ conflicting incoming changes, or as a last resort, by skipping the whole transaction.
+ Skipping the whole transaction includes skipping changes that might not violate
+ any constraint. This can easily make the subscriber inconsistent, especially if
+ a user specifies the wrong transaction ID or the wrong position of origin.
+ </para>
</sect1>
<sect1 id="logical-replication-restrictions">
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 0b027cc346..7e0eb55653 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -29,6 +29,7 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> REFRESH PUB
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> ENABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> DISABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> )
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <replaceable>new_name</replaceable>
</synopsis>
@@ -207,6 +208,47 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><literal>SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> [, ... ] )</literal></term>
+ <listitem>
+ <para>
+ Skips applying all changes of the specified remote transaction. If incoming data
+ violates any constraints, logical replication will stop until it is
+ resolved. The resolution can be done either by changing data on the
+ subscriber so that it doesn't conflict with incoming changes or by skipping
+ the whole transaction. Using the <command>ALTER SUBSCRIPTION ... SKIP</command>
+ command, the logical replication worker skips all data modification changes
+ within the specified transaction, including changes that might not violate
+ the constraint, so, it should only be used as a last resort. This option has
+ no effect on the transactions that are already prepared by enabling
+ <literal>two_phase</literal> on subscriber. After logical replication
+ successfully skips the transaction, the transaction ID (stored in
+ <structname>pg_subscription</structname>.<structfield>subskipxid</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts.
+ </para>
+
+ <para>
+ <replaceable>skip_option</replaceable> specifies options for this operation.
+ The supported option is:
+
+ <variablelist>
+ <varlistentry>
+ <term><literal>xid</literal> (<type>xid</type>)</term>
+ <listitem>
+ <para>
+ Specifies the ID of the remote transaction whose changes are to be skipped
+ by the logical replication worker. Skipping
+ individual subtransactions is not supported. Setting <literal>NONE</literal>
+ resets the transaction ID.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><replaceable class="parameter">new_owner</replaceable></term>
<listitem>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index ca65a8bd20..da199e9a3e 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -69,6 +69,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->binary = subform->subbinary;
sub->stream = subform->substream;
sub->twophasestate = subform->subtwophasestate;
+ sub->skipxid = subform->subskipxid;
/* Get conninfo */
datum = SysCacheGetAttr(SUBSCRIPTIONOID,
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index f5eba450ce..b8fb7130a6 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -61,6 +61,7 @@
#define SUBOPT_BINARY 0x00000080
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
+#define SUBOPT_XID 0x00000400
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -82,6 +83,8 @@ typedef struct SubOpts
bool binary;
bool streaming;
bool twophase;
+ TransactionId xid; /* InvalidTransactionId for resetting purpose, otherwise
+ * normal transaction id */
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -249,6 +252,33 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_TWOPHASE_COMMIT;
opts->twophase = defGetBoolean(defel);
}
+ else if (IsSet(supported_opts, SUBOPT_XID) &&
+ strcmp(defel->defname, "xid") == 0)
+ {
+ char *xid_str = defGetString(defel);
+ TransactionId xid;
+
+ if (IsSet(opts->specified_opts, SUBOPT_XID))
+ errorConflictingDefElem(defel, pstate);
+
+ /* Setting xid = NONE is treated as resetting xid */
+ if (strcmp(xid_str, "none") == 0)
+ xid = InvalidTransactionId;
+ else
+ {
+ /* Parse the argument as TransactionId */
+ xid = DatumGetTransactionId(DirectFunctionCall1(xidin,
+ CStringGetDatum(xid_str)));
+
+ if (!TransactionIdIsNormal(xid))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid transaction ID: %s", xid_str)));
+ }
+
+ opts->specified_opts |= SUBOPT_XID;
+ opts->xid = xid;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -464,6 +494,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
CharGetDatum(opts.twophase ?
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(InvalidTransactionId);
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -1083,6 +1115,27 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
break;
}
+ case ALTER_SUBSCRIPTION_SKIP:
+ {
+ if (!superuser())
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("must be superuser to skip transaction")));
+
+ parse_subscription_options(pstate, stmt->options, SUBOPT_XID, &opts);
+
+ /* ALTER SUBSCRIPTION ... SKIP supports only xid option */
+ Assert(IsSet(opts.specified_opts, SUBOPT_XID));
+
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(opts.xid);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ update_tuple = true;
+
+ break;
+ }
+
default:
elog(ERROR, "unrecognized ALTER SUBSCRIPTION kind %d",
stmt->kind);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index b5966712ce..d1fc1a8b42 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9954,6 +9954,15 @@ AlterSubscriptionStmt:
(Node *)makeBoolean(false), @1));
$$ = (Node *)n;
}
+ | ALTER SUBSCRIPTION name SKIP definition
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_SKIP;
+ n->subname = $3;
+ n->options = $5;
+ $$ = (Node *)n;
+ }
;
/*****************************************************************************
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index c9af775bc1..245d7b67fd 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -136,6 +136,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -257,6 +258,21 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * We enable skipping all data modification changes (INSERT, UPDATE, etc.) for
+ * the subscription if the user has specified subskipxid. Once we start skipping
+ * changes, we don't stop it until we skip all changes of the transaction even
+ * if pg_subscription is updated and MySubscription->skipxid gets changed or
+ * reset during that. Also, in streaming transaction cases, we don't skip
+ * receiving and spooling the changes, since we decide whether or not to skip
+ * applying the changes when starting to apply changes. At end of the transaction,
+ * we disable it and reset subskipxid. The timing of resetting subskipxid varies
+ * depending on commit or commit/rollback prepared case. Please refer to the
+ * comments in corresponding functions for details.
+ */
+static TransactionId skip_xid = InvalidTransactionId;
+#define is_skipping_changes() (TransactionIdIsValid(skip_xid))
+
/* BufFile handle of the current streaming file */
static BufFile *stream_fd = NULL;
@@ -332,6 +348,13 @@ static void TwoPhaseTransactionGid(Oid subid, TransactionId xid, char *gid, int
/* Common streaming function to apply all the spooled messages */
static void apply_spooled_messages(TransactionId xid, XLogRecPtr lsn);
+/* Functions for skipping changes */
+static void maybe_start_skipping_changes(TransactionId xid);
+static void stop_skipping_changes(bool commit, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp);
+static void clear_subscription_skip_xid(TransactionId xid, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp);
+
/* Functions for apply error callback */
static void apply_error_callback(void *arg);
static inline void set_apply_error_context_xact(TransactionId xid, TimestampTz ts);
@@ -791,6 +814,8 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -843,6 +868,8 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -856,6 +883,36 @@ apply_handle_prepare_internal(LogicalRepPreparedTxnData *prepare_data)
{
char gid[GIDSIZE];
+ /*
+ * If we are skipping all changes of this transaction, we stop it but
+ * unlike commit, we do not clear subskipxid of pg_subscription catalog
+ * here and will do that at commit prepared or rollback prepared time. If
+ * we update the catalog and then prepare the transaction, the changes
+ * will be part of the prepared transaction. Even if we do that in
+ * reverse order, subskipxid will not be cleared but this transaction
+ * won't be resent in a case where the server crashes between them.
+ *
+ * subskipxid might be changed or cleared by the user before we receive
+ * COMMIT PREPARED or ROLLBACK PREPARED. But that's okay because this
+ * prepared transaction is empty.
+ *
+ * One might think that we can skip preparing the skipped transaction and
+ * also skip COMMIT PREPARED or ROLLBACK PREPARED by comparing the XID
+ * received as part of the message to subskipxid. But subskipxid could be
+ * changed by users between PREPARE and COMMIT PREPARED or ROLLBACK
+ * PREPARED. There was an idea to disallow users to change subskipxid
+ * while skipping changes. But we don't know when COMMIT PREPARED or
+ * ROLLBACK PREPARED comes and another conflict could occur in the
+ * meanwhile. If such another conflict occurs, we cannot skip the
+ * transaction by using subskipxid. Also, there was another idea to check
+ * whether the transaction has been prepared or not by checking GID,
+ * origin LSN, and origin timestamp of the prepared transaction but that
+ * doesn't seem worthwhile because it requires protocol changes, and
+ * skipping transactions shouldn't be common.
+ */
+ if (is_skipping_changes())
+ stop_skipping_changes(false, InvalidXLogRecPtr, 0);
+
/*
* Compute unique GID for two_phase transactions. We don't use GID of
* prepared transaction sent by server as that can lead to deadlock when
@@ -901,9 +958,9 @@ apply_handle_prepare(StringInfo s)
/*
* Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction. It is done this way because at
- * commit prepared time, we won't know whether we have skipped preparing a
- * transaction because of no change.
+ * change has happened in this transaction or all changes are skipped. It
+ * is done this way because at commit prepared time, we won't know whether
+ * we have skipped preparing a transaction because of no change.
*
* XXX, We can optimize such that at commit prepared time, we first check
* whether we have prepared the transaction or not but that doesn't seem
@@ -940,6 +997,23 @@ apply_handle_commit_prepared(StringInfo s)
logicalrep_read_commit_prepared(s, &prepare_data);
set_apply_error_context_xact(prepare_data.xid, prepare_data.commit_time);
+ if (MySubscription->skipxid == prepare_data.xid)
+ {
+ /*
+ * Clear the subskipxid of pg_subscription catalog. This catalog
+ * update must be committed before finishing prepared transaction.
+ * Because otherwise, in a case where the server crashes between
+ * finishing prepared transaction and the catalog update, COMMIT
+ * PREPARED won't be resent but subskipxid is left.
+ *
+ * Also, we must not update the replication origin LSN and timestamp
+ * while committing the catalog update so that COMMIT PREPARED will be
+ * resent in case of a crash immediately after the catalog update
+ * commit.
+ */
+ clear_subscription_skip_xid(prepare_data.xid, InvalidXLogRecPtr, 0);
+ }
+
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
gid, sizeof(gid));
@@ -981,6 +1055,17 @@ apply_handle_rollback_prepared(StringInfo s)
logicalrep_read_rollback_prepared(s, &rollback_data);
set_apply_error_context_xact(rollback_data.xid, rollback_data.rollback_time);
+ if (MySubscription->skipxid == rollback_data.xid)
+ {
+ /*
+ * Same as COMMIT PREPARED, we must clear subskipxid of
+ * pg_subscription before rolling back the prepared transaction.
+ * Please see the comments in apply_handle_commit_prepared() for
+ * details.
+ */
+ clear_subscription_skip_xid(rollback_data.xid, InvalidXLogRecPtr, 0);
+ }
+
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, rollback_data.xid,
gid, sizeof(gid));
@@ -1209,6 +1294,15 @@ apply_handle_stream_abort(StringInfo s)
logicalrep_read_stream_abort(s, &xid, &subxid);
+ /*
+ * We don't expect the user to set the XID of the transaction that is
+ * rolled back but if the skip XID is set, clear it. Since we don't
+ * support skipping individual subtransactions we don't clear
+ * subtransaction's XID.
+ */
+ if (MySubscription->skipxid == xid)
+ clear_subscription_skip_xid(MySubscription->skipxid, InvalidXLogRecPtr, 0);
+
/*
* If the two XIDs are the same, it's in fact abort of toplevel xact, so
* just delete the files with serialized info.
@@ -1331,6 +1425,8 @@ apply_spooled_messages(TransactionId xid, XLogRecPtr lsn)
remote_final_lsn = lsn;
+ maybe_start_skipping_changes(xid);
+
/*
* Make sure the handle apply_dispatch methods are aware we're in a remote
* transaction.
@@ -1451,7 +1547,23 @@ apply_handle_stream_commit(StringInfo s)
static void
apply_handle_commit_internal(LogicalRepCommitData *commit_data)
{
- if (IsTransactionState())
+ if (is_skipping_changes())
+ {
+ /*
+ * If we are skipping all changes of this transaction, we stop it and
+ * clear subskipxid of pg_subscription.
+ */
+ stop_skipping_changes(true, commit_data->end_lsn,
+ commit_data->committime);
+
+ /* Clearing subskipxid must be committed */
+ Assert(!IsTransactionState());
+
+ pgstat_report_stat(false);
+
+ store_flush_position(commit_data->end_lsn);
+ }
+ else if (IsTransactionState())
{
/*
* Update origin state so we can restart streaming from correct
@@ -2366,6 +2478,17 @@ apply_dispatch(StringInfo s)
LogicalRepMsgType action = pq_getmsgbyte(s);
LogicalRepMsgType saved_command;
+ /*
+ * Skip all data-modification changes if we're skipping changes of this
+ * transaction.
+ */
+ if (is_skipping_changes() &&
+ (action == LOGICAL_REP_MSG_INSERT ||
+ action == LOGICAL_REP_MSG_UPDATE ||
+ action == LOGICAL_REP_MSG_DELETE ||
+ action == LOGICAL_REP_MSG_TRUNCATE))
+ return;
+
/*
* Set the current command being applied. Since this function can be
* called recusively when applying spooled changes, save the current
@@ -3661,6 +3784,138 @@ IsLogicalWorker(void)
return MyLogicalRepWorker != NULL;
}
+/*
+ * Start skipping changes of the transaction if the given XID matches the
+ * transaction ID specified by subskipxid.
+ */
+static void
+maybe_start_skipping_changes(TransactionId xid)
+{
+ Assert(!is_skipping_changes());
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ if (MySubscription->skipxid != xid)
+ return;
+
+ /* Start skipping all changes of this transaction */
+ skip_xid = xid;
+
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction %u",
+ xid));
+}
+
+/*
+ * Stop skipping changes by resetting subskipxid. If clear_subskipxid is true,
+ * we also clear subskipxid of pg_subscription by setting InvalidTransactionId.
+ * Both origin_lsn and origin_timestamp are used to update origin state when
+ * clearing subskipxid so that we can restart streaming from correct position
+ * in case of crash.
+ */
+static void
+stop_skipping_changes(bool clear_subskipxid, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp)
+{
+ Assert(is_skipping_changes());
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction %u",
+ skip_xid)));
+
+ if (clear_subskipxid)
+ {
+ clear_subscription_skip_xid(skip_xid, origin_lsn, origin_timestamp);
+
+ /* Make sure that clearing subskipxid is committed */
+ if (IsTransactionState())
+ CommitTransactionCommand();
+ }
+
+ /* Stop skipping changes */
+ skip_xid = InvalidTransactionId;
+}
+
+/* Clear subskipxid of pg_subscription catalog */
+static void
+clear_subscription_skip_xid(TransactionId xid, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp)
+{
+ Relation rel;
+ Form_pg_subscription subform;
+ HeapTuple tup;
+ bool started_tx = false;
+
+ if (!IsTransactionState())
+ {
+ StartTransactionCommand();
+ started_tx = true;
+ }
+
+ /*
+ * Protect subskipxid of pg_subscription from being concurrently updated
+ * while clearing it.
+ */
+ LockSharedObject(SubscriptionRelationId, MySubscription->oid, 0,
+ AccessShareLock);
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ ObjectIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ /* Get subskipxid value */
+ subform = (Form_pg_subscription) GETSTRUCT(tup);
+
+ /*
+ * Update the subskipxid of the tuple to InvalidTransactionId. If user
+ * has already changed subskipxid before clearing it we don't update the
+ * catalog and don't advance the replication origin state. So in the
+ * worst case, if the server crashes before sending an acknowledgment of
+ * the flush position the transaction will be sent again and the user
+ * needs to set subskipxid again. We can reduce the possibility by
+ * logging a replication origin WAL record to advance the origin LSN
+ * instead but there is no way to advance origin timestamp and it
+ * doesn't seem to be worth it since it's a very minor case.
+ */
+ if (subform->subskipxid == xid)
+ {
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ /* reset subskipxid */
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(InvalidTransactionId);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ /*
+ * Update origin state so we can restart streaming from correct
+ * position in case of crash.
+ */
+ replorigin_session_origin_lsn = origin_lsn;
+ replorigin_session_origin_timestamp = origin_timestamp;
+
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+ }
+
+ heap_freetuple(tup);
+ table_close(rel, NoLock);
+
+ if (started_tx)
+ CommitTransactionCommand();
+}
+
/* Error callback to give more context info about the change being applied */
static void
apply_error_callback(void *arg)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 7c2f1d3044..124f755486 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4304,6 +4304,10 @@ getSubscriptions(Archive *fout)
ntups = PQntuples(res);
+ /*
+ * Get subscription fields. We don't fetch subskipxid as we don't
+ * include it in the dump.
+ */
i_tableoid = PQfnumber(res, "tableoid");
i_oid = PQfnumber(res, "oid");
i_subname = PQfnumber(res, "subname");
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 40433e32fa..fa247590b0 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6051,7 +6051,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false};
+ false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6096,6 +6096,12 @@ describeSubscriptions(const char *pattern, bool verbose)
", subconninfo AS \"%s\"\n",
gettext_noop("Synchronous commit"),
gettext_noop("Conninfo"));
+
+ /* Skip XID is only supported in v15 and higher */
+ if (pset.sversion >= 150000)
+ appendPQExpBuffer(&buf,
+ ", subskipxid AS \"%s\"\n",
+ gettext_noop("Skip XID"));
}
/* Only display subscriptions in current database. */
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 6bd33a06cb..e402c26aa8 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1710,7 +1710,7 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> */
else if (Matches("ALTER", "SUBSCRIPTION", MatchAny))
COMPLETE_WITH("CONNECTION", "ENABLE", "DISABLE", "OWNER TO",
- "RENAME TO", "REFRESH PUBLICATION", "SET",
+ "RENAME TO", "REFRESH PUBLICATION", "SET", "SKIP (",
"ADD PUBLICATION", "DROP PUBLICATION");
/* ALTER SUBSCRIPTION <name> REFRESH PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) &&
@@ -1726,6 +1726,9 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
COMPLETE_WITH("binary", "slot_name", "streaming", "synchronous_commit");
+ /* ALTER SUBSCRIPTION <name> SKIP ( */
+ else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
+ COMPLETE_WITH("xid");
/* ALTER SUBSCRIPTION <name> SET PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "PUBLICATION"))
{
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 18c291289f..d4410da58f 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -67,6 +67,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
char subtwophasestate; /* Stream two-phase transactions */
+ TransactionId subskipxid; /* All changes associated with
+ * this XID are skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -103,6 +106,7 @@ typedef struct Subscription
* binary format */
bool stream; /* Allow streaming in-progress transactions. */
char twophasestate; /* Allow streaming two-phase transactions */
+ TransactionId skipxid; /* All changes of the XID are skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 3e9bdc781f..c7f9d12ac6 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3717,7 +3717,8 @@ typedef enum AlterSubscriptionType
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
ALTER_SUBSCRIPTION_DROP_PUBLICATION,
ALTER_SUBSCRIPTION_REFRESH,
- ALTER_SUBSCRIPTION_ENABLED
+ ALTER_SUBSCRIPTION_ENABLED,
+ ALTER_SUBSCRIPTION_SKIP
} AlterSubscriptionType;
typedef struct AlterSubscriptionStmt
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 80aae83562..1f7912a71a 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -76,10 +76,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip XID
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist | 0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -93,11 +93,39 @@ ALTER SUBSCRIPTION regress_doesnotexist CONNECTION 'dbname=regress_doesnotexist2
ERROR: subscription "regress_doesnotexist" does not exist
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
ERROR: unrecognized subscription parameter: "create_slot"
+-- ok - valid xid
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 3);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 4294967295);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+ subname | subskipxid
+-----------------+------------
+ regress_testsub | 4294967295
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = NONE);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+ subname | subskipxid
+-----------------+------------
+ regress_testsub | 0
+(1 row)
+
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 0);
+ERROR: invalid transaction ID: 0
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1);
+ERROR: invalid transaction ID: 1
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 2);
+ERROR: invalid transaction ID: 2
+-- fail - must be superuser
+SET SESSION AUTHORIZATION 'regress_subscription_user2';
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 100);
+ERROR: must be owner of subscription regress_testsub
+SET SESSION AUTHORIZATION 'regress_subscription_user';
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+--------------------+------------------------------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | off | dbname=regress_doesnotexist2
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip XID
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | off | dbname=regress_doesnotexist2 | 0
(1 row)
BEGIN;
@@ -129,10 +157,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+--------------------+------------------------------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | local | dbname=regress_doesnotexist2
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip XID
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | local | dbname=regress_doesnotexist2 | 0
(1 row)
-- rename back to keep the rest simple
@@ -165,19 +193,19 @@ ERROR: binary requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, binary = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip XID
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | off | dbname=regress_doesnotexist | 0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip XID
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist | 0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -188,19 +216,19 @@ ERROR: streaming requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip XID
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | off | dbname=regress_doesnotexist | 0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip XID
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist | 0
(1 row)
-- fail - publication already exists
@@ -215,10 +243,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip XID
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | off | dbname=regress_doesnotexist | 0
(1 row)
-- fail - publication used more then once
@@ -233,10 +261,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip XID
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist | 0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -270,10 +298,10 @@ ERROR: two_phase requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip XID
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | off | dbname=regress_doesnotexist | 0
(1 row)
--fail - alter of two_phase option not supported.
@@ -282,10 +310,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip XID
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist | 0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -294,10 +322,10 @@ DROP SUBSCRIPTION regress_testsub;
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip XID
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist | 0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index bd0f4af1e4..aa15d12d9d 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -72,6 +72,23 @@ ALTER SUBSCRIPTION regress_testsub SET (slot_name = '');
ALTER SUBSCRIPTION regress_doesnotexist CONNECTION 'dbname=regress_doesnotexist2';
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
+-- ok - valid xid
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 3);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 4294967295);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = NONE);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 0);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 2);
+
+-- fail - must be superuser
+SET SESSION AUTHORIZATION 'regress_subscription_user2';
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 100);
+SET SESSION AUTHORIZATION 'regress_subscription_user';
+
\dRs+
BEGIN;
diff --git a/src/test/subscription/t/028_skip_xact.pl b/src/test/subscription/t/028_skip_xact.pl
new file mode 100644
index 0000000000..588f1f1d3e
--- /dev/null
+++ b/src/test/subscription/t/028_skip_xact.pl
@@ -0,0 +1,226 @@
+
+# Copyright (c) 2022, PostgreSQL Global Development Group
+
+# Tests for skipping logical replication transactions
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use Test::More tests => 7;
+
+# Test skipping the transaction. This function must be called after the caller
+# has inserted data that conflicts with the subscriber. After waiting for the
+# subscription worker stats to be updated, we skip the transaction in question
+# by ALTER SUBSCRIPTION ... SKIP. Then, check if logical replication can continue
+# working by inserting $nonconflict_data on the publisher.
+sub test_skip_xact
+{
+ my ($node_publisher, $node_subscriber, $subname, $relname,
+ $nonconflict_data, $expected, $xid, $msg)
+ = @_;
+
+ local $Test::Builder::Level = $Test::Builder::Level + 1;
+
+ # Wait for worker error
+ $node_subscriber->poll_query_until(
+ 'postgres',
+ qq[
+SELECT count(1) > 0
+FROM pg_stat_subscription_workers
+WHERE last_error_relid = '$relname'::regclass
+ AND subrelid IS NULL
+ AND last_error_command = 'INSERT'
+ AND last_error_xid = '$xid'
+ AND starts_with(last_error_message, 'duplicate key value violates unique constraint');
+]) or die "Timed out while waiting for worker error";
+
+ # Set skip xid
+ $node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION $subname SKIP (xid = '$xid')");
+
+ # Restart the subscriber node to restart logical replication with no interval
+ $node_subscriber->restart;
+
+ # Wait for the failed transaction to be skipped
+ $node_subscriber->poll_query_until('postgres',
+ "SELECT subskipxid = 0 FROM pg_subscription WHERE subname = '$subname'"
+ );
+
+ # Insert non-conflict data
+ $node_publisher->safe_psql('postgres',
+ "INSERT INTO $relname VALUES $nonconflict_data");
+
+ $node_publisher->wait_for_catchup($subname);
+
+ # Check replicated data
+ my $res = $node_subscriber->safe_psql('postgres',
+ "SELECT count(*) FROM $relname");
+ is($res, $expected, $msg);
+}
+
+# Create publisher node. Set a low value to logical_decoding_work_mem
+# so we can test streaming cases easily.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf(
+ 'postgresql.conf',
+ qq[
+logical_decoding_work_mem = 64kB
+max_prepared_transactions = 10
+]);
+$node_publisher->start;
+
+# Create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->append_conf(
+ 'postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+]);
+
+# The subscriber will enter an infinite error loop, so we don't want
+# to overflow the server log with error messages.
+$node_subscriber->append_conf(
+ 'postgresql.conf',
+ qq[
+wal_retrieve_retry_interval = 2s
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On the subscriber, we
+# create the same tables but with primary keys. Also, insert some data that
+# will conflict with the data replicated from publisher later.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+CREATE TABLE test_tab (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+COMMIT;
+]);
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+CREATE TABLE test_tab (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab VALUES (1);
+INSERT INTO test_tab_streaming VALUES (1, md5(1::text));
+COMMIT;
+]);
+
+# Setup publications
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Create subscriptions
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr' PUBLICATION tap_pub WITH (two_phase = on);
+CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr' PUBLICATION tap_pub_streaming WITH (two_phase = on, streaming = on);
+]);
+
+$node_publisher->wait_for_catchup('tap_sub');
+$node_publisher->wait_for_catchup('tap_sub_streaming');
+
+# Insert data to test_tab1, raising an error on the subscriber due to violation
+# of the unique constraint on test_tab. Then skip the transaction.
+my $xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(2)", "2", $xid, "test skipping transaction");
+
+# Test for PREPARE and COMMIT PREPARED. Insert the same data to test_tab1 and
+# PREPARE the transaction, raising an error. Then skip the transaction.
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+COMMIT PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(3)", "3", $xid, "test skipping prepare and commit prepared ");
+
+# Test for PREPARE and ROLLBACK PREPARED
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+ROLLBACK PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(4)", "4", $xid, "test skipping prepare and rollback prepared");
+
+# Test for STREAM COMMIT. Insert enough rows to test_tab_streaming to exceed the 64kB
+# limit, also raising an error on the subscriber during applying spooled changes for the
+# same reason. Then skip the transaction.
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub_streaming",
+ "test_tab_streaming", "(2, md5(2::text))",
+ "2", $xid, "test skipping stream-commit");
+
+# Test for STREAM PREPARE and COMMIT PREPARED
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+COMMIT PREPARED 'gtx';
+]);
+test_skip_xact(
+ $node_publisher, $node_subscriber,
+ "tap_sub_streaming", "test_tab_streaming",
+ "(3, md5(3::text))", "3",
+ $xid, "test skipping stream-prepare and commit prepared");
+
+# Test for STREAM PREPARE and ROLLBACK PREPARED
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+ROLLBACK PREPARED 'gtx';
+]);
+test_skip_xact(
+ $node_publisher,
+ $node_subscriber,
+ "tap_sub_streaming",
+ "test_tab_streaming",
+ "(4, md5(4::text))",
+ "4",
+ $xid,
+ "test skipping stream-prepare and rollback prepared");
+
+my $res = $node_subscriber->safe_psql('postgres',
+ "SELECT count(*) FROM pg_prepared_xacts");
+is($res, "0",
+ "check all prepared transactions are resolved on the subscriber");
--
2.24.3 (Apple Git-128)
On Wed, Jan 19, 2022 at 3:32 PM vignesh C <vignesh21@gmail.com> wrote:
On Sat, Jan 15, 2022 at 3:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Jan 14, 2022 at 5:35 PM vignesh C <vignesh21@gmail.com> wrote:
Thanks for the updated patch, few minor comments: 1) Should "SKIP" be "SKIP (" here: @@ -1675,7 +1675,7 @@ psql_completion(const char *text, int start, int end) /* ALTER SUBSCRIPTION <name> */ else if (Matches("ALTER", "SUBSCRIPTION", MatchAny)) COMPLETE_WITH("CONNECTION", "ENABLE", "DISABLE", "OWNER TO", - "RENAME TO", "REFRESH PUBLICATION", "SET", + "RENAME TO", "REFRESH PUBLICATION", "SET", "SKIP",Won't the another rule as follows added by patch sufficient for what you are asking? + /* ALTER SUBSCRIPTION <name> SKIP */ + else if (Matches("ALTER", "SUBSCRIPTION", MatchAny, "SKIP")) + COMPLETE_WITH("(");I might be missing something but why do you think the handling of SKIP
be any different than what we are doing for SET?In case of "ALTER SUBSCRIPTION sub1 SET" there are 2 possible tab
completion options, user can either specify "ALTER SUBSCRIPTION sub1
SET PUBLICATION pub1" or "ALTER SUBSCRIPTION sub1 SET ( SET option
like STREAMING,etc = 'on')", that is why we have 2 possible options as
below:
postgres=# ALTER SUBSCRIPTION sub1 SET
( PUBLICATIONWhereas in the case of SKIP there is only one possible tab completion
option i.e XID. We handle similarly in case of WITH option, we specify
"WITH (" in case of tab completion for "CREATE PUBLICATION pub1"
postgres=# CREATE PUBLICATION pub1
FOR ALL TABLES FOR ALL TABLES IN SCHEMA FOR TABLE
WITH (
Right. I've incorporated this comment into the latest v9 patch[1]/messages/by-id/CAD21AoDOuNtvFUfU2wH2QgTJ6AyMXXh_vdA87qX0mUibdsrYTg@mail.gmail.com.
Regards,
[1]: /messages/by-id/CAD21AoDOuNtvFUfU2wH2QgTJ6AyMXXh_vdA87qX0mUibdsrYTg@mail.gmail.com
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wed, Jan 19, 2022 at 4:14 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Tue, Jan 18, 2022 at 5:05 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached a rebased patch.
A couple of comments for the v8 patch:
Thank you for the comments!
doc/src/sgml/logical-replication.sgml
(1)
Strictly-speaking it's the transaction, not transaction ID, that
contains changes, so suggesting minor change:BEFORE: + The transaction ID that contains the change violating the constraint can be AFTER: + The ID of the transaction that contains the change violating the constraint can bedoc/src/sgml/ref/alter_subscription.sgml
(2) apply_handle_commit_internal
It's not entirely apparent what commits the clearing of subskixpid
here, so I suggest the following addition:BEFORE: + * clear subskipxid of pg_subscription. AFTER: + * clear subskipxid of pg_subscription, then commit.
These comments are merged with Peter's comments and incorporated into
the latest v9 patch[1]/messages/by-id/CAD21AoDOuNtvFUfU2wH2QgTJ6AyMXXh_vdA87qX0mUibdsrYTg@mail.gmail.com.
Regards,
[1]: /messages/by-id/CAD21AoDOuNtvFUfU2wH2QgTJ6AyMXXh_vdA87qX0mUibdsrYTg@mail.gmail.com
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wed, Jan 19, 2022 at 5:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Jan 19, 2022 at 12:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Jan 19, 2022 at 12:22 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:(6) apply_handle_stream_abort
@@ -1209,6 +1300,13 @@ apply_handle_stream_abort(StringInfo s)
logicalrep_read_stream_abort(s, &xid, &subxid);
+ /* + * We don't expect the user to set the XID of the transaction that is + * rolled back but if the skip XID is set, clear it. + */ + if (MySubscription->skipxid == xid || MySubscription->skipxid == subxid) + clear_subscription_skip_xid(MySubscription->skipxid, InvalidXLogRecPtr, 0); +In my humble opinion, this still cares about subtransaction xid still.
If we want to be consistent with top level transactions only,
I felt checking MySubscription->skipxid == xid should be sufficient.I thought if we can clear subskipxid whose value has already been
processed on the subscriber with a reasonable cost it makes sense to
do that because it can reduce the possibility of the issue that XID is
wraparound while leaving the wrong in subskipxid.I guess that could happen if the user sets some unrelated XID value.
So, I think it should be okay to not clear this but we can add a
comment in the code at that place that we don't clear subtransaction's
XID as we don't support skipping individual subtransactions or
something like that.
Agreed and added the comment in the latest patch[1]/messages/by-id/CAD21AoDOuNtvFUfU2wH2QgTJ6AyMXXh_vdA87qX0mUibdsrYTg@mail.gmail.com.
Regards,
[1]: /messages/by-id/CAD21AoDOuNtvFUfU2wH2QgTJ6AyMXXh_vdA87qX0mUibdsrYTg@mail.gmail.com
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Fri, Jan 21, 2022 at 8:39 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Jan 21, 2022 at 1:18 AM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:I think the superuser check in AlterSubscription() might no longer be
appropriate. Subscriptions can now be owned by non-superusers. Please
check that.IIUC we don't allow non-superuser to own the subscription yet. We
still have the following superuser checks:In CreateSubscription():
if (!superuser())
ereport(ERROR,
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
errmsg("must be superuser to create subscriptions")));and in AlterSubscriptionOwner_internal();
/* New owner must be a superuser */
if (!superuser_arg(newOwnerId))
ereport(ERROR,
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
errmsg("permission denied to change owner of
subscription \"%s\"",
NameStr(form->subname)),
errhint("The owner of a subscription must be a superuser.")));Also, doing superuser check here seems to be consistent with
pg_replication_origin_advance() which is another way to skip
transactions and also requires superuser permission.
+1. I think this feature has the potential to make data inconsistent
and only be used as a last resort to resolve the conflicts so it is
better to allow this as a superuser.
--
With Regards,
Amit Kapila.
On Tue, Jan 18, 2022 at 9:21 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Jan 18, 2022 at 12:37 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jan 18, 2022 at 8:34 AM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:On Mon, Jan 17, 2022 2:18 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
2) The following two places are not consistent in whether "= value" is surround
with square brackets.+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable class="parameter">skip_option</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ <term><literal>SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> [, ... ] )</literal></term>
Should we modify the first place to:
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> [, ... ] )Because currently there is only one skip_option - xid, and a parameter must be
specified when using it.Good observation. Do we really need [, ... ] as currently, we support
only one value for XID?I think no. In the doc, it should be:
ALTER SUBSCRIPTION name SKIP ( skip_option = value )
In the latest patch, I see:
+ <varlistentry>
+ <term><literal>SKIP ( <replaceable
class="parameter">skip_option</replaceable> = <replaceable
class="parameter">value</replaceable> [, ... ] )</literal></term>
What do we want to indicate by [, ... ]? To me, it appears like
multiple options but that is not what we support currently.
--
With Regards,
Amit Kapila.
On Fri, Jan 21, 2022 at 1:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jan 18, 2022 at 9:21 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Jan 18, 2022 at 12:37 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jan 18, 2022 at 8:34 AM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:On Mon, Jan 17, 2022 2:18 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
2) The following two places are not consistent in whether "= value" is surround
with square brackets.+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable class="parameter">skip_option</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ <term><literal>SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> [, ... ] )</literal></term>
Should we modify the first place to:
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> [, ... ] )Because currently there is only one skip_option - xid, and a parameter must be
specified when using it.Good observation. Do we really need [, ... ] as currently, we support
only one value for XID?I think no. In the doc, it should be:
ALTER SUBSCRIPTION name SKIP ( skip_option = value )
In the latest patch, I see:
+ <varlistentry>
+ <term><literal>SKIP ( <replaceable
class="parameter">skip_option</replaceable> = <replaceable
class="parameter">value</replaceable> [, ... ] )</literal></term>What do we want to indicate by [, ... ]? To me, it appears like
multiple options but that is not what we support currently.
You're right. It's an oversight.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Friday, January 21, 2022 12:08 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated patch that incorporated these commends as well as
other comments I got so far.
Thank you for your update !
Few minor comments.
(1) trivial question
For the users,
was it perfectly clear that in the cascading logical replication setup,
we can't selectively skip an arbitrary transaction of one upper nodes,
without skipping its all executions on subsequent nodes,
when we refer to the current doc description of v9 ?
IIUC, this is because we don't write changes WAL either and
can't propagate the contents to subsequent nodes.
I tested this case and it didn't, as I expected.
This can apply to other measures for conflicts, though.
(2) suggestion
There's no harm in writing a notification for a committer
"Bump catalog version" in the commit log,
as the patch changes the catalog.
(3) minor question
In the past, there was a discussion that
it might be better if we reset the XID
according to a change of subconninfo,
which might be an opportunity to connect another
publisher of a different XID space.
Currently, we can regard it as user's responsibility.
Was this correct ?
Best Regards,
Takamichi Osumi
On Fri, Jan 21, 2022 at 10:32 AM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
On Friday, January 21, 2022 12:08 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated patch that incorporated these commends as well as
other comments I got so far.Thank you for your update !
Few minor comments.
(1) trivial question
For the users,
was it perfectly clear that in the cascading logical replication setup,
we can't selectively skip an arbitrary transaction of one upper nodes,
without skipping its all executions on subsequent nodes,
when we refer to the current doc description of v9 ?IIUC, this is because we don't write changes WAL either and
can't propagate the contents to subsequent nodes.I tested this case and it didn't, as I expected.
This can apply to other measures for conflicts, though.
Right, there is nothing new as the user will same effect when she uses
existing function pg_replication_origin_advance(). So, not sure if we
want to add something specific to this.
(3) minor question
In the past, there was a discussion that
it might be better if we reset the XID
according to a change of subconninfo,
which might be an opportunity to connect another
publisher of a different XID space.
Currently, we can regard it as user's responsibility.
Was this correct ?
I think if the user points to another publisher, doesn't it similarly
needs to change slot_name as well? If so, I think this can be treated
in a similar way.
--
With Regards,
Amit Kapila.
On Fri, Jan 21, 2022 at 2:09 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated patch that incorporated these commends as
well as other comments I got so far.
src/backend/replication/logical/worker.c
(1)
Didn't you mean to say "check the" instead of "clear" in the following
comment? (the subtransaction's XID was never being cleared before,
just checked against the skipxid, and now that check has been removed)
+ * ... . Since we don't
+ * support skipping individual subtransactions we don't clear
+ * subtransaction's XID.
Other than that, the patch LGTM.
Regards,
Greg Nancarrow
Fujitsu Australia
On Friday, January 21, 2022 2:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Jan 21, 2022 at 10:32 AM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:On Friday, January 21, 2022 12:08 PM Masahiko Sawada
<sawada.mshk@gmail.com> wrote:
I've attached an updated patch that incorporated these commends as
well as other comments I got so far.Thank you for your update !
Few minor comments.
(1) trivial question
For the users,
was it perfectly clear that in the cascading logical replication
setup, we can't selectively skip an arbitrary transaction of one upper
nodes, without skipping its all executions on subsequent nodes, when
we refer to the current doc description of v9 ?IIUC, this is because we don't write changes WAL either and can't
propagate the contents to subsequent nodes.I tested this case and it didn't, as I expected.
This can apply to other measures for conflicts, though.Right, there is nothing new as the user will same effect when she uses existing
function pg_replication_origin_advance(). So, not sure if we want to add
something specific to this.
Okay, thank you for clarifying this !
That's good to know.
(3) minor question
In the past, there was a discussion that it might be better if we
reset the XID according to a change of subconninfo, which might be an
opportunity to connect another publisher of a different XID space.
Currently, we can regard it as user's responsibility.
Was this correct ?I think if the user points to another publisher, doesn't it similarly needs to
change slot_name as well? If so, I think this can be treated in a similar way.
I see. Then, in the AlterSubscription(), switching a slot_name
doesn't affect other columns, which means this time,
we don't need some special measure for this either as well, IIUC.
Thanks !
Best Regards,
Takamichi Osumi
On Fri, Jan 21, 2022 at 10:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Jan 21, 2022 at 1:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
What do we want to indicate by [, ... ]? To me, it appears like
multiple options but that is not what we support currently.You're right. It's an oversight.
I have fixed this and a few other things in the attached patch.
1.
The newly added column needs to be updated in the following statement:
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
substream, subtwophasestate, subslotname, subsynccommit,
subpublications)
ON pg_subscription TO public;
2.
+stop_skipping_changes(bool clear_subskipxid, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp)
+{
+ Assert(is_skipping_changes());
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction %u",
+ skip_xid)));
Isn't it better to move this LOG at the end of this function? Because
clear* functions can give an error, so it is better to move it after
that. I have done that in the attached.
3.
+-- fail - must be superuser
+SET SESSION AUTHORIZATION 'regress_subscription_user2';
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 100);
+ERROR: must be owner of subscription regress_testsub
This test doesn't seem to be right. You want to get the error for the
superuser but the error is for the owner. I have changed this test to
do what it intends to do.
Apart from this, I have changed a few comments and ran pgindent. Do
let me know what you think of the changes?
Few things that I think we can improve in 028_skip_xact.pl are as follows:
After CREATE SUBSCRIPTION, wait for initial sync to be over and
two_phase state to be enabled. Please see 021_twophase. For the
streaming case, we might be able to ensure streaming even with lesser
data. Can you please try that?
--
With Regards,
Amit Kapila.
Attachments:
v10-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transa.patchapplication/octet-stream; name=v10-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transa.patchDownload
From 57bf7f98bc9f64f93a34d9b1fbacb30a570d1074 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 10 Dec 2021 14:41:30 +0900
Subject: [PATCH v10] Add ALTER SUBSCRIPTION ... SKIP to skip the transaction
on subscriber nodes.
If incoming change violates any constraint, logical replication stops
until it's resolved. This commit introduces another way to skip the
transaction in question, other than manually updating the subscriber's
database or using pg_replication_origin_advance().
The user can specify XID by ALTER SUBSCRIPTION ... SKIP (xid = XXX),
updating pg_subscription.subskipxid field, telling the apply worker to
skip the transaction. The apply worker skips all data modification changes
within the specified transaction.
After skipping the transaction the apply worker clears
pg_subscription.subskipxid.
Author: Masahiko Sawada
Reviewed-by: Vignesh C, Greg Nancarrow, Takamichi Osumi, Haiying Tang, Hou Zhijie, Peter Eisentraut, Amit Kapila
Discussion: https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xJfUVihNZDA@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 10 ++
doc/src/sgml/logical-replication.sgml | 44 ++++-
doc/src/sgml/ref/alter_subscription.sgml | 42 +++++
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 3 +-
src/backend/commands/subscriptioncmds.c | 53 ++++++
src/backend/parser/gram.y | 9 +
src/backend/replication/logical/worker.c | 264 ++++++++++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 4 +
src/bin/psql/describe.c | 8 +-
src/bin/psql/tab-complete.c | 5 +-
src/include/catalog/pg_subscription.h | 4 +
src/include/nodes/parsenodes.h | 3 +-
src/test/regress/expected/subscription.out | 129 ++++++++------
src/test/regress/sql/subscription.sql | 22 +++
src/test/subscription/t/028_skip_xact.pl | 226 ++++++++++++++++++++++++
16 files changed, 767 insertions(+), 60 deletions(-)
create mode 100644 src/test/subscription/t/028_skip_xact.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 1e65c42..de3581b 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7738,6 +7738,16 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<row>
<entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subskipxid</structfield> <type>xid</type>
+ </para>
+ <para>
+ ID of the transaction whose changes are to be skipped, if a valid
+ transaction ID; otherwise 0.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
<structfield>subconninfo</structfield> <type>text</type>
</para>
<para>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 96b4886..a89e545 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -352,16 +352,52 @@
</para>
<para>
- The resolution can be done either by changing data or permissions on the subscriber so
- that it does not conflict with the incoming change or by skipping the
- transaction that conflicts with the existing data. The transaction can be
- skipped by calling the <link linkend="pg-replication-origin-advance">
+ When a conflict produces an error, it is shown in the
+ <structname>pg_stat_subscription_workers</structname> view as follows:
+
+<programlisting>
+postgres=# SELECT * FROM pg_stat_subscription_workers;
+-[ RECORD 1 ]------+-----------------------------------------------------------
+subid | 16391
+subname | test_sub
+subrelid |
+last_error_relid | 16385
+last_error_command | INSERT
+last_error_xid | 716
+last_error_count | 50
+last_error_message | duplicate key value violates unique constraint "test_pkey"
+last_error_time | 2021-09-29 15:52:45.165754+00
+</programlisting>
+
+ and it is also shown in subscriber's server log as follows:
+
+<screen>
+ERROR: duplicate key value violates unique constraint "test_pkey"
+DETAIL: Key (id)=(1) already exists.
+CONTEXT: processing remote data during "INSERT" for replication target relation "public.test" in transaction 716 at 2021-09-29 15:52:45.165754+00
+</screen>
+
+ The ID of the transaction that contains the change violating the constraint can be
+ found from those outputs (transaction ID 716 in the above case). The transaction
+ can be skipped by using <command>ALTER SUBSCRIPTION ... SKIP</command> on the
+ subscription. Alternatively, the transaction can also be skipped by calling the
+ <link linkend="pg-replication-origin-advance">
<function>pg_replication_origin_advance()</function></link> function with
a <parameter>node_name</parameter> corresponding to the subscription name,
and a position. The current position of origins can be seen in the
<link linkend="view-pg-replication-origin-status">
<structname>pg_replication_origin_status</structname></link> system view.
</para>
+
+ <para>
+ The resolution can be done by changing data or permissions on the subscriber so
+ that it does not conflict with incoming changes, by dropping the conflicting constraint
+ or unique index, or by writing a trigger on the subscriber to suppress or redirect
+ conflicting incoming changes, or as a last resort, by skipping the whole transaction.
+ Skipping the whole transaction includes skipping changes that might not violate
+ any constraint. This can easily make the subscriber inconsistent, especially if
+ a user specifies the wrong transaction ID or the wrong position of origin.
+ </para>
</sect1>
<sect1 id="logical-replication-restrictions">
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 0b027cc..24591df 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -29,6 +29,7 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> REFRESH PUB
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> ENABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> DISABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> )
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <replaceable>new_name</replaceable>
</synopsis>
@@ -208,6 +209,47 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
</varlistentry>
<varlistentry>
+ <term><literal>SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> )</literal></term>
+ <listitem>
+ <para>
+ Skips applying all changes of the specified remote transaction. If incoming data
+ violates any constraints, logical replication will stop until it is
+ resolved. The resolution can be done either by changing data on the
+ subscriber so that it doesn't conflict with incoming changes or by skipping
+ the whole transaction. Using the <command>ALTER SUBSCRIPTION ... SKIP</command>
+ command, the logical replication worker skips all data modification changes
+ within the specified transaction, including changes that might not violate
+ the constraint, so, it should only be used as a last resort. This option has
+ no effect on the transactions that are already prepared by enabling
+ <literal>two_phase</literal> on subscriber. After logical replication
+ successfully skips the transaction, the transaction ID (stored in
+ <structname>pg_subscription</structname>.<structfield>subskipxid</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts.
+ </para>
+
+ <para>
+ <replaceable>skip_option</replaceable> specifies options for this operation.
+ The supported option is:
+
+ <variablelist>
+ <varlistentry>
+ <term><literal>xid</literal> (<type>xid</type>)</term>
+ <listitem>
+ <para>
+ Specifies the ID of the remote transaction whose changes are to be skipped
+ by the logical replication worker. Skipping
+ individual subtransactions is not supported. Setting <literal>NONE</literal>
+ resets the transaction ID.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term><replaceable class="parameter">new_owner</replaceable></term>
<listitem>
<para>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index ca65a8b..da199e9 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -69,6 +69,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->binary = subform->subbinary;
sub->stream = subform->substream;
sub->twophasestate = subform->subtwophasestate;
+ sub->skipxid = subform->subskipxid;
/* Get conninfo */
datum = SysCacheGetAttr(SUBSCRIPTIONOID,
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 3cb69b1..4306ca0 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1261,7 +1261,8 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
- substream, subtwophasestate, subslotname, subsynccommit, subpublications)
+ substream, subtwophasestate, subskipxid, subslotname, subsynccommit,
+ subpublications)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_workers AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index f5eba45..311291e 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -61,6 +61,7 @@
#define SUBOPT_BINARY 0x00000080
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
+#define SUBOPT_XID 0x00000400
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -82,6 +83,8 @@ typedef struct SubOpts
bool binary;
bool streaming;
bool twophase;
+ TransactionId xid; /* InvalidTransactionId for resetting purpose,
+ * otherwise normal transaction id */
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -249,6 +252,33 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_TWOPHASE_COMMIT;
opts->twophase = defGetBoolean(defel);
}
+ else if (IsSet(supported_opts, SUBOPT_XID) &&
+ strcmp(defel->defname, "xid") == 0)
+ {
+ char *xid_str = defGetString(defel);
+ TransactionId xid;
+
+ if (IsSet(opts->specified_opts, SUBOPT_XID))
+ errorConflictingDefElem(defel, pstate);
+
+ /* Setting xid = NONE is treated as resetting xid */
+ if (strcmp(xid_str, "none") == 0)
+ xid = InvalidTransactionId;
+ else
+ {
+ /* Parse the argument as TransactionId */
+ xid = DatumGetTransactionId(DirectFunctionCall1(xidin,
+ CStringGetDatum(xid_str)));
+
+ if (!TransactionIdIsNormal(xid))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid transaction ID: %s", xid_str)));
+ }
+
+ opts->specified_opts |= SUBOPT_XID;
+ opts->xid = xid;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -464,6 +494,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
CharGetDatum(opts.twophase ?
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(InvalidTransactionId);
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -1083,6 +1115,27 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
break;
}
+ case ALTER_SUBSCRIPTION_SKIP:
+ {
+ if (!superuser())
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("must be superuser to skip transaction")));
+
+ parse_subscription_options(pstate, stmt->options, SUBOPT_XID, &opts);
+
+ /* ALTER SUBSCRIPTION ... SKIP supports only xid option */
+ Assert(IsSet(opts.specified_opts, SUBOPT_XID));
+
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(opts.xid);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ update_tuple = true;
+
+ break;
+ }
+
default:
elog(ERROR, "unrecognized ALTER SUBSCRIPTION kind %d",
stmt->kind);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index b596671..d1fc1a8 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9954,6 +9954,15 @@ AlterSubscriptionStmt:
(Node *)makeBoolean(false), @1));
$$ = (Node *)n;
}
+ | ALTER SUBSCRIPTION name SKIP definition
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_SKIP;
+ n->subname = $3;
+ n->options = $5;
+ $$ = (Node *)n;
+ }
;
/*****************************************************************************
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index c9af775..d516e93 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -136,6 +136,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -257,6 +258,21 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * We enable skipping all data modification changes (INSERT, UPDATE, etc.) for
+ * the subscription if the user has specified subskipxid. Once we start skipping
+ * changes, we don't stop it until we skip all changes of the transaction even
+ * if pg_subscription is updated and MySubscription->skipxid gets changed or
+ * reset during that. Also, in streaming transaction cases, we don't skip
+ * receiving and spooling the changes, since we decide whether or not to skip
+ * applying the changes when starting to apply changes. At end of the transaction,
+ * we disable it and reset subskipxid. The timing of resetting subskipxid varies
+ * depending on commit or commit/rollback prepared case. Please refer to the
+ * comments in corresponding functions for details.
+ */
+static TransactionId skip_xid = InvalidTransactionId;
+#define is_skipping_changes() (TransactionIdIsValid(skip_xid))
+
/* BufFile handle of the current streaming file */
static BufFile *stream_fd = NULL;
@@ -332,6 +348,13 @@ static void TwoPhaseTransactionGid(Oid subid, TransactionId xid, char *gid, int
/* Common streaming function to apply all the spooled messages */
static void apply_spooled_messages(TransactionId xid, XLogRecPtr lsn);
+/* Functions for skipping changes */
+static void maybe_start_skipping_changes(TransactionId xid);
+static void stop_skipping_changes(bool commit, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp);
+static void clear_subscription_skip_xid(TransactionId xid, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp);
+
/* Functions for apply error callback */
static void apply_error_callback(void *arg);
static inline void set_apply_error_context_xact(TransactionId xid, TimestampTz ts);
@@ -791,6 +814,8 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -843,6 +868,8 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
+ maybe_start_skipping_changes(begin_data.xid);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -857,6 +884,36 @@ apply_handle_prepare_internal(LogicalRepPreparedTxnData *prepare_data)
char gid[GIDSIZE];
/*
+ * If we are skipping all changes of this transaction, we stop it but
+ * unlike commit, we do not clear subskipxid of pg_subscription catalog
+ * here and will do that at commit prepared or rollback prepared time. If
+ * we update the catalog and then prepare the transaction, the changes
+ * will be part of the prepared transaction. Even if we do that in
+ * reverse order, subskipxid will not be cleared but this transaction
+ * won't be resent in a case where the server crashes between them.
+ *
+ * subskipxid might be changed or cleared by the user before we receive
+ * COMMIT PREPARED or ROLLBACK PREPARED. But that's okay because this
+ * prepared transaction is empty.
+ *
+ * One might think that we can skip preparing the skipped transaction and
+ * also skip COMMIT PREPARED or ROLLBACK PREPARED by comparing the XID
+ * received as part of the message to subskipxid. But subskipxid could be
+ * changed by users between PREPARE and COMMIT PREPARED or ROLLBACK
+ * PREPARED. There was an idea to disallow users to change subskipxid
+ * while skipping changes. But we don't know when COMMIT PREPARED or
+ * ROLLBACK PREPARED comes and another conflict could occur in the
+ * meanwhile. If such another conflict occurs, we cannot skip the
+ * transaction by using subskipxid. Also, there was another idea to check
+ * whether the transaction has been prepared or not by checking GID,
+ * origin LSN, and origin timestamp of the prepared transaction but that
+ * doesn't seem worthwhile because it requires protocol changes, and
+ * skipping transactions shouldn't be common.
+ */
+ if (is_skipping_changes())
+ stop_skipping_changes(false, InvalidXLogRecPtr, 0);
+
+ /*
* Compute unique GID for two_phase transactions. We don't use GID of
* prepared transaction sent by server as that can lead to deadlock when
* we have multiple subscriptions from same node point to publications on
@@ -901,9 +958,9 @@ apply_handle_prepare(StringInfo s)
/*
* Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction. It is done this way because at
- * commit prepared time, we won't know whether we have skipped preparing a
- * transaction because of no change.
+ * change has happened in this transaction or all changes are skipped. It
+ * is done this way because at commit prepared time, we won't know whether
+ * we have skipped preparing a transaction because of no change.
*
* XXX, We can optimize such that at commit prepared time, we first check
* whether we have prepared the transaction or not but that doesn't seem
@@ -940,6 +997,23 @@ apply_handle_commit_prepared(StringInfo s)
logicalrep_read_commit_prepared(s, &prepare_data);
set_apply_error_context_xact(prepare_data.xid, prepare_data.commit_time);
+ if (MySubscription->skipxid == prepare_data.xid)
+ {
+ /*
+ * Clear the subskipxid of pg_subscription catalog. This catalog
+ * update must be committed before finishing prepared transaction.
+ * Because otherwise, in a case where the server crashes between
+ * finishing prepared transaction and the catalog update, COMMIT
+ * PREPARED won't be resent but subskipxid is left.
+ *
+ * Also, we must not update the replication origin LSN and timestamp
+ * while committing the catalog update so that COMMIT PREPARED will be
+ * resent in case of a crash immediately after the catalog update
+ * commit.
+ */
+ clear_subscription_skip_xid(prepare_data.xid, InvalidXLogRecPtr, 0);
+ }
+
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
gid, sizeof(gid));
@@ -981,6 +1055,17 @@ apply_handle_rollback_prepared(StringInfo s)
logicalrep_read_rollback_prepared(s, &rollback_data);
set_apply_error_context_xact(rollback_data.xid, rollback_data.rollback_time);
+ if (MySubscription->skipxid == rollback_data.xid)
+ {
+ /*
+ * Same as COMMIT PREPARED, we must clear subskipxid of
+ * pg_subscription before rolling back the prepared transaction.
+ * Please see the comments in apply_handle_commit_prepared() for
+ * details.
+ */
+ clear_subscription_skip_xid(rollback_data.xid, InvalidXLogRecPtr, 0);
+ }
+
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, rollback_data.xid,
gid, sizeof(gid));
@@ -1210,6 +1295,15 @@ apply_handle_stream_abort(StringInfo s)
logicalrep_read_stream_abort(s, &xid, &subxid);
/*
+ * We don't expect the user to set the XID of the transaction that is
+ * rolled back but if the skip XID is set, clear it. Since we don't
+ * support skipping individual subtransactions we don't clear
+ * subtransaction's XID.
+ */
+ if (MySubscription->skipxid == xid)
+ clear_subscription_skip_xid(MySubscription->skipxid, InvalidXLogRecPtr, 0);
+
+ /*
* If the two XIDs are the same, it's in fact abort of toplevel xact, so
* just delete the files with serialized info.
*/
@@ -1331,6 +1425,8 @@ apply_spooled_messages(TransactionId xid, XLogRecPtr lsn)
remote_final_lsn = lsn;
+ maybe_start_skipping_changes(xid);
+
/*
* Make sure the handle apply_dispatch methods are aware we're in a remote
* transaction.
@@ -1451,7 +1547,23 @@ apply_handle_stream_commit(StringInfo s)
static void
apply_handle_commit_internal(LogicalRepCommitData *commit_data)
{
- if (IsTransactionState())
+ if (is_skipping_changes())
+ {
+ /*
+ * If we are skipping all changes of this transaction, we stop it and
+ * clear subskipxid of pg_subscription.
+ */
+ stop_skipping_changes(true, commit_data->end_lsn,
+ commit_data->committime);
+
+ /* Clearing subskipxid must be committed */
+ Assert(!IsTransactionState());
+
+ pgstat_report_stat(false);
+
+ store_flush_position(commit_data->end_lsn);
+ }
+ else if (IsTransactionState())
{
/*
* Update origin state so we can restart streaming from correct
@@ -2367,6 +2479,17 @@ apply_dispatch(StringInfo s)
LogicalRepMsgType saved_command;
/*
+ * Skip all data-modification changes if we're skipping changes of this
+ * transaction.
+ */
+ if (is_skipping_changes() &&
+ (action == LOGICAL_REP_MSG_INSERT ||
+ action == LOGICAL_REP_MSG_UPDATE ||
+ action == LOGICAL_REP_MSG_DELETE ||
+ action == LOGICAL_REP_MSG_TRUNCATE))
+ return;
+
+ /*
* Set the current command being applied. Since this function can be
* called recusively when applying spooled changes, save the current
* command.
@@ -3661,6 +3784,139 @@ IsLogicalWorker(void)
return MyLogicalRepWorker != NULL;
}
+/*
+ * Start skipping changes of the transaction if the given XID matches the
+ * transaction ID specified by subscription's skipxid.
+ */
+static void
+maybe_start_skipping_changes(TransactionId xid)
+{
+ Assert(!is_skipping_changes());
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ if (MySubscription->skipxid != xid)
+ return;
+
+ /* Start skipping all changes of this transaction */
+ skip_xid = xid;
+
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction %u",
+ xid));
+}
+
+/*
+ * Stop skipping changes by resetting skip_xid. If clear_subskipxid is true,
+ * we also clear subskipxid of pg_subscription by setting InvalidTransactionId.
+ * Both origin_lsn and origin_timestamp are used to update origin state when
+ * clearing subskipxid so that we can restart streaming from correct position
+ * in case of crash.
+ */
+static void
+stop_skipping_changes(bool clear_subskipxid, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp)
+{
+ Assert(is_skipping_changes());
+
+ if (clear_subskipxid)
+ {
+ clear_subscription_skip_xid(skip_xid, origin_lsn, origin_timestamp);
+
+ /* Make sure that clearing subskipxid is committed */
+ if (IsTransactionState())
+ CommitTransactionCommand();
+ }
+
+ /* Stop skipping changes */
+ skip_xid = InvalidTransactionId;
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction %u",
+ skip_xid)));
+}
+
+/* Clear subskipxid of pg_subscription catalog */
+static void
+clear_subscription_skip_xid(TransactionId xid, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp)
+{
+ Relation rel;
+ Form_pg_subscription subform;
+ HeapTuple tup;
+ bool started_tx = false;
+
+ if (!IsTransactionState())
+ {
+ StartTransactionCommand();
+ started_tx = true;
+ }
+
+ /*
+ * Protect subskipxid of pg_subscription from being concurrently updated
+ * while clearing it.
+ */
+ LockSharedObject(SubscriptionRelationId, MySubscription->oid, 0,
+ AccessShareLock);
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ ObjectIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ /* Get subskipxid value */
+ subform = (Form_pg_subscription) GETSTRUCT(tup);
+
+ /*
+ * Update the subskipxid of the tuple to InvalidTransactionId. If user
+ * has already changed subskipxid before clearing it we don't update the
+ * catalog and don't advance the replication origin state. So in the
+ * worst case, if the server crashes before sending an acknowledgment of
+ * the flush position the transaction will be sent again and the user
+ * needs to set subskipxid again. We can reduce the possibility by
+ * logging a replication origin WAL record to advance the origin LSN
+ * instead but there is no way to advance the origin timestamp and it
+ * doesn't seem to be worth doing anything about it since it's a very rare
+ * case.
+ */
+ if (subform->subskipxid == xid)
+ {
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ /* reset subskipxid */
+ values[Anum_pg_subscription_subskipxid - 1] =
+ TransactionIdGetDatum(InvalidTransactionId);
+ replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+ /*
+ * Update origin state so we can restart streaming from correct
+ * position in case of crash.
+ */
+ replorigin_session_origin_lsn = origin_lsn;
+ replorigin_session_origin_timestamp = origin_timestamp;
+
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+ }
+
+ heap_freetuple(tup);
+ table_close(rel, NoLock);
+
+ if (started_tx)
+ CommitTransactionCommand();
+}
+
/* Error callback to give more context info about the change being applied */
static void
apply_error_callback(void *arg)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 7c2f1d3..f3c2867 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4304,6 +4304,10 @@ getSubscriptions(Archive *fout)
ntups = PQntuples(res);
+ /*
+ * Get subscription fields. We don't include subskipxid in the dump as
+ * after restoring the dump this value may no longer be relevant.
+ */
i_tableoid = PQfnumber(res, "tableoid");
i_oid = PQfnumber(res, "oid");
i_subname = PQfnumber(res, "subname");
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 40433e3..fa24759 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6051,7 +6051,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false};
+ false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6096,6 +6096,12 @@ describeSubscriptions(const char *pattern, bool verbose)
", subconninfo AS \"%s\"\n",
gettext_noop("Synchronous commit"),
gettext_noop("Conninfo"));
+
+ /* Skip XID is only supported in v15 and higher */
+ if (pset.sversion >= 150000)
+ appendPQExpBuffer(&buf,
+ ", subskipxid AS \"%s\"\n",
+ gettext_noop("Skip XID"));
}
/* Only display subscriptions in current database. */
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 6bd33a0..e402c26 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1710,7 +1710,7 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> */
else if (Matches("ALTER", "SUBSCRIPTION", MatchAny))
COMPLETE_WITH("CONNECTION", "ENABLE", "DISABLE", "OWNER TO",
- "RENAME TO", "REFRESH PUBLICATION", "SET",
+ "RENAME TO", "REFRESH PUBLICATION", "SET", "SKIP (",
"ADD PUBLICATION", "DROP PUBLICATION");
/* ALTER SUBSCRIPTION <name> REFRESH PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) &&
@@ -1726,6 +1726,9 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
COMPLETE_WITH("binary", "slot_name", "streaming", "synchronous_commit");
+ /* ALTER SUBSCRIPTION <name> SKIP ( */
+ else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
+ COMPLETE_WITH("xid");
/* ALTER SUBSCRIPTION <name> SET PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "PUBLICATION"))
{
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 18c2912..3200faa 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -67,6 +67,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
char subtwophasestate; /* Stream two-phase transactions */
+ TransactionId subskipxid; /* All changes associated with this XID are
+ * skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -103,6 +106,7 @@ typedef struct Subscription
* binary format */
bool stream; /* Allow streaming in-progress transactions. */
char twophasestate; /* Allow streaming two-phase transactions */
+ TransactionId skipxid; /* All changes of the XID are skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 3e9bdc7..c7f9d12 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3717,7 +3717,8 @@ typedef enum AlterSubscriptionType
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
ALTER_SUBSCRIPTION_DROP_PUBLICATION,
ALTER_SUBSCRIPTION_REFRESH,
- ALTER_SUBSCRIPTION_ENABLED
+ ALTER_SUBSCRIPTION_ENABLED,
+ ALTER_SUBSCRIPTION_SKIP
} AlterSubscriptionType;
typedef struct AlterSubscriptionStmt
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 80aae83..ac71f7d 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -76,10 +76,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip XID
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist | 0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -93,11 +93,44 @@ ALTER SUBSCRIPTION regress_doesnotexist CONNECTION 'dbname=regress_doesnotexist2
ERROR: subscription "regress_doesnotexist" does not exist
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
ERROR: unrecognized subscription parameter: "create_slot"
+-- ok - valid xid
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 3);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 4294967295);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+ subname | subskipxid
+-----------------+------------
+ regress_testsub | 4294967295
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = NONE);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+ subname | subskipxid
+-----------------+------------
+ regress_testsub | 0
+(1 row)
+
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 0);
+ERROR: invalid transaction ID: 0
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1);
+ERROR: invalid transaction ID: 1
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 2);
+ERROR: invalid transaction ID: 2
+-- fail - must be superuser. We need to try this operation as subscription
+-- owner.
+ALTER ROLE regress_subscription_user2 SUPERUSER;
+ALTER ROLE regress_subscription_user NOSUPERUSER;
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 100);
+ERROR: must be superuser to skip transaction
+SET SESSION AUTHORIZATION 'regress_subscription_user2';
+ALTER ROLE regress_subscription_user SUPERUSER;
+ALTER ROLE regress_subscription_user2 NOSUPERUSER;
+SET SESSION AUTHORIZATION 'regress_subscription_user';
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+--------------------+------------------------------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | off | dbname=regress_doesnotexist2
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip XID
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | off | dbname=regress_doesnotexist2 | 0
(1 row)
BEGIN;
@@ -129,10 +162,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+--------------------+------------------------------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | local | dbname=regress_doesnotexist2
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip XID
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | local | dbname=regress_doesnotexist2 | 0
(1 row)
-- rename back to keep the rest simple
@@ -165,19 +198,19 @@ ERROR: binary requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, binary = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip XID
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | off | dbname=regress_doesnotexist | 0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip XID
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist | 0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -188,19 +221,19 @@ ERROR: streaming requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip XID
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | off | dbname=regress_doesnotexist | 0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip XID
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist | 0
(1 row)
-- fail - publication already exists
@@ -215,10 +248,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip XID
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | off | dbname=regress_doesnotexist | 0
(1 row)
-- fail - publication used more then once
@@ -233,10 +266,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip XID
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist | 0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -270,10 +303,10 @@ ERROR: two_phase requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip XID
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | off | dbname=regress_doesnotexist | 0
(1 row)
--fail - alter of two_phase option not supported.
@@ -282,10 +315,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip XID
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist | 0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -294,10 +327,10 @@ DROP SUBSCRIPTION regress_testsub;
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip XID
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist | 0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index bd0f4af..664268e 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -72,6 +72,28 @@ ALTER SUBSCRIPTION regress_testsub SET (slot_name = '');
ALTER SUBSCRIPTION regress_doesnotexist CONNECTION 'dbname=regress_doesnotexist2';
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
+-- ok - valid xid
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 3);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 4294967295);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = NONE);
+SELECT subname, subskipxid FROM pg_subscription WHERE subname = 'regress_testsub';
+
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 0);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 2);
+
+-- fail - must be superuser. We need to try this operation as subscription
+-- owner.
+ALTER ROLE regress_subscription_user2 SUPERUSER;
+ALTER ROLE regress_subscription_user NOSUPERUSER;
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 100);
+SET SESSION AUTHORIZATION 'regress_subscription_user2';
+ALTER ROLE regress_subscription_user SUPERUSER;
+ALTER ROLE regress_subscription_user2 NOSUPERUSER;
+SET SESSION AUTHORIZATION 'regress_subscription_user';
+
\dRs+
BEGIN;
diff --git a/src/test/subscription/t/028_skip_xact.pl b/src/test/subscription/t/028_skip_xact.pl
new file mode 100644
index 0000000..588f1f1
--- /dev/null
+++ b/src/test/subscription/t/028_skip_xact.pl
@@ -0,0 +1,226 @@
+
+# Copyright (c) 2022, PostgreSQL Global Development Group
+
+# Tests for skipping logical replication transactions
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use Test::More tests => 7;
+
+# Test skipping the transaction. This function must be called after the caller
+# has inserted data that conflicts with the subscriber. After waiting for the
+# subscription worker stats to be updated, we skip the transaction in question
+# by ALTER SUBSCRIPTION ... SKIP. Then, check if logical replication can continue
+# working by inserting $nonconflict_data on the publisher.
+sub test_skip_xact
+{
+ my ($node_publisher, $node_subscriber, $subname, $relname,
+ $nonconflict_data, $expected, $xid, $msg)
+ = @_;
+
+ local $Test::Builder::Level = $Test::Builder::Level + 1;
+
+ # Wait for worker error
+ $node_subscriber->poll_query_until(
+ 'postgres',
+ qq[
+SELECT count(1) > 0
+FROM pg_stat_subscription_workers
+WHERE last_error_relid = '$relname'::regclass
+ AND subrelid IS NULL
+ AND last_error_command = 'INSERT'
+ AND last_error_xid = '$xid'
+ AND starts_with(last_error_message, 'duplicate key value violates unique constraint');
+]) or die "Timed out while waiting for worker error";
+
+ # Set skip xid
+ $node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION $subname SKIP (xid = '$xid')");
+
+ # Restart the subscriber node to restart logical replication with no interval
+ $node_subscriber->restart;
+
+ # Wait for the failed transaction to be skipped
+ $node_subscriber->poll_query_until('postgres',
+ "SELECT subskipxid = 0 FROM pg_subscription WHERE subname = '$subname'"
+ );
+
+ # Insert non-conflict data
+ $node_publisher->safe_psql('postgres',
+ "INSERT INTO $relname VALUES $nonconflict_data");
+
+ $node_publisher->wait_for_catchup($subname);
+
+ # Check replicated data
+ my $res = $node_subscriber->safe_psql('postgres',
+ "SELECT count(*) FROM $relname");
+ is($res, $expected, $msg);
+}
+
+# Create publisher node. Set a low value to logical_decoding_work_mem
+# so we can test streaming cases easily.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf(
+ 'postgresql.conf',
+ qq[
+logical_decoding_work_mem = 64kB
+max_prepared_transactions = 10
+]);
+$node_publisher->start;
+
+# Create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->append_conf(
+ 'postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+]);
+
+# The subscriber will enter an infinite error loop, so we don't want
+# to overflow the server log with error messages.
+$node_subscriber->append_conf(
+ 'postgresql.conf',
+ qq[
+wal_retrieve_retry_interval = 2s
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On the subscriber, we
+# create the same tables but with primary keys. Also, insert some data that
+# will conflict with the data replicated from publisher later.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+CREATE TABLE test_tab (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+COMMIT;
+]);
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+CREATE TABLE test_tab (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab VALUES (1);
+INSERT INTO test_tab_streaming VALUES (1, md5(1::text));
+COMMIT;
+]);
+
+# Setup publications
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Create subscriptions
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr' PUBLICATION tap_pub WITH (two_phase = on);
+CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr' PUBLICATION tap_pub_streaming WITH (two_phase = on, streaming = on);
+]);
+
+$node_publisher->wait_for_catchup('tap_sub');
+$node_publisher->wait_for_catchup('tap_sub_streaming');
+
+# Insert data to test_tab1, raising an error on the subscriber due to violation
+# of the unique constraint on test_tab. Then skip the transaction.
+my $xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(2)", "2", $xid, "test skipping transaction");
+
+# Test for PREPARE and COMMIT PREPARED. Insert the same data to test_tab1 and
+# PREPARE the transaction, raising an error. Then skip the transaction.
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+COMMIT PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(3)", "3", $xid, "test skipping prepare and commit prepared ");
+
+# Test for PREPARE and ROLLBACK PREPARED
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+ROLLBACK PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(4)", "4", $xid, "test skipping prepare and rollback prepared");
+
+# Test for STREAM COMMIT. Insert enough rows to test_tab_streaming to exceed the 64kB
+# limit, also raising an error on the subscriber during applying spooled changes for the
+# same reason. Then skip the transaction.
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+SELECT pg_current_xact_id()::xid;
+COMMIT;
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub_streaming",
+ "test_tab_streaming", "(2, md5(2::text))",
+ "2", $xid, "test skipping stream-commit");
+
+# Test for STREAM PREPARE and COMMIT PREPARED
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+COMMIT PREPARED 'gtx';
+]);
+test_skip_xact(
+ $node_publisher, $node_subscriber,
+ "tap_sub_streaming", "test_tab_streaming",
+ "(3, md5(3::text))", "3",
+ $xid, "test skipping stream-prepare and commit prepared");
+
+# Test for STREAM PREPARE and ROLLBACK PREPARED
+$xid = $node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+SELECT pg_current_xact_id()::xid;
+PREPARE TRANSACTION 'gtx';
+ROLLBACK PREPARED 'gtx';
+]);
+test_skip_xact(
+ $node_publisher,
+ $node_subscriber,
+ "tap_sub_streaming",
+ "test_tab_streaming",
+ "(4, md5(4::text))",
+ "4",
+ $xid,
+ "test skipping stream-prepare and rollback prepared");
+
+my $res = $node_subscriber->safe_psql('postgres',
+ "SELECT count(*) FROM pg_prepared_xacts");
+is($res, "0",
+ "check all prepared transactions are resolved on the subscriber");
--
1.8.3.1
On Fri, Jan 21, 2022 at 5:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Jan 21, 2022 at 10:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Few things that I think we can improve in 028_skip_xact.pl are as follows:
After CREATE SUBSCRIPTION, wait for initial sync to be over and
two_phase state to be enabled. Please see 021_twophase. For the
streaming case, we might be able to ensure streaming even with lesser
data. Can you please try that?
I noticed that the newly added test by this patch takes time is on the
upper side. See comparison with the subscription test that takes max
time:
[17:38:49] t/028_skip_xact.pl ................. ok 9298 ms
[17:38:59] t/100_bugs.pl ...................... ok 11349 ms
I think we can reduce time by removing some stream tests without much
impacting on coverage, possibly related to 2PC and streaming together,
and if you do that we probably don't need a subscription with both 2PC
and streaming enabled.
--
With Regards,
Amit Kapila.
On 21.01.22 04:08, Masahiko Sawada wrote:
I think the superuser check in AlterSubscription() might no longer be
appropriate. Subscriptions can now be owned by non-superusers. Please
check that.IIUC we don't allow non-superuser to own the subscription yet. We
still have the following superuser checks:In CreateSubscription():
if (!superuser())
ereport(ERROR,
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
errmsg("must be superuser to create subscriptions")));and in AlterSubscriptionOwner_internal();
/* New owner must be a superuser */
if (!superuser_arg(newOwnerId))
ereport(ERROR,
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
errmsg("permission denied to change owner of
subscription \"%s\"",
NameStr(form->subname)),
errhint("The owner of a subscription must be a superuser.")));Also, doing superuser check here seems to be consistent with
pg_replication_origin_advance() which is another way to skip
transactions and also requires superuser permission.
I'm referring to commit a2ab9c06ea15fbcb2bfde570986a06b37f52bcca. You
still have to be superuser to create a subscription, but you can change
the owner to a nonprivileged user and it will observe table permissions
on the subscriber.
Assuming my understanding of that commit is correct, I think it would be
sufficient in your patch to check that the current user is the owner of
the subscription.
On Fri, Jan 21, 2022 at 4:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
Apart from this, I have changed a few comments and ran pgindent. Do
let me know what you think of the changes?
The paragraph describing ALTER SUBSCRIPTION SKIP seems unnecessarily
repetitive. Consider:
"""
Skips applying all changes of the specified remote transaction, whose value
should be obtained from pg_stat_subscription_workers.last_error_xid. While
this will result in avoiding the last error on the subscription, thus
allowing it to resume working. See "link to a more holistic description in
the Logical Replication chapter" for alternative means of resolving
subscription errors. Removing an entire transaction from the history of a
table should be considered a last resort as it can leave the system in a
very inconsistent state.
Note, this feature will not accept transactions prepared under two-phase
commit.
This command sets pg_subscription.subskipxid field upon issuance and the
system clears the same field upon seeing and successfully skipped the
identified transaction. Issuing this command again while a skipped
transaction is pending replaces the existing transaction with the new one.
"""
Then change the subskipxid column description to be:
"""
ID of the transaction whose changes are to be skipped. It is 0 when there
are no pending skips. This is set by issuing ALTER SUBSCRIPTION SKIP and
resets back to 0 when the identified transactions passes through the
subscription stream and is successfully ignored.
"""
I don't understand why/how ", if a valid transaction ID;" comes into play
(how would we know whether it is valid, or if we do ALTER SUBSCRIPTION SKIP
should prohibit the invalid value from being chosen).
I'm against mentioning subtransactions in the skip_option description.
The Logical Replication page changes provide good content overall but I
dislike going into detail about how to perform conflict resolution in the
third paragraph and then summarize the various forms of conflict resolution
in the newly added forth. Maybe re-work things like:
1. Logical replication behaves...
2. A conflict will produce...details can be found in places...
3. Resolving conflicts can be done by...
4. (split and reworded) If choosing to simply skip the offending
transaction you take the pg_stat_subscription_worker.last_error_xid value
(716 in the example above) and provide it while executing ALTER
SUBSCRIPTION SKIP...
5. (split and reworded) Prior to v15 ALTER SUBSCRIPTION SKIP was not
available and instead you had to use the pg_replication_origin_advance()
function...
Don't just list out two options for the user to perform the same action.
Tell a story about why we felt compelled to add ALTER SYSTEM SKIP and why
either the function is now deprecated or is useful given different
circumstances (the former seems likely).
David J.
On Fri, Jan 21, 2022 at 7:23 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
On 21.01.22 04:08, Masahiko Sawada wrote:
I think the superuser check in AlterSubscription() might no longer be
appropriate. Subscriptions can now be owned by non-superusers. Please
check that.IIUC we don't allow non-superuser to own the subscription yet. We
still have the following superuser checks:In CreateSubscription():
if (!superuser())
ereport(ERROR,
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
errmsg("must be superuser to create subscriptions")));and in AlterSubscriptionOwner_internal();
/* New owner must be a superuser */
if (!superuser_arg(newOwnerId))
ereport(ERROR,
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
errmsg("permission denied to change owner of
subscription \"%s\"",
NameStr(form->subname)),
errhint("The owner of a subscription must be a superuser.")));Also, doing superuser check here seems to be consistent with
pg_replication_origin_advance() which is another way to skip
transactions and also requires superuser permission.I'm referring to commit a2ab9c06ea15fbcb2bfde570986a06b37f52bcca. You
still have to be superuser to create a subscription, but you can change
the owner to a nonprivileged user and it will observe table permissions
on the subscriber.Assuming my understanding of that commit is correct, I think it would be
sufficient in your patch to check that the current user is the owner of
the subscription.
Won't we already do that for Alter Subscription command which means
nothing special needs to be done for this? However, it seems to me
that the idea we are trying to follow here is that as this option can
lead to data inconsistency, it is good to allow only superusers to
specify this option. The owner of the subscription can be changed to
non-superuser as well in which case I think it won't be a good idea to
allow this option. OTOH, if we think it is okay to allow such an
option to users that don't have superuser privilege then I think
allowing it to the owner of the subscription makes sense to me.
--
With Regards,
Amit Kapila.
On Fri, Jan 21, 2022 at 10:00 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
On Fri, Jan 21, 2022 at 4:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
Apart from this, I have changed a few comments and ran pgindent. Do
let me know what you think of the changes?The paragraph describing ALTER SUBSCRIPTION SKIP seems unnecessarily repetitive. Consider:
"""
Skips applying all changes of the specified remote transaction, whose value should be obtained from pg_stat_subscription_workers.last_error_xid.
Here, you can also say that the value can be found from server logs as well.
While this will result in avoiding the last error on the
subscription, thus allowing it to resume working. See "link to a more
holistic description in the Logical Replication chapter" for
alternative means of resolving subscription errors. Removing an
entire transaction from the history of a table should be considered a
last resort as it can leave the system in a very inconsistent state.
Note, this feature will not accept transactions prepared under two-phase commit.
This command sets pg_subscription.subskipxid field upon issuance and the system clears the same field upon seeing and successfully skipped the identified transaction. Issuing this command again while a skipped transaction is pending replaces the existing transaction with the new one.
"""
The proposed text sounds better to me except for a minor change as
suggested above.
Then change the subskipxid column description to be:
"""
ID of the transaction whose changes are to be skipped. It is 0 when there are no pending skips. This is set by issuing ALTER SUBSCRIPTION SKIP and resets back to 0 when the identified transactions passes through the subscription stream and is successfully ignored.
"""
Users can manually reset it by specifying NONE, so that should be
covered in the above text, otherwise, looks good.
I don't understand why/how ", if a valid transaction ID;" comes into play (how would we know whether it is valid, or if we do ALTER SUBSCRIPTION SKIP should prohibit the invalid value from being chosen).
What do you mean by invalid value here? Is it the value lesser than
FirstNormalTransactionId or a value that is of the non-error
transaction? For the former, we already have a check in the patch and
for later we can't identify it with any certainty because the error
stats are collected by the stats collector.
I'm against mentioning subtransactions in the skip_option description.
We have mentioned that because currently, we don't support it but in
the future one can come up with an idea to support it. What problem do
you see with it?
The Logical Replication page changes provide good content overall but I dislike going into detail about how to perform conflict resolution in the third paragraph and then summarize the various forms of conflict resolution in the newly added forth. Maybe re-work things like:
1. Logical replication behaves...
2. A conflict will produce...details can be found in places...
3. Resolving conflicts can be done by...
4. (split and reworded) If choosing to simply skip the offending transaction you take the pg_stat_subscription_worker.last_error_xid value (716 in the example above) and provide it while executing ALTER SUBSCRIPTION SKIP...
5. (split and reworded) Prior to v15 ALTER SUBSCRIPTION SKIP was not available and instead you had to use the pg_replication_origin_advance() function...Don't just list out two options for the user to perform the same action. Tell a story about why we felt compelled to add ALTER SYSTEM SKIP and why either the function is now deprecated or is useful given different circumstances (the former seems likely).
Personally, I don't see much value in the split (especially giving
context like "Prior to v15 ..) but specifying the circumstances where
each of the options could be useful.
--
With Regards,
Amit Kapila.
On Fri, Jan 21, 2022 at 10:30 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:
On Fri, Jan 21, 2022 at 10:00 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:On Fri, Jan 21, 2022 at 4:55 AM Amit Kapila <amit.kapila16@gmail.com>
wrote:
Apart from this, I have changed a few comments and ran pgindent. Do
let me know what you think of the changes?The paragraph describing ALTER SUBSCRIPTION SKIP seems unnecessarily
repetitive. Consider:
"""
Skips applying all changes of the specified remote transaction, whosevalue should be obtained from pg_stat_subscription_workers.last_error_xid.
Here, you can also say that the value can be found from server logs as
well.
subscriber's server logs, right? I would agree that adding that for
completeness is warranted.
Then change the subskipxid column description to be:
"""
ID of the transaction whose changes are to be skipped. It is 0 whenthere are no pending skips. This is set by issuing ALTER SUBSCRIPTION SKIP
and resets back to 0 when the identified transactions passes through the
subscription stream and is successfully ignored."""
Users can manually reset it by specifying NONE, so that should be
covered in the above text, otherwise, looks good.
I agree with incorporating "reset" into the paragraph somehow - does not
have to mention NONE, just that ALTER SUBSCRIPTION SKIP (not a family
friendly abbreviation...) is what does it.
I don't understand why/how ", if a valid transaction ID;" comes into
play (how would we know whether it is valid, or if we do ALTER SUBSCRIPTION
SKIP should prohibit the invalid value from being chosen).What do you mean by invalid value here? Is it the value lesser than
FirstNormalTransactionId or a value that is of the non-error
transaction? For the former, we already have a check in the patch and
for later we can't identify it with any certainty because the error
stats are collected by the stats collector.
The original proposal qualifies the non-zero transaction id in
subskipxid as being a "valid transaction ID" and that invalid ones (which
is how "otherwise" is interpreted given the "valid" qualification preceding
it) are shown as 0. As an end-user that makes me wonder what it means for
a transaction ID to be invalid. My point is that dropping the mention of
"valid transaction ID" avoids that and lets the reader operate with an
understanding that things should "just work". If I see a non-zero in the
column I have a pending skip and if I see zero I do not. My wording
assumes it is that simple. If it isn't I would need some clarity as to why
it is not in order to write something I could read and understand from my
inexperienced user-centric point-of-view.
I get that I may provide a transaction ID that is invalid such that the
system could never see it (or at least not for a long while) - say we
error on transaction 102 and I typo it as 1002 or 101. But I would expect
either an error where I make the typo or the numbers 1002 or 101 to appear
on the table. I would not expect my 101 typo to result in a 0 appearing on
the table (and if it does so today I argue that is a POLA violation).
Thus, "if a valid transaction ID" from the original text just doesn't make
sense to me.
In typical usage it would seem strange to allow a skip to be recorded if
there is no existing error in the subscription. Should we (do we, haven't
read the code) warn in that situation?
*Or, why even force them to specify a number instead of just saying SKIP
and if there is a current error we skip its transaction, otherwise we warn
them that nothing happened because there is no last error.*
Additionally, the description for pg_stat_subscription_workers should
describe what happens once the transaction represented by last_error_xid
has either been successfully processed or skipped. Does this "last error"
stick around until another error happens (which is hopefully very rare) or
does it reset to blanks? Seems like it should reset, which really makes
this more of an "active_error" instead of a "last_error". This system is
linear, we are stuck until this error is resolved, making it active.
I'm against mentioning subtransactions in the skip_option description.
We have mentioned that because currently, we don't support it but in
the future one can come up with an idea to support it. What problem do
you see with it?
If you ever get around to implementing the feature then by all means add
it. My main issue is that we basically never talk about subtransactions in
the user-facing documentation and it doesn't seem desirable to do so here.
Knowing that a whole transaction is skipped is all I need to care about as
a user. I believe that no users will be asking "what about subtransactions
(savepoints)" but by mentioning it less experienced ones will now have
something to be curious about that they really do not need to be.
The Logical Replication page changes provide good content overall but I
dislike going into detail about how to perform conflict resolution in the
third paragraph and then summarize the various forms of conflict resolution
in the newly added forth. Maybe re-work things like:Personally, I don't see much value in the split (especially giving
context like "Prior to v15 ..) but specifying the circumstances where
each of the options could be useful.
Yes, I've been reminded of the desire to avoid mentioning versions and
agree doing so here is correct. The added context is desired, the style
depends on the content.
David J.
On Sat, Jan 22, 2022 at 12:41 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
On Fri, Jan 21, 2022 at 10:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Jan 21, 2022 at 10:00 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:On Fri, Jan 21, 2022 at 4:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
Apart from this, I have changed a few comments and ran pgindent. Do
let me know what you think of the changes?The paragraph describing ALTER SUBSCRIPTION SKIP seems unnecessarily repetitive. Consider:
"""
Skips applying all changes of the specified remote transaction, whose value should be obtained from pg_stat_subscription_workers.last_error_xid.Here, you can also say that the value can be found from server logs as well.
subscriber's server logs, right?
Right.
I would agree that adding that for completeness is warranted.
Then change the subskipxid column description to be:
"""
ID of the transaction whose changes are to be skipped. It is 0 when there are no pending skips. This is set by issuing ALTER SUBSCRIPTION SKIP and resets back to 0 when the identified transactions passes through the subscription stream and is successfully ignored.
"""Users can manually reset it by specifying NONE, so that should be
covered in the above text, otherwise, looks good.I agree with incorporating "reset" into the paragraph somehow - does not have to mention NONE, just that ALTER SUBSCRIPTION SKIP (not a family friendly abbreviation...) is what does it.
It is not clear to me what you have in mind here but to me in this
context saying "Setting <literal>NONE</literal> resets the transaction
ID." seems quite reasonable.
I don't understand why/how ", if a valid transaction ID;" comes into play (how would we know whether it is valid, or if we do ALTER SUBSCRIPTION SKIP should prohibit the invalid value from being chosen).
What do you mean by invalid value here? Is it the value lesser than
FirstNormalTransactionId or a value that is of the non-error
transaction? For the former, we already have a check in the patch and
for later we can't identify it with any certainty because the error
stats are collected by the stats collector.The original proposal qualifies the non-zero transaction id in subskipxid as being a "valid transaction ID" and that invalid ones (which is how "otherwise" is interpreted given the "valid" qualification preceding it) are shown as 0. As an end-user that makes me wonder what it means for a transaction ID to be invalid. My point is that dropping the mention of "valid transaction ID" avoids that and lets the reader operate with an understanding that things should "just work". If I see a non-zero in the column I have a pending skip and if I see zero I do not. My wording assumes it is that simple. If it isn't I would need some clarity as to why it is not in order to write something I could read and understand from my inexperienced user-centric point-of-view.
I get that I may provide a transaction ID that is invalid such that the system could never see it (or at least not for a long while) - say we error on transaction 102 and I typo it as 1002 or 101. But I would expect either an error where I make the typo or the numbers 1002 or 101 to appear on the table. I would not expect my 101 typo to result in a 0 appearing on the table (and if it does so today I argue that is a POLA violation). Thus, "if a valid transaction ID" from the original text just doesn't make sense to me.
In typical usage it would seem strange to allow a skip to be recorded if there is no existing error in the subscription. Should we (do we, haven't read the code) warn in that situation?
Yeah, we will error in that situation. The only invalid values are
system reserved values (1,2).
Or, why even force them to specify a number instead of just saying SKIP and if there is a current error we skip its transaction, otherwise we warn them that nothing happened because there is no last error.
The idea is that we might extend this feature to skip specific
operations on relations or maybe by having other identifiers. One idea
we discussed was to automatically fetch the last error xid but then
decided it can be done as a later patch.
Additionally, the description for pg_stat_subscription_workers should describe what happens once the transaction represented by last_error_xid has either been successfully processed or skipped. Does this "last error" stick around until another error happens (which is hopefully very rare) or does it reset to blanks?
It will be reset only on subscription drop, otherwise, it will stick
around until another error happens.
Seems like it should reset, which really makes this more of an "active_error" instead of a "last_error". This system is linear, we are stuck until this error is resolved, making it active.
I'm against mentioning subtransactions in the skip_option description.
We have mentioned that because currently, we don't support it but in
the future one can come up with an idea to support it. What problem do
you see with it?If you ever get around to implementing the feature then by all means add it. My main issue is that we basically never talk about subtransactions in the user-facing documentation and it doesn't seem desirable to do so here. Knowing that a whole transaction is skipped is all I need to care about as a user. I believe that no users will be asking "what about subtransactions (savepoints)" but by mentioning it less experienced ones will now have something to be curious about that they really do not need to be.
It is not that we don't mention subtransactions in the docs but I see
your point and I think we can avoid mentioning it in this case.
--
With Regards,
Amit Kapila.
On Sat, Jan 22, 2022 at 2:41 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Sat, Jan 22, 2022 at 12:41 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:On Fri, Jan 21, 2022 at 10:30 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:
On Fri, Jan 21, 2022 at 10:00 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:On Fri, Jan 21, 2022 at 4:55 AM Amit Kapila <amit.kapila16@gmail.com>
wrote:
I agree with incorporating "reset" into the paragraph somehow - does not
have to mention NONE, just that ALTER SUBSCRIPTION SKIP (not a family
friendly abbreviation...) is what does it.It is not clear to me what you have in mind here but to me in this
context saying "Setting <literal>NONE</literal> resets the transaction
ID." seems quite reasonable.
OK
Yeah, we will error in that situation. The only invalid values are
system reserved values (1,2).
So long as the ALTER command errors when asked to skip those IDs there
isn't any reason for an end-user, who likely doesn't know or care that 1
and 2 are special, to be concerned about them (the only two invalid values)
while reading the docs.
Or, why even force them to specify a number instead of just saying SKIP
and if there is a current error we skip its transaction, otherwise we warn
them that nothing happened because there is no last error.The idea is that we might extend this feature to skip specific
operations on relations or maybe by having other identifiers.
Again, you've already got syntax reserved that lets you add more features
to this command in the future; and removing warnings or errors because new
features make them moot is easy. Lets document and code what we are
willing to implement today. A single top-level transaction xid that is
presently blocking the worker from applying any more WAL.
One idea
we discussed was to automatically fetch the last error xid but then
decided it can be done as a later patch.
This seems backwards. The user-friendly approach is to not make them type
in anything at all. That said, this particular UX seems like it could use
some safety. Thus I would propose at this time that attempting to set the
skip_option to anything but THE active_error_xid for the named subscription
results in an error. Once you add new features the user can set the
skip_option to other things without provoking errors. Again, I consider
this a safety feature since the user now has to accurately match the xid to
the name in the SQL in order to perform a successful skip - and the to-be
affected transaction has to be one that is preventing replication from
moving forward. I'm not interested in providing a foot-gun where an
arbitrary future transaction can be scheduled to be skipped. Running the
command twice with the same values should provoke an error since the first
run should be allowed to finish (?). Also, we handle the situation where
the state of the worker changes between when the user saw the error and
wrote down the xid to skip and the actual execution of the alter command.
Maybe not highly anticipated scenarios but this is an easy win to deal with
them.
Additionally, the description for pg_stat_subscription_workers should
describe what happens once the transaction represented by last_error_xid
has either been successfully processed or skipped. Does this "last error"
stick around until another error happens (which is hopefully very rare) or
does it reset to blanks?It will be reset only on subscription drop, otherwise, it will stick
around until another error happens.
I really dislike the user experience this provides, and given it is new in
v15 (and right now this table seems to exist solely to support this
feature) changing this seems within the realm of possibility. I have to
imagine these workers have a sense of local state that would just be "no
errors, no need to touch pg_stat_subscription_workers at the end of this
transaction's commit". It would save a local state of the error_xid and if
a successfully committed transaction has that xid it would clear the
error. The skip code path would also check for and see the matching xid
value and clear the error. Even if the local state thing doesn't work, one
catalog lookup per transaction seems like potentially reasonable overhead
to incur here.
David J.
On Sat, Jan 22, 2022 at 9:21 AM David G. Johnston <
david.g.johnston@gmail.com> wrote:
On Sat, Jan 22, 2022 at 2:41 AM Amit Kapila <amit.kapila16@gmail.com>
wrote:Additionally, the description for pg_stat_subscription_workers should
describe what happens once the transaction represented by last_error_xid
has either been successfully processed or skipped. Does this "last error"
stick around until another error happens (which is hopefully very rare) or
does it reset to blanks?It will be reset only on subscription drop, otherwise, it will stick
around until another error happens.I really dislike the user experience this provides, and given it is new in
v15 (and right now this table seems to exist solely to support this
feature) changing this seems within the realm of possibility. I have to
imagine these workers have a sense of local state that would just be "no
errors, no need to touch pg_stat_subscription_workers at the end of this
transaction's commit". It would save a local state of the error_xid and if
a successfully committed transaction has that xid it would clear the
error. The skip code path would also check for and see the matching xid
value and clear the error. Even if the local state thing doesn't work, one
catalog lookup per transaction seems like potentially reasonable overhead
to incur here.
It shouldn't even need to be that overhead intensive. Once an error is
encountered the system stops. By construction it must be told to redo, at
which point the information about "last error" is no longer relevant and
can be removed (for skipping the user/system will have already done
everything with the xid that is needed before the redo is issued). In the
steady-state it then is simply empty until a new error arises at which
point it becomes populated again; and stays that way until the system goes
into redo mode as instructed by the user via one of several methods.
David J.
On Fri, Jan 21, 2022 at 9:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Jan 21, 2022 at 5:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Jan 21, 2022 at 10:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Few things that I think we can improve in 028_skip_xact.pl are as follows:
After CREATE SUBSCRIPTION, wait for initial sync to be over and
two_phase state to be enabled. Please see 021_twophase. For the
streaming case, we might be able to ensure streaming even with lesser
data. Can you please try that?I noticed that the newly added test by this patch takes time is on the
upper side. See comparison with the subscription test that takes max
time:
[17:38:49] t/028_skip_xact.pl ................. ok 9298 ms
[17:38:59] t/100_bugs.pl ...................... ok 11349 msI think we can reduce time by removing some stream tests without much
impacting on coverage, possibly related to 2PC and streaming together,
and if you do that we probably don't need a subscription with both 2PC
and streaming enabled.
Agreed.
In addition to that, after some tests, I realized that the two tests
of ROLLBACK PREPARED are not stable. If the walsender detects a
concurrent abort of the transaction that is being decoded, it’s
possible that it sends only beigin_prepare and prepare messages, and
consequently. If this happens before setting skip_xid, a unique key
constraint violation doesn’t occur on the subscription, and
consequently, skip_xid is not cleared. We can reduce the possibility
by setting a very high value to wal_retrieve_retry_interval but I
think it’s better to remove them. What do you think?
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Fri, Jan 21, 2022 at 8:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Jan 21, 2022 at 10:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Jan 21, 2022 at 1:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
What do we want to indicate by [, ... ]? To me, it appears like
multiple options but that is not what we support currently.You're right. It's an oversight.
I have fixed this and a few other things in the attached patch.
Thank you for updating the patch!
1.
The newly added column needs to be updated in the following statement:
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
substream, subtwophasestate, subslotname, subsynccommit,
subpublications)
ON pg_subscription TO public;2. +stop_skipping_changes(bool clear_subskipxid, XLogRecPtr origin_lsn, + TimestampTz origin_timestamp) +{ + Assert(is_skipping_changes()); + + ereport(LOG, + (errmsg("done skipping logical replication transaction %u", + skip_xid)));Isn't it better to move this LOG at the end of this function? Because
clear* functions can give an error, so it is better to move it after
that. I have done that in the attached.3. +-- fail - must be superuser +SET SESSION AUTHORIZATION 'regress_subscription_user2'; +ALTER SUBSCRIPTION regress_testsub SKIP (xid = 100); +ERROR: must be owner of subscription regress_testsubThis test doesn't seem to be right. You want to get the error for the
superuser but the error is for the owner. I have changed this test to
do what it intends to do.Apart from this, I have changed a few comments and ran pgindent. Do
let me know what you think of the changes?
Agree with these changes.
Few things that I think we can improve in 028_skip_xact.pl are as follows:
After CREATE SUBSCRIPTION, wait for initial sync to be over and
two_phase state to be enabled. Please see 021_twophase.
Agreed.
For the
streaming case, we might be able to ensure streaming even with lesser
data. Can you please try that?
Yeah, after some tests, it's enough to insert 500 rows as follows:
INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM
generate_series(1, 500) s(i);
I've just sent another email about that probably we can remove two
tests for ROLLBACK PREPARED, so I’ll update the patch while including
this point.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Mon, Jan 24, 2022 at 8:24 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Jan 21, 2022 at 9:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Jan 21, 2022 at 5:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Jan 21, 2022 at 10:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Few things that I think we can improve in 028_skip_xact.pl are as follows:
After CREATE SUBSCRIPTION, wait for initial sync to be over and
two_phase state to be enabled. Please see 021_twophase. For the
streaming case, we might be able to ensure streaming even with lesser
data. Can you please try that?I noticed that the newly added test by this patch takes time is on the
upper side. See comparison with the subscription test that takes max
time:
[17:38:49] t/028_skip_xact.pl ................. ok 9298 ms
[17:38:59] t/100_bugs.pl ...................... ok 11349 msI think we can reduce time by removing some stream tests without much
impacting on coverage, possibly related to 2PC and streaming together,
and if you do that we probably don't need a subscription with both 2PC
and streaming enabled.Agreed.
In addition to that, after some tests, I realized that the two tests
of ROLLBACK PREPARED are not stable. If the walsender detects a
concurrent abort of the transaction that is being decoded, it’s
possible that it sends only beigin_prepare and prepare messages, and
consequently. If this happens before setting skip_xid, a unique key
constraint violation doesn’t occur on the subscription, and
consequently, skip_xid is not cleared. We can reduce the possibility
by setting a very high value to wal_retrieve_retry_interval but I
think it’s better to remove them.
+1.
--
With Regards,
Amit Kapila.
On Sat, Jan 22, 2022 at 9:51 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
So long as the ALTER command errors when asked to skip those IDs there isn't any reason for an end-user, who likely doesn't know or care that 1 and 2 are special, to be concerned about them (the only two invalid values) while reading the docs.
In this matter, I don't see any problem with the current text proposed
and there are many others who have also reviewed it. I am fine to
change if others also think that the current text needs to be changed.
Additionally, the description for pg_stat_subscription_workers should describe what happens once the transaction represented by last_error_xid has either been successfully processed or skipped. Does this "last error" stick around until another error happens (which is hopefully very rare) or does it reset to blanks?
It will be reset only on subscription drop, otherwise, it will stick
around until another error happens.I really dislike the user experience this provides, and given it is new in v15 (and right now this table seems to exist solely to support this feature) changing this seems within the realm of possibility. I have to imagine these workers have a sense of local state that would just be "no errors, no need to touch pg_stat_subscription_workers at the end of this transaction's commit". It would save a local state of the error_xid and if a successfully committed transaction has that xid it would clear the error. The skip code path would also check for and see the matching xid value and clear the error. Even if the local state thing doesn't work, one catalog lookup per transaction seems like potentially reasonable overhead to incur here.
Are you telling to update the catalog to save error_xid when an error
occurs? If so, that has many challenges like we are not supposed to
perform any such operations when the transaction is in an error state.
We have discussed this and other ideas in the beginning. I don't find
any of your arguments convincing to change the basic approach here but
I would like to see what others think on this matter?
--
With Regards,
Amit Kapila.
On Sun, Jan 23, 2022 at 8:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I really dislike the user experience this provides, and given it is new
in v15 (and right now this table seems to exist solely to support this
feature) changing this seems within the realm of possibility. I have to
imagine these workers have a sense of local state that would just be "no
errors, no need to touch pg_stat_subscription_workers at the end of this
transaction's commit". It would save a local state of the error_xid and if
a successfully committed transaction has that xid it would clear the
error. The skip code path would also check for and see the matching xid
value and clear the error. Even if the local state thing doesn't work, one
catalog lookup per transaction seems like potentially reasonable overhead
to incur here.Are you telling to update the catalog to save error_xid when an error
occurs? If so, that has many challenges like we are not supposed to
perform any such operations when the transaction is in an error state.
We have discussed this and other ideas in the beginning. I don't find
any of your arguments convincing to change the basic approach here but
I would like to see what others think on this matter?
Then how does the table get updated to that state in the first place since
it doesn't know the error details until there is an error?
In any case, clearing out the entries in the table would not happen while
it is applying the replication stream, in an error state or otherwise.
in = while streaming
out = not streaming
1(in). replication stream is working
2(in). replication stream fails; capture error information
3(in->out). stop replication stream; perform rollback on xid
4(out). update pg_stat_subscription_worker to report the failure, including
xid of the transaction
5(out). wait for the user to manually restart the replication stream
[if they do so by skipping the xid, save the xid from
pg_stat_subscription_worker into pg_subscription.subskipxid - possibly
requiring the user to confirm the xid]
[user has now done their thing and requested that the replication stream
resume]
6(out). clear the error information from pg_stat_subscription_worker; it is
no longer useful/doesn't exist because the user just took action to avoid
that very error, one way (skipping its transaction) or another.
7(out->in). resume the replication stream, return to step 1
You are already doing steps 1-5 and 7 today however you are forced to deal
with transactions and catalog access. I am just adding step 6, which turns
last_error_xid into current_error_xid because it is current value of the
error in the stream during step 5 when the user needs to decide how to
recover from the error. Once the user decides and the stream resumes that
error information has no value (go look in the logs if you want history).
Thus when 7 comes around and the stream is restarted the error info in
pg_stat_subscription_worker is empty waiting for the next error to happen.
If the user did nothing in step 5 then when that same wal is replayed at
step 2 the error will come back.
The main thing is how many ways can the user exit step 5 and to make sure
that no matter which way they exit step 6 happens before step 7.
David J.
On Fri, Jan 21, 2022 7:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
2. +stop_skipping_changes(bool clear_subskipxid, XLogRecPtr origin_lsn, + TimestampTz origin_timestamp) +{ + Assert(is_skipping_changes()); + + ereport(LOG, + (errmsg("done skipping logical replication transaction %u", + skip_xid)));Isn't it better to move this LOG at the end of this function? Because
clear* functions can give an error, so it is better to move it after
that. I have done that in the attached.
+ /* Stop skipping changes */
+ skip_xid = InvalidTransactionId;
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction %u",
+ skip_xid)));
I think we can move the LOG before resetting skip_xid, otherwise skip_xid would
always be 0 in the LOG.
Regards,
Tang
On Mon, Jan 24, 2022 at 1:49 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
On Sun, Jan 23, 2022 at 8:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I really dislike the user experience this provides, and given it is new in v15 (and right now this table seems to exist solely to support this feature) changing this seems within the realm of possibility. I have to imagine these workers have a sense of local state that would just be "no errors, no need to touch pg_stat_subscription_workers at the end of this transaction's commit". It would save a local state of the error_xid and if a successfully committed transaction has that xid it would clear the error. The skip code path would also check for and see the matching xid value and clear the error. Even if the local state thing doesn't work, one catalog lookup per transaction seems like potentially reasonable overhead to incur here.
Are you telling to update the catalog to save error_xid when an error
occurs? If so, that has many challenges like we are not supposed to
perform any such operations when the transaction is in an error state.
We have discussed this and other ideas in the beginning. I don't find
any of your arguments convincing to change the basic approach here but
I would like to see what others think on this matter?Then how does the table get updated to that state in the first place since it doesn't know the error details until there is an error?
I think your idea is based on storing error information including XID
is stored in the system catalog. I think that the reasons why we use
the stats collector to store error information including
last_error_xid are (1) as Amit mentioned, it would have many
challenges if updating the catalog when the transaction is in an error
state, and (2) we can store more information such as error messages,
action, etc. other than XID so that users can identify that the
reported error is a conflict error but not other types of error such
as OOM error. For these reasons to me, it makes sense to store
subscribers' error information by using the stats collector.
When it comes to reporting a message to the stats collector, we need
to note that it's not guaranteed that all messages arrive at the stats
collector. Therefore, last_error_xid doesn't not necessarily get
updated after the worker reports an error. Similarly, the same is true
for clearing subskipxid. I agree that it's useful if
pg_subscription.subskipxid is automatically set when executing ALTER
SUBSCRIPTION SKIP but it might not work in some cases because of this
restriction.
There is another idea of storing error XID on shmem (e.g., in
ReplicationState) in addition to reporting error details to the stats
collector and using the XID when skipping the transaction, but I'm not
sure whether it's a reliable way.
Anyway, even if subskipxid is automatically set when ALTER
SUBSCRIPTION SKIP, I think we need to provide a way to clear it as the
current patch does (setting NONE) just in case.
In any case, clearing out the entries in the table would not happen while it is applying the replication stream, in an error state or otherwise.
in = while streaming
out = not streaming1(in). replication stream is working
2(in). replication stream fails; capture error information
3(in->out). stop replication stream; perform rollback on xid
4(out). update pg_stat_subscription_worker to report the failure, including xid of the transaction
5(out). wait for the user to manually restart the replication stream
Do you mean that there always is user intervention after error so the
replication stream can resume?
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Sun, Jan 23, 2022 at 11:55 PM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:
On Mon, Jan 24, 2022 at 1:49 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:On Sun, Jan 23, 2022 at 8:35 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:
I really dislike the user experience this provides, and given it is
new in v15 (and right now this table seems to exist solely to support this
feature) changing this seems within the realm of possibility. I have to
imagine these workers have a sense of local state that would just be "no
errors, no need to touch pg_stat_subscription_workers at the end of this
transaction's commit". It would save a local state of the error_xid and if
a successfully committed transaction has that xid it would clear the
error. The skip code path would also check for and see the matching xid
value and clear the error. Even if the local state thing doesn't work, one
catalog lookup per transaction seems like potentially reasonable overhead
to incur here.Are you telling to update the catalog to save error_xid when an error
occurs? If so, that has many challenges like we are not supposed to
perform any such operations when the transaction is in an error state.
We have discussed this and other ideas in the beginning. I don't find
any of your arguments convincing to change the basic approach here but
I would like to see what others think on this matter?Then how does the table get updated to that state in the first place
since it doesn't know the error details until there is an error?
I think your idea is based on storing error information including XID
is stored in the system catalog. I think that the reasons why we use
the stats collector
I noticed this dynamic while skimming the patch (and also pondering why the
new worker table was not in a catalog chapter) but am only now fully
beginning to appreciate its impact on this discussion.
to store error information including
last_error_xid are (1) as Amit mentioned, it would have many
challenges if updating the catalog when the transaction is in an error
state, and
I'm going on faith right now that this is a problem. But from my prior
outline I hope you can see why I find it surprising. Don't try to update a
catalog while in an error state. Get out of the error state first. e.g.,
A transient "holding pattern" would seem to work. Upon a server restart
the transient state would be forgotten, it would attempt to reapply the
wal, would see the same error, and would then go back into the transient
holding pattern. I do intend to read the other discussion on this
particular topic so a detailed rebuttal, if warranted, can be withheld.
(2) we can store more information such as error messages,
action, etc. other than XID so that users can identify that the
reported error is a conflict error but not other types of error such
as OOM error.
I mentioned only XID because of the focus on SKIP. The other data already
present in that table is ok. Whether we use a catalog or the stats
collector seems irrelevant. If anything the catalog makes more sense -
calling an error message a statistic is a bit of a reach.
Similarly, the same is true
for clearing subskipxid. I agree that it's useful if
pg_subscription.subskipxid is automatically set when executing ALTER
SUBSCRIPTION SKIP but it might not work in some cases because of this
restriction. For these reasons to me, it makes sense to store
subscribers' error information by using the stats collector.
I'm confused - pg_subscription is a catalog, not a stat view. Why is it
affected?
I don't see how point 2 prevents using a system catalog. I accept point 1
as true but will need to read some of the prior discussion to really
understand it.
When it comes to reporting a message to the stats collector, we need
to note that it's not guaranteed that all messages arrive at the stats
collector. Therefore, last_error_xid doesn't not necessarily get
updated after the worker reports an error.
You'll forgive me for not considering this due to its apparent lack of
mention in the documentation [*] and it's arguable classification as a POLA
violation.
[*]
https://www.postgresql.org/docs/current/monitoring-stats.html#MONITORING-PG-STAT-SUBSCRIPTION
What I do read there seems compatible with the desired user experience.
500ms lag, idle transaction oriented, reset upon unclean shutdown, and
consumers seeing a stable transactional view: none of these seem like
show-stoppers.
Anyway, even if subskipxid is automatically set when ALTER
SUBSCRIPTION SKIP, I think we need to provide a way to clear it as the
current patch does (setting NONE) just in case.
With my suggestion of requiring a matching xid the whole option for
skip_xid = { xid | NONE } remains.
5(out). wait for the user to manually restart the replication stream
Do you mean that there always is user intervention after error so the
replication stream can resume?
That is my working assumption. It doesn't seem like the system would
auto-resume without a DBA doing something (I'll attribute a server crash to
the DBA for convenience).
Apparently I need to read more about how the system works today to
understand how this varies from and integrates with today's user experience.
That said, at present my two dislikes:
1) ALTER SYSTEM SKIP accepts any xid value (I need to consider further the
timing of when this resets to zero)
2) pg_stat_subscription_worker.last_error_* fields remain populated even
while the system is in a normal operating state.
are preventing me from preferring this patch over the status quo (yes, I
know the 2nd point is about a committed feature). Regardless of how far
off I may be regarding our technical ability to change them to a more (IMO)
user-friendly design.
David J.
On Mon, Jan 24, 2022 at 1:30 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
That said, at present my two dislikes:
1) ALTER SYSTEM SKIP accepts any xid value (I need to consider further the timing of when this resets to zero)
I think this is required for future extension of this feature wherein
I think there could be multiple such xids say when we support parallel
apply workers. I think if we get a good way to do it even after the
first version like by making a xid an optional parameter.
--
With Regards,
Amit Kapila.
On Mon, Jan 24, 2022 at 5:00 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
On Sun, Jan 23, 2022 at 11:55 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Similarly, the same is true
for clearing subskipxid.I'm confused - pg_subscription is a catalog, not a stat view. Why is it affected?
Sorry, I mistook last_error_xid for subskipxid here.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On 22.01.22 03:54, Amit Kapila wrote:
Won't we already do that for Alter Subscription command which means
nothing special needs to be done for this? However, it seems to me
that the idea we are trying to follow here is that as this option can
lead to data inconsistency, it is good to allow only superusers to
specify this option. The owner of the subscription can be changed to
non-superuser as well in which case I think it won't be a good idea to
allow this option. OTOH, if we think it is okay to allow such an
option to users that don't have superuser privilege then I think
allowing it to the owner of the subscription makes sense to me.
I don't think this functionality allows a nonprivileged user to do
anything they couldn't otherwise do. You can create inconsistent data
in the sense that you can choose not to apply certain replicated data.
But a subscription owner has to have write access to the target tables
of the subscription, so they already have the ability to write or not
write any data they want.
On 22.01.22 10:41, Amit Kapila wrote:
Additionally, the description for pg_stat_subscription_workers should describe what happens once the transaction represented by last_error_xid has either been successfully processed or skipped. Does this "last error" stick around until another error happens (which is hopefully very rare) or does it reset to blanks?
It will be reset only on subscription drop, otherwise, it will stick
around until another error happens.
Is this going to be a problem with transaction ID wraparound? Do we
need to use 64-bit xids for this?
On Monday, January 24, 2022, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Jan 24, 2022 at 1:30 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:That said, at present my two dislikes:
1) ALTER SYSTEM SKIP accepts any xid value (I need to consider further
the timing of when this resets to zero)
I think this is required for future extension of this feature wherein
I think there could be multiple such xids say when we support parallel
apply workers. I think if we get a good way to do it even after the
first version like by making a xid an optional parameter.
Extending the behavior is doable, and maybe we end up without this
limitation in the future, so be it. But I’m having a hard time imagining a
scenario where the xid is not already known to the system, and the user,
and wants to be in effect for a very short window.
David J.
On Mon, Jan 24, 2022 at 7:36 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
On 22.01.22 03:54, Amit Kapila wrote:
Won't we already do that for Alter Subscription command which means
nothing special needs to be done for this? However, it seems to me
that the idea we are trying to follow here is that as this option can
lead to data inconsistency, it is good to allow only superusers to
specify this option. The owner of the subscription can be changed to
non-superuser as well in which case I think it won't be a good idea to
allow this option. OTOH, if we think it is okay to allow such an
option to users that don't have superuser privilege then I think
allowing it to the owner of the subscription makes sense to me.I don't think this functionality allows a nonprivileged user to do
anything they couldn't otherwise do. You can create inconsistent data
in the sense that you can choose not to apply certain replicated data.
I thought this will be the only primary way to skip applying certain
transactions. The other could be via pg_replication_origin_advance().
Or are you talking about the case where we skip applying update/delete
where the corresponding rows are not found?
I see the point that if we can allow the owner to skip applying
updates/deletes in certain cases then probably this should also be
okay. Kindly let us know if you have something else in mind as well?
--
With Regards,
Amit Kapila.
On Mon, Jan 24, 2022 at 7:40 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
On 22.01.22 10:41, Amit Kapila wrote:
Additionally, the description for pg_stat_subscription_workers should describe what happens once the transaction represented by last_error_xid has either been successfully processed or skipped. Does this "last error" stick around until another error happens (which is hopefully very rare) or does it reset to blanks?
It will be reset only on subscription drop, otherwise, it will stick
around until another error happens.Is this going to be a problem with transaction ID wraparound?
I think to avoid this we can send a message to clear this (at least to
clear XID in the view) after skipping the xact but there is no
guarantee that it will be received by the stats collector.
Additionally, the worker can periodically (say after every N (100,
500, etc) successful transaction) send a clear message after
successful apply. This will ensure that eventually the error entry
will be cleared.
Do we
need to use 64-bit xids for this?
For 64-bit XIds, as this reported XID is for the remote transactions,
I think we need to add 4-bytes to each transaction message(say Begin)
and that could be costly for small transactions. We also probably need
to make logical decoding aware of 64-bit XID? Note that XIDs in WAL
records are still 32-bit XID. I don't think this feature deserves such
a big (in terms of WAL and network message size) change.
--
With Regards,
Amit Kapila.
On 25.01.22 03:54, Amit Kapila wrote:
I don't think this functionality allows a nonprivileged user to do
anything they couldn't otherwise do. You can create inconsistent data
in the sense that you can choose not to apply certain replicated data.I thought this will be the only primary way to skip applying certain
transactions. The other could be via pg_replication_origin_advance().
Or are you talking about the case where we skip applying update/delete
where the corresponding rows are not found?I see the point that if we can allow the owner to skip applying
updates/deletes in certain cases then probably this should also be
okay. Kindly let us know if you have something else in mind as well?
Let's start this again: The question at hand is whether ALTER
SUBSCRIPTION ... SKIP should be allowed for subscription owners that are
not superusers. The argument raised against that was that this would
allow the owner to create "inconsistent" data. But it hasn't been
explained what that actually means or why it is dangerous.
On 25.01.22 06:18, Amit Kapila wrote:
I think to avoid this we can send a message to clear this (at least to
clear XID in the view) after skipping the xact but there is no
guarantee that it will be received by the stats collector.
Additionally, the worker can periodically (say after every N (100,
500, etc) successful transaction) send a clear message after
successful apply. This will ensure that eventually the error entry
will be cleared.
Well, I think we need *some* solution for now. We can't leave a footgun
where you say, "skip transaction 700", somehow transaction 700 doesn't
happen, the whole thing gets forgotten, but then 3 months later, the
next transaction 700 mysteriously gets dropped.
On Tue, Jan 25, 2022 at 5:52 AM Peter Eisentraut <
peter.eisentraut@enterprisedb.com> wrote:
On 25.01.22 06:18, Amit Kapila wrote:
I think to avoid this we can send a message to clear this (at least to
clear XID in the view) after skipping the xact but there is no
guarantee that it will be received by the stats collector.
Additionally, the worker can periodically (say after every N (100,
500, etc) successful transaction) send a clear message after
successful apply. This will ensure that eventually the error entry
will be cleared.Well, I think we need *some* solution for now. We can't leave a footgun
where you say, "skip transaction 700", somehow transaction 700 doesn't
happen, the whole thing gets forgotten, but then 3 months later, the
next transaction 700 mysteriously gets dropped.
This is indeed part of why I feel that the xid being skipped should be
validated. As the feature is presented the user is supposed to read the
xid from the system (the new stat view or the error log) and supply it and
then the worker, when it goes to skip, should find that the very first
transaction xid it encounters is the one it is being told to skip. It
skips that transaction, clears the skipxid, and puts the system back into
normal operating mode. If that first transaction xid isn't the one being
specified to skip the worker should error with "skipping transaction
failed, xid 123 expected but 456 found".
This whole lack of a guarantee of the availability and accuracy regarding
the data that this process should be reliant upon needs to be engineered
away.
David J.
On Tue, Jan 25, 2022 at 11:35 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
On Tue, Jan 25, 2022 at 5:52 AM Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote:
On 25.01.22 06:18, Amit Kapila wrote:
I think to avoid this we can send a message to clear this (at least to
clear XID in the view) after skipping the xact but there is no
guarantee that it will be received by the stats collector.
Additionally, the worker can periodically (say after every N (100,
500, etc) successful transaction) send a clear message after
successful apply. This will ensure that eventually the error entry
will be cleared.Well, I think we need *some* solution for now. We can't leave a footgun
where you say, "skip transaction 700", somehow transaction 700 doesn't
happen, the whole thing gets forgotten, but then 3 months later, the
next transaction 700 mysteriously gets dropped.This is indeed part of why I feel that the xid being skipped should be validated. As the feature is presented the user is supposed to read the xid from the system (the new stat view or the error log) and supply it and then the worker, when it goes to skip, should find that the very first transaction xid it encounters is the one it is being told to skip. It skips that transaction, clears the skipxid, and puts the system back into normal operating mode. If that first transaction xid isn't the one being specified to skip the worker should error with "skipping transaction failed, xid 123 expected but 456 found".
Yeah, I think it's a good idea to clear the subskipxid after the first
transaction regardless of whether the worker skipped it.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, Jan 25, 2022 at 7:47 AM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:
Yeah, I think it's a good idea to clear the subskipxid after the first
transaction regardless of whether the worker skipped it.
So basically instead of stopping the worker with an error you suggest
having the worker continue applying changes (after resetting subskipxid,
and - arguably - the ?_error_* fields). Log the transaction xid mis-match
as a warning in the log file as opposed to an error.
I was supposing to make it an error and have the worker stop again since in
a system where the xid is verified and the code is bug-free I would expect
the situation to be a "can't happen" one and I'd rather error in that
circumstance than warn. The DBA will have to go and ALTER SUBSCRIPTION
SKIP (xid = NONE) to get the worker working again but I find that
acceptable in this case.
David J.
On Tue, Jan 25, 2022 at 11:58 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
On Tue, Jan 25, 2022 at 7:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Yeah, I think it's a good idea to clear the subskipxid after the first
transaction regardless of whether the worker skipped it.So basically instead of stopping the worker with an error you suggest having the worker continue applying changes (after resetting subskipxid, and - arguably - the ?_error_* fields). Log the transaction xid mis-match as a warning in the log file as opposed to an error.
Agreed, I think it's better to log a warning than to raise an error.
In the case where the user specified the wrong XID, the worker should
fail again due to the same error.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, Jan 25, 2022 at 8:09 AM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:
On Tue, Jan 25, 2022 at 11:58 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:On Tue, Jan 25, 2022 at 7:47 AM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:
Yeah, I think it's a good idea to clear the subskipxid after the first
transaction regardless of whether the worker skipped it.So basically instead of stopping the worker with an error you suggest
having the worker continue applying changes (after resetting subskipxid,
and - arguably - the ?_error_* fields). Log the transaction xid mis-match
as a warning in the log file as opposed to an error.Agreed, I think it's better to log a warning than to raise an error.
In the case where the user specified the wrong XID, the worker should
fail again due to the same error.
If it remains possible for the system to accept a wrongly specified XID I
would agree that this behavior is preferable. At least when the user
wonders why the skip didn't work and they are seeing the same error again
they will have a log entry warning telling them their XID choice was
incorrect. I would prefer that the system not accept a wrongly specified
XID and the user be told directly and sooner that their XID choice was
incorrect.
David J.
On Wed, Jan 26, 2022 at 12:14 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:
On Tue, Jan 25, 2022 at 8:09 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Jan 25, 2022 at 11:58 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:On Tue, Jan 25, 2022 at 7:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Yeah, I think it's a good idea to clear the subskipxid after the first
transaction regardless of whether the worker skipped it.So basically instead of stopping the worker with an error you suggest having the worker continue applying changes (after resetting subskipxid, and - arguably - the ?_error_* fields). Log the transaction xid mis-match as a warning in the log file as opposed to an error.
Agreed, I think it's better to log a warning than to raise an error.
In the case where the user specified the wrong XID, the worker should
fail again due to the same error.If it remains possible for the system to accept a wrongly specified XID I would agree that this behavior is preferable. At least when the user wonders why the skip didn't work and they are seeing the same error again they will have a log entry warning telling them their XID choice was incorrect.
Yes.
I would prefer that the system not accept a wrongly specified XID and the user be told directly and sooner that their XID choice was incorrect.
Given that we cannot use rely on the pg_stat_subscription_workers view
for this purpose, we would need either a new sub-system that tracks
each logical replication status so the system can set the error XID to
subskipxid, or to wait for shared-memory based stats collector. While
agreeing that ideally, we need such a sub-system I'm concerned that
everyone will agree to add complexity for this feature. That having
been said, if there is a significant need for it, we can implement it
as an improvement.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Tue, Jan 25, 2022 at 8:33 AM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:
Given that we cannot use rely on the pg_stat_subscription_workers view
for this purpose, we would need either a new sub-system that tracks
each logical replication status so the system can set the error XID to
subskipxid, or to wait for shared-memory based stats collector.
I'm reading over the monitoring-stats page to try and get my head around
all of this. First of all, it defines two kinds of views:
1. PostgreSQL's statistics collector is a subsystem that supports
collection and reporting of information about server activity.
2. PostgreSQL also supports reporting dynamic information ... This facility
is independent of the collector process.
In then has two tables:
28.1 Dynamic Statistics Views (describing #2 above)
28.2 Collected Statistics Views (describing #1 above)
Apparently the "collector process" is UDP-like, not reliable. The
documentation fails to mention this fact. I'd argue that this is a
documentation bug.
I do see that the pg_stat_subscription_workers view is correctly placed in
Table 28.2
Reviewing the other views listed in that table only pg_stat_archiver abuses
the statistics collector in a similar fashion. All of the others are
actually metric oriented.
I don't care for the specification: "will contain one row per subscription
worker on which errors have occurred, for workers applying logical
replication changes and workers handling the initial data copy of the
subscribed tables."
I would much rather have this behave similar to pg_stat_activity (which, of
course, is a Dynamic Statistics View...) in that it shows only and all
workers that are presently working. The tablesync workers should go away
when they have finished synchronizing. I should not have to manually
intervene to get rid of unreliable expired data. The log file feels like a
superior solution to this monitoring view.
Alternatively, if the tablesync workers are done but we've been
accumulating real statistics for them, then by all means keep them included
in the view - but regardless of whether they encountered an error. But
maybe the view can right join in pg_stat_subscription as show a column for
"(pid is not null) AS is_active".
Maybe we need to add a track_finished_tablesync_workers GUC so the DBA can
decide whether to devote storage and processing resources to that
historical information.
If you had kept the original view name, "pg_stat_subscription_error", this
whole issue goes away. But you decided to make it more generic and call it
"pg_stat_subscription_workers" - which means you need to get rid of the
error-specific condition in the WHERE clause for the view. Show all
workers - I can filter on is_active. Showing only active workers is also
acceptable. You won't get to change your mind so decide whether this wants
to show only current and running state or whether historical statistics for
now defunct tablesync workers are desired. Personally, I would just show
active workers and if someone wants to add the feature they can add a
track_tablesync_worker_stats GUC and a matching view.
From that, every apply worker should be sending a statistics message to the
collector periodically. If error info is not present and the state is "all
is well", clear out any existing error info from the view. The attempt to
include an actual statistic field here doesn't seem useful nor redeeming.
I would add a "state" field in its place (well, after subrelid). And I
would still rename the columns to current_error_* and note that these
should be null unless the status field shows error (there may be some
additional complexity here). Just get rid of last_error_count.
David J.
P.S. I saw the discussion regarding pg_dump'ing the subskipid field. I
didn't notice any discussion around creating and restoring a basebackup.
It seems like during server startup subskipid should just be cleared out.
Then it doesn't matter what one does during backup.
On Tue, Jan 25, 2022 at 6:18 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
On 25.01.22 03:54, Amit Kapila wrote:
I don't think this functionality allows a nonprivileged user to do
anything they couldn't otherwise do. You can create inconsistent data
in the sense that you can choose not to apply certain replicated data.I thought this will be the only primary way to skip applying certain
transactions. The other could be via pg_replication_origin_advance().
Or are you talking about the case where we skip applying update/delete
where the corresponding rows are not found?I see the point that if we can allow the owner to skip applying
updates/deletes in certain cases then probably this should also be
okay. Kindly let us know if you have something else in mind as well?Let's start this again: The question at hand is whether ALTER
SUBSCRIPTION ... SKIP should be allowed for subscription owners that are
not superusers. The argument raised against that was that this would
allow the owner to create "inconsistent" data. But it hasn't been
explained what that actually means or why it is dangerous.
There are two reasons in my mind: (a) We are going to skip some
unrelated data changes that are not the direct cause of conflict
because of the entire transaction skip. Now, it is possible that
unintentionally it allows skipping some actual changes
insert/update/delete/truncate to some relations which will then allow
even the future changes to cause some conflict or won't get applied. A
few examples are after TRUNCATE is skipped, the INSERTS in following
transactions can cause error "duplicate key .."; similarly say some
INSERT is skipped, then following UPDATE/DELETE won't find the
corresponding row to perform the operation. (b) Users can specify some
random XID, the discussion below is trying to detect this and raise
WARNING/ERROR but still, it could cause some valid transaction (which
won't generate any conflict/error) to skip.
These can lead to some missing data in the subscriber which the user
might not have expected.
--
With Regards,
Amit Kapila.
On Mon, Jan 24, 2022 at 12:59 AM David G. Johnston <
david.g.johnston@gmail.com> wrote:
5(out). wait for the user to manually restart the replication stream
Do you mean that there always is user intervention after error so the
replication stream can resume?That is my working assumption. It doesn't seem like the system would
auto-resume without a DBA doing something (I'll attribute a server crash to
the DBA for convenience).Apparently I need to read more about how the system works today to
understand how this varies from and integrates with today's user experience.
I've done some code reading. My understanding is that a background worker
for the main apply of a given subscription is created from the launcher
code (not reviewed) which is initialized at server startup (or as needed
sometime thereafter). This goes into a for(;;) loop in LogicalRepApplyLoop
under a PG_TRY in ApplyWorkerMain. When a message is applied that provokes
an error the PG_CATCH() in ApplyWorkerMain takes over and then this worker
dies. While in that PG_CATCH() we have an aborted transaction and so are
limited in what we can change. We PG_RE_THROW(); back to the background
worker infrastructure and let it perform logging and cleanup; which
includes this destroying this instance of the background worker. The
background worker that is destroyed is replaced and its replacement is
identical to the original so far as the statistics collector is concerned.
I haven't traced out when the replacement apply worker gets recreated. It
seems like doing so immediately, and then it going and just encountering
the same error, would be an undesirable choice, and so I've assumed it does
not. But I also wasn't expecting the apply worker to PG_RE_THROW() either,
but instead continue on running in a different for(;;) loop waiting for
some signal from the system that something has changed that may avoid the
error that put it in timeout.
So my more detailed goal would be to get rid of PG_RE_THROW(); (I assume
doing so would entail transaction rollback) and stay in the worker. Update
pg_subscription with the error information (having removed PG_RE_THROW we
have new things to consider re: pg_stat_subscription_workers). Go into a
for(;;) loop, maybe polling pg_subscription for an indication that it is OK
to retry applying the last transaction. (can an inter-process signal be
sent from a normal backend process to a background worker process?). The
SKIP command then matches XID values on pg_subscription; the resumption
sees the subskipxid, updates pg_subscription to remove the error info and
subskipid, skips the next transaction assuming it has the matching XID, and
then continues applying as normal. Adapt to deal with crash conditions as
needed though clearing before reapplying seems like a safe default. Again,
upon worker startup maybe they should be cleared too (making pg_dump and
other backup considerations moot - as noted in my P.S. in the previous
email).
I'm not sure we are paranoid enough regarding the locking of
pg_subscription for purposes of reading and writing subskipxid. I'd
probably rather serialize access to it, and maybe even not allow changing
from one non-zero XID to another non-zero XID. It shouldn't be needed in
practice (moreso if the XID has to be the one that is present from
current_error_xid) and the user can always reset first.
In worker.c I was and still am confused as to the meaning of 'c' and 'w' in
LogicalRepApplyLoop. In apply_dispatch in that file enums are used to
compare against the message byte, it would be helpful for the inexperienced
reader if 'c' and 'w' were done as enums instead as well.
David J.
On Tue, Jan 25, 2022 at 8:39 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Jan 25, 2022 at 11:58 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:On Tue, Jan 25, 2022 at 7:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Yeah, I think it's a good idea to clear the subskipxid after the first
transaction regardless of whether the worker skipped it.So basically instead of stopping the worker with an error you suggest having the worker continue applying changes (after resetting subskipxid, and - arguably - the ?_error_* fields). Log the transaction xid mis-match as a warning in the log file as opposed to an error.
Agreed, I think it's better to log a warning than to raise an error.
In the case where the user specified the wrong XID, the worker should
fail again due to the same error.
IIUC, the proposal is to compare the skip_xid with the very
transaction the apply worker received to apply and raise a warning if
it doesn't match with skip_xid and then continue. This seems like a
reasonable idea but can we guarantee that it is always the first
transaction that we want to skip? We seem to guarantee that we won't
get something again once it is written durably/flushed on the
subscriber side. I guess here it can happen that before the errored
transaction, there is some empty xact, or maybe part of the stream
(consider streaming transactions) of some xact, or there could be
other cases as well where the server will send those xacts again.
Now, if the above reasoning is correct then I think your proposal to
clear the skip_xid in the catalog as soon as we have applied the first
transaction successfully seems reasonable to me.
--
With Regards,
Amit Kapila.
On Wed, Jan 26, 2022 at 7:31 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:
On Mon, Jan 24, 2022 at 12:59 AM David G. Johnston <david.g.johnston@gmail.com> wrote:
So my more detailed goal would be to get rid of PG_RE_THROW();
I don't think that will be possible, consider the FATAL/PANIC error
case. Also, there are reasons why we always restart apply worker on
ERROR even without this work. If we want to change that, we might need
to redesign the apply side mechanism which I don't think we should try
to do as part of this patch.
--
With Regards,
Amit Kapila.
On Wed, Jan 26, 2022 at 11:28 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jan 25, 2022 at 8:39 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Jan 25, 2022 at 11:58 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:On Tue, Jan 25, 2022 at 7:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Yeah, I think it's a good idea to clear the subskipxid after the first
transaction regardless of whether the worker skipped it.So basically instead of stopping the worker with an error you suggest having the worker continue applying changes (after resetting subskipxid, and - arguably - the ?_error_* fields). Log the transaction xid mis-match as a warning in the log file as opposed to an error.
Agreed, I think it's better to log a warning than to raise an error.
In the case where the user specified the wrong XID, the worker should
fail again due to the same error.IIUC, the proposal is to compare the skip_xid with the very
transaction the apply worker received to apply and raise a warning if
it doesn't match with skip_xid and then continue. This seems like a
reasonable idea but can we guarantee that it is always the first
transaction that we want to skip? We seem to guarantee that we won't
get something again once it is written durably/flushed on the
subscriber side. I guess here it can happen that before the errored
transaction, there is some empty xact, or maybe part of the stream
(consider streaming transactions) of some xact, or there could be
other cases as well where the server will send those xacts again.
Good point.
I guess that in the situation the worker entered an error loop, we can
guarantee that the worker fails while applying the first non-empty
transaction since starting logical replication. And the transaction is
what we’d like to skip. If the transaction that can be applied without
an error is resent after a restart, it’s a problem of logical
replication. As you pointed out, it's possible that there are some
empty transactions before the transaction in question since we don't
advance replication origin LSN if the transaction is empty. Also,
probably the same is true for a streamed transaction that is rolled
back or ROLLBACK-PREPARED transactions. So, we can also skip clearing
subskipxid if the transaction is empty? That is, we make sure to clear
it after applying the first non-empty transaction. We would need to
carefully think about this solution otherwise ALTER SUBSCRIPTION SKIP
ends up not working at all in some cases.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wed, Jan 26, 2022 at 11:51 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Jan 26, 2022 at 11:28 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jan 25, 2022 at 8:39 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Jan 25, 2022 at 11:58 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:On Tue, Jan 25, 2022 at 7:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Yeah, I think it's a good idea to clear the subskipxid after the first
transaction regardless of whether the worker skipped it.So basically instead of stopping the worker with an error you suggest having the worker continue applying changes (after resetting subskipxid, and - arguably - the ?_error_* fields). Log the transaction xid mis-match as a warning in the log file as opposed to an error.
Agreed, I think it's better to log a warning than to raise an error.
In the case where the user specified the wrong XID, the worker should
fail again due to the same error.IIUC, the proposal is to compare the skip_xid with the very
transaction the apply worker received to apply and raise a warning if
it doesn't match with skip_xid and then continue. This seems like a
reasonable idea but can we guarantee that it is always the first
transaction that we want to skip? We seem to guarantee that we won't
get something again once it is written durably/flushed on the
subscriber side. I guess here it can happen that before the errored
transaction, there is some empty xact, or maybe part of the stream
(consider streaming transactions) of some xact, or there could be
other cases as well where the server will send those xacts again.Good point.
I guess that in the situation the worker entered an error loop, we can
guarantee that the worker fails while applying the first non-empty
transaction since starting logical replication. And the transaction is
what we’d like to skip. If the transaction that can be applied without
an error is resent after a restart, it’s a problem of logical
replication. As you pointed out, it's possible that there are some
empty transactions before the transaction in question since we don't
advance replication origin LSN if the transaction is empty. Also,
probably the same is true for a streamed transaction that is rolled
back or ROLLBACK-PREPARED transactions. So, we can also skip clearing
subskipxid if the transaction is empty? That is, we make sure to clear
it after applying the first non-empty transaction. We would need to
carefully think about this solution otherwise ALTER SUBSCRIPTION SKIP
ends up not working at all in some cases.
Probably, we also need to consider the case where the tablesync worker
entered an error loop and the user wants to skip the transaction? The
apply worker is also running at the same time but it should not clear
subskipxid. Similarly, the tablesync worker should not clear
subskipxid if the apply worker wants to skip the transaction.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wed, Jan 26, 2022 at 8:55 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Jan 26, 2022 at 11:51 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Jan 26, 2022 at 11:28 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
IIUC, the proposal is to compare the skip_xid with the very
transaction the apply worker received to apply and raise a warning if
it doesn't match with skip_xid and then continue. This seems like a
reasonable idea but can we guarantee that it is always the first
transaction that we want to skip? We seem to guarantee that we won't
get something again once it is written durably/flushed on the
subscriber side. I guess here it can happen that before the errored
transaction, there is some empty xact, or maybe part of the stream
(consider streaming transactions) of some xact, or there could be
other cases as well where the server will send those xacts again.Good point.
I guess that in the situation the worker entered an error loop, we can
guarantee that the worker fails while applying the first non-empty
transaction since starting logical replication. And the transaction is
what we’d like to skip. If the transaction that can be applied without
an error is resent after a restart, it’s a problem of logical
replication. As you pointed out, it's possible that there are some
empty transactions before the transaction in question since we don't
advance replication origin LSN if the transaction is empty. Also,
probably the same is true for a streamed transaction that is rolled
back or ROLLBACK-PREPARED transactions. So, we can also skip clearing
subskipxid if the transaction is empty? That is, we make sure to clear
it after applying the first non-empty transaction. We would need to
carefully think about this solution otherwise ALTER SUBSCRIPTION SKIP
ends up not working at all in some cases.
I think it is okay to clear after the first successful application of
any transaction. What I was not sure was about the idea of giving
WARNING/ERROR if the first xact to be applied is not the same as
skip_xid.
Probably, we also need to consider the case where the tablesync worker
entered an error loop and the user wants to skip the transaction? The
apply worker is also running at the same time but it should not clear
subskipxid. Similarly, the tablesync worker should not clear
subskipxid if the apply worker wants to skip the transaction.
I think for tablesync workers, the skip_xid set via this mechanism
won't work as we don't have any remote_xid for them, and neither any
XID is reported in the view for them.
--
With Regards,
Amit Kapila.
On Wed, Jan 26, 2022 at 12:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Jan 26, 2022 at 8:55 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Jan 26, 2022 at 11:51 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Jan 26, 2022 at 11:28 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
IIUC, the proposal is to compare the skip_xid with the very
transaction the apply worker received to apply and raise a warning if
it doesn't match with skip_xid and then continue. This seems like a
reasonable idea but can we guarantee that it is always the first
transaction that we want to skip? We seem to guarantee that we won't
get something again once it is written durably/flushed on the
subscriber side. I guess here it can happen that before the errored
transaction, there is some empty xact, or maybe part of the stream
(consider streaming transactions) of some xact, or there could be
other cases as well where the server will send those xacts again.Good point.
I guess that in the situation the worker entered an error loop, we can
guarantee that the worker fails while applying the first non-empty
transaction since starting logical replication. And the transaction is
what we’d like to skip. If the transaction that can be applied without
an error is resent after a restart, it’s a problem of logical
replication. As you pointed out, it's possible that there are some
empty transactions before the transaction in question since we don't
advance replication origin LSN if the transaction is empty. Also,
probably the same is true for a streamed transaction that is rolled
back or ROLLBACK-PREPARED transactions. So, we can also skip clearing
subskipxid if the transaction is empty? That is, we make sure to clear
it after applying the first non-empty transaction. We would need to
carefully think about this solution otherwise ALTER SUBSCRIPTION SKIP
ends up not working at all in some cases.I think it is okay to clear after the first successful application of
any transaction. What I was not sure was about the idea of giving
WARNING/ERROR if the first xact to be applied is not the same as
skip_xid.
Do you prefer not to do anything in this case?
Probably, we also need to consider the case where the tablesync worker
entered an error loop and the user wants to skip the transaction? The
apply worker is also running at the same time but it should not clear
subskipxid. Similarly, the tablesync worker should not clear
subskipxid if the apply worker wants to skip the transaction.I think for tablesync workers, the skip_xid set via this mechanism
won't work as we don't have any remote_xid for them, and neither any
XID is reported in the view for them.
If the tablesync worker raises an error while applying changes after
finishing the copy, it also reports the error XID.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wed, Jan 26, 2022 at 7:05 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:
On Tue, Jan 25, 2022 at 8:33 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Given that we cannot use rely on the pg_stat_subscription_workers view
for this purpose, we would need either a new sub-system that tracks
each logical replication status so the system can set the error XID to
subskipxid, or to wait for shared-memory based stats collector.I'm reading over the monitoring-stats page to try and get my head around all of this. First of all, it defines two kinds of views:
1. PostgreSQL's statistics collector is a subsystem that supports collection and reporting of information about server activity.
2. PostgreSQL also supports reporting dynamic information ... This facility is independent of the collector process.In then has two tables:
28.1 Dynamic Statistics Views (describing #2 above)
28.2 Collected Statistics Views (describing #1 above)Apparently the "collector process" is UDP-like, not reliable. The documentation fails to mention this fact. I'd argue that this is a documentation bug.
I do see that the pg_stat_subscription_workers view is correctly placed in Table 28.2
Reviewing the other views listed in that table only pg_stat_archiver abuses the statistics collector in a similar fashion. All of the others are actually metric oriented.
I don't care for the specification: "will contain one row per subscription worker on which errors have occurred, for workers applying logical replication changes and workers handling the initial data copy of the subscribed tables."
I would much rather have this behave similar to pg_stat_activity (which, of course, is a Dynamic Statistics View...) in that it shows only and all workers that are presently working.
I have no objection against having a dynamic statistics view showing
the status of each running worker but I think it should be implemented
in a separate view and not be something that replaces the
pg_stat_subscription_workers. I think pg_stat_subscription would be
the right place for it.
The tablesync workers should go away when they have finished synchronizing. I should not have to manually intervene to get rid of unreliable expired data. The log file feels like a superior solution to this monitoring view.
Alternatively, if the tablesync workers are done but we've been accumulating real statistics for them, then by all means keep them included in the view - but regardless of whether they encountered an error. But maybe the view can right join in pg_stat_subscription as show a column for "(pid is not null) AS is_active".
Maybe we need to add a track_finished_tablesync_workers GUC so the DBA can decide whether to devote storage and processing resources to that historical information.
If you had kept the original view name, "pg_stat_subscription_error", this whole issue goes away. But you decided to make it more generic and call it "pg_stat_subscription_workers" - which means you need to get rid of the error-specific condition in the WHERE clause for the view. Show all workers - I can filter on is_active. Showing only active workers is also acceptable. You won't get to change your mind so decide whether this wants to show only current and running state or whether historical statistics for now defunct tablesync workers are desired. Personally, I would just show active workers and if someone wants to add the feature they can add a track_tablesync_worker_stats GUC and a matching view.
We plan to clear/remove table sync entries who finished synchronization.
It’s better not to merge dynamic statistics such as pid and is_active
and accumulative statistics into one view. I think we can have both
views: pg_stat_subscription_workers view with some changes based on
the review comments (e.g., removing defunct tablesync entry), and
another view showing dynamic statistics such as the worker status.
From that, every apply worker should be sending a statistics message to the collector periodically. If error info is not present and the state is "all is well", clear out any existing error info from the view. The attempt to include an actual statistic field here doesn't seem useful nor redeeming. I would add a "state" field in its place (well, after subrelid). And I would still rename the columns to current_error_* and note that these should be null unless the status field shows error (there may be some additional complexity here). Just get rid of last_error_count.
I don't think that using the stats collector to show the current
status of each worker is a good idea because of 500ms lag, UDP
connection etc. Even if error info is not present and the state is
good according to the view, it might be out-of-date or simply not
true. If we want to do that, it’s much better to prepare something on
shmem so each worker can store its status (running or error, error
xid, etc.) and have pg_stat_subscription (or another view) show the
information. One thing we need to consider is that it needs to leave
the status even after exiting apply/tablesync worker but we don't know
how many statuses for workers we need to allocate on the shmem at
startup time.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wed, Jan 26, 2022 at 9:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Jan 26, 2022 at 12:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I think it is okay to clear after the first successful application of
any transaction. What I was not sure was about the idea of giving
WARNING/ERROR if the first xact to be applied is not the same as
skip_xid.Do you prefer not to do anything in this case?
I am fine with clearing the skip_xid after the first successful
application. But note, we shouldn't do catalog access for this, we can
check if it is set in MySubscription.
Probably, we also need to consider the case where the tablesync worker
entered an error loop and the user wants to skip the transaction? The
apply worker is also running at the same time but it should not clear
subskipxid. Similarly, the tablesync worker should not clear
subskipxid if the apply worker wants to skip the transaction.I think for tablesync workers, the skip_xid set via this mechanism
won't work as we don't have any remote_xid for them, and neither any
XID is reported in the view for them.If the tablesync worker raises an error while applying changes after
finishing the copy, it also reports the error XID.
Right and agreed with your assessment for the same.
--
With Regards,
Amit Kapila.
On Tue, Jan 25, 2022 at 9:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Jan 26, 2022 at 9:36 AM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:On Wed, Jan 26, 2022 at 12:54 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:
Probably, we also need to consider the case where the tablesync
worker
entered an error loop and the user wants to skip the transaction? The
apply worker is also running at the same time but it should not clear
subskipxid. Similarly, the tablesync worker should not clear
subskipxid if the apply worker wants to skip the transaction.I think for tablesync workers, the skip_xid set via this mechanism
won't work as we don't have any remote_xid for them, and neither any
XID is reported in the view for them.If the tablesync worker raises an error while applying changes after
finishing the copy, it also reports the error XID.Right and agreed with your assessment for the same.
IIUC each tablesync process also performs an apply stage but only applies
the messages related to the single table it is responsible for. Once all
tablesync workers synchronize they are all destroyed and the main apply
worker takes over and applies transactions to all subscribed tables.
We probably should just provide an option for the user to specify
"subrelid". If null, only the main apply worker will skip the given xid,
otherwise only the worker tasked with syncing that particular table will do
so. It might take a sequence of ALTER SUBSCRIPTION SET commands to get a
broken initial table synchronization to load completely but at least there
will not be any surprises as to which tables had transactions skipped and
which did not.
It may even make sense, eventually for the main apply worker to skip on a
subrelid basis. Since the main apply worker isn't applying transactions at
the same time as the tablesync workers the non-null subrelid can also be
interpreted by the main apply worker.
David J.
On Wed, Jan 26, 2022 at 1:43 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
On Tue, Jan 25, 2022 at 9:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Jan 26, 2022 at 9:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Jan 26, 2022 at 12:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Probably, we also need to consider the case where the tablesync worker
entered an error loop and the user wants to skip the transaction? The
apply worker is also running at the same time but it should not clear
subskipxid. Similarly, the tablesync worker should not clear
subskipxid if the apply worker wants to skip the transaction.I think for tablesync workers, the skip_xid set via this mechanism
won't work as we don't have any remote_xid for them, and neither any
XID is reported in the view for them.If the tablesync worker raises an error while applying changes after
finishing the copy, it also reports the error XID.Right and agreed with your assessment for the same.
IIUC each tablesync process also performs an apply stage but only applies the messages related to the single table it is responsible for. Once all tablesync workers synchronize they are all destroyed and the main apply worker takes over and applies transactions to all subscribed tables.
We probably should just provide an option for the user to specify "subrelid". If null, only the main apply worker will skip the given xid, otherwise only the worker tasked with syncing that particular table will do so. It might take a sequence of ALTER SUBSCRIPTION SET commands to get a broken initial table synchronization to load completely but at least there will not be any surprises as to which tables had transactions skipped and which did not.
That would work but I’m concerned that the users can specify it
properly. Also, we would need to change the errcontext message
generated by apply_error_callback() so the user can know that the
error occurred in either apply worker or tablesync worker.
Or, as another idea, since an error during table synchronization is
not common and could be resolved by truncating the table and
restarting the synchronization in practice, there might be no need
this much and we can support it only for apply worker errors.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wed, Jan 26, 2022 at 12:51 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Jan 26, 2022 at 1:43 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:We probably should just provide an option for the user to specify "subrelid". If null, only the main apply worker will skip the given xid, otherwise only the worker tasked with syncing that particular table will do so. It might take a sequence of ALTER SUBSCRIPTION SET commands to get a broken initial table synchronization to load completely but at least there will not be any surprises as to which tables had transactions skipped and which did not.
That would work but I’m concerned that the users can specify it
properly. Also, we would need to change the errcontext message
generated by apply_error_callback() so the user can know that the
error occurred in either apply worker or tablesync worker.Or, as another idea, since an error during table synchronization is
not common and could be resolved by truncating the table and
restarting the synchronization in practice, there might be no need
this much and we can support it only for apply worker errors.
Yes, that is what I have also in mind. We can always extend this
feature for tablesync process because it can not only fail for the
specified skip_xid but also for many other reasons during the initial
copy.
--
With Regards,
Amit Kapila.
On Wed, Jan 26, 2022 at 8:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Jan 26, 2022 at 12:51 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Jan 26, 2022 at 1:43 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:We probably should just provide an option for the user to specify "subrelid". If null, only the main apply worker will skip the given xid, otherwise only the worker tasked with syncing that particular table will do so. It might take a sequence of ALTER SUBSCRIPTION SET commands to get a broken initial table synchronization to load completely but at least there will not be any surprises as to which tables had transactions skipped and which did not.
That would work but I’m concerned that the users can specify it
properly. Also, we would need to change the errcontext message
generated by apply_error_callback() so the user can know that the
error occurred in either apply worker or tablesync worker.Or, as another idea, since an error during table synchronization is
not common and could be resolved by truncating the table and
restarting the synchronization in practice, there might be no need
this much and we can support it only for apply worker errors.Yes, that is what I have also in mind. We can always extend this
feature for tablesync process because it can not only fail for the
specified skip_xid but also for many other reasons during the initial
copy.
I'll update the patch accordingly to test and verify this approach.
In the meantime, I’d like to discuss the possible ideas of storing the
error XID somewhere the worker can see it even after a restart. It has
been proposed that the worker updates the catalog when an error
occurs, which was criticized as updating the catalog in such a
situation is not a good idea.
The next idea I considered was to store the error XID somewhere on
shmem (e.g., ReplicationState). But It requires entries at least as
much as subscriptions in principle, not
max_logical_replcation_workers. Since we don’t know it at startup
time, we need to use DSM or cache with a fixed number of entries. It
seems overkill to me.
The third idea, which is slightly better than others, is to update the
catalog by the launcher process, not the worker process; when an error
occurs, the apply worker stores the error XID (and maybe its
subscription OID) into its LogicalRepWorker entry, and the launcher
updates the corresponding entry of pg_subscription catalog before
launching workers. After the worker restarts, it clears the error XID
on the catalog if it successfully applied the transaction with the
error XID. The user can enable the skipping transaction behavior by a
query say ALTER SUBSCRIPTION SKIP ENABLED. The user cannot enable the
skipping behavior if the error XID is not set. If the skipping
behavior is enabled and the error XID is a valid value, the worker
skips the transaction and then clears both the error XID and a flag of
skipping behavior on the catalog.
With this idea, we don’t need a complex mechanism to store the error
XID for each subscription and can ensure to skip only the transaction
in question. But my concern is that the launcher updates the catalog.
Since it doesn’t connect to any database, probably it cannot open the
catalog indexes (because it requires lookup pg_class). Therefore, we
have to use in-place updates here. Through quick tests, I’ve confirmed
that using heap_inplace_update() to update the error XID on
pg_subscription tuples seems to work but not sure using an in-place
update here is a legitimate approach.
What do you think and any ideas?
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On 26.01.22 05:05, Masahiko Sawada wrote:
I think it is okay to clear after the first successful application of
any transaction. What I was not sure was about the idea of giving
WARNING/ERROR if the first xact to be applied is not the same as
skip_xid.Do you prefer not to do anything in this case?
I think a warning would be sensible. If the user specifies to skip a
certain transaction and then that doesn't happen, we should at least say
something.
On Thu, Jan 27, 2022 at 10:42 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
On 26.01.22 05:05, Masahiko Sawada wrote:
I think it is okay to clear after the first successful application of
any transaction. What I was not sure was about the idea of giving
WARNING/ERROR if the first xact to be applied is not the same as
skip_xid.Do you prefer not to do anything in this case?
I think a warning would be sensible. If the user specifies to skip a
certain transaction and then that doesn't happen, we should at least say
something.
Meanwhile waiting for comments on the discussion about the designs of
both pg_stat_subscription_workers and ALTER SUBSCRIPTION SKIP feature,
I’ve incorporated some (minor) comments on the current design patch,
which includes:
* Use LSN instead of XID.
* Raise a warning if the user specifies to skip a certain transaction
and then that doesn’t happen.
* Skip-LSN has an effect on the first non-empty transaction. That is,
it’s cleared after successfully committing a non-empty transaction,
preventing the user-specified wrong LSN to remain.
* Remove some unnecessary tap tests to reduce the test time.
I think we all agree with the first point regardless of where we store
error information. And speaking of the current design, I think we all
agree on other points. Since the design discussion is ongoing, I’ll
incorporate other comments according to the result of the discussion.
The attached 0001 patch modifies the pg_stat_subscription_workers to
report LSN instead of XID, which is required by ALTER SUBSCRIPTION
SKIP patch, the 0002 patch.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v11-0001-Report-error-transaction-s-commit-LSN-instead-of.patchapplication/octet-stream; name=v11-0001-Report-error-transaction-s-commit-LSN-instead-of.patchDownload
From c1940cb37539030efd016d6a409ea39c72302d82 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Thu, 10 Feb 2022 21:18:11 +0900
Subject: [PATCH v11 1/2] Report error transaction's commit LSN instead of XID
to pg_stat_subscription_workers.
---
doc/src/sgml/monitoring.sgml | 6 +--
src/backend/catalog/system_views.sql | 2 +-
src/backend/postmaster/pgstat.c | 10 ++---
src/backend/replication/logical/worker.c | 45 ++++++++++-----------
src/backend/utils/adt/pgstatfuncs.c | 11 ++---
src/include/catalog/pg_proc.dat | 4 +-
src/include/pgstat.h | 6 +--
src/test/regress/expected/rules.out | 4 +-
src/test/subscription/t/026_worker_stats.pl | 14 ++-----
9 files changed, 47 insertions(+), 55 deletions(-)
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 62f2a3332b..0820d4a320 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3143,11 +3143,11 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>last_error_xid</structfield> <type>xid</type>
+ <structfield>last_error_lsn</structfield> <type>pg_lsn</type>
</para>
<para>
- Transaction ID of the publisher node being applied when the error
- occurred. This field is null if the error was reported
+ The commit LSN of transaction of the publisher node being applied
+ when the error occurred. This field is null if the error was reported
during the initial data copy.
</para></entry>
</row>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 3cb69b1f87..9e9578bad4 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1271,7 +1271,7 @@ CREATE VIEW pg_stat_subscription_workers AS
w.subrelid,
w.last_error_relid,
w.last_error_command,
- w.last_error_xid,
+ w.last_error_lsn,
w.last_error_count,
w.last_error_message,
w.last_error_time
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 0646f53098..9d95bcb0e3 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -1956,7 +1956,7 @@ pgstat_report_replslot_drop(const char *slotname)
*/
void
pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
- LogicalRepMsgType command, TransactionId xid,
+ LogicalRepMsgType command, XLogRecPtr lsn,
const char *errmsg)
{
PgStat_MsgSubWorkerError msg;
@@ -1968,7 +1968,7 @@ pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
msg.m_subrelid = subrelid;
msg.m_relid = relid;
msg.m_command = command;
- msg.m_xid = xid;
+ msg.m_lsn = lsn;
msg.m_timestamp = GetCurrentTimestamp();
strlcpy(msg.m_message, errmsg, PGSTAT_SUBWORKERERROR_MSGLEN);
@@ -3967,7 +3967,7 @@ pgstat_get_subworker_entry(PgStat_StatDBEntry *dbentry, Oid subid, Oid subrelid,
{
subwentry->last_error_relid = InvalidOid;
subwentry->last_error_command = 0;
- subwentry->last_error_xid = InvalidTransactionId;
+ subwentry->last_error_lsn = InvalidXLogRecPtr;
subwentry->last_error_count = 0;
subwentry->last_error_time = 0;
subwentry->last_error_message[0] = '\0';
@@ -6173,7 +6173,7 @@ pgstat_recv_subworker_error(PgStat_MsgSubWorkerError *msg, int len)
if (subwentry->last_error_relid == msg->m_relid &&
subwentry->last_error_command == msg->m_command &&
- subwentry->last_error_xid == msg->m_xid &&
+ subwentry->last_error_lsn == msg->m_lsn &&
strcmp(subwentry->last_error_message, msg->m_message) == 0)
{
/*
@@ -6188,7 +6188,7 @@ pgstat_recv_subworker_error(PgStat_MsgSubWorkerError *msg, int len)
/* Otherwise, update the error information */
subwentry->last_error_relid = msg->m_relid;
subwentry->last_error_command = msg->m_command;
- subwentry->last_error_xid = msg->m_xid;
+ subwentry->last_error_lsn = msg->m_lsn;
subwentry->last_error_count = 1;
subwentry->last_error_time = msg->m_timestamp;
strlcpy(subwentry->last_error_message, msg->m_message,
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index d77bb32bb9..2d2c83cd53 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -225,7 +225,7 @@ typedef struct ApplyErrorCallbackArg
/* Remote node information */
int remote_attnum; /* -1 if invalid */
- TransactionId remote_xid;
+ XLogRecPtr remote_lsn;
TimestampTz ts; /* commit, rollback, or prepare timestamp */
} ApplyErrorCallbackArg;
@@ -234,7 +234,7 @@ static ApplyErrorCallbackArg apply_error_callback_arg =
.command = 0,
.rel = NULL,
.remote_attnum = -1,
- .remote_xid = InvalidTransactionId,
+ .remote_lsn = InvalidXLogRecPtr,
.ts = 0,
};
@@ -334,7 +334,7 @@ static void apply_spooled_messages(TransactionId xid, XLogRecPtr lsn);
/* Functions for apply error callback */
static void apply_error_callback(void *arg);
-static inline void set_apply_error_context_xact(TransactionId xid, TimestampTz ts);
+static inline void set_apply_error_context_xact(XLogRecPtr lsn, TimestampTz ts);
static inline void reset_apply_error_context_info(void);
/*
@@ -787,7 +787,7 @@ apply_handle_begin(StringInfo s)
LogicalRepBeginData begin_data;
logicalrep_read_begin(s, &begin_data);
- set_apply_error_context_xact(begin_data.xid, begin_data.committime);
+ set_apply_error_context_xact(begin_data.final_lsn, begin_data.committime);
remote_final_lsn = begin_data.final_lsn;
@@ -839,7 +839,7 @@ apply_handle_begin_prepare(StringInfo s)
errmsg_internal("tablesync worker received a BEGIN PREPARE message")));
logicalrep_read_begin_prepare(s, &begin_data);
- set_apply_error_context_xact(begin_data.xid, begin_data.prepare_time);
+ set_apply_error_context_xact(begin_data.prepare_lsn, begin_data.prepare_time);
remote_final_lsn = begin_data.prepare_lsn;
@@ -938,7 +938,7 @@ apply_handle_commit_prepared(StringInfo s)
char gid[GIDSIZE];
logicalrep_read_commit_prepared(s, &prepare_data);
- set_apply_error_context_xact(prepare_data.xid, prepare_data.commit_time);
+ set_apply_error_context_xact(prepare_data.commit_lsn, prepare_data.commit_time);
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
@@ -979,7 +979,8 @@ apply_handle_rollback_prepared(StringInfo s)
char gid[GIDSIZE];
logicalrep_read_rollback_prepared(s, &rollback_data);
- set_apply_error_context_xact(rollback_data.xid, rollback_data.rollback_time);
+ set_apply_error_context_xact(rollback_data.rollback_end_lsn,
+ rollback_data.rollback_time);
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, rollback_data.xid,
@@ -1044,7 +1045,8 @@ apply_handle_stream_prepare(StringInfo s)
errmsg_internal("tablesync worker received a STREAM PREPARE message")));
logicalrep_read_stream_prepare(s, &prepare_data);
- set_apply_error_context_xact(prepare_data.xid, prepare_data.prepare_time);
+ set_apply_error_context_xact(prepare_data.prepare_lsn,
+ prepare_data.prepare_time);
elog(DEBUG1, "received prepare for streamed transaction %u", prepare_data.xid);
@@ -1126,8 +1128,6 @@ apply_handle_stream_start(StringInfo s)
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg_internal("invalid transaction ID in streamed replication transaction")));
- set_apply_error_context_xact(stream_xid, 0);
-
/*
* Initialize the worker's stream_fileset if we haven't yet. This will be
* used for the entire duration of the worker so create it in a permanent
@@ -1214,10 +1214,7 @@ apply_handle_stream_abort(StringInfo s)
* just delete the files with serialized info.
*/
if (xid == subxid)
- {
- set_apply_error_context_xact(xid, 0);
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
- }
else
{
/*
@@ -1241,8 +1238,6 @@ apply_handle_stream_abort(StringInfo s)
bool found = false;
char path[MAXPGPATH];
- set_apply_error_context_xact(subxid, 0);
-
subidx = -1;
begin_replication_step();
subxact_info_read(MyLogicalRepWorker->subid, xid);
@@ -1426,7 +1421,7 @@ apply_handle_stream_commit(StringInfo s)
errmsg_internal("STREAM COMMIT message without STREAM STOP")));
xid = logicalrep_read_stream_commit(s, &commit_data);
- set_apply_error_context_xact(xid, commit_data.committime);
+ set_apply_error_context_xact(commit_data.commit_lsn, commit_data.committime);
elog(DEBUG1, "received commit for streamed transaction %u", xid);
@@ -3499,7 +3494,7 @@ ApplyWorkerMain(Datum main_arg)
MyLogicalRepWorker->relid,
MyLogicalRepWorker->relid,
0, /* message type */
- InvalidTransactionId,
+ InvalidXLogRecPtr,
errdata->message);
MemoryContextSwitchTo(ecxt);
PG_RE_THROW();
@@ -3640,7 +3635,7 @@ ApplyWorkerMain(Datum main_arg)
? apply_error_callback_arg.rel->localreloid
: InvalidOid,
apply_error_callback_arg.command,
- apply_error_callback_arg.remote_xid,
+ apply_error_callback_arg.remote_lsn,
errdata->message);
MemoryContextSwitchTo(ecxt);
}
@@ -3687,11 +3682,13 @@ apply_error_callback(void *arg)
}
/* append transaction information */
- if (TransactionIdIsNormal(errarg->remote_xid))
+ if (!XLogRecPtrIsInvalid(errarg->remote_lsn))
{
- appendStringInfo(&buf, _(" in transaction %u"), errarg->remote_xid);
+ appendStringInfo(&buf, _(" in transaction which committed at %X/%X"),
+ LSN_FORMAT_ARGS(errarg->remote_lsn));
+
if (errarg->ts != 0)
- appendStringInfo(&buf, _(" at %s"),
+ appendStringInfo(&buf, _(", at %s"),
timestamptz_to_str(errarg->ts));
}
@@ -3701,9 +3698,9 @@ apply_error_callback(void *arg)
/* Set transaction information of apply error callback */
static inline void
-set_apply_error_context_xact(TransactionId xid, TimestampTz ts)
+set_apply_error_context_xact(XLogRecPtr lsn, TimestampTz ts)
{
- apply_error_callback_arg.remote_xid = xid;
+ apply_error_callback_arg.remote_lsn = lsn;
apply_error_callback_arg.ts = ts;
}
@@ -3714,5 +3711,5 @@ reset_apply_error_context_info(void)
apply_error_callback_arg.command = 0;
apply_error_callback_arg.rel = NULL;
apply_error_callback_arg.remote_attnum = -1;
- set_apply_error_context_xact(InvalidTransactionId, 0);
+ set_apply_error_context_xact(InvalidXLogRecPtr, 0);
}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 15cb17ace4..697f72c276 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -30,6 +30,7 @@
#include "utils/acl.h"
#include "utils/builtins.h"
#include "utils/inet.h"
+#include "utils/pg_lsn.h"
#include "utils/timestamp.h"
#define UINT32_ACCESS_ONCE(var) ((uint32)(*((volatile uint32 *)&(var))))
@@ -2446,8 +2447,8 @@ pg_stat_get_subscription_worker(PG_FUNCTION_ARGS)
OIDOID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 4, "last_error_command",
TEXTOID, -1, 0);
- TupleDescInitEntry(tupdesc, (AttrNumber) 5, "last_error_xid",
- XIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 5, "last_error_lsn",
+ LSNOID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 6, "last_error_count",
INT8OID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 7, "last_error_message",
@@ -2483,9 +2484,9 @@ pg_stat_get_subscription_worker(PG_FUNCTION_ARGS)
else
nulls[i++] = true;
- /* last_error_xid */
- if (TransactionIdIsValid(wentry->last_error_xid))
- values[i++] = TransactionIdGetDatum(wentry->last_error_xid);
+ /* last_error_lsn */
+ if (!XLogRecPtrIsInvalid(wentry->last_error_lsn))
+ values[i++] = LSNGetDatum(wentry->last_error_lsn);
else
nulls[i++] = true;
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 7024dbe10a..1b6b745d11 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5375,9 +5375,9 @@
proname => 'pg_stat_get_subscription_worker', prorows => '1', proisstrict => 'f',
proretset => 't', provolatile => 's', proparallel => 'r',
prorettype => 'record', proargtypes => 'oid oid',
- proallargtypes => '{oid,oid,oid,oid,oid,text,xid,int8,text,timestamptz}',
+ proallargtypes => '{oid,oid,oid,oid,oid,text,pg_lsn,int8,text,timestamptz}',
proargmodes => '{i,i,o,o,o,o,o,o,o,o}',
- proargnames => '{subid,subrelid,subid,subrelid,last_error_relid,last_error_command,last_error_xid,last_error_count,last_error_message,last_error_time}',
+ proargnames => '{subid,subrelid,subid,subrelid,last_error_relid,last_error_command,last_error_lsn,last_error_count,last_error_message,last_error_time}',
prosrc => 'pg_stat_get_subscription_worker' },
{ oid => '6118', descr => 'statistics: information about subscription',
proname => 'pg_stat_get_subscription', prorows => '10', proisstrict => 'f',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index e10d20222a..77eb799e81 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -585,7 +585,7 @@ typedef struct PgStat_MsgSubWorkerError
Oid m_relid;
LogicalRepMsgType m_command;
- TransactionId m_xid;
+ XLogRecPtr m_lsn;
TimestampTz m_timestamp;
char m_message[PGSTAT_SUBWORKERERROR_MSGLEN];
} PgStat_MsgSubWorkerError;
@@ -1016,7 +1016,7 @@ typedef struct PgStat_StatSubWorkerEntry
*/
Oid last_error_relid;
LogicalRepMsgType last_error_command;
- TransactionId last_error_xid;
+ XLogRecPtr last_error_lsn;
PgStat_Counter last_error_count;
TimestampTz last_error_time;
char last_error_message[PGSTAT_SUBWORKERERROR_MSGLEN];
@@ -1133,7 +1133,7 @@ extern void pgstat_report_replslot_create(const char *slotname);
extern void pgstat_report_replslot_drop(const char *slotname);
extern void pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
LogicalRepMsgType command,
- TransactionId xid, const char *errmsg);
+ XLogRecPtr lsn, const char *errmsg);
extern void pgstat_report_subscription_drop(Oid subid);
extern void pgstat_initialize(void);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index d652f7b5fb..0b2b2f81e9 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2099,7 +2099,7 @@ pg_stat_subscription_workers| SELECT w.subid,
w.subrelid,
w.last_error_relid,
w.last_error_command,
- w.last_error_xid,
+ w.last_error_lsn,
w.last_error_count,
w.last_error_message,
w.last_error_time
@@ -2110,7 +2110,7 @@ pg_stat_subscription_workers| SELECT w.subid,
SELECT pg_subscription_rel.srsubid AS subid,
pg_subscription_rel.srrelid AS relid
FROM pg_subscription_rel) sr,
- (LATERAL pg_stat_get_subscription_worker(sr.subid, sr.relid) w(subid, subrelid, last_error_relid, last_error_command, last_error_xid, last_error_count, last_error_message, last_error_time)
+ (LATERAL pg_stat_get_subscription_worker(sr.subid, sr.relid) w(subid, subrelid, last_error_relid, last_error_command, last_error_lsn, last_error_count, last_error_message, last_error_time)
JOIN pg_subscription s ON ((w.subid = s.oid)));
pg_stat_sys_indexes| SELECT pg_stat_all_indexes.relid,
pg_stat_all_indexes.indexrelid,
diff --git a/src/test/subscription/t/026_worker_stats.pl b/src/test/subscription/t/026_worker_stats.pl
index 6cf21c8fee..d7f6e702df 100644
--- a/src/test/subscription/t/026_worker_stats.pl
+++ b/src/test/subscription/t/026_worker_stats.pl
@@ -11,7 +11,7 @@ use Test::More tests => 3;
# Test if the error reported on pg_stat_subscription_workers view is expected.
sub test_subscription_error
{
- my ($node, $relname, $command, $xid, $by_apply_worker, $errmsg_prefix, $msg)
+ my ($node, $relname, $command, $by_apply_worker, $errmsg_prefix, $msg)
= @_;
my $check_sql = qq[
@@ -30,11 +30,6 @@ WHERE last_error_relid = '$relname'::regclass
? qq[ AND last_error_command IS NULL]
: qq[ AND last_error_command = '$command'];
- # last_error_xid
- $check_sql .= $xid eq ''
- ? qq[ AND last_error_xid IS NULL]
- : qq[ AND last_error_xid = '$xid'::xid];
-
# Wait for the particular error statistics to be reported.
$node->poll_query_until('postgres', $check_sql,
) or die "Timed out while waiting for " . $msg;
@@ -116,21 +111,20 @@ is($result, q(1), 'check initial data are copied to subscriber');
# Insert more data to test_tab1, raising an error on the subscriber due to
# violation of the unique constraint on test_tab1.
-my $xid = $node_publisher->safe_psql(
+$node_publisher->safe_psql(
'postgres',
qq[
BEGIN;
INSERT INTO test_tab1 VALUES (1);
-SELECT pg_current_xact_id()::xid;
COMMIT;
]);
-test_subscription_error($node_subscriber, 'test_tab1', 'INSERT', $xid,
+test_subscription_error($node_subscriber, 'test_tab1', 'INSERT',
1, # check apply worker error
qq(duplicate key value violates unique constraint),
'error reported by the apply worker');
# Check the table sync worker's error in the view.
-test_subscription_error($node_subscriber, 'test_tab2', '', '',
+test_subscription_error($node_subscriber, 'test_tab2', '',
0, # check tablesync worker error
qq(duplicate key value violates unique constraint),
'the error reported by the table sync worker');
--
2.24.3 (Apple Git-128)
v11-0002-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transa.patchapplication/octet-stream; name=v11-0002-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transa.patchDownload
From 0ffefa60a43e8aee46bc8e9aa8556b341f148db9 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 10 Dec 2021 14:41:30 +0900
Subject: [PATCH v11 2/2] Add ALTER SUBSCRIPTION ... SKIP to skip the
transaction on subscriber nodes.
If incoming change violates any constraint, logical replication stops
until it's resolved. This commit introduces another way to skip the
transaction in question, other than manually updating the subscriber's
database or using pg_replication_origin_advance().
The user can specify LSN by ALTER SUBSCRIPTION ... SKIP (lsn = XXX),
updating pg_subscription.subskiplsn field, telling the apply worker to
skip the transaction. The apply worker skips all data modification changes
within the specified transaction.
After skipping the transaction the apply worker clears
pg_subscription.subskiplsn.
Author: Masahiko Sawada
Reviewed-by: Vignesh C, Greg Nancarrow, Takamichi Osumi, Haiying Tang, Hou Zhijie, Peter Eisentraut, Amit Kapila
Discussion: https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xJfUVihNZDA@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 10 +
doc/src/sgml/logical-replication.sgml | 44 ++-
doc/src/sgml/ref/alter_subscription.sgml | 43 +++
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 3 +-
src/backend/commands/subscriptioncmds.c | 71 +++++
src/backend/parser/gram.y | 9 +
src/backend/replication/logical/worker.c | 340 ++++++++++++++++-----
src/bin/pg_dump/pg_dump.c | 4 +
src/bin/psql/describe.c | 8 +-
src/bin/psql/tab-complete.c | 5 +-
src/include/catalog/pg_subscription.h | 5 +
src/include/nodes/parsenodes.h | 3 +-
src/test/regress/expected/subscription.out | 111 ++++---
src/test/regress/sql/subscription.sql | 16 +
src/test/subscription/t/028_skip_xact.pl | 204 +++++++++++++
16 files changed, 753 insertions(+), 124 deletions(-)
create mode 100644 src/test/subscription/t/028_skip_xact.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 879d2dbce0..d65460b542 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7749,6 +7749,16 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subskiplsn</structfield> <type>pg_lsn</type>
+ </para>
+ <para>
+ Commit LSN of the transaction whose changes are to be skipped, if a valid
+ LSN; otherwise <literal>0/0</literal>.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subconninfo</structfield> <type>text</type>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 96b4886e08..1327c986be 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -352,16 +352,52 @@
</para>
<para>
- The resolution can be done either by changing data or permissions on the subscriber so
- that it does not conflict with the incoming change or by skipping the
- transaction that conflicts with the existing data. The transaction can be
- skipped by calling the <link linkend="pg-replication-origin-advance">
+ When a conflict produces an error, it is shown in the
+ <structname>pg_stat_subscription_workers</structname> view as follows:
+
+<programlisting>
+postgres=# SELECT * FROM pg_stat_subscription_workers;
+-[ RECORD 1 ]------+-----------------------------------------------------------
+subid | 16391
+subname | test_sub
+subrelid |
+last_error_relid | 16385
+last_error_command | INSERT
+last_error_lsn | 0/14B9240
+last_error_count | 50
+last_error_message | duplicate key value violates unique constraint "test_pkey"
+last_error_time | 2021-09-29 15:52:45.165754+00
+</programlisting>
+
+ and it is also shown in subscriber's server log as follows:
+
+<screen>
+ERROR: duplicate key value violates unique constraint "test_pkey"
+DETAIL: Key (id)=(1) already exists.
+CONTEXT: processing remote data during "INSERT" for replication target relation "public.test" in transaction which committed at 0/14B9240, at 2021-09-29 15:52:45.165754+00
+</screen>
+
+ The ID of the transaction that contains the change violating the constraint can be
+ found from those outputs (LSN 0/14B9240 in the above case). The transaction
+ can be skipped by using <command>ALTER SUBSCRIPTION ... SKIP</command> on the
+ subscription. Alternatively, the transaction can also be skipped by calling the
+ <link linkend="pg-replication-origin-advance">
<function>pg_replication_origin_advance()</function></link> function with
a <parameter>node_name</parameter> corresponding to the subscription name,
and a position. The current position of origins can be seen in the
<link linkend="view-pg-replication-origin-status">
<structname>pg_replication_origin_status</structname></link> system view.
</para>
+
+ <para>
+ The resolution can be done by changing data or permissions on the subscriber so
+ that it does not conflict with incoming changes, by dropping the conflicting constraint
+ or unique index, or by writing a trigger on the subscriber to suppress or redirect
+ conflicting incoming changes, or as a last resort, by skipping the whole transaction.
+ Skipping the whole transaction includes skipping changes that might not violate
+ any constraint. This can easily make the subscriber inconsistent, especially if
+ a user specifies the wrong transaction ID or the wrong position of origin.
+ </para>
</sect1>
<sect1 id="logical-replication-restrictions">
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 0b027cc346..ab99ff8f92 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -29,6 +29,7 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> REFRESH PUB
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> ENABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> DISABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> )
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <replaceable>new_name</replaceable>
</synopsis>
@@ -207,6 +208,48 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><literal>SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> )</literal></term>
+ <listitem>
+ <para>
+ Skips applying all changes of the specified remote transaction. If incoming data
+ violates any constraints, logical replication will stop until it is
+ resolved. The resolution can be done either by changing data on the
+ subscriber so that it doesn't conflict with incoming changes or by skipping
+ the whole transaction. Using the <command>ALTER SUBSCRIPTION ... SKIP</command>
+ command, the logical replication worker skips all data modification changes
+ within the specified transaction, including changes that might not violate
+ the constraint, so, it should only be used as a last resort. This option has
+ no effect on the transactions that are already prepared by enabling
+ <literal>two_phase</literal> on subscriber. After logical replication
+ successfully skips the transaction or commits non-empty transaction,
+ the LSN (stored in
+ <structname>pg_subscription</structname>.<structfield>subskiplsn</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts.
+ </para>
+
+ <para>
+ <replaceable>skip_option</replaceable> specifies options for this operation.
+ The supported option is:
+
+ <variablelist>
+ <varlistentry>
+ <term><literal>lsn</literal> (<type>pg_lsn</type>)</term>
+ <listitem>
+ <para>
+ Specifies the commit LSN of the remote transaction whose changes are to be skipped
+ by the logical replication worker. Skipping
+ individual subtransactions is not supported. Setting <literal>NONE</literal>
+ resets the LSN.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><replaceable class="parameter">new_owner</replaceable></term>
<listitem>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index ca65a8bd20..2139ebd0e0 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -69,6 +69,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->binary = subform->subbinary;
sub->stream = subform->substream;
sub->twophasestate = subform->subtwophasestate;
+ sub->skiplsn = subform->subskiplsn;
/* Get conninfo */
datum = SysCacheGetAttr(SUBSCRIPTIONOID,
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 9e9578bad4..bcef8ea5f1 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1261,7 +1261,8 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
- substream, subtwophasestate, subslotname, subsynccommit, subpublications)
+ substream, subtwophasestate, subskiplsn, subslotname, subsynccommit,
+ subpublications)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_workers AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 3ef6607d24..4def4cac4f 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -45,6 +45,7 @@
#include "utils/guc.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/pg_lsn.h"
#include "utils/syscache.h"
/*
@@ -61,6 +62,7 @@
#define SUBOPT_BINARY 0x00000080
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
+#define SUBOPT_LSN 0x00000400
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -82,6 +84,8 @@ typedef struct SubOpts
bool binary;
bool streaming;
bool twophase;
+ XLogRecPtr lsn; /* InvalidXLogRecPtr for resetting purpose,
+ * otherwise a valid LSN */
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -249,6 +253,35 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_TWOPHASE_COMMIT;
opts->twophase = defGetBoolean(defel);
}
+ else if (IsSet(supported_opts, SUBOPT_LSN) &&
+ strcmp(defel->defname, "lsn") == 0)
+ {
+ char *lsn_str = defGetString(defel);
+ XLogRecPtr lsn;
+
+ if (IsSet(opts->specified_opts, SUBOPT_LSN))
+ errorConflictingDefElem(defel, pstate);
+
+ if (strcmp(lsn_str, "none") == 0)
+ {
+ /* Setting lsn = NONE is treated as resetting LSN */
+ lsn = InvalidXLogRecPtr;
+ }
+ else
+ {
+ /* Parse the argument as LSN */
+ lsn = DatumGetTransactionId(DirectFunctionCall1(pg_lsn_in,
+ CStringGetDatum(lsn_str)));
+
+ if (XLogRecPtrIsInvalid(lsn))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid WAL location (LSN): %s", lsn_str)));
+ }
+
+ opts->specified_opts |= SUBOPT_LSN;
+ opts->lsn = lsn;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -464,6 +497,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
CharGetDatum(opts.twophase ?
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
+ values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -1083,6 +1117,43 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
break;
}
+ case ALTER_SUBSCRIPTION_SKIP:
+ {
+ RepOriginId originid;
+ char originname[NAMEDATALEN];
+ XLogRecPtr remote_lsn;
+
+ if (!superuser())
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("must be superuser to skip transaction")));
+
+ parse_subscription_options(pstate, stmt->options, SUBOPT_LSN, &opts);
+
+ /* ALTER SUBSCRIPTION ... SKIP supports only LSN option */
+ Assert(IsSet(opts.specified_opts, SUBOPT_LSN));
+
+ if (!XLogRecPtrIsInvalid(opts.lsn))
+ {
+ /* Check the given LSN is at least a future LSN */
+ snprintf(originname, sizeof(originname), "pg_%u", subid);
+ originid = replorigin_by_name(originname, false);
+ remote_lsn = replorigin_get_progress(originid, false);
+
+ if (!XLogRecPtrIsInvalid(remote_lsn) && opts.lsn < remote_lsn)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("skip WAL location (LSN) must be greater than origin LSN %X/%X",
+ LSN_FORMAT_ARGS(remote_lsn))));
+ }
+
+ values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(opts.lsn);
+ replaces[Anum_pg_subscription_subskiplsn - 1] = true;
+
+ update_tuple = true;
+ break;
+ }
+
default:
elog(ERROR, "unrecognized ALTER SUBSCRIPTION kind %d",
stmt->kind);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c4f3242506..cdf821ffa8 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9965,6 +9965,15 @@ AlterSubscriptionStmt:
(Node *)makeBoolean(false), @1));
$$ = (Node *)n;
}
+ | ALTER SUBSCRIPTION name SKIP definition
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_SKIP;
+ n->subname = $3;
+ n->options = $5;
+ $$ = (Node *)n;
+ }
;
/*****************************************************************************
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 2d2c83cd53..fd878cbf1e 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -136,6 +136,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -189,6 +190,7 @@
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/pg_lsn.h"
#include "utils/rel.h"
#include "utils/rls.h"
#include "utils/syscache.h"
@@ -257,6 +259,21 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * We enable skipping all data modification changes (INSERT, UPDATE, etc.) for
+ * the subscription if the remote transaction's commit LSN matches the subskiplsn.
+ * Once we start skipping changes, we don't stop it until we skip all changes of
+ * the transaction even if pg_subscription is updated and MySubscription->skiplsn
+ * gets changed or reset during that. Also, in streaming transaction cases, we
+ * don't skip receiving and spooling the changes, since we decide whether or not
+ * to skip applying the changes when starting to apply changes. The subskiplsn is
+ * cleared after successfully skipping the transaction or applying non-empty
+ * transaction, where the later avoids the mistakenly specified subskiplsn from
+ * being left. Only the apply workers support this skipping behavior for now.
+ */
+static XLogRecPtr skip_xact_commit_lsn = InvalidXLogRecPtr;
+#define is_skipping_changes() (!XLogRecPtrIsInvalid(skip_xact_commit_lsn))
+
/* BufFile handle of the current streaming file */
static BufFile *stream_fd = NULL;
@@ -303,10 +320,13 @@ static void store_flush_position(XLogRecPtr remote_lsn);
static void maybe_reread_subscription(void);
+static void apply_worker_post_transaction(bool empty_tx, XLogRecPtr origin_lsn,
+ TimestampTz origin_ts);
+
/* prototype needed because of stream_commit */
static void apply_dispatch(StringInfo s);
-static void apply_handle_commit_internal(LogicalRepCommitData *commit_data);
+static bool apply_handle_commit_internal(LogicalRepCommitData *commit_data);
static void apply_handle_insert_internal(ApplyExecutionData *edata,
ResultRelInfo *relinfo,
TupleTableSlot *remoteslot);
@@ -332,6 +352,12 @@ static void TwoPhaseTransactionGid(Oid subid, TransactionId xid, char *gid, int
/* Common streaming function to apply all the spooled messages */
static void apply_spooled_messages(TransactionId xid, XLogRecPtr lsn);
+/* Functions for skipping changes */
+static void maybe_start_skipping_changes(XLogRecPtr lsn);
+static void stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_ts);
+static void clear_subscription_skip_lsn(XLogRecPtr skiplsn, XLogRecPtr origin_lsn,
+ TimestampTz origin_ts);
+
/* Functions for apply error callback */
static void apply_error_callback(void *arg);
static inline void set_apply_error_context_xact(XLogRecPtr lsn, TimestampTz ts);
@@ -791,6 +817,8 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
+ maybe_start_skipping_changes(begin_data.final_lsn);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -805,6 +833,7 @@ static void
apply_handle_commit(StringInfo s)
{
LogicalRepCommitData commit_data;
+ bool committed;
logicalrep_read_commit(s, &commit_data);
@@ -815,13 +844,10 @@ apply_handle_commit(StringInfo s)
LSN_FORMAT_ARGS(commit_data.commit_lsn),
LSN_FORMAT_ARGS(remote_final_lsn))));
- apply_handle_commit_internal(&commit_data);
-
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(commit_data.end_lsn);
+ committed = apply_handle_commit_internal(&commit_data);
- pgstat_report_activity(STATE_IDLE, NULL);
- reset_apply_error_context_info();
+ apply_worker_post_transaction(committed, commit_data.end_lsn,
+ commit_data.committime);
}
/*
@@ -843,6 +869,8 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
+ maybe_start_skipping_changes(begin_data.prepare_lsn);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -901,9 +929,9 @@ apply_handle_prepare(StringInfo s)
/*
* Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction. It is done this way because at
- * commit prepared time, we won't know whether we have skipped preparing a
- * transaction because of no change.
+ * change has happened in this transaction or all changes are skipped. It
+ * is done this way because at commit prepared time, we won't know whether
+ * we have skipped preparing a transaction because of no change.
*
* XXX, We can optimize such that at commit prepared time, we first check
* whether we have prepared the transaction or not but that doesn't seem
@@ -917,15 +945,14 @@ apply_handle_prepare(StringInfo s)
CommitTransactionCommand();
pgstat_report_stat(false);
- store_flush_position(prepare_data.end_lsn);
-
- in_remote_transaction = false;
-
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(prepare_data.end_lsn);
-
- pgstat_report_activity(STATE_IDLE, NULL);
- reset_apply_error_context_info();
+ /*
+ * Do the post transaction work and cleanup including clearing the subskiplsn.
+ * Since we already have prepared the transaction, in a case where the server
+ * crashes before clearing the subskiplsn, it will be left but the transaction
+ * won't be resent. But that's okay because it will be cleared when starting
+ * the next transaction.
+ */
+ apply_worker_post_transaction(false, prepare_data.end_lsn, prepare_data.prepare_time);
}
/*
@@ -957,16 +984,8 @@ apply_handle_commit_prepared(StringInfo s)
FinishPreparedTransaction(gid, true);
end_replication_step();
CommitTransactionCommand();
- pgstat_report_stat(false);
-
- store_flush_position(prepare_data.end_lsn);
- in_remote_transaction = false;
-
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(prepare_data.end_lsn);
- pgstat_report_activity(STATE_IDLE, NULL);
- reset_apply_error_context_info();
+ apply_worker_post_transaction(true, prepare_data.end_lsn, prepare_data.commit_time);
}
/*
@@ -977,6 +996,7 @@ apply_handle_rollback_prepared(StringInfo s)
{
LogicalRepRollbackPreparedTxnData rollback_data;
char gid[GIDSIZE];
+ bool finish_prepared = false;
logicalrep_read_rollback_prepared(s, &rollback_data);
set_apply_error_context_xact(rollback_data.rollback_end_lsn,
@@ -1007,18 +1027,12 @@ apply_handle_rollback_prepared(StringInfo s)
FinishPreparedTransaction(gid, false);
end_replication_step();
CommitTransactionCommand();
+ finish_prepared = true;
}
- pgstat_report_stat(false);
-
- store_flush_position(rollback_data.rollback_end_lsn);
- in_remote_transaction = false;
-
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(rollback_data.rollback_end_lsn);
-
- pgstat_report_activity(STATE_IDLE, NULL);
- reset_apply_error_context_info();
+ apply_worker_post_transaction(finish_prepared,
+ rollback_data.rollback_end_lsn,
+ rollback_data.rollback_time);
}
/*
@@ -1058,21 +1072,10 @@ apply_handle_stream_prepare(StringInfo s)
CommitTransactionCommand();
- pgstat_report_stat(false);
-
- store_flush_position(prepare_data.end_lsn);
-
- in_remote_transaction = false;
-
/* unlink the files with serialized changes and subxact info. */
stream_cleanup_files(MyLogicalRepWorker->subid, prepare_data.xid);
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(prepare_data.end_lsn);
-
- pgstat_report_activity(STATE_IDLE, NULL);
-
- reset_apply_error_context_info();
+ apply_worker_post_transaction(false, prepare_data.end_lsn, prepare_data.prepare_time);
}
/*
@@ -1326,6 +1329,8 @@ apply_spooled_messages(TransactionId xid, XLogRecPtr lsn)
remote_final_lsn = lsn;
+ maybe_start_skipping_changes(lsn);
+
/*
* Make sure the handle apply_dispatch methods are aware we're in a remote
* transaction.
@@ -1414,6 +1419,7 @@ apply_handle_stream_commit(StringInfo s)
{
TransactionId xid;
LogicalRepCommitData commit_data;
+ bool committed;
if (in_streamed_transaction)
ereport(ERROR,
@@ -1427,23 +1433,21 @@ apply_handle_stream_commit(StringInfo s)
apply_spooled_messages(xid, commit_data.commit_lsn);
- apply_handle_commit_internal(&commit_data);
+ committed = apply_handle_commit_internal(&commit_data);
/* unlink the files with serialized changes and subxact info */
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(commit_data.end_lsn);
-
- pgstat_report_activity(STATE_IDLE, NULL);
-
- reset_apply_error_context_info();
+ apply_worker_post_transaction(committed, commit_data.end_lsn,
+ commit_data.committime);
}
/*
* Helper function for apply_handle_commit and apply_handle_stream_commit.
+ * Return true if the transaction was committed, otherwise return false
+ * if the transaction doesn't have any change.
*/
-static void
+static bool
apply_handle_commit_internal(LogicalRepCommitData *commit_data)
{
if (IsTransactionState())
@@ -1456,18 +1460,14 @@ apply_handle_commit_internal(LogicalRepCommitData *commit_data)
replorigin_session_origin_timestamp = commit_data->committime;
CommitTransactionCommand();
- pgstat_report_stat(false);
-
- store_flush_position(commit_data->end_lsn);
- }
- else
- {
- /* Process any invalidation messages that might have accumulated. */
- AcceptInvalidationMessages();
- maybe_reread_subscription();
+ return true;
}
- in_remote_transaction = false;
+ /* Process any invalidation messages that might have accumulated. */
+ AcceptInvalidationMessages();
+ maybe_reread_subscription();
+
+ return false;
}
/*
@@ -2361,6 +2361,17 @@ apply_dispatch(StringInfo s)
LogicalRepMsgType action = pq_getmsgbyte(s);
LogicalRepMsgType saved_command;
+ /*
+ * Skip all data-modification changes if we're skipping changes of this
+ * transaction.
+ */
+ if (is_skipping_changes() &&
+ (action == LOGICAL_REP_MSG_INSERT ||
+ action == LOGICAL_REP_MSG_UPDATE ||
+ action == LOGICAL_REP_MSG_DELETE ||
+ action == LOGICAL_REP_MSG_TRUNCATE))
+ return;
+
/*
* Set the current command being applied. Since this function can be
* called recursively when applying spooled changes, save the current
@@ -3656,6 +3667,199 @@ IsLogicalWorker(void)
return MyLogicalRepWorker != NULL;
}
+/*
+ * Post-transaction work for apply workers.
+ *
+ * tx_finished is true if the caller have finished the transaction with updating
+ * the replication origin so the same transaction won't be resent. Both origin_lsn
+ * and origin_timestamp are the remote transaction's end_lsn and commit timestamp.
+ */
+static void
+apply_worker_post_transaction(bool tx_finished, XLogRecPtr origin_lsn,
+ TimestampTz origin_ts)
+{
+ if (unlikely(is_skipping_changes()))
+ {
+ /*
+ * If we are skipping all changes of this transaction, we stop it and
+ * clear the subskiplsn of pg_subscription.
+ */
+ stop_skipping_changes(origin_lsn, origin_ts);
+ }
+ else if (unlikely(tx_finished && !XLogRecPtrIsInvalid(MySubscription->skiplsn)))
+ {
+ /*
+ * The subskiplsn was specified but we successfully finished non-empty
+ * transaction. In this case, it's possible that the user mistakenly specified
+ * the wrong subskiplsn so raise an warning and clear it.
+ */
+ ereport(WARNING,
+ errmsg("remote transaction's commit WAL location (LSN) %X/%X did not match skip-LSN %X/%X",
+ LSN_FORMAT_ARGS(origin_lsn),
+ LSN_FORMAT_ARGS(MySubscription->skiplsn)));
+
+ clear_subscription_skip_lsn(MySubscription->skiplsn, origin_lsn, origin_ts);
+ }
+
+ Assert(!IsTransactionState());
+ Assert(!is_skipping_changes());
+
+ pgstat_report_stat(false);
+
+ store_flush_position(origin_lsn);
+
+ in_remote_transaction = false;
+
+ /* Process any tables that are being synchronized in parallel. */
+ process_syncing_tables(origin_lsn);
+
+ pgstat_report_activity(STATE_IDLE, NULL);
+
+ reset_apply_error_context_info();
+}
+
+/*
+ * Start skipping changes of the transaction if the given commit LSN matches the
+ * LSN specified by subscription's skiplsn.
+ */
+static void
+maybe_start_skipping_changes(XLogRecPtr lsn)
+{
+ Assert(!is_skipping_changes());
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ /* Tablesync worker doesn't support skipping changes */
+ if (am_tablesync_worker())
+ return;
+
+ /* Quick exit if the subskiplsn is not specified */
+ if (likely(XLogRecPtrIsInvalid(MySubscription->skiplsn)))
+ return;
+
+ if (likely(MySubscription->skiplsn != lsn))
+ {
+ /*
+ * It's a rare case; a post subskiplsn was left because the server crashed
+ * after preparing the transaction and before clearing the subskiplsn.
+ * We clear it without a warning message so as not confuse the user.
+ */
+ if (unlikely(MySubscription->skiplsn < lsn))
+ clear_subscription_skip_lsn(MySubscription->skiplsn, InvalidXLogRecPtr, 0);
+
+ return;
+ }
+
+ /* Start skipping all changes of this transaction */
+ skip_xact_commit_lsn = lsn;
+
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction which committed at %X/%X",
+ LSN_FORMAT_ARGS(skip_xact_commit_lsn)));
+}
+
+/*
+ * Stop skipping changes by resetting skip_xact_commit_lsn. Both origin_lsn and
+ * origin_timestamp are used to update origin state when clearing subskiplsn so
+ * that we can restart streaming from correct position in case of crash.
+ */
+static void
+stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_ts)
+{
+ Assert(is_skipping_changes());
+
+ clear_subscription_skip_lsn(skip_xact_commit_lsn, origin_lsn, origin_ts);
+
+ /* Make sure that clearing the subskiplsn is committed */
+ if (IsTransactionState())
+ CommitTransactionCommand();
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction that committed at %X/%X",
+ LSN_FORMAT_ARGS(skip_xact_commit_lsn))));
+
+ /* Stop skipping changes */
+ skip_xact_commit_lsn = InvalidXLogRecPtr;
+}
+
+/* Clear subskiplsn of pg_subscription catalog */
+static void
+clear_subscription_skip_lsn(XLogRecPtr skiplsn, XLogRecPtr origin_lsn, TimestampTz origin_ts)
+{
+ Relation rel;
+ Form_pg_subscription subform;
+ HeapTuple tup;
+ bool started_tx = false;
+
+ if (!IsTransactionState())
+ {
+ StartTransactionCommand();
+ started_tx = true;
+ }
+
+ /*
+ * Protect subskiplsn of pg_subscription from being concurrently updated
+ * while clearing it.
+ */
+ LockSharedObject(SubscriptionRelationId, MySubscription->oid, 0,
+ AccessShareLock);
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ ObjectIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ subform = (Form_pg_subscription) GETSTRUCT(tup);
+
+ /*
+ * Update the subskiplsn of the tuple to InvalidXLogRecPtr. If user
+ * has already changed subskiplsn before clearing it we don't update the
+ * catalog and don't advance the replication origin state. So in the
+ * worst case, if the server crashes before sending an acknowledgment of
+ * the flush position the transaction will be sent again and the user
+ * needs to set subskiplsn again. We can reduce the possibility by
+ * logging a replication origin WAL record to advance the origin LSN
+ * instead but there is no way to advance the origin timestamp and it
+ * doesn't seem to be worth doing anything about it since it's a very rare
+ * case.
+ */
+ if (subform->subskiplsn == skiplsn)
+ {
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ /* reset subskiplsn */
+ values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ replaces[Anum_pg_subscription_subskiplsn - 1] = true;
+
+ /*
+ * Update origin state so we can restart streaming from correct
+ * position in case of crash.
+ */
+ replorigin_session_origin_lsn = origin_lsn;
+ replorigin_session_origin_timestamp = origin_ts;
+
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+ }
+
+ heap_freetuple(tup);
+ table_close(rel, NoLock);
+
+ if (started_tx)
+ CommitTransactionCommand();
+}
+
/* Error callback to give more context info about the change being applied */
static void
apply_error_callback(void *arg)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 3499c0a4d5..7d0c7ba1d4 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4312,6 +4312,10 @@ getSubscriptions(Archive *fout)
ntups = PQntuples(res);
+ /*
+ * Get subscription fields. We don't include subskiplsn in the dump as
+ * after restoring the dump this value may no longer be relevant.
+ */
i_tableoid = PQfnumber(res, "tableoid");
i_oid = PQfnumber(res, "oid");
i_subname = PQfnumber(res, "subname");
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 654ef2d7c3..2a92258d9f 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6062,7 +6062,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false};
+ false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6107,6 +6107,12 @@ describeSubscriptions(const char *pattern, bool verbose)
", subconninfo AS \"%s\"\n",
gettext_noop("Synchronous commit"),
gettext_noop("Conninfo"));
+
+ /* Skip LSN is only supported in v15 and higher */
+ if (pset.sversion >= 150000)
+ appendPQExpBuffer(&buf,
+ ", subskiplsn AS \"%s\"\n",
+ gettext_noop("Skip LSN"));
}
/* Only display subscriptions in current database. */
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index d1e421bc0f..c1877ec9e5 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1795,7 +1795,7 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> */
else if (Matches("ALTER", "SUBSCRIPTION", MatchAny))
COMPLETE_WITH("CONNECTION", "ENABLE", "DISABLE", "OWNER TO",
- "RENAME TO", "REFRESH PUBLICATION", "SET",
+ "RENAME TO", "REFRESH PUBLICATION", "SET", "SKIP (",
"ADD PUBLICATION", "DROP PUBLICATION");
/* ALTER SUBSCRIPTION <name> REFRESH PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) &&
@@ -1811,6 +1811,9 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
COMPLETE_WITH("binary", "slot_name", "streaming", "synchronous_commit");
+ /* ALTER SUBSCRIPTION <name> SKIP ( */
+ else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
+ COMPLETE_WITH("lsn");
/* ALTER SUBSCRIPTION <name> SET PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "PUBLICATION"))
{
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 18c291289f..9b77cf916a 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -67,6 +67,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
char subtwophasestate; /* Stream two-phase transactions */
+ XLogRecPtr subskiplsn; /* All changes which committed at this LSN
+ * are skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -103,6 +106,8 @@ typedef struct Subscription
* binary format */
bool stream; /* Allow streaming in-progress transactions. */
char twophasestate; /* Allow streaming two-phase transactions */
+ XLogRecPtr skiplsn; /* All changes which committed at this LSN are
+ * skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 37fcc4c9b5..f93fab4461 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3719,7 +3719,8 @@ typedef enum AlterSubscriptionType
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
ALTER_SUBSCRIPTION_DROP_PUBLICATION,
ALTER_SUBSCRIPTION_REFRESH,
- ALTER_SUBSCRIPTION_ENABLED
+ ALTER_SUBSCRIPTION_ENABLED,
+ ALTER_SUBSCRIPTION_SKIP
} AlterSubscriptionType;
typedef struct AlterSubscriptionStmt
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 80aae83562..c23427922d 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -76,10 +76,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -93,11 +93,26 @@ ALTER SUBSCRIPTION regress_doesnotexist CONNECTION 'dbname=regress_doesnotexist2
ERROR: subscription "regress_doesnotexist" does not exist
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
ERROR: unrecognized subscription parameter: "create_slot"
+-- ok - with lsn = NONE
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
+ERROR: invalid WAL location (LSN): 0/0
+-- fail - must be superuser. We need to try this operation as subscription
+-- owner.
+ALTER ROLE regress_subscription_user2 SUPERUSER;
+ALTER ROLE regress_subscription_user NOSUPERUSER;
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
+ERROR: must be superuser to skip transaction
+SET SESSION AUTHORIZATION 'regress_subscription_user2';
+ALTER ROLE regress_subscription_user SUPERUSER;
+ALTER ROLE regress_subscription_user2 NOSUPERUSER;
+SET SESSION AUTHORIZATION 'regress_subscription_user';
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+--------------------+------------------------------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | off | dbname=regress_doesnotexist2
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -129,10 +144,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+--------------------+------------------------------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | local | dbname=regress_doesnotexist2
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -165,19 +180,19 @@ ERROR: binary requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, binary = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -188,19 +203,19 @@ ERROR: streaming requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -215,10 +230,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more then once
@@ -233,10 +248,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -270,10 +285,10 @@ ERROR: two_phase requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -282,10 +297,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -294,10 +309,10 @@ DROP SUBSCRIPTION regress_testsub;
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index bd0f4af1e4..f97b507f18 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -72,6 +72,22 @@ ALTER SUBSCRIPTION regress_testsub SET (slot_name = '');
ALTER SUBSCRIPTION regress_doesnotexist CONNECTION 'dbname=regress_doesnotexist2';
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
+-- ok - with lsn = NONE
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
+
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
+
+-- fail - must be superuser. We need to try this operation as subscription
+-- owner.
+ALTER ROLE regress_subscription_user2 SUPERUSER;
+ALTER ROLE regress_subscription_user NOSUPERUSER;
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
+SET SESSION AUTHORIZATION 'regress_subscription_user2';
+ALTER ROLE regress_subscription_user SUPERUSER;
+ALTER ROLE regress_subscription_user2 NOSUPERUSER;
+SET SESSION AUTHORIZATION 'regress_subscription_user';
+
\dRs+
BEGIN;
diff --git a/src/test/subscription/t/028_skip_xact.pl b/src/test/subscription/t/028_skip_xact.pl
new file mode 100644
index 0000000000..797e5d1c8a
--- /dev/null
+++ b/src/test/subscription/t/028_skip_xact.pl
@@ -0,0 +1,204 @@
+
+# Copyright (c) 2022, PostgreSQL Global Development Group
+
+# Tests for skipping logical replication transactions
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use Test::More tests => 4;
+use Time::HiRes qw(usleep);
+
+# Test skipping the transaction. This function must be called after the caller
+# has inserted data that conflicts with the subscriber. After waiting for the
+# subscription worker stats to be updated, we skip the transaction in question
+# by ALTER SUBSCRIPTION ... SKIP. Then, check if logical replication can continue
+# working by inserting $nonconflict_data on the publisher.
+sub test_skip_xact
+{
+ my ($node_publisher, $node_subscriber, $subname, $relname,
+ $nonconflict_data, $expected, $msg)
+ = @_;
+
+ local $Test::Builder::Level = $Test::Builder::Level + 1;
+ my $max_attempts = 180 * 10;
+ my $attempts = 0;
+ my $lsn;
+
+ # Wait for worker error
+ while (1)
+ {
+ $lsn = $node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+SELECT last_error_lsn
+FROM pg_stat_subscription_workers
+WHERE last_error_relid = '$relname'::regclass
+ AND subrelid IS NULL
+ AND last_error_command = 'INSERT'
+ AND starts_with(last_error_message, 'duplicate key value violates unique constraint');
+]);
+
+ # Break if got a valid error LSN.
+ last if ($lsn ne '');
+
+ # Wait 0.1 second before retrying.
+ usleep(100_000);
+
+ $attempts++;
+
+ die "Timed out while waiting for subscription worker error" if ($attempts > $max_attempts);
+ }
+
+ # Set skip lsn
+ $node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION $subname SKIP (lsn = '$lsn')");
+
+ # Restart the subscriber node to restart logical replication with no interval
+ $node_subscriber->restart;
+
+ # Wait for the failed transaction to be skipped
+ $node_subscriber->poll_query_until('postgres',
+ "SELECT subskiplsn = '0/0' FROM pg_subscription WHERE subname = '$subname'"
+ );
+
+ # Insert non-conflict data
+ $node_publisher->safe_psql('postgres',
+ "INSERT INTO $relname VALUES $nonconflict_data");
+
+ $node_publisher->wait_for_catchup($subname);
+
+ # Check replicated data
+ my $res = $node_subscriber->safe_psql('postgres',
+ "SELECT count(*) FROM $relname");
+ is($res, $expected, $msg);
+
+ # Clear error information
+ $node_subscriber->safe_psql('postgres',
+ qq[
+SELECT pg_stat_reset_subscription_worker(oid)
+FROM pg_subscription
+WHERE subname = '$subname'
+]);
+ $node_subscriber->poll_query_until('postgres',
+ qq[
+SELECT NOT EXISTS (SELECT subid
+FROM pg_stat_subscription_workers
+WHERE subname = '$subname')
+]);
+}
+
+# Create publisher node. Set a low value to logical_decoding_work_mem
+# so we can test streaming cases easily.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf(
+ 'postgresql.conf',
+ qq[
+logical_decoding_work_mem = 64kB
+max_prepared_transactions = 10
+]);
+$node_publisher->start;
+
+# Create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->append_conf(
+ 'postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+]);
+
+# The subscriber will enter an infinite error loop, so we don't want
+# to overflow the server log with error messages.
+$node_subscriber->append_conf(
+ 'postgresql.conf',
+ qq[
+wal_retrieve_retry_interval = 2s
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On the subscriber, we
+# create the same tables but with primary keys. Also, insert some data that
+# will conflict with the data replicated from publisher later.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+CREATE TABLE test_tab (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+COMMIT;
+]);
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+CREATE TABLE test_tab (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab VALUES (1);
+INSERT INTO test_tab_streaming VALUES (1, md5(1::text));
+COMMIT;
+]);
+
+# Setup publications
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Create subscriptions
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr' PUBLICATION tap_pub WITH (two_phase = on);
+CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr' PUBLICATION tap_pub_streaming WITH (streaming = on);
+]);
+
+$node_publisher->wait_for_catchup('tap_sub');
+$node_publisher->wait_for_catchup('tap_sub_streaming');
+
+# Insert data to test_tab1, raising an error on the subscriber due to violation
+# of the unique constraint on test_tab. Then skip the transaction.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+COMMIT;
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(2)", "2", "test skipping transaction");
+
+# Test for PREPARE and COMMIT PREPARED. Insert the same data to test_tab1 and
+# PREPARE the transaction, raising an error. Then skip the transaction.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+PREPARE TRANSACTION 'gtx';
+COMMIT PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(3)", "3", "test skipping prepare and commit prepared ");
+
+# Test for STREAM COMMIT. Insert enough rows to test_tab_streaming to exceed the 64kB
+# limit, also raising an error on the subscriber during applying spooled changes for the
+# same reason. Then skip the transaction.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+COMMIT;
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub_streaming",
+ "test_tab_streaming", "(2, md5(2::text))",
+ "2", "test skipping stream-commit");
+
+my $res = $node_subscriber->safe_psql('postgres',
+ "SELECT count(*) FROM pg_prepared_xacts");
+is($res, "0",
+ "check all prepared transactions are resolved on the subscriber");
--
2.24.3 (Apple Git-128)
On Fri, Feb 11, 2022 at 7:40 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Jan 27, 2022 at 10:42 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:On 26.01.22 05:05, Masahiko Sawada wrote:
I think it is okay to clear after the first successful application of
any transaction. What I was not sure was about the idea of giving
WARNING/ERROR if the first xact to be applied is not the same as
skip_xid.Do you prefer not to do anything in this case?
I think a warning would be sensible. If the user specifies to skip a
certain transaction and then that doesn't happen, we should at least say
something.Meanwhile waiting for comments on the discussion about the designs of
both pg_stat_subscription_workers and ALTER SUBSCRIPTION SKIP feature,
I’ve incorporated some (minor) comments on the current design patch,
which includes:* Use LSN instead of XID.
I think exposing LSN is a better approach as it doesn't have the
dangers of wraparound. And, I think users can use it with the existing
function pg_replication_origin_advance() which will save us from
adding additional code for this feature. We can explain/expand in docs
how users can use the error information from view/error_logs and use
the existing function to skip conflicting transactions. We might want
to even expose error_origin to make it a bit easier for users but not
sure. I feel the need for the new syntax (and then added code
complexity due to that) isn't warranted if we expose error_LSN and let
users use it with the existing functions.
Do you see any problem with the same?
--
With Regards,
Amit Kapila.
On 14.02.22 10:16, Amit Kapila wrote:
I think exposing LSN is a better approach as it doesn't have the
dangers of wraparound. And, I think users can use it with the existing
function pg_replication_origin_advance() which will save us from
adding additional code for this feature. We can explain/expand in docs
how users can use the error information from view/error_logs and use
the existing function to skip conflicting transactions. We might want
to even expose error_origin to make it a bit easier for users but not
sure. I feel the need for the new syntax (and then added code
complexity due to that) isn't warranted if we expose error_LSN and let
users use it with the existing functions.
Well, the whole point of this feature is to provide a higher-level
interface instead of pg_replication_origin_advance(). Replication
origins are currently not something the users have to deal with
directly. We already document that you can use
pg_replication_origin_advance() to skip erroring transactions. But that
seems unsatisfactory. It'd be like using pg_surgery to fix unique
constraint violations.
On Tue, Feb 15, 2022 at 7:35 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
On 14.02.22 10:16, Amit Kapila wrote:
I think exposing LSN is a better approach as it doesn't have the
dangers of wraparound. And, I think users can use it with the existing
function pg_replication_origin_advance() which will save us from
adding additional code for this feature. We can explain/expand in docs
how users can use the error information from view/error_logs and use
the existing function to skip conflicting transactions. We might want
to even expose error_origin to make it a bit easier for users but not
sure. I feel the need for the new syntax (and then added code
complexity due to that) isn't warranted if we expose error_LSN and let
users use it with the existing functions.Well, the whole point of this feature is to provide a higher-level
interface instead of pg_replication_origin_advance(). Replication
origins are currently not something the users have to deal with
directly. We already document that you can use
pg_replication_origin_advance() to skip erroring transactions. But that
seems unsatisfactory. It'd be like using pg_surgery to fix unique
constraint violations.
+1
I’ve considered a plan for the skipping logical replication
transaction feature toward PG15. Several ideas and patches have been
proposed here and another related thread[1]/messages/by-id/20220125063131.4cmvsxbz2tdg6g65@alap3.anarazel.de[2]/messages/by-id/CAD21AoBarBf2oTF71ig2g_o=3Z_Dt6_sOpMQma1kFgbnA5OZ_w@mail.gmail.com for the skipping
logical replication transaction feature as follows:
A. Change pg_stat_subscription_workers (committed 7a8507329085)
B. Add origin name and commit-LSN to logical replication worker
errcontext (proposed[2]/messages/by-id/CAD21AoBarBf2oTF71ig2g_o=3Z_Dt6_sOpMQma1kFgbnA5OZ_w@mail.gmail.com)
C. Store error information (e.g., the error message and commit-LSN) to
the system catalog
D. Introduce ALTER SUBSCRIPTION SKIP
E. Record the skipped data somewhere: server logs or a table
Given the remaining time for PG15, it’s unlikely to complete all of
them for PG15 by the feature freeze. The most realistic plan for PG15
in my mind is to complete B and D. With these two items, the LSN of
the error-ed transaction is shown in the server log, and we can ask
users to check server logs for the LSN and use it with ALTER
SUBSCRIPTION SKIP command. If the community agrees with B+D, we will
have a user-visible feature for PG15 which can be further
extended/improved in PG16 by adding C and E. I started a new thread[2]/messages/by-id/CAD21AoBarBf2oTF71ig2g_o=3Z_Dt6_sOpMQma1kFgbnA5OZ_w@mail.gmail.com
for B yesterday. In this thread, I'd like to discuss D.
I've attached an updated patch for D and here is the summary:
* Introduce a new command ALTER SUBSCRIPTION ... SKIP (lsn =
'0/1234'). The user can get the commit-LSN of the transaction in
question from the server logs thanks to B[2]/messages/by-id/CAD21AoBarBf2oTF71ig2g_o=3Z_Dt6_sOpMQma1kFgbnA5OZ_w@mail.gmail.com.
* The user-specified LSN (say skip-LSN) is stored in the
pg_subscription catalog.
* The apply worker skips the whole transaction if the transaction's
commit-LSN exactly matches to skip-LSN.
* The skip-LSN has an effect on only the first non-empty transaction
since the worker started to apply changes. IOW it's cleared after
either skipping the whole transaction or successfully committing a
non-empty transaction, preventing the skip-LSN to remain in the
catalog. Also, since the latter case means that the user set the wrong
skip-LSN we clear it with a warning.
* ALTER SUBSCRIPTION SKIP doesn't support tablesync workers. But it
would not be a problem in practice since an error during table
synchronization is not common and could be resolved by truncating the
table and restarting the synchronization.
For the above reasons, ALTER SUBSCRIPTION SKIP command is safer than
the existing way of using pg_replication_origin_advance().
I've attached an updated patch along with two patches for cfbot tests
since the main patch (0003) depends on the other two patches. Both
0001 and 0002 patches are the same ones I attached on another
thread[2]/messages/by-id/CAD21AoBarBf2oTF71ig2g_o=3Z_Dt6_sOpMQma1kFgbnA5OZ_w@mail.gmail.com.
Regards,
[1]: /messages/by-id/20220125063131.4cmvsxbz2tdg6g65@alap3.anarazel.de
[2]: /messages/by-id/CAD21AoBarBf2oTF71ig2g_o=3Z_Dt6_sOpMQma1kFgbnA5OZ_w@mail.gmail.com
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v12-0003-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transa.patchapplication/octet-stream; name=v12-0003-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transa.patchDownload
From 1a5cd86c78d7ed83cefbe74f35ffff3db1f568a1 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 10 Dec 2021 14:41:30 +0900
Subject: [PATCH v12 3/3] Add ALTER SUBSCRIPTION ... SKIP to skip the
transaction on subscriber nodes.
If incoming change violates any constraint, logical replication stops
until it's resolved. This commit introduces another way to skip the
transaction in question, other than manually updating the subscriber's
database or using pg_replication_origin_advance().
The user can specify LSN by ALTER SUBSCRIPTION ... SKIP (lsn = XXX),
updating pg_subscription.subskiplsn field, telling the apply worker to
skip the transaction. The apply worker skips all data modification changes
within the specified transaction.
After skipping the transaction the apply worker clears
pg_subscription.subskiplsn.
Author: Masahiko Sawada
Reviewed-by: Vignesh C, Greg Nancarrow, Takamichi Osumi, Haiying Tang, Hou Zhijie, Peter Eisentraut, Amit Kapila
Discussion: https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xJfUVihNZDA@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 10 +
doc/src/sgml/logical-replication.sgml | 26 +-
doc/src/sgml/ref/alter_subscription.sgml | 43 +++
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 3 +-
src/backend/commands/subscriptioncmds.c | 70 +++++
src/backend/parser/gram.y | 9 +
src/backend/replication/logical/worker.c | 336 ++++++++++++++++-----
src/bin/pg_dump/pg_dump.c | 4 +
src/bin/psql/describe.c | 8 +-
src/bin/psql/tab-complete.c | 5 +-
src/include/catalog/pg_subscription.h | 5 +
src/include/nodes/parsenodes.h | 3 +-
src/test/regress/expected/subscription.out | 101 ++++---
src/test/regress/sql/subscription.sql | 6 +
src/test/subscription/t/029_skip_xact.pl | 182 +++++++++++
16 files changed, 685 insertions(+), 127 deletions(-)
create mode 100644 src/test/subscription/t/029_skip_xact.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 83987a9904..89be2b7682 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7769,6 +7769,16 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subskiplsn</structfield> <type>pg_lsn</type>
+ </para>
+ <para>
+ Commit LSN of the transaction whose changes are to be skipped, if a valid
+ LSN; otherwise <literal>0/0</literal>.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subconninfo</structfield> <type>text</type>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 57272e641e..d34b4485f5 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -361,13 +361,25 @@ CONTEXT: processing remote data during "INSERT" for replication target relation
</screen>
The LSN of the transaction that contains the change violating the constraint and
the replication origin name can be found from those outputs (LSN 0/14C0378 and
- replication origin <literal>pg_16395</literal> in the above case). The transaction
- can be skipped by calling the <link linkend="pg-replication-origin-advance">
- <function>pg_replication_origin_advance()</function></link> function with
- the <parameter>node_name</parameter> and the next LSN of the commit LSN
- (i.e., 0/14C0379) from those outputs. The current position of origins can be
- seen in the <link linkend="view-pg-replication-origin-status">
- <structname>pg_replication_origin_status</structname></link> system view.
+ replication origin <literal>pg_16395</literal> in the above case).
+ </para>
+
+ <para>
+ The resolution can be done by changing data or permissions on the subscriber so
+ that it does not conflict with incoming changes, by dropping the conflicting constraint
+ or unique index, or by writing a trigger on the subscriber to suppress or redirect
+ conflicting incoming changes, or as a last resort, by skipping the whole transaction.
+ </para>
+
+ <para>
+ The whole transaction can be skipped by using <command>ALTER SUBSCRIPTION ... SKIP</command>
+ with the commit LSN on the subscription. Alternatively, the transaction can also be
+ skipped by calling the <link linkend="pg-replication-origin-advance">
+ <function>pg_replication_origin_advance()</function></link> function with the
+ <parameter>node_name</parameter> and the next LSN of the commit LSN
+ (i.e., 0/14C0379) from those outputs. Please note that skipping the whole transaction
+ include skipping changes that might not violate any constraint. This can easily make
+ the subscriber inconsistent.
</para>
</sect1>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 0d6f064f58..f974511c1c 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -29,6 +29,7 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> REFRESH PUB
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> ENABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> DISABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> )
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <replaceable>new_name</replaceable>
</synopsis>
@@ -210,6 +211,48 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><literal>SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> )</literal></term>
+ <listitem>
+ <para>
+ Skips applying all changes of the specified remote transaction. If incoming data
+ violates any constraints, logical replication will stop until it is
+ resolved. The resolution can be done either by changing data on the
+ subscriber so that it doesn't conflict with incoming changes or by skipping
+ the whole transaction. Using the <command>ALTER SUBSCRIPTION ... SKIP</command>
+ command, the logical replication worker skips all data modification changes
+ within the specified transaction, including changes that might not violate
+ the constraint, so, it should only be used as a last resort. This option has
+ no effect on the transactions that are already prepared by enabling
+ <literal>two_phase</literal> on subscriber. After logical replication
+ successfully skips the transaction or commits non-empty transaction,
+ the LSN (stored in
+ <structname>pg_subscription</structname>.<structfield>subskiplsn</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts.
+ </para>
+
+ <para>
+ <replaceable>skip_option</replaceable> specifies options for this operation.
+ The supported option is:
+
+ <variablelist>
+ <varlistentry>
+ <term><literal>lsn</literal> (<type>pg_lsn</type>)</term>
+ <listitem>
+ <para>
+ Specifies the commit LSN of the remote transaction whose changes are to be skipped
+ by the logical replication worker. Skipping
+ individual subtransactions is not supported. Setting <literal>NONE</literal>
+ resets the LSN.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><replaceable class="parameter">new_owner</replaceable></term>
<listitem>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index ca65a8bd20..2139ebd0e0 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -69,6 +69,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->binary = subform->subbinary;
sub->stream = subform->substream;
sub->twophasestate = subform->subtwophasestate;
+ sub->skiplsn = subform->subskiplsn;
/* Get conninfo */
datum = SysCacheGetAttr(SUBSCRIPTIONOID,
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 40b7bca5a9..673d0bc7ba 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1261,7 +1261,8 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
- substream, subtwophasestate, subslotname, subsynccommit, subpublications)
+ substream, subtwophasestate, subskiplsn, subslotname, subsynccommit,
+ subpublications)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 3ef6607d24..cb80d41494 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -45,6 +45,7 @@
#include "utils/guc.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/pg_lsn.h"
#include "utils/syscache.h"
/*
@@ -61,6 +62,7 @@
#define SUBOPT_BINARY 0x00000080
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
+#define SUBOPT_LSN 0x00000400
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -82,6 +84,8 @@ typedef struct SubOpts
bool binary;
bool streaming;
bool twophase;
+ XLogRecPtr lsn; /* InvalidXLogRecPtr for resetting purpose,
+ * otherwise a valid LSN */
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -249,6 +253,35 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_TWOPHASE_COMMIT;
opts->twophase = defGetBoolean(defel);
}
+ else if (IsSet(supported_opts, SUBOPT_LSN) &&
+ strcmp(defel->defname, "lsn") == 0)
+ {
+ char *lsn_str = defGetString(defel);
+ XLogRecPtr lsn;
+
+ if (IsSet(opts->specified_opts, SUBOPT_LSN))
+ errorConflictingDefElem(defel, pstate);
+
+ if (strcmp(lsn_str, "none") == 0)
+ {
+ /* Setting lsn = NONE is treated as resetting LSN */
+ lsn = InvalidXLogRecPtr;
+ }
+ else
+ {
+ /* Parse the argument as LSN */
+ lsn = DatumGetTransactionId(DirectFunctionCall1(pg_lsn_in,
+ CStringGetDatum(lsn_str)));
+
+ if (XLogRecPtrIsInvalid(lsn))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid WAL location (LSN): %s", lsn_str)));
+ }
+
+ opts->specified_opts |= SUBOPT_LSN;
+ opts->lsn = lsn;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -464,6 +497,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
CharGetDatum(opts.twophase ?
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
+ values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -1083,6 +1117,42 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
break;
}
+ case ALTER_SUBSCRIPTION_SKIP:
+ {
+ parse_subscription_options(pstate, stmt->options, SUBOPT_LSN, &opts);
+
+ /* ALTER SUBSCRIPTION ... SKIP supports only LSN option */
+ Assert(IsSet(opts.specified_opts, SUBOPT_LSN));
+
+ /*
+ * If the user set subskiplsn, we do a sanity check to make
+ * sure that the specified LSN is a probable value.
+ */
+ if (!XLogRecPtrIsInvalid(opts.lsn))
+ {
+ RepOriginId originid;
+ char originname[NAMEDATALEN];
+ XLogRecPtr remote_lsn;
+
+ snprintf(originname, sizeof(originname), "pg_%u", subid);
+ originid = replorigin_by_name(originname, false);
+ remote_lsn = replorigin_get_progress(originid, false);
+
+ /* Check the given LSN is at least a future LSN */
+ if (!XLogRecPtrIsInvalid(remote_lsn) && opts.lsn < remote_lsn)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("skip WAL location (LSN) must be greater than origin LSN %X/%X",
+ LSN_FORMAT_ARGS(remote_lsn))));
+ }
+
+ values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(opts.lsn);
+ replaces[Anum_pg_subscription_subskiplsn - 1] = true;
+
+ update_tuple = true;
+ break;
+ }
+
default:
elog(ERROR, "unrecognized ALTER SUBSCRIPTION kind %d",
stmt->kind);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index a03b33b53b..0036c2f9e2 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9983,6 +9983,15 @@ AlterSubscriptionStmt:
(Node *)makeBoolean(false), @1));
$$ = (Node *)n;
}
+ | ALTER SUBSCRIPTION name SKIP definition
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_SKIP;
+ n->subname = $3;
+ n->options = $5;
+ $$ = (Node *)n;
+ }
;
/*****************************************************************************
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index a159561e31..91a7eaffe1 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -136,6 +136,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -189,6 +190,7 @@
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/pg_lsn.h"
#include "utils/rel.h"
#include "utils/rls.h"
#include "utils/syscache.h"
@@ -261,6 +263,21 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * We enable skipping all data modification changes (INSERT, UPDATE, etc.) for
+ * the subscription if the remote transaction's commit LSN matches the subskiplsn.
+ * Once we start skipping changes, we don't stop it until we skip all changes of
+ * the transaction even if pg_subscription is updated and MySubscription->skiplsn
+ * gets changed or reset during that. Also, in streaming transaction cases, we
+ * don't skip receiving and spooling the changes, since we decide whether or not
+ * to skip applying the changes when starting to apply changes. The subskiplsn is
+ * cleared after successfully skipping the transaction or applying non-empty
+ * transaction, where the later avoids the mistakenly specified subskiplsn from
+ * being left.
+ */
+static XLogRecPtr skip_xact_commit_lsn = InvalidXLogRecPtr;
+#define is_skipping_changes() (!XLogRecPtrIsInvalid(skip_xact_commit_lsn))
+
/* BufFile handle of the current streaming file */
static BufFile *stream_fd = NULL;
@@ -307,10 +324,13 @@ static void store_flush_position(XLogRecPtr remote_lsn);
static void maybe_reread_subscription(void);
+static void apply_worker_post_transaction(bool empty_tx, XLogRecPtr origin_lsn,
+ TimestampTz origin_ts);
+
/* prototype needed because of stream_commit */
static void apply_dispatch(StringInfo s);
-static void apply_handle_commit_internal(LogicalRepCommitData *commit_data);
+static bool apply_handle_commit_internal(LogicalRepCommitData *commit_data);
static void apply_handle_insert_internal(ApplyExecutionData *edata,
ResultRelInfo *relinfo,
TupleTableSlot *remoteslot);
@@ -336,6 +356,12 @@ static void TwoPhaseTransactionGid(Oid subid, TransactionId xid, char *gid, int
/* Common streaming function to apply all the spooled messages */
static void apply_spooled_messages(TransactionId xid, XLogRecPtr lsn);
+/* Functions for skipping changes */
+static void maybe_start_skipping_changes(XLogRecPtr lsn);
+static void stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_ts);
+static void clear_subscription_skip_lsn(XLogRecPtr skiplsn, XLogRecPtr origin_lsn,
+ TimestampTz origin_ts);
+
/* Functions for apply error callback */
static void apply_error_callback(void *arg);
static inline void set_apply_error_context_xact(TransactionId xid, XLogRecPtr lsn,
@@ -797,6 +823,8 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
+ maybe_start_skipping_changes(begin_data.final_lsn);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -811,6 +839,7 @@ static void
apply_handle_commit(StringInfo s)
{
LogicalRepCommitData commit_data;
+ bool committed;
logicalrep_read_commit(s, &commit_data);
@@ -821,13 +850,10 @@ apply_handle_commit(StringInfo s)
LSN_FORMAT_ARGS(commit_data.commit_lsn),
LSN_FORMAT_ARGS(remote_final_lsn))));
- apply_handle_commit_internal(&commit_data);
-
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(commit_data.end_lsn);
+ committed = apply_handle_commit_internal(&commit_data);
- pgstat_report_activity(STATE_IDLE, NULL);
- reset_apply_error_context_info();
+ apply_worker_post_transaction(committed, commit_data.end_lsn,
+ commit_data.committime);
}
/*
@@ -850,6 +876,8 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
+ maybe_start_skipping_changes(begin_data.prepare_lsn);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -908,9 +936,9 @@ apply_handle_prepare(StringInfo s)
/*
* Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction. It is done this way because at
- * commit prepared time, we won't know whether we have skipped preparing a
- * transaction because of no change.
+ * change has happened in this transaction or all changes are skipped. It
+ * is done this way because at commit prepared time, we won't know whether
+ * we have skipped preparing a transaction because of no change.
*
* XXX, We can optimize such that at commit prepared time, we first check
* whether we have prepared the transaction or not but that doesn't seem
@@ -924,15 +952,14 @@ apply_handle_prepare(StringInfo s)
CommitTransactionCommand();
pgstat_report_stat(false);
- store_flush_position(prepare_data.end_lsn);
-
- in_remote_transaction = false;
-
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(prepare_data.end_lsn);
-
- pgstat_report_activity(STATE_IDLE, NULL);
- reset_apply_error_context_info();
+ /*
+ * Do the post transaction work and cleanup. Since we already have
+ * prepared the transaction, in a case where the server crashes before
+ * clearing the subskiplsn, it will be left but the transaction won't be
+ * resent. But that's okay because it will be cleared when starting to
+ * apply the next transaction.
+ */
+ apply_worker_post_transaction(false, prepare_data.end_lsn, prepare_data.prepare_time);
}
/*
@@ -965,16 +992,8 @@ apply_handle_commit_prepared(StringInfo s)
FinishPreparedTransaction(gid, true);
end_replication_step();
CommitTransactionCommand();
- pgstat_report_stat(false);
-
- store_flush_position(prepare_data.end_lsn);
- in_remote_transaction = false;
-
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(prepare_data.end_lsn);
- pgstat_report_activity(STATE_IDLE, NULL);
- reset_apply_error_context_info();
+ apply_worker_post_transaction(true, prepare_data.end_lsn, prepare_data.commit_time);
}
/*
@@ -985,6 +1004,7 @@ apply_handle_rollback_prepared(StringInfo s)
{
LogicalRepRollbackPreparedTxnData rollback_data;
char gid[GIDSIZE];
+ bool finish_prepared = false;
logicalrep_read_rollback_prepared(s, &rollback_data);
set_apply_error_context_xact(rollback_data.xid, rollback_data.rollback_end_lsn,
@@ -1015,18 +1035,12 @@ apply_handle_rollback_prepared(StringInfo s)
FinishPreparedTransaction(gid, false);
end_replication_step();
CommitTransactionCommand();
+ finish_prepared = true;
}
- pgstat_report_stat(false);
-
- store_flush_position(rollback_data.rollback_end_lsn);
- in_remote_transaction = false;
-
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(rollback_data.rollback_end_lsn);
-
- pgstat_report_activity(STATE_IDLE, NULL);
- reset_apply_error_context_info();
+ apply_worker_post_transaction(finish_prepared,
+ rollback_data.rollback_end_lsn,
+ rollback_data.rollback_time);
}
/*
@@ -1066,21 +1080,10 @@ apply_handle_stream_prepare(StringInfo s)
CommitTransactionCommand();
- pgstat_report_stat(false);
-
- store_flush_position(prepare_data.end_lsn);
-
- in_remote_transaction = false;
-
/* unlink the files with serialized changes and subxact info. */
stream_cleanup_files(MyLogicalRepWorker->subid, prepare_data.xid);
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(prepare_data.end_lsn);
-
- pgstat_report_activity(STATE_IDLE, NULL);
-
- reset_apply_error_context_info();
+ apply_worker_post_transaction(false, prepare_data.end_lsn, prepare_data.prepare_time);
}
/*
@@ -1341,6 +1344,8 @@ apply_spooled_messages(TransactionId xid, XLogRecPtr lsn)
remote_final_lsn = lsn;
+ maybe_start_skipping_changes(lsn);
+
/*
* Make sure the handle apply_dispatch methods are aware we're in a remote
* transaction.
@@ -1429,6 +1434,7 @@ apply_handle_stream_commit(StringInfo s)
{
TransactionId xid;
LogicalRepCommitData commit_data;
+ bool committed;
if (in_streamed_transaction)
ereport(ERROR,
@@ -1442,23 +1448,20 @@ apply_handle_stream_commit(StringInfo s)
apply_spooled_messages(xid, commit_data.commit_lsn);
- apply_handle_commit_internal(&commit_data);
+ committed = apply_handle_commit_internal(&commit_data);
/* unlink the files with serialized changes and subxact info */
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(commit_data.end_lsn);
-
- pgstat_report_activity(STATE_IDLE, NULL);
-
- reset_apply_error_context_info();
+ apply_worker_post_transaction(committed, commit_data.end_lsn,
+ commit_data.committime);
}
/*
* Helper function for apply_handle_commit and apply_handle_stream_commit.
+ * Return true if the transaction was committed, otherwise return false.
*/
-static void
+static bool
apply_handle_commit_internal(LogicalRepCommitData *commit_data)
{
if (IsTransactionState())
@@ -1471,18 +1474,14 @@ apply_handle_commit_internal(LogicalRepCommitData *commit_data)
replorigin_session_origin_timestamp = commit_data->committime;
CommitTransactionCommand();
- pgstat_report_stat(false);
-
- store_flush_position(commit_data->end_lsn);
- }
- else
- {
- /* Process any invalidation messages that might have accumulated. */
- AcceptInvalidationMessages();
- maybe_reread_subscription();
+ return true;
}
- in_remote_transaction = false;
+ /* Process any invalidation messages that might have accumulated. */
+ AcceptInvalidationMessages();
+ maybe_reread_subscription();
+
+ return false;
}
/*
@@ -2376,6 +2375,17 @@ apply_dispatch(StringInfo s)
LogicalRepMsgType action = pq_getmsgbyte(s);
LogicalRepMsgType saved_command;
+ /*
+ * Skip all data-modification changes if we're skipping changes of this
+ * transaction.
+ */
+ if (is_skipping_changes() &&
+ (action == LOGICAL_REP_MSG_INSERT ||
+ action == LOGICAL_REP_MSG_UPDATE ||
+ action == LOGICAL_REP_MSG_DELETE ||
+ action == LOGICAL_REP_MSG_TRUNCATE))
+ return;
+
/*
* Set the current command being applied. Since this function can be
* called recursively when applying spooled changes, save the current
@@ -3672,6 +3682,196 @@ IsLogicalWorker(void)
return MyLogicalRepWorker != NULL;
}
+/*
+ * Post-transaction work for apply workers.
+ *
+ * tx_finished is true if the caller have finished the transaction with updating
+ * the replication origin so the same transaction won't be resent in case of
+ * a crash. Both origin_lsn and origin_timestamp are the remote transaction's
+ * end_lsn and commit timestamp, respectively.
+ */
+static void
+apply_worker_post_transaction(bool tx_finished, XLogRecPtr origin_lsn,
+ TimestampTz origin_ts)
+{
+ if (unlikely(is_skipping_changes()))
+ {
+ /*
+ * If we are skipping all changes of this transaction, we stop it and
+ * clear subskiplsn of pg_subscription.
+ */
+ stop_skipping_changes(origin_lsn, origin_ts);
+ }
+ else if (unlikely(tx_finished && !XLogRecPtrIsInvalid(MySubscription->skiplsn)))
+ {
+ /*
+ * The subskiplsn was specified but we successfully finished non-empty
+ * transaction. In this case, it's possible that the user mistakenly
+ * specified the wrong subskiplsn so raise an warning and clear it.
+ */
+ ereport(WARNING,
+ errmsg("remote transaction's commit WAL location (LSN) %X/%X did not match skip-LSN %X/%X",
+ LSN_FORMAT_ARGS(origin_lsn),
+ LSN_FORMAT_ARGS(MySubscription->skiplsn)));
+
+ clear_subscription_skip_lsn(MySubscription->skiplsn, origin_lsn, origin_ts);
+ }
+
+ Assert(!IsTransactionState());
+ Assert(!is_skipping_changes());
+
+ pgstat_report_stat(false);
+
+ store_flush_position(origin_lsn);
+
+ in_remote_transaction = false;
+
+ /* Process any tables that are being synchronized in parallel. */
+ process_syncing_tables(origin_lsn);
+
+ pgstat_report_activity(STATE_IDLE, NULL);
+
+ reset_apply_error_context_info();
+}
+
+/*
+ * Start skipping changes of the transaction if the given commit LSN matches the
+ * LSN specified by subscription's skiplsn.
+ */
+static void
+maybe_start_skipping_changes(XLogRecPtr lsn)
+{
+ Assert(!is_skipping_changes());
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ if (likely(XLogRecPtrIsInvalid(MySubscription->skiplsn)))
+ return;
+
+ if (likely(MySubscription->skiplsn != lsn))
+ {
+ /*
+ * It's a rare case; a past subskiplsn was left because the server
+ * crashed after preparing the transaction and before clearing the
+ * subskiplsn. We clear it without a warning message so as not confuse
+ * the user.
+ */
+ if (unlikely(MySubscription->skiplsn < lsn))
+ clear_subscription_skip_lsn(MySubscription->skiplsn, InvalidXLogRecPtr, 0);
+
+ return;
+ }
+
+ /* Start skipping all changes of this transaction */
+ skip_xact_commit_lsn = lsn;
+
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction which committed at %X/%X",
+ LSN_FORMAT_ARGS(skip_xact_commit_lsn)));
+}
+
+/*
+ * Stop skipping changes by resetting skip_xact_commit_lsn. Both origin_lsn and
+ * origin_timestamp are used to update origin state when clearing subskiplsn so
+ * that we can restart streaming from correct position in case of crash.
+ */
+static void
+stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_ts)
+{
+ Assert(is_skipping_changes());
+
+ clear_subscription_skip_lsn(skip_xact_commit_lsn, origin_lsn, origin_ts);
+
+ /* Make sure that clearing subskiplsn is committed */
+ if (IsTransactionState())
+ CommitTransactionCommand();
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction which committed at %X/%X",
+ LSN_FORMAT_ARGS(skip_xact_commit_lsn))));
+
+ /* Stop skipping changes */
+ skip_xact_commit_lsn = InvalidXLogRecPtr;
+}
+
+/* Clear subskiplsn of pg_subscription catalog */
+static void
+clear_subscription_skip_lsn(XLogRecPtr skiplsn, XLogRecPtr origin_lsn, TimestampTz origin_ts)
+{
+ Relation rel;
+ Form_pg_subscription subform;
+ HeapTuple tup;
+ bool started_tx = false;
+
+ if (!IsTransactionState())
+ {
+ StartTransactionCommand();
+ started_tx = true;
+ }
+
+ /*
+ * Protect subskiplsn of pg_subscription from being concurrently updated
+ * while clearing it.
+ */
+ LockSharedObject(SubscriptionRelationId, MySubscription->oid, 0,
+ AccessShareLock);
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ ObjectIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ subform = (Form_pg_subscription) GETSTRUCT(tup);
+
+ /*
+ * Update the subskiplsn of the tuple to InvalidXLogRecPtr. If user has
+ * already changed subskiplsn before clearing it we don't update the
+ * catalog and don't advance the replication origin state. So in the
+ * worst case, if the server crashes before sending an acknowledgment of
+ * the flush position the transaction will be sent again and the user
+ * needs to set subskiplsn again. We can reduce the possibility by
+ * logging a replication origin WAL record to advance the origin LSN
+ * instead but there is no way to advance the origin timestamp and it
+ * doesn't seem to be worth doing anything about it since it's a very rare
+ * case.
+ */
+ if (subform->subskiplsn == skiplsn)
+ {
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ /* reset subskiplsn */
+ values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ replaces[Anum_pg_subscription_subskiplsn - 1] = true;
+
+ /*
+ * Update origin state so we can restart streaming from correct
+ * position in case of crash.
+ */
+ replorigin_session_origin_lsn = origin_lsn;
+ replorigin_session_origin_timestamp = origin_ts;
+
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+ }
+
+ heap_freetuple(tup);
+ table_close(rel, NoLock);
+
+ if (started_tx)
+ CommitTransactionCommand();
+}
+
/* Error callback to give more context info about the change being applied */
static void
apply_error_callback(void *arg)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index e69dcf8a48..dc3b28660d 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4355,6 +4355,10 @@ getSubscriptions(Archive *fout)
ntups = PQntuples(res);
+ /*
+ * Get subscription fields. We don't include subskiplsn in the dump as
+ * after restoring the dump this value may no longer be relevant.
+ */
i_tableoid = PQfnumber(res, "tableoid");
i_oid = PQfnumber(res, "oid");
i_subname = PQfnumber(res, "subname");
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index e3382933d9..1750b71a4a 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6084,7 +6084,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false};
+ false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6129,6 +6129,12 @@ describeSubscriptions(const char *pattern, bool verbose)
", subconninfo AS \"%s\"\n",
gettext_noop("Synchronous commit"),
gettext_noop("Conninfo"));
+
+ /* Skip LSN is only supported in v15 and higher */
+ if (pset.sversion >= 150000)
+ appendPQExpBuffer(&buf,
+ ", subskiplsn AS \"%s\"\n",
+ gettext_noop("Skip LSN"));
}
/* Only display subscriptions in current database. */
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 6957567264..604047f341 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1819,7 +1819,7 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> */
else if (Matches("ALTER", "SUBSCRIPTION", MatchAny))
COMPLETE_WITH("CONNECTION", "ENABLE", "DISABLE", "OWNER TO",
- "RENAME TO", "REFRESH PUBLICATION", "SET",
+ "RENAME TO", "REFRESH PUBLICATION", "SET", "SKIP (",
"ADD PUBLICATION", "DROP PUBLICATION");
/* ALTER SUBSCRIPTION <name> REFRESH PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) &&
@@ -1835,6 +1835,9 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
COMPLETE_WITH("binary", "slot_name", "streaming", "synchronous_commit");
+ /* ALTER SUBSCRIPTION <name> SKIP ( */
+ else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
+ COMPLETE_WITH("lsn");
/* ALTER SUBSCRIPTION <name> SET PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "PUBLICATION"))
{
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 18c291289f..89a5861d19 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -67,6 +67,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
char subtwophasestate; /* Stream two-phase transactions */
+ XLogRecPtr subskiplsn; /* All changes which committed at this LSN are
+ * skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -103,6 +106,8 @@ typedef struct Subscription
* binary format */
bool stream; /* Allow streaming in-progress transactions. */
char twophasestate; /* Allow streaming two-phase transactions */
+ XLogRecPtr skiplsn; /* All changes which committed at this LSN are
+ * skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 1617702d9d..6f83a79a96 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3726,7 +3726,8 @@ typedef enum AlterSubscriptionType
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
ALTER_SUBSCRIPTION_DROP_PUBLICATION,
ALTER_SUBSCRIPTION_REFRESH,
- ALTER_SUBSCRIPTION_ENABLED
+ ALTER_SUBSCRIPTION_ENABLED,
+ ALTER_SUBSCRIPTION_SKIP
} AlterSubscriptionType;
typedef struct AlterSubscriptionStmt
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 80aae83562..4710d53698 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -76,10 +76,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -93,11 +93,16 @@ ALTER SUBSCRIPTION regress_doesnotexist CONNECTION 'dbname=regress_doesnotexist2
ERROR: subscription "regress_doesnotexist" does not exist
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
ERROR: unrecognized subscription parameter: "create_slot"
+-- ok - with lsn = NONE
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
+ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+--------------------+------------------------------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | off | dbname=regress_doesnotexist2
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -129,10 +134,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+--------------------+------------------------------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | local | dbname=regress_doesnotexist2
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -165,19 +170,19 @@ ERROR: binary requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, binary = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -188,19 +193,19 @@ ERROR: streaming requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -215,10 +220,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more then once
@@ -233,10 +238,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -270,10 +275,10 @@ ERROR: two_phase requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -282,10 +287,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -294,10 +299,10 @@ DROP SUBSCRIPTION regress_testsub;
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index bd0f4af1e4..753be1f323 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -72,6 +72,12 @@ ALTER SUBSCRIPTION regress_testsub SET (slot_name = '');
ALTER SUBSCRIPTION regress_doesnotexist CONNECTION 'dbname=regress_doesnotexist2';
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
+-- ok - with lsn = NONE
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
+
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
+
\dRs+
BEGIN;
diff --git a/src/test/subscription/t/029_skip_xact.pl b/src/test/subscription/t/029_skip_xact.pl
new file mode 100644
index 0000000000..833088cf86
--- /dev/null
+++ b/src/test/subscription/t/029_skip_xact.pl
@@ -0,0 +1,182 @@
+
+# Copyright (c) 2022, PostgreSQL Global Development Group
+
+# Tests for skipping logical replication transactions
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 4;
+use Time::HiRes qw(usleep);
+
+my $offset = 0;
+
+# Test skipping the transaction. This function must be called after the caller
+# has inserted data that conflicts with the subscriber. The commit-LSN of the
+# error transaction that is used to specify to ALTER SUBSCRIPTION ... SKIP is
+# fetched from the server logs. After executing ALTER SUBSCRITPION ... SKIP, we
+# check if logical replication can continue working by inserting $nonconflict_data
+# on the publisher.
+sub test_skip_xact
+{
+ my ($node_publisher, $node_subscriber, $subname, $relname,
+ $nonconflict_data, $expected, $msg)
+ = @_;
+
+ # Wait until a conflict occurs on the subscriber.
+ $node_subscriber->wait_for_log(
+ qr/CONTEXT: processing remote data during "INSERT" for replication target relation/,
+ $offset);
+
+ # Get the commit-LSN of the error transaction.
+ my $contents = slurp_file($node_subscriber->logfile, $offset);
+ $contents =~
+ qr/processing remote data during "INSERT" for replication target relation "public.$relname" in transaction \d+ committed at LSN ([[:xdigit:]]+\/[[:xdigit:]]+)/
+ or die "could not get error-LSN";
+ my $lsn = $1;
+
+ # Set skip lsn
+ $node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION $subname SKIP (lsn = '$lsn')");
+
+ # Restart the subscriber node to restart logical replication with no interval
+ $node_subscriber->restart;
+
+ # Wait for the failed transaction to be skipped
+ $node_subscriber->poll_query_until('postgres',
+ "SELECT subskiplsn = '0/0' FROM pg_subscription WHERE subname = '$subname'"
+ );
+
+ # Wait for the log indicating that successfully skipped the transaction, and
+ # advance the offset of the log file for the next test.
+ $offset = $node_subscriber->wait_for_log(
+ qr/LOG: done skipping logical replication transaction which committed at $lsn/,
+ $offset);
+
+ # Insert non-conflict data
+ $node_publisher->safe_psql('postgres',
+ "INSERT INTO $relname VALUES $nonconflict_data");
+
+ $node_publisher->wait_for_catchup($subname);
+
+ # Check replicated data
+ my $res = $node_subscriber->safe_psql('postgres',
+ "SELECT count(*) FROM $relname");
+ is($res, $expected, $msg);
+}
+
+# Create publisher node. Set a low value to logical_decoding_work_mem
+# so we can test streaming cases easily.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf(
+ 'postgresql.conf',
+ qq[
+logical_decoding_work_mem = 64kB
+max_prepared_transactions = 10
+]);
+$node_publisher->start;
+
+# Create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->append_conf(
+ 'postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+]);
+
+# The subscriber will enter an infinite error loop, so we don't want
+# to overflow the server log with error messages.
+$node_subscriber->append_conf(
+ 'postgresql.conf',
+ qq[
+wal_retrieve_retry_interval = 2s
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On the subscriber, we
+# create the same tables but with primary keys. Also, insert some data that
+# will conflict with the data replicated from publisher later.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+CREATE TABLE test_tab (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+COMMIT;
+]);
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+CREATE TABLE test_tab (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab VALUES (1);
+INSERT INTO test_tab_streaming VALUES (1, md5(1::text));
+COMMIT;
+]);
+
+# Setup publications
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Create subscriptions
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr' PUBLICATION tap_pub WITH (two_phase = on);
+CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr' PUBLICATION tap_pub_streaming WITH (streaming = on);
+]);
+
+$node_publisher->wait_for_catchup('tap_sub');
+$node_publisher->wait_for_catchup('tap_sub_streaming');
+
+# Insert data to test_tab1, raising an error on the subscriber due to violation
+# of the unique constraint on test_tab. Then skip the transaction.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+COMMIT;
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(2)", "2", "test skipping transaction");
+
+# Test for PREPARE and COMMIT PREPARED. Insert the same data to test_tab1 and
+# PREPARE the transaction, raising an error. Then skip the transaction.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+PREPARE TRANSACTION 'gtx';
+COMMIT PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(3)", "3", "test skipping prepare and commit prepared ");
+
+# Test for STREAM COMMIT. Insert enough rows to test_tab_streaming to exceed the 64kB
+# limit, also raising an error on the subscriber during applying spooled changes for the
+# same reason. Then skip the transaction.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+COMMIT;
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub_streaming",
+ "test_tab_streaming", "(2, md5(2::text))",
+ "2", "test skipping stream-commit");
+
+my $res = $node_subscriber->safe_psql('postgres',
+ "SELECT count(*) FROM pg_prepared_xacts");
+is($res, "0",
+ "check all prepared transactions are resolved on the subscriber");
--
2.24.3 (Apple Git-128)
v12-0001-Use-complete-sentences-in-logical-replication-wo.patchapplication/octet-stream; name=v12-0001-Use-complete-sentences-in-logical-replication-wo.patchDownload
From 0eec9d7fb9bed71af6fc26fce350c9df426cae1b Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 28 Feb 2022 17:53:28 +0900
Subject: [PATCH v12 1/3] Use complete sentences in logical replication worker
errcontext.
Previously, the message for logical replication worker errcontext is
incrementally built, which was not translation friendly. Instead, we
use complete sentences with if-else branches.
---
src/backend/replication/logical/worker.c | 49 ++++++++++++------------
1 file changed, 24 insertions(+), 25 deletions(-)
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 7e267f7960..e81f85e2a3 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3648,38 +3648,37 @@ IsLogicalWorker(void)
static void
apply_error_callback(void *arg)
{
- StringInfoData buf;
ApplyErrorCallbackArg *errarg = &apply_error_callback_arg;
if (apply_error_callback_arg.command == 0)
return;
- initStringInfo(&buf);
- appendStringInfo(&buf, _("processing remote data during \"%s\""),
- logicalrep_message_type(errarg->command));
-
- /* append relation information */
- if (errarg->rel)
- {
- appendStringInfo(&buf, _(" for replication target relation \"%s.%s\""),
- errarg->rel->remoterel.nspname,
- errarg->rel->remoterel.relname);
- if (errarg->remote_attnum >= 0)
- appendStringInfo(&buf, _(" column \"%s\""),
- errarg->rel->remoterel.attnames[errarg->remote_attnum]);
- }
-
- /* append transaction information */
- if (TransactionIdIsNormal(errarg->remote_xid))
+ if (errarg->rel == NULL)
{
- appendStringInfo(&buf, _(" in transaction %u"), errarg->remote_xid);
- if (errarg->ts != 0)
- appendStringInfo(&buf, _(" at %s"),
- timestamptz_to_str(errarg->ts));
+ if (!TransactionIdIsValid(errarg->remote_xid))
+ errcontext("processing remote data during \"%s\"",
+ logicalrep_message_type(errarg->command));
+ else
+ errcontext("processing remote data during \"%s\" in transaction %u at %s",
+ logicalrep_message_type(errarg->command),
+ errarg->remote_xid,
+ (errarg->ts != 0) ? timestamptz_to_str(errarg->ts) : "(not-set)");
}
-
- errcontext("%s", buf.data);
- pfree(buf.data);
+ else if (errarg->remote_attnum < 0)
+ errcontext("processing remote data during \"%s\" for replication target relation \"%s.%s\" in transaction %u at %s",
+ logicalrep_message_type(errarg->command),
+ errarg->rel->remoterel.nspname,
+ errarg->rel->remoterel.relname,
+ errarg->remote_xid,
+ (errarg->ts != 0) ? timestamptz_to_str(errarg->ts) : "(not-set)");
+ else
+ errcontext("processing remote data during \"%s\" for replication target relation \"%s.%s\" column \"%s\" in transaction %u at %s",
+ logicalrep_message_type(errarg->command),
+ errarg->rel->remoterel.nspname,
+ errarg->rel->remoterel.relname,
+ errarg->rel->remoterel.attnames[errarg->remote_attnum],
+ errarg->remote_xid,
+ (errarg->ts != 0) ? timestamptz_to_str(errarg->ts) : "(not-set)");
}
/* Set transaction information of apply error callback */
--
2.24.3 (Apple Git-128)
v12-0002-Add-the-origin-name-and-remote-commit-LSN-to-log.patchapplication/octet-stream; name=v12-0002-Add-the-origin-name-and-remote-commit-LSN-to-log.patchDownload
From 2b220abdf8ee18b8c60ef0dce584d9ec9e36aedd Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Thu, 24 Feb 2022 16:56:58 +0900
Subject: [PATCH v12 2/3] Add the origin name and remote commit-LSN to logical
replication worker errcontext.
This commits adds both the commit-LSN and replication origin name to
the existing error context message.
This will help users in specifying the origin name and commit-LSN to
pg_replication_origin_advance() SQL function to skip the particular transaction.
---
doc/src/sgml/logical-replication.sgml | 19 +++++--
src/backend/replication/logical/worker.c | 71 ++++++++++++++++++------
2 files changed, 67 insertions(+), 23 deletions(-)
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index fb4472356d..57272e641e 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -352,12 +352,21 @@
<para>
The resolution can be done either by changing data or permissions on the subscriber so
that it does not conflict with the incoming change or by skipping the
- transaction that conflicts with the existing data. The transaction can be
- skipped by calling the <link linkend="pg-replication-origin-advance">
+ transaction that conflicts with the existing data. When a conflict produces
+ an error, it is shown in the subscriber's server logs as follows:
+<screen>
+ERROR: duplicate key value violates unique constraint "test_pkey"
+DETAIL: Key (c)=(1) already exists.
+CONTEXT: processing remote data during "INSERT" for replication target relation "public.test" in transaction 725 committed at LSN 0/14BFA88 and timestamp 2022-02-28 20:58:27.964238+00 from replication origin "pg_16395"
+</screen>
+ The LSN of the transaction that contains the change violating the constraint and
+ the replication origin name can be found from those outputs (LSN 0/14C0378 and
+ replication origin <literal>pg_16395</literal> in the above case). The transaction
+ can be skipped by calling the <link linkend="pg-replication-origin-advance">
<function>pg_replication_origin_advance()</function></link> function with
- a <parameter>node_name</parameter> corresponding to the subscription name,
- and a position. The current position of origins can be seen in the
- <link linkend="view-pg-replication-origin-status">
+ the <parameter>node_name</parameter> and the next LSN of the commit LSN
+ (i.e., 0/14C0379) from those outputs. The current position of origins can be
+ seen in the <link linkend="view-pg-replication-origin-status">
<structname>pg_replication_origin_status</structname></link> system view.
</para>
</sect1>
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index e81f85e2a3..a159561e31 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -226,6 +226,8 @@ typedef struct ApplyErrorCallbackArg
/* Remote node information */
int remote_attnum; /* -1 if invalid */
TransactionId remote_xid;
+ XLogRecPtr commit_lsn;
+ char *origin_name;
TimestampTz ts; /* commit, rollback, or prepare timestamp */
} ApplyErrorCallbackArg;
@@ -235,6 +237,8 @@ static ApplyErrorCallbackArg apply_error_callback_arg =
.rel = NULL,
.remote_attnum = -1,
.remote_xid = InvalidTransactionId,
+ .commit_lsn = InvalidXLogRecPtr,
+ .origin_name = NULL,
.ts = 0,
};
@@ -334,7 +338,8 @@ static void apply_spooled_messages(TransactionId xid, XLogRecPtr lsn);
/* Functions for apply error callback */
static void apply_error_callback(void *arg);
-static inline void set_apply_error_context_xact(TransactionId xid, TimestampTz ts);
+static inline void set_apply_error_context_xact(TransactionId xid, XLogRecPtr lsn,
+ TimestampTz ts);
static inline void reset_apply_error_context_info(void);
/*
@@ -787,7 +792,8 @@ apply_handle_begin(StringInfo s)
LogicalRepBeginData begin_data;
logicalrep_read_begin(s, &begin_data);
- set_apply_error_context_xact(begin_data.xid, begin_data.committime);
+ set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn,
+ begin_data.committime);
remote_final_lsn = begin_data.final_lsn;
@@ -839,7 +845,8 @@ apply_handle_begin_prepare(StringInfo s)
errmsg_internal("tablesync worker received a BEGIN PREPARE message")));
logicalrep_read_begin_prepare(s, &begin_data);
- set_apply_error_context_xact(begin_data.xid, begin_data.prepare_time);
+ set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn,
+ begin_data.prepare_time);
remote_final_lsn = begin_data.prepare_lsn;
@@ -938,7 +945,8 @@ apply_handle_commit_prepared(StringInfo s)
char gid[GIDSIZE];
logicalrep_read_commit_prepared(s, &prepare_data);
- set_apply_error_context_xact(prepare_data.xid, prepare_data.commit_time);
+ set_apply_error_context_xact(prepare_data.xid, prepare_data.commit_lsn,
+ prepare_data.commit_time);
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
@@ -979,7 +987,8 @@ apply_handle_rollback_prepared(StringInfo s)
char gid[GIDSIZE];
logicalrep_read_rollback_prepared(s, &rollback_data);
- set_apply_error_context_xact(rollback_data.xid, rollback_data.rollback_time);
+ set_apply_error_context_xact(rollback_data.xid, rollback_data.rollback_end_lsn,
+ rollback_data.rollback_time);
/* Compute GID for two_phase transactions. */
TwoPhaseTransactionGid(MySubscription->oid, rollback_data.xid,
@@ -1044,7 +1053,8 @@ apply_handle_stream_prepare(StringInfo s)
errmsg_internal("tablesync worker received a STREAM PREPARE message")));
logicalrep_read_stream_prepare(s, &prepare_data);
- set_apply_error_context_xact(prepare_data.xid, prepare_data.prepare_time);
+ set_apply_error_context_xact(prepare_data.xid, prepare_data.prepare_lsn,
+ prepare_data.prepare_time);
elog(DEBUG1, "received prepare for streamed transaction %u", prepare_data.xid);
@@ -1126,7 +1136,7 @@ apply_handle_stream_start(StringInfo s)
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg_internal("invalid transaction ID in streamed replication transaction")));
- set_apply_error_context_xact(stream_xid, 0);
+ set_apply_error_context_xact(stream_xid, InvalidXLogRecPtr, 0);
/*
* Initialize the worker's stream_fileset if we haven't yet. This will be
@@ -1215,7 +1225,7 @@ apply_handle_stream_abort(StringInfo s)
*/
if (xid == subxid)
{
- set_apply_error_context_xact(xid, 0);
+ set_apply_error_context_xact(xid, InvalidXLogRecPtr, 0);
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
}
else
@@ -1241,7 +1251,7 @@ apply_handle_stream_abort(StringInfo s)
bool found = false;
char path[MAXPGPATH];
- set_apply_error_context_xact(subxid, 0);
+ set_apply_error_context_xact(subxid, InvalidXLogRecPtr, 0);
subidx = -1;
begin_replication_step();
@@ -1426,7 +1436,7 @@ apply_handle_stream_commit(StringInfo s)
errmsg_internal("STREAM COMMIT message without STREAM STOP")));
xid = logicalrep_read_stream_commit(s, &commit_data);
- set_apply_error_context_xact(xid, commit_data.committime);
+ set_apply_error_context_xact(xid, commit_data.commit_lsn, commit_data.committime);
elog(DEBUG1, "received commit for streamed transaction %u", xid);
@@ -3501,6 +3511,17 @@ ApplyWorkerMain(Datum main_arg)
myslotname = MemoryContextStrdup(ApplyContext, syncslotname);
pfree(syncslotname);
+
+ /*
+ * Allocate the origin name in long-lived context for error context
+ * message
+ */
+ ReplicationOriginNameForTablesync(MySubscription->oid,
+ MyLogicalRepWorker->relid,
+ originname,
+ sizeof(originname));
+ apply_error_callback_arg.origin_name = MemoryContextStrdup(ApplyContext,
+ originname);
}
else
{
@@ -3544,6 +3565,13 @@ ApplyWorkerMain(Datum main_arg)
* does some initializations on the upstream so let's still call it.
*/
(void) walrcv_identify_system(LogRepWorkerWalRcvConn, &startpointTLI);
+
+ /*
+ * Allocate the origin name in long-lived context for error context
+ * message
+ */
+ apply_error_callback_arg.origin_name = MemoryContextStrdup(ApplyContext,
+ originname);
}
/*
@@ -3659,33 +3687,40 @@ apply_error_callback(void *arg)
errcontext("processing remote data during \"%s\"",
logicalrep_message_type(errarg->command));
else
- errcontext("processing remote data during \"%s\" in transaction %u at %s",
+ errcontext("processing remote data during \"%s\" in transaction %u committed at LSN %X/%X and timestamp %s from replication origin \"%s\"",
logicalrep_message_type(errarg->command),
errarg->remote_xid,
- (errarg->ts != 0) ? timestamptz_to_str(errarg->ts) : "(not-set)");
+ LSN_FORMAT_ARGS(errarg->commit_lsn),
+ (errarg->ts != 0) ? timestamptz_to_str(errarg->ts) : "(not-set)",
+ errarg->origin_name);
}
else if (errarg->remote_attnum < 0)
- errcontext("processing remote data during \"%s\" for replication target relation \"%s.%s\" in transaction %u at %s",
+ errcontext("processing remote data during \"%s\" for replication target relation \"%s.%s\" in transaction %u committed at LSN %X/%X and timestamp %s from replication origin \"%s\"",
logicalrep_message_type(errarg->command),
errarg->rel->remoterel.nspname,
errarg->rel->remoterel.relname,
errarg->remote_xid,
- (errarg->ts != 0) ? timestamptz_to_str(errarg->ts) : "(not-set)");
+ LSN_FORMAT_ARGS(errarg->commit_lsn),
+ (errarg->ts != 0) ? timestamptz_to_str(errarg->ts) : "(not-set)",
+ errarg->origin_name);
else
- errcontext("processing remote data during \"%s\" for replication target relation \"%s.%s\" column \"%s\" in transaction %u at %s",
+ errcontext("processing remote data during \"%s\" for replication target relation \"%s.%s\" column \"%s\" in transaction %u committed at LSN %X/%X and timestamp %s from replication origin \"%s\"",
logicalrep_message_type(errarg->command),
errarg->rel->remoterel.nspname,
errarg->rel->remoterel.relname,
errarg->rel->remoterel.attnames[errarg->remote_attnum],
errarg->remote_xid,
- (errarg->ts != 0) ? timestamptz_to_str(errarg->ts) : "(not-set)");
+ LSN_FORMAT_ARGS(errarg->commit_lsn),
+ (errarg->ts != 0) ? timestamptz_to_str(errarg->ts) : "(not-set)",
+ errarg->origin_name);
}
/* Set transaction information of apply error callback */
static inline void
-set_apply_error_context_xact(TransactionId xid, TimestampTz ts)
+set_apply_error_context_xact(TransactionId xid, XLogRecPtr lsn, TimestampTz ts)
{
apply_error_callback_arg.remote_xid = xid;
+ apply_error_callback_arg.commit_lsn = lsn;
apply_error_callback_arg.ts = ts;
}
@@ -3696,5 +3731,5 @@ reset_apply_error_context_info(void)
apply_error_callback_arg.command = 0;
apply_error_callback_arg.rel = NULL;
apply_error_callback_arg.remote_attnum = -1;
- set_apply_error_context_xact(InvalidTransactionId, 0);
+ set_apply_error_context_xact(InvalidTransactionId, InvalidXLogRecPtr, 0);
}
--
2.24.3 (Apple Git-128)
On Tue, Mar 1, 2022 at 8:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I’ve considered a plan for the skipping logical replication
transaction feature toward PG15. Several ideas and patches have been
proposed here and another related thread[1][2] for the skipping
logical replication transaction feature as follows:A. Change pg_stat_subscription_workers (committed 7a8507329085)
B. Add origin name and commit-LSN to logical replication worker
errcontext (proposed[2])
C. Store error information (e.g., the error message and commit-LSN) to
the system catalog
D. Introduce ALTER SUBSCRIPTION SKIP
E. Record the skipped data somewhere: server logs or a tableGiven the remaining time for PG15, it’s unlikely to complete all of
them for PG15 by the feature freeze. The most realistic plan for PG15
in my mind is to complete B and D. With these two items, the LSN of
the error-ed transaction is shown in the server log, and we can ask
users to check server logs for the LSN and use it with ALTER
SUBSCRIPTION SKIP command.
It makes sense to me to try to finish B and D from the above list for
PG-15. I can review the patch for D in detail if others don't have an
objection to it.
Peter E., others, any opinion on this matter?
If the community agrees with B+D, we will
have a user-visible feature for PG15 which can be further
extended/improved in PG16 by adding C and E.
Agreed.
I've attached an updated patch for D and here is the summary:
* Introduce a new command ALTER SUBSCRIPTION ... SKIP (lsn =
'0/1234'). The user can get the commit-LSN of the transaction in
question from the server logs thanks to B[2].
* The user-specified LSN (say skip-LSN) is stored in the
pg_subscription catalog.
* The apply worker skips the whole transaction if the transaction's
commit-LSN exactly matches to skip-LSN.
* The skip-LSN has an effect on only the first non-empty transaction
since the worker started to apply changes. IOW it's cleared after
either skipping the whole transaction or successfully committing a
non-empty transaction, preventing the skip-LSN to remain in the
catalog. Also, since the latter case means that the user set the wrong
skip-LSN we clear it with a warning.
As this will be displayed only in server logs and by background apply
worker, should it be LOG or WARNING?
--
With Regards,
Amit Kapila.
On Wednesday, March 2, 2022 12:01 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated patch along with two patches for cfbot tests since the
main patch (0003) depends on the other two patches. Both
0001 and 0002 patches are the same ones I attached on another thread[2].
Hi, few comments on v12-0003-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transa.patch.
(1) doc/src/sgml/ref/alter_subscription.sgml
+ <term><literal>SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</r$
...
+ ...After logical replication
+ successfully skips the transaction or commits non-empty transaction,
+ the LSN (stored in
+ <structname>pg_subscription</structname>.<structfield>subskiplsn</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts.
+ </para>
...
+ <term><literal>lsn</literal> (<type>pg_lsn</type>)</term>
+ <listitem>
+ <para>
+ Specifies the commit LSN of the remote transaction whose changes are to be skipped
+ by the logical replication worker. Skipping
+ individual subtransactions is not supported. Setting <literal>NONE</literal>
+ resets the LSN.
I think we'll extend the SKIP option choices in the future besides the 'lsn' option.
Then, one sentence "After logical replication successfully skips the transaction or commits non-empty
transaction, the LSN .. is cleared" should be moved to the explanation for 'lsn' section,
if we think this behavior to reset LSN is unique for 'lsn' option ?
(2) doc/src/sgml/catalogs.sgml
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subskiplsn</structfield> <type>pg_lsn</type>
+ </para>
+ <para>
+ Commit LSN of the transaction whose changes are to be skipped, if a valid
+ LSN; otherwise <literal>0/0</literal>.
+ </para></entry>
+ </row>
+
We need to cover the PREPARE that keeps causing errors on the subscriber.
This would apply to the entire patch (e.g. the rename of skip_xact_commit_lsn)
(3) apply_handle_commit_internal comments
/*
* Helper function for apply_handle_commit and apply_handle_stream_commit.
+ * Return true if the transaction was committed, otherwise return false.
*/
If we want to make the new added line alinged with other functions in worker.c,
we should insert one blank line before it ?
(4) apply_worker_post_transaction
I'm not sure if the current refactoring is good or not.
For example, the current HEAD calls pgstat_report_stat(false)
for a commit case if we are in a transaction in apply_handle_commit_internal.
On the other hand, your refactoring calls pgstat_report_stat unconditionally
for apply_handle_commit path. I'm not sure if there
are many cases to call apply_handle_commit without opening a transaction,
but is that acceptable ?
Also, the name is a bit broad.
How about making a function only for stopping and resetting LSN at this stage ?
(5) comments for clear_subscription_skip_lsn
How about changing the comment like below ?
From:
Clear subskiplsn of pg_subscription catalog
To:
Clear subskiplsn of pg_subscription catalog with origin state update
Best Regards,
Takamichi Osumi
On Tue, Mar 1, 2022 at 8:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated patch along with two patches for cfbot tests
since the main patch (0003) depends on the other two patches. Both
0001 and 0002 patches are the same ones I attached on another
thread[2].
Few comments on 0003:
=====================
1.
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subskiplsn</structfield> <type>pg_lsn</type>
+ </para>
+ <para>
+ Commit LSN of the transaction whose changes are to be skipped,
if a valid
+ LSN; otherwise <literal>0/0</literal>.
+ </para></entry>
+ </row>
Can't this be prepared LSN or rollback prepared LSN? Can we say
Finish/End LSN and then add some details which all LSNs can be there?
2. The conflict resolution explanation needs an update after the
latest commits and we should probably change the commit LSN
terminology as mentioned in the previous point.
3. The text in alter_subscription.sgml looks a bit repetitive to me
(similar to what we have in logical-replication.sgml related to
conflicts). Here also we refer to only commit LSN which needs to be
changed as mentioned in the previous two points.
4.
if (strcmp(lsn_str, "none") == 0)
+ {
+ /* Setting lsn = NONE is treated as resetting LSN */
+ lsn = InvalidXLogRecPtr;
+ }
+ else
+ {
+ /* Parse the argument as LSN */
+ lsn = DatumGetTransactionId(DirectFunctionCall1(pg_lsn_in,
+ CStringGetDatum(lsn_str)));
+
+ if (XLogRecPtrIsInvalid(lsn))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid WAL location (LSN): %s", lsn_str)));
Is there a reason that we don't want to allow setting 0
(InvalidXLogRecPtr) for skip LSN?
5.
+# The subscriber will enter an infinite error loop, so we don't want
+# to overflow the server log with error messages.
+$node_subscriber->append_conf(
+ 'postgresql.conf',
+ qq[
+wal_retrieve_retry_interval = 2s
+]);
Can we change this test to use disable_on_error feature? I am thinking
if the disable_on_error feature got committed first, maybe we can have
one test file for this and disable_on_error feature (something like
conflicts.pl).
--
With Regards,
Amit Kapila.
On Thu, Mar 10, 2022 at 2:10 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
On Wednesday, March 2, 2022 12:01 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated patch along with two patches for cfbot tests since the
main patch (0003) depends on the other two patches. Both
0001 and 0002 patches are the same ones I attached on another thread[2].Hi, few comments on v12-0003-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transa.patch.
Thank you for the comments.
(1) doc/src/sgml/ref/alter_subscription.sgml
+ <term><literal>SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</r$ ... + ...After logical replication + successfully skips the transaction or commits non-empty transaction, + the LSN (stored in + <structname>pg_subscription</structname>.<structfield>subskiplsn</structfield>) + is cleared. See <xref linkend="logical-replication-conflicts"/> for + the details of logical replication conflicts. + </para> ... + <term><literal>lsn</literal> (<type>pg_lsn</type>)</term> + <listitem> + <para> + Specifies the commit LSN of the remote transaction whose changes are to be skipped + by the logical replication worker. Skipping + individual subtransactions is not supported. Setting <literal>NONE</literal> + resets the LSN.I think we'll extend the SKIP option choices in the future besides the 'lsn' option.
Then, one sentence "After logical replication successfully skips the transaction or commits non-empty
transaction, the LSN .. is cleared" should be moved to the explanation for 'lsn' section,
if we think this behavior to reset LSN is unique for 'lsn' option ?
Hmm, I think that regardless of the type of option (e.g., relid, xid,
and action whatever), resetting the specified something after that is
specific to SKIP command. SKIP command should have an effect on only
the first non-empty transaction. Otherwise, we could end up leaving it
if the user mistakenly specifies the wrong one.
(2) doc/src/sgml/catalogs.sgml
+ <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>subskiplsn</structfield> <type>pg_lsn</type> + </para> + <para> + Commit LSN of the transaction whose changes are to be skipped, if a valid + LSN; otherwise <literal>0/0</literal>. + </para></entry> + </row> +We need to cover the PREPARE that keeps causing errors on the subscriber.
This would apply to the entire patch (e.g. the rename of skip_xact_commit_lsn)
Fixed.
(3) apply_handle_commit_internal comments
/*
* Helper function for apply_handle_commit and apply_handle_stream_commit.
+ * Return true if the transaction was committed, otherwise return false.
*/If we want to make the new added line alinged with other functions in worker.c,
we should insert one blank line before it ?
This part is removed.
(4) apply_worker_post_transaction
I'm not sure if the current refactoring is good or not.
For example, the current HEAD calls pgstat_report_stat(false)
for a commit case if we are in a transaction in apply_handle_commit_internal.
On the other hand, your refactoring calls pgstat_report_stat unconditionally
for apply_handle_commit path. I'm not sure if there
are many cases to call apply_handle_commit without opening a transaction,
but is that acceptable ?Also, the name is a bit broad.
How about making a function only for stopping and resetting LSN at this stage ?
Agreed, it seems to be overkill. I'll revert that change.
(5) comments for clear_subscription_skip_lsn
How about changing the comment like below ?
From:
Clear subskiplsn of pg_subscription catalog
To:
Clear subskiplsn of pg_subscription catalog with origin state update
Updated.
I'll submit an updated patch that incorporated comments I got so far
and is rebased to disable_on_error patch.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thu, Mar 10, 2022 at 9:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Mar 1, 2022 at 8:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated patch along with two patches for cfbot tests
since the main patch (0003) depends on the other two patches. Both
0001 and 0002 patches are the same ones I attached on another
thread[2].Few comments on 0003: ===================== 1. + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>subskiplsn</structfield> <type>pg_lsn</type> + </para> + <para> + Commit LSN of the transaction whose changes are to be skipped, if a valid + LSN; otherwise <literal>0/0</literal>. + </para></entry> + </row>Can't this be prepared LSN or rollback prepared LSN? Can we say
Finish/End LSN and then add some details which all LSNs can be there?
Right, changed to finish LSN.
2. The conflict resolution explanation needs an update after the
latest commits and we should probably change the commit LSN
terminology as mentioned in the previous point.
Updated.
3. The text in alter_subscription.sgml looks a bit repetitive to me
(similar to what we have in logical-replication.sgml related to
conflicts). Here also we refer to only commit LSN which needs to be
changed as mentioned in the previous two points.
Updated.
4. if (strcmp(lsn_str, "none") == 0) + { + /* Setting lsn = NONE is treated as resetting LSN */ + lsn = InvalidXLogRecPtr; + } + else + { + /* Parse the argument as LSN */ + lsn = DatumGetTransactionId(DirectFunctionCall1(pg_lsn_in, + CStringGetDatum(lsn_str))); + + if (XLogRecPtrIsInvalid(lsn)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid WAL location (LSN): %s", lsn_str)));Is there a reason that we don't want to allow setting 0
(InvalidXLogRecPtr) for skip LSN?
0 is obviously an invalid value for skip LSN, which should not be
allowed similar to other options (like setting '' to slot_name). Also,
we use 0 (InvalidXLogRecPtr) internally to reset the subskipxid when
NONE is specified.
5. +# The subscriber will enter an infinite error loop, so we don't want +# to overflow the server log with error messages. +$node_subscriber->append_conf( + 'postgresql.conf', + qq[ +wal_retrieve_retry_interval = 2s +]);Can we change this test to use disable_on_error feature? I am thinking
if the disable_on_error feature got committed first, maybe we can have
one test file for this and disable_on_error feature (something like
conflicts.pl).
Good idea. Updated.
I've attached an updated version patch. This patch can be applied on
top of the latest disable_on_error patch[1]/messages/by-id/CAA4eK1Kes9TsMpGL6m+AJNHYCGRvx6piYQt5v6TEbH_t9jh8nA@mail.gmail.com.
Regards,
[1]: /messages/by-id/CAA4eK1Kes9TsMpGL6m+AJNHYCGRvx6piYQt5v6TEbH_t9jh8nA@mail.gmail.com
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v13-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transa.patchapplication/octet-stream; name=v13-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transa.patchDownload
From 949205c4fb700979d91cf9bc8aa23e0ba0838bf5 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 10 Dec 2021 14:41:30 +0900
Subject: [PATCH v13] Add ALTER SUBSCRIPTION ... SKIP to skip the transaction
on subscriber nodes.
If incoming change violates any constraint, logical replication stops
until it's resolved. This commit introduces another way to skip the
transaction in question, other than manually updating the subscriber's
database or using pg_replication_origin_advance().
The user can specify LSN by ALTER SUBSCRIPTION ... SKIP (lsn = XXX),
updating pg_subscription.subskiplsn field, telling the apply worker to
skip the transaction. The apply worker skips all data modification changes
within the specified transaction.
After skipping the transaction the apply worker clears
pg_subscription.subskiplsn.
Author: Masahiko Sawada
Reviewed-by: Vignesh C, Greg Nancarrow, Takamichi Osumi, Haiying Tang, Hou Zhijie, Peter Eisentraut, Amit Kapila
Discussion: https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xJfUVihNZDA@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 10 +
doc/src/sgml/logical-replication.sgml | 18 +-
doc/src/sgml/ref/alter_subscription.sgml | 38 +++
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 2 +-
src/backend/commands/subscriptioncmds.c | 70 ++++++
src/backend/parser/gram.y | 9 +
src/backend/replication/logical/worker.c | 261 ++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 4 +
src/bin/psql/describe.c | 8 +-
src/bin/psql/tab-complete.c | 5 +-
src/include/catalog/pg_subscription.h | 5 +
src/include/nodes/parsenodes.h | 3 +-
src/test/regress/expected/subscription.out | 117 ++++-----
src/test/regress/sql/subscription.sql | 6 +
src/test/subscription/t/030_skip_xact.pl | 183 +++++++++++++++
16 files changed, 666 insertions(+), 74 deletions(-)
create mode 100644 src/test/subscription/t/030_skip_xact.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 7777d60514..eec06b90e8 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7779,6 +7779,16 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subskiplsn</structfield> <type>pg_lsn</type>
+ </para>
+ <para>
+ Finish LSN of the transaction whose changes are to be skipped, if a valid
+ LSN; otherwise <literal>0/0</literal>.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subconninfo</structfield> <type>text</type>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 6431d4796d..d8a741c72e 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -366,15 +366,19 @@ CONTEXT: processing remote data for replication origin "pg_16395" during "INSER
transaction, the subscription needs to be disabled temporarily by
<command>ALTER SUBSCRIPTION ... DISABLE</command> first or alternatively, the
subscription can be used with the <literal>disable_on_error</literal> option.
- Then, the transaction can be skipped by calling the
+ Then, the transaction can be skipped by using
+ <command>ALTER SUBSCRITPION ... SKIP</command> with the transaction's finish LSN
+ (i.e., LSN 0/14C0378). After that the replication
+ can be resumed by <command>ALTER SUBSCRIPTION ... ENABLE</command>. Alternatively,
+ the transaction can also be skipped by calling the
<link linkend="pg-replication-origin-advance">
- <function>pg_replication_origin_advance()</function></link> function with
- the <parameter>node_name</parameter> (i.e., <literal>pg_16395</literal>) and the
- next LSN of the transaction's LSN (i.e., LSN 0/14C0379). After that the replication
- can be resumed by <command>ALTER SUBSCRIPTION ... ENABLE</command>. The current
- position of origins can be seen in the
- <link linkend="view-pg-replication-origin-status">
+ <function>pg_replication_origin_advance()</function></link> function with the
+ <parameter>node_name</parameter> (i.e., <literal>pg_16395</literal>) and the next
+ LSN of the transaction's finish LSN (i.e., 0/14C0379). The current position of
+ origins can be seen in the <link linkend="view-pg-replication-origin-status">
<structname>pg_replication_origin_status</structname></link> system view.
+ Please note that skipping the whole transaction include skipping changes that
+ might not violate any constraint. This can easily make the subscriber inconsistent.
</para>
</sect1>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 58b78a94ea..4763599b53 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -29,6 +29,7 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> REFRESH PUB
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> ENABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> DISABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> )
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <replaceable>new_name</replaceable>
</synopsis>
@@ -210,6 +211,43 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><literal>SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> )</literal></term>
+ <listitem>
+ <para>
+ Skips applying all changes of the specified remote transaction. If incoming data
+ violates any constraints, logical replication will stop until it is
+ resolved. By using <command>ALTER SUBSCRIPTION ... SKIP</command> command, the
+ logical replication worker skips all data modification changes
+ within the specified transaction. This option has no effect on the transactions
+ that are already prepared by enabling <literal>two_phase</literal> on subscriber.
+ After logical replication worker successfully skips the transaction or commits
+ non-empty transaction, the LSN (stored in
+ <structname>pg_subscription</structname>.<structfield>subskiplsn</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts.
+ </para>
+
+ <para>
+ <replaceable>skip_option</replaceable> specifies options for this operation.
+ The supported option is:
+
+ <variablelist>
+ <varlistentry>
+ <term><literal>lsn</literal> (<type>pg_lsn</type>)</term>
+ <listitem>
+ <para>
+ Specifies the transaction's finish LSN of the remote transaction whose changes
+ are to be skipped by the logical replication worker. Skipping individual
+ subtransactions is not supported. Setting <literal>NONE</literal> resets the LSN.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><replaceable class="parameter">new_owner</replaceable></term>
<listitem>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index baf660aa24..33d3437539 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -70,6 +70,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->stream = subform->substream;
sub->twophasestate = subform->subtwophasestate;
sub->disableonerr = subform->subdisableonerr;
+ sub->skiplsn = subform->subskiplsn;
/* Get conninfo */
datum = SysCacheGetAttr(SUBSCRIPTIONOID,
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index bb1ac30cd1..bd48ee7bd2 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1261,7 +1261,7 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
- substream, subtwophasestate, subdisableonerr, subslotname,
+ substream, subtwophasestate, subdisableonerr, subskiplsn, subslotname,
subsynccommit, subpublications)
ON pg_subscription TO public;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 3922658bbc..459bd6d109 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -45,6 +45,7 @@
#include "utils/guc.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/pg_lsn.h"
#include "utils/syscache.h"
/*
@@ -62,6 +63,7 @@
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
#define SUBOPT_DISABLE_ON_ERR 0x00000400
+#define SUBOPT_LSN 0x00000800
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -84,6 +86,8 @@ typedef struct SubOpts
bool streaming;
bool twophase;
bool disableonerr;
+ XLogRecPtr lsn; /* InvalidXLogRecPtr for resetting purpose,
+ * otherwise a valid LSN */
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -262,6 +266,35 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_DISABLE_ON_ERR;
opts->disableonerr = defGetBoolean(defel);
}
+ else if (IsSet(supported_opts, SUBOPT_LSN) &&
+ strcmp(defel->defname, "lsn") == 0)
+ {
+ char *lsn_str = defGetString(defel);
+ XLogRecPtr lsn;
+
+ if (IsSet(opts->specified_opts, SUBOPT_LSN))
+ errorConflictingDefElem(defel, pstate);
+
+ if (strcmp(lsn_str, "none") == 0)
+ {
+ /* Setting lsn = NONE is treated as resetting LSN */
+ lsn = InvalidXLogRecPtr;
+ }
+ else
+ {
+ /* Parse the argument as LSN */
+ lsn = DatumGetTransactionId(DirectFunctionCall1(pg_lsn_in,
+ CStringGetDatum(lsn_str)));
+
+ if (XLogRecPtrIsInvalid(lsn))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid WAL location (LSN): %s", lsn_str)));
+ }
+
+ opts->specified_opts |= SUBOPT_LSN;
+ opts->lsn = lsn;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -479,6 +512,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
values[Anum_pg_subscription_subdisableonerr - 1] = BoolGetDatum(opts.disableonerr);
+ values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -1106,6 +1140,42 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
break;
}
+ case ALTER_SUBSCRIPTION_SKIP:
+ {
+ parse_subscription_options(pstate, stmt->options, SUBOPT_LSN, &opts);
+
+ /* ALTER SUBSCRIPTION ... SKIP supports only LSN option */
+ Assert(IsSet(opts.specified_opts, SUBOPT_LSN));
+
+ /*
+ * If the user set subskiplsn, we do a sanity check to make
+ * sure that the specified LSN is a probable value.
+ */
+ if (!XLogRecPtrIsInvalid(opts.lsn))
+ {
+ RepOriginId originid;
+ char originname[NAMEDATALEN];
+ XLogRecPtr remote_lsn;
+
+ snprintf(originname, sizeof(originname), "pg_%u", subid);
+ originid = replorigin_by_name(originname, false);
+ remote_lsn = replorigin_get_progress(originid, false);
+
+ /* Check the given LSN is at least a future LSN */
+ if (!XLogRecPtrIsInvalid(remote_lsn) && opts.lsn < remote_lsn)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("skip WAL location (LSN) must be greater than origin LSN %X/%X",
+ LSN_FORMAT_ARGS(remote_lsn))));
+ }
+
+ values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(opts.lsn);
+ replaces[Anum_pg_subscription_subskiplsn - 1] = true;
+
+ update_tuple = true;
+ break;
+ }
+
default:
elog(ERROR, "unrecognized ALTER SUBSCRIPTION kind %d",
stmt->kind);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index a03b33b53b..0036c2f9e2 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9983,6 +9983,15 @@ AlterSubscriptionStmt:
(Node *)makeBoolean(false), @1));
$$ = (Node *)n;
}
+ | ALTER SUBSCRIPTION name SKIP definition
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_SKIP;
+ n->subname = $3;
+ n->options = $5;
+ $$ = (Node *)n;
+ }
;
/*****************************************************************************
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index a1fe81b34f..5450ad2537 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -136,6 +136,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -189,6 +190,7 @@
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/pg_lsn.h"
#include "utils/rel.h"
#include "utils/rls.h"
#include "utils/syscache.h"
@@ -259,6 +261,21 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * We enable skipping all data modification changes (INSERT, UPDATE, etc.) for
+ * the subscription if the remote transaction's finish LSN matches the subskiplsn.
+ * Once we start skipping changes, we don't stop it until we skip all changes of
+ * the transaction even if pg_subscription is updated and MySubscription->skiplsn
+ * gets changed or reset during that. Also, in streaming transaction cases, we
+ * don't skip receiving and spooling the changes since we decide whether or not
+ * to skip applying the changes when starting to apply changes. The subskiplsn is
+ * cleared after successfully skipping the transaction or applying non-empty
+ * transaction, where the later prevents the mistakenly specified subskiplsn from
+ * being left.
+ */
+static XLogRecPtr skip_xact_finish_lsn = InvalidXLogRecPtr;
+#define is_skipping_changes() (!XLogRecPtrIsInvalid(skip_xact_finish_lsn))
+
/* BufFile handle of the current streaming file */
static BufFile *stream_fd = NULL;
@@ -336,6 +353,12 @@ static void TwoPhaseTransactionGid(Oid subid, TransactionId xid, char *gid, int
/* Common streaming function to apply all the spooled messages */
static void apply_spooled_messages(TransactionId xid, XLogRecPtr lsn);
+/* Functions for skipping changes */
+static void maybe_start_skipping_changes(XLogRecPtr lsn);
+static void stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_ts);
+static void clear_subscription_skip_lsn(XLogRecPtr skiplsn, XLogRecPtr origin_lsn,
+ TimestampTz origin_ts, bool with_warning);
+
/* Functions for apply error callback */
static void apply_error_callback(void *arg);
static inline void set_apply_error_context_xact(TransactionId xid, XLogRecPtr lsn);
@@ -795,6 +818,8 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
+ maybe_start_skipping_changes(begin_data.final_lsn);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -847,6 +872,8 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
+ maybe_start_skipping_changes(begin_data.prepare_lsn);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -905,9 +932,9 @@ apply_handle_prepare(StringInfo s)
/*
* Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction. It is done this way because at
- * commit prepared time, we won't know whether we have skipped preparing a
- * transaction because of no change.
+ * change has happened in this transaction or all changes are skipped. It
+ * is done this way because at commit prepared time, we won't know whether
+ * we have skipped preparing a transaction because of no change.
*
* XXX, We can optimize such that at commit prepared time, we first check
* whether we have prepared the transaction or not but that doesn't seem
@@ -928,6 +955,15 @@ apply_handle_prepare(StringInfo s)
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(prepare_data.end_lsn);
+ /*
+ * Since we already have prepared the transaction, in a case where the
+ * server crashes before clearing the subskiplsn, it will be left but the
+ * transaction won't be resent. But that's okay because it will be
+ * cleared when starting to apply the next transaction.
+ */
+ stop_skipping_changes(prepare_data.end_lsn, prepare_data.prepare_time);
+ Assert(!IsTransactionState());
+
pgstat_report_activity(STATE_IDLE, NULL);
reset_apply_error_context_info();
}
@@ -969,6 +1005,10 @@ apply_handle_commit_prepared(StringInfo s)
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(prepare_data.end_lsn);
+ clear_subscription_skip_lsn(MySubscription->skiplsn, prepare_data.end_lsn,
+ prepare_data.commit_time, true);
+ Assert(!IsTransactionState());
+
pgstat_report_activity(STATE_IDLE, NULL);
reset_apply_error_context_info();
}
@@ -1010,6 +1050,10 @@ apply_handle_rollback_prepared(StringInfo s)
FinishPreparedTransaction(gid, false);
end_replication_step();
CommitTransactionCommand();
+
+ clear_subscription_skip_lsn(MySubscription->skiplsn, rollback_data.rollback_end_lsn,
+ rollback_data.rollback_time, true);
+ Assert(!IsTransactionState());
}
pgstat_report_stat(false);
@@ -1072,6 +1116,9 @@ apply_handle_stream_prepare(StringInfo s)
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(prepare_data.end_lsn);
+ stop_skipping_changes(prepare_data.end_lsn, prepare_data.prepare_time);
+ Assert(!IsTransactionState());
+
pgstat_report_activity(STATE_IDLE, NULL);
reset_apply_error_context_info();
@@ -1335,6 +1382,8 @@ apply_spooled_messages(TransactionId xid, XLogRecPtr lsn)
remote_final_lsn = lsn;
+ maybe_start_skipping_changes(lsn);
+
/*
* Make sure the handle apply_dispatch methods are aware we're in a remote
* transaction.
@@ -1455,6 +1504,21 @@ apply_handle_stream_commit(StringInfo s)
static void
apply_handle_commit_internal(LogicalRepCommitData *commit_data)
{
+ bool was_skipping = false;
+
+ if (is_skipping_changes())
+ {
+ /*
+ * Start a new transaction to clear the subskipxid, if not started yet.
+ * The transaction is committed below.
+ */
+ if (!IsTransactionState())
+ StartTransactionCommand();
+
+ stop_skipping_changes(commit_data->commit_lsn, commit_data->committime);
+ was_skipping = true;
+ }
+
if (IsTransactionState())
{
/*
@@ -1468,6 +1532,17 @@ apply_handle_commit_internal(LogicalRepCommitData *commit_data)
pgstat_report_stat(false);
store_flush_position(commit_data->end_lsn);
+
+ /*
+ * If non-empty (and non-skipped) transaction is successfully
+ * committed, we clear the subskiplsn.
+ */
+ if (!was_skipping)
+ {
+ clear_subscription_skip_lsn(MySubscription->skiplsn, commit_data->commit_lsn,
+ commit_data->committime, true);
+ Assert(!IsTransactionState());
+ }
}
else
{
@@ -1583,7 +1658,8 @@ apply_handle_insert(StringInfo s)
TupleTableSlot *remoteslot;
MemoryContext oldctx;
- if (handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s))
+ if (is_skipping_changes() ||
+ handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s))
return;
begin_replication_step();
@@ -1710,7 +1786,8 @@ apply_handle_update(StringInfo s)
RangeTblEntry *target_rte;
MemoryContext oldctx;
- if (handle_streamed_transaction(LOGICAL_REP_MSG_UPDATE, s))
+ if (is_skipping_changes() ||
+ handle_streamed_transaction(LOGICAL_REP_MSG_UPDATE, s))
return;
begin_replication_step();
@@ -1874,7 +1951,8 @@ apply_handle_delete(StringInfo s)
TupleTableSlot *remoteslot;
MemoryContext oldctx;
- if (handle_streamed_transaction(LOGICAL_REP_MSG_DELETE, s))
+ if (is_skipping_changes() ||
+ handle_streamed_transaction(LOGICAL_REP_MSG_DELETE, s))
return;
begin_replication_step();
@@ -2261,7 +2339,8 @@ apply_handle_truncate(StringInfo s)
ListCell *lc;
LOCKMODE lockmode = AccessExclusiveLock;
- if (handle_streamed_transaction(LOGICAL_REP_MSG_TRUNCATE, s))
+ if (is_skipping_changes() ||
+ handle_streamed_transaction(LOGICAL_REP_MSG_TRUNCATE, s))
return;
begin_replication_step();
@@ -3738,6 +3817,174 @@ IsLogicalWorker(void)
return MyLogicalRepWorker != NULL;
}
+/*
+ * Start skipping changes of the transaction if the given LSN matches the
+ * LSN specified by subscription's skiplsn.
+ */
+static void
+maybe_start_skipping_changes(XLogRecPtr lsn)
+{
+ Assert(!is_skipping_changes());
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ if (likely(XLogRecPtrIsInvalid(MySubscription->skiplsn)))
+ return;
+
+ if (likely(MySubscription->skiplsn != lsn))
+ {
+ /*
+ * It's a rare case; a past subskiplsn was left because the server
+ * crashed after preparing the transaction and before clearing the
+ * subskiplsn. We clear it without a warning message so as not confuse
+ * the user.
+ */
+ if (unlikely(MySubscription->skiplsn < lsn))
+ {
+ clear_subscription_skip_lsn(MySubscription->skiplsn, InvalidXLogRecPtr, 0,
+ false);
+ Assert(!IsTransactionState());
+ }
+
+ return;
+ }
+
+ /* Start skipping all changes of this transaction */
+ skip_xact_finish_lsn = lsn;
+
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction which finished at %X/%X",
+ LSN_FORMAT_ARGS(skip_xact_finish_lsn)));
+}
+
+/*
+ * Stop skipping changes by resetting skip_xact_finish_lsn if enabled. Regardless
+ * of it being enabled, we clear the subskiplsn.
+ *
+ * Both origin_lsn and origin_timestamp are the remote transaction's end_lsn
+ * and commit timestamp, respectively.
+ */
+static void
+stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_ts)
+{
+ if (likely(!is_skipping_changes()))
+ {
+ /*
+ * Clear the subskiplsn with a warning message if it was specified but
+ * not used.
+ */
+ clear_subscription_skip_lsn(MySubscription->skiplsn, origin_lsn,
+ origin_ts, true);
+ return;
+ }
+
+ clear_subscription_skip_lsn(skip_xact_finish_lsn, origin_lsn, origin_ts,
+ false);
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction which finished at %X/%X",
+ LSN_FORMAT_ARGS(skip_xact_finish_lsn))));
+
+ /* Stop skipping changes */
+ skip_xact_finish_lsn = InvalidXLogRecPtr;
+}
+
+/*
+ * Clear subskiplsn of pg_subscription catalog with origin state update.
+ *
+ * if with_warning is true, we raise a warning when clearing the subskipxid.
+ * This is used to inform users that the subskiplsn cleared since it was
+ * specified but did not match the transaction's finish LSN. It's possible
+ * to happen e.g., when the user mistakenly specified the wrong subskiplsn.
+ */
+static void
+clear_subscription_skip_lsn(XLogRecPtr skiplsn, XLogRecPtr origin_lsn,
+ TimestampTz origin_ts, bool with_warning)
+{
+ Relation rel;
+ Form_pg_subscription subform;
+ HeapTuple tup;
+ bool started_tx = false;
+
+ if (likely(XLogRecPtrIsInvalid(skiplsn)))
+ return;
+
+ if (!IsTransactionState())
+ {
+ StartTransactionCommand();
+ started_tx = true;
+ }
+
+ /*
+ * Protect subskiplsn of pg_subscription from being concurrently updated
+ * while clearing it.
+ */
+ LockSharedObject(SubscriptionRelationId, MySubscription->oid, 0,
+ AccessShareLock);
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ ObjectIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ subform = (Form_pg_subscription) GETSTRUCT(tup);
+
+ /*
+ * Update the subskiplsn of the tuple to InvalidXLogRecPtr. If user has
+ * already changed subskiplsn before clearing it we don't update the
+ * catalog and don't advance the replication origin state. So in the
+ * worst case, if the server crashes before sending an acknowledgment of
+ * the flush position the transaction will be sent again and the user
+ * needs to set subskiplsn again. We can reduce the possibility by
+ * logging a replication origin WAL record to advance the origin LSN
+ * instead but there is no way to advance the origin timestamp and it
+ * doesn't seem to be worth doing anything about it since it's a very rare
+ * case.
+ */
+ if (subform->subskiplsn == skiplsn)
+ {
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ /* reset subskiplsn */
+ values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ replaces[Anum_pg_subscription_subskiplsn - 1] = true;
+
+ /*
+ * Update origin state so we can restart streaming from correct
+ * position in case of crash.
+ */
+ replorigin_session_origin_lsn = origin_lsn;
+ replorigin_session_origin_timestamp = origin_ts;
+
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+
+ if (with_warning)
+ ereport(WARNING,
+ errmsg("skip-LSN of logical replication subscription \"%s\" cleared", MySubscription->name),
+ errdetail("Remote transaction's finish WAL location (LSN) %X/%X did not match skip-LSN %X/%X",
+ LSN_FORMAT_ARGS(origin_lsn),
+ LSN_FORMAT_ARGS(skiplsn)));
+ }
+
+ heap_freetuple(tup);
+ table_close(rel, NoLock);
+
+ if (started_tx)
+ CommitTransactionCommand();
+}
+
/* Error callback to give more context info about the change being applied */
static void
apply_error_callback(void *arg)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 4dd24b8c89..202bca4b23 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4359,6 +4359,10 @@ getSubscriptions(Archive *fout)
ntups = PQntuples(res);
+ /*
+ * Get subscription fields. We don't include subskiplsn in the dump as
+ * after restoring the dump this value may no longer be relevant.
+ */
i_tableoid = PQfnumber(res, "tableoid");
i_oid = PQfnumber(res, "oid");
i_subname = PQfnumber(res, "subname");
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 9229eacb6d..4c6c370b6f 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6084,7 +6084,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false};
+ false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6131,6 +6131,12 @@ describeSubscriptions(const char *pattern, bool verbose)
", subconninfo AS \"%s\"\n",
gettext_noop("Synchronous commit"),
gettext_noop("Conninfo"));
+
+ /* Skip LSN is only supported in v15 and higher */
+ if (pset.sversion >= 150000)
+ appendPQExpBuffer(&buf,
+ ", subskiplsn AS \"%s\"\n",
+ gettext_noop("Skip LSN"));
}
/* Only display subscriptions in current database. */
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index e630acc70d..1295f79340 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1819,7 +1819,7 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> */
else if (Matches("ALTER", "SUBSCRIPTION", MatchAny))
COMPLETE_WITH("CONNECTION", "ENABLE", "DISABLE", "OWNER TO",
- "RENAME TO", "REFRESH PUBLICATION", "SET",
+ "RENAME TO", "REFRESH PUBLICATION", "SET", "SKIP (",
"ADD PUBLICATION", "DROP PUBLICATION");
/* ALTER SUBSCRIPTION <name> REFRESH PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) &&
@@ -1835,6 +1835,9 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
COMPLETE_WITH("binary", "slot_name", "streaming", "synchronous_commit", "disable_on_error");
+ /* ALTER SUBSCRIPTION <name> SKIP ( */
+ else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
+ COMPLETE_WITH("lsn");
/* ALTER SUBSCRIPTION <name> SET PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "PUBLICATION"))
{
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index e2befaf351..9381228806 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -70,6 +70,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
bool subdisableonerr; /* True if a worker error should cause the
* subscription to be disabled */
+ XLogRecPtr subskiplsn; /* All changes which committed at this LSN are
+ * skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -109,6 +112,8 @@ typedef struct Subscription
bool disableonerr; /* Indicates if the subscription should be
* automatically disabled if a worker error
* occurs */
+ XLogRecPtr skiplsn; /* All changes which finished at this LSN are
+ * skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 1617702d9d..6f83a79a96 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3726,7 +3726,8 @@ typedef enum AlterSubscriptionType
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
ALTER_SUBSCRIPTION_DROP_PUBLICATION,
ALTER_SUBSCRIPTION_REFRESH,
- ALTER_SUBSCRIPTION_ENABLED
+ ALTER_SUBSCRIPTION_ENABLED,
+ ALTER_SUBSCRIPTION_SKIP
} AlterSubscriptionType;
typedef struct AlterSubscriptionStmt
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index ad8003fae1..914aef0385 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -76,10 +76,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -93,11 +93,16 @@ ALTER SUBSCRIPTION regress_doesnotexist CONNECTION 'dbname=regress_doesnotexist2
ERROR: subscription "regress_doesnotexist" does not exist
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
ERROR: unrecognized subscription parameter: "create_slot"
+-- ok - with lsn = NONE
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
+ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------------------+------------------------------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | off | dbname=regress_doesnotexist2
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -129,10 +134,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------------------+------------------------------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | local | dbname=regress_doesnotexist2
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -165,19 +170,19 @@ ERROR: binary requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, binary = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -188,19 +193,19 @@ ERROR: streaming requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -215,10 +220,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more then once
@@ -233,10 +238,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -270,10 +275,10 @@ ERROR: two_phase requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -282,10 +287,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -294,10 +299,10 @@ DROP SUBSCRIPTION regress_testsub;
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -309,18 +314,18 @@ ERROR: disable_on_error requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, disable_on_error = false);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | t | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | t | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index a7c15b1daf..19aeec9277 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -72,6 +72,12 @@ ALTER SUBSCRIPTION regress_testsub SET (slot_name = '');
ALTER SUBSCRIPTION regress_doesnotexist CONNECTION 'dbname=regress_doesnotexist2';
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
+-- ok - with lsn = NONE
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
+
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
+
\dRs+
BEGIN;
diff --git a/src/test/subscription/t/030_skip_xact.pl b/src/test/subscription/t/030_skip_xact.pl
new file mode 100644
index 0000000000..cc6a6e32d5
--- /dev/null
+++ b/src/test/subscription/t/030_skip_xact.pl
@@ -0,0 +1,183 @@
+
+# Copyright (c) 2022, PostgreSQL Global Development Group
+
+# Tests for skipping logical replication transactions
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 4;
+
+my $offset = 0;
+
+# Test skipping the transaction. This function must be called after the caller
+# has inserted data that conflicts with the subscriber. The finish LSN of the
+# error transaction that is used to specify to ALTER SUBSCRIPTION ... SKIP is
+# fetched from the server logs. After executing ALTER SUBSCRITPION ... SKIP, we
+# check if logical replication can continue working by inserting $nonconflict_data
+# on the publisher.
+sub test_skip_xact
+{
+ my ($node_publisher, $node_subscriber, $subname, $relname,
+ $nonconflict_data, $expected, $msg)
+ = @_;
+
+ # Wait until a conflict occurs on the subscriber.
+ $node_subscriber->poll_query_until(
+ 'postgres',
+ qq[
+SELECT subenabled = FALSE FROM pg_subscription WHERE subname = '$subname'
+]);
+
+ # Get the finish LSN of the error transaction.
+ my $contents = slurp_file($node_subscriber->logfile, $offset);
+ $contents =~
+ qr/processing remote data for replication origin \"pg_\d+\" during "INSERT" for replication target relation "public.$relname" in transaction \d+ finished at ([[:xdigit:]]+\/[[:xdigit:]]+)/
+ or die "could not get error-LSN";
+ my $lsn = $1;
+
+ # Set skip lsn.
+ $node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION $subname SKIP (lsn = '$lsn')");
+
+ # Re-enable the subscription.
+ $node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION $subname ENABLE");
+
+ # Wait for the failed transaction to be skipped
+ $node_subscriber->poll_query_until('postgres',
+ "SELECT subskiplsn = '0/0' FROM pg_subscription WHERE subname = '$subname'"
+ );
+
+ # Check the log indicating that successfully skipped the transaction, and
+ # advance the offset of the log file for the next test.
+ $offset = $node_subscriber->wait_for_log(
+ qr/LOG: done skipping logical replication transaction which finished at $lsn/,
+ $offset);
+
+ # Insert non-conflict data
+ $node_publisher->safe_psql('postgres',
+ "INSERT INTO $relname VALUES $nonconflict_data");
+
+ $node_publisher->wait_for_catchup($subname);
+
+ # Check replicated data
+ my $res = $node_subscriber->safe_psql('postgres',
+ "SELECT count(*) FROM $relname");
+ is($res, $expected, $msg);
+}
+
+# Create publisher node. Set a low value to logical_decoding_work_mem
+# so we can test streaming cases easily.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf(
+ 'postgresql.conf',
+ qq[
+logical_decoding_work_mem = 64kB
+max_prepared_transactions = 10
+]);
+$node_publisher->start;
+
+# Create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->append_conf(
+ 'postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+]);
+
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On the subscriber, we
+# create the same tables but with primary keys. Also, insert some data that
+# will conflict with the data replicated from publisher later.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+CREATE TABLE test_tab (a int);
+CREATE TABLE test_tab_streaming (a int, b text);
+COMMIT;
+]);
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+CREATE TABLE test_tab (a int primary key);
+CREATE TABLE test_tab_streaming (a int primary key, b text);
+INSERT INTO test_tab VALUES (1);
+INSERT INTO test_tab_streaming VALUES (1, md5(1::text));
+COMMIT;
+]);
+
+# Setup publications
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab;
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+
+# Create subscriptions. Both subscription sets disable_on_error to on
+# so that they get disabled when a conflict occurs.
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr' PUBLICATION tap_pub WITH (two_phase = on, disable_on_error = on);
+CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr' PUBLICATION tap_pub_streaming WITH (streaming = on, disable_on_error = on);
+]);
+
+$node_publisher->wait_for_catchup('tap_sub');
+$node_publisher->wait_for_catchup('tap_sub_streaming');
+$node_subscriber->poll_query_until(
+ 'postgres',
+ qq[
+SELECT COUNT(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('s', 'r')
+]);
+
+# Insert data to test_tab1, raising an error on the subscriber due to violation
+# of the unique constraint on test_tab. Then skip the transaction.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+COMMIT;
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(2)", "2", "test skipping transaction");
+
+# Test for PREPARE and COMMIT PREPARED. Insert the same data to test_tab1 and
+# PREPARE the transaction, raising an error. Then skip the transaction.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab VALUES (1);
+PREPARE TRANSACTION 'gtx';
+COMMIT PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub", "test_tab",
+ "(3)", "3", "test skipping prepare and commit prepared ");
+
+# Test for STREAM COMMIT. Insert enough rows to test_tab_streaming to exceed the 64kB
+# limit, also raising an error on the subscriber during applying spooled changes for the
+# same reason. Then skip the transaction.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+COMMIT;
+]);
+test_skip_xact($node_publisher, $node_subscriber, "tap_sub_streaming",
+ "test_tab_streaming", "(2, md5(2::text))",
+ "2", "test skipping stream-commit");
+
+my $res = $node_subscriber->safe_psql('postgres',
+ "SELECT count(*) FROM pg_prepared_xacts");
+is($res, "0",
+ "check all prepared transactions are resolved on the subscriber");
--
2.24.3 (Apple Git-128)
On Friday, March 11, 2022 5:20 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated version patch. This patch can be applied on top of the
latest disable_on_error patch[1].
Hi, thank you for the patch. I'll share my review comments on v13.
(a) src/backend/commands/subscriptioncmds.c
@@ -84,6 +86,8 @@ typedef struct SubOpts
bool streaming;
bool twophase;
bool disableonerr;
+ XLogRecPtr lsn; /* InvalidXLogRecPtr for resetting purpose,
+ * otherwise a valid LSN */
I think this explanation is slightly odd and can be improved.
Strictly speaking, I feel a *valid* LSN is for retting transaction purpose
from the functional perspective. Also, the wording "resetting purpose"
is unclear by itself. I'll suggest below change.
From:
InvalidXLogRecPtr for resetting purpose, otherwise a valid LSN
To:
A valid LSN when we skip transaction, otherwise InvalidXLogRecPtr
(b) The code position of additional append in describeSubscriptions
+
+ /* Skip LSN is only supported in v15 and higher */
+ if (pset.sversion >= 150000)
+ appendPQExpBuffer(&buf,
+ ", subskiplsn AS \"%s\"\n",
+ gettext_noop("Skip LSN"));
I suggest to combine this code after subdisableonerr.
(c) parse_subscription_options
+ /* Parse the argument as LSN */
+ lsn = DatumGetTransactionId(DirectFunctionCall1(pg_lsn_in,
Here, shouldn't we call DatumGetLSN, instead of DatumGetTransactionId ?
(d) parse_subscription_options
+ if (strcmp(lsn_str, "none") == 0)
+ {
+ /* Setting lsn = NONE is treated as resetting LSN */
+ lsn = InvalidXLogRecPtr;
+ }
+
We should remove this pair of curly brackets that is for one sentence.
(e) src/backend/replication/logical/worker.c
+ * to skip applying the changes when starting to apply changes. The subskiplsn is
+ * cleared after successfully skipping the transaction or applying non-empty
+ * transaction, where the later avoids the mistakenly specified subskiplsn from
+ * being left.
typo "the later" -> "the latter"
At the same time, I feel the last part of this sentence can be an independent sentence.
From:
, where the later avoids the mistakenly specified subskiplsn from being left
To:
. The latter prevents the mistakenly specified subskiplsn from being left
* Note that my comments below are applied if we choose we don't merge disable_on_error test with skip lsn tests.
(f) src/test/subscription/t/030_skip_xact.pl
+use Test::More tests => 4;
It's better to utilize the new style for the TAP test.
Then, probably we should introduce done_testing()
at the end of the test.
(g) src/test/subscription/t/030_skip_xact.pl
I think there's no need to create two types of subscriptions.
Just one subscription with two_phase = on and streaming = on
would be sufficient for the tests(normal commit, commit prepared,
stream commit cases). I think this point of view will reduce
the number of the table and the publication, which will
make the whole test simpler.
Best Regards,
Takamichi Osumi
On Fri, Mar 11, 2022 4:20 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated version patch. This patch can be applied on
top of the latest disable_on_error patch[1].
Thanks for your patch. Here are some comments for the v13 patch.
1. doc/src/sgml/ref/alter_subscription.sgml
+ Specifies the transaction's finish LSN of the remote transaction whose changes
Could it be simplified to "Specifies the finish LSN of the remote transaction
whose ...".
2.
I met a failed assertion, the backtrace is attached. This is caused by the
following code in maybe_start_skipping_changes().
+ /*
+ * It's a rare case; a past subskiplsn was left because the server
+ * crashed after preparing the transaction and before clearing the
+ * subskiplsn. We clear it without a warning message so as not confuse
+ * the user.
+ */
+ if (unlikely(MySubscription->skiplsn < lsn))
+ {
+ clear_subscription_skip_lsn(MySubscription->skiplsn, InvalidXLogRecPtr, 0,
+ false);
+ Assert(!IsTransactionState());
+ }
We want to clear subskiplsn in the case mentioned in comment. But if the next
transaction is a steaming transaction and this function is called by
apply_spooled_messages(), we are inside a transaction here. So, I think this
assertion is not suitable for streaming transaction. Thoughts?
3.
+ XLogRecPtr subskiplsn; /* All changes which committed at this LSN are
+ * skipped */
To be consistent, should the comment be changed to "All changes which finished
at this LSN are skipped"?
4.
+ After logical replication worker successfully skips the transaction or commits
+ non-empty transaction, the LSN (stored in
+ <structname>pg_subscription</structname>.<structfield>subskiplsn</structfield>)
+ is cleared.
Besides "commits non-empty transaction", subskiplsn would also be cleared in
some two-phase commit cases I think. Like prepare/commit/rollback a transaction,
even if it is an empty transaction. So, should we change it for these cases?
5.
+ * Clear subskiplsn of pg_subscription catalog with origin state update.
Should "with origin state update" modified to "with origin state updated"?
Regards,
Shi yu
Attachments:
On Friday, March 11, 2022 5:20 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated version patch. This patch can be applied on top of the
latest disable_on_error patch[1].
Hi, few extra comments on v13.
(1) src/backend/replication/logical/worker.c
With regard to clear_subscription_skip_lsn,
There are cases that we conduct origin state update twice.
For instance, the case we reset subskiplsn by executing an
irrelevant non-empty transaction. The first update is
conducted at apply_handle_commit_internal and the second one
is at clear_subscription_skip_lsn. In the second change,
we update replorigin_session_origin_lsn by smaller value(commit_lsn),
compared to the first update(end_lsn). Were those intentional and OK ?
(2) src/backend/replication/logical/worker.c
+ * Both origin_lsn and origin_timestamp are the remote transaction's end_lsn
+ * and commit timestamp, respectively.
+ */
+static void
+stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_ts)
Typo. Should change 'origin_timestamp' to 'origin_ts',
because the name of the argument is the latter.
Also, here we handle not only commit but also prepare.
You need to fix the comment "commit timestamp" as well.
(3) src/backend/replication/logical/worker.c
+/*
+ * Clear subskiplsn of pg_subscription catalog with origin state update.
+ *
+ * if with_warning is true, we raise a warning when clearing the subskipxid.
It's better to insert this second sentence as the last sentence of
the other comments. It should start with capital letter as well.
Best Regards,
Takamichi Osumi
On Mon, Mar 14, 2022 at 6:50 PM shiy.fnst@fujitsu.com
<shiy.fnst@fujitsu.com> wrote:
On Fri, Mar 11, 2022 4:20 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated version patch. This patch can be applied on
top of the latest disable_on_error patch[1].Thanks for your patch. Here are some comments for the v13 patch.
Thank you for the comments!
1. doc/src/sgml/ref/alter_subscription.sgml
+ Specifies the transaction's finish LSN of the remote transaction whose changesCould it be simplified to "Specifies the finish LSN of the remote transaction
whose ...".
Fixed.
2.
I met a failed assertion, the backtrace is attached. This is caused by the
following code in maybe_start_skipping_changes().+ /* + * It's a rare case; a past subskiplsn was left because the server + * crashed after preparing the transaction and before clearing the + * subskiplsn. We clear it without a warning message so as not confuse + * the user. + */ + if (unlikely(MySubscription->skiplsn < lsn)) + { + clear_subscription_skip_lsn(MySubscription->skiplsn, InvalidXLogRecPtr, 0, + false); + Assert(!IsTransactionState()); + }We want to clear subskiplsn in the case mentioned in comment. But if the next
transaction is a steaming transaction and this function is called by
apply_spooled_messages(), we are inside a transaction here. So, I think this
assertion is not suitable for streaming transaction. Thoughts?
Good catch. After more thought, I realized that the assumption of this
if statement is wrong and we don't necessarily need to do here since
the left skip-LSN will eventually be cleared when the next transaction
is finished. So removed this part.
3. + XLogRecPtr subskiplsn; /* All changes which committed at this LSN are + * skipped */To be consistent, should the comment be changed to "All changes which finished
at this LSN are skipped"?
Fixed.
4. + After logical replication worker successfully skips the transaction or commits + non-empty transaction, the LSN (stored in + <structname>pg_subscription</structname>.<structfield>subskiplsn</structfield>) + is cleared.Besides "commits non-empty transaction", subskiplsn would also be cleared in
some two-phase commit cases I think. Like prepare/commit/rollback a transaction,
even if it is an empty transaction. So, should we change it for these cases?
Fixed.
5.
+ * Clear subskiplsn of pg_subscription catalog with origin state update.Should "with origin state update" modified to "with origin state updated"?
Fixed.
I'll submit an updated patch soon.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Hi,
On Fri, Mar 11, 2022 at 8:37 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
On Friday, March 11, 2022 5:20 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated version patch. This patch can be applied on top of the
latest disable_on_error patch[1].Hi, thank you for the patch. I'll share my review comments on v13.
(a) src/backend/commands/subscriptioncmds.c
@@ -84,6 +86,8 @@ typedef struct SubOpts bool streaming; bool twophase; bool disableonerr; + XLogRecPtr lsn; /* InvalidXLogRecPtr for resetting purpose, + * otherwise a valid LSN */I think this explanation is slightly odd and can be improved.
Strictly speaking, I feel a *valid* LSN is for retting transaction purpose
from the functional perspective. Also, the wording "resetting purpose"
is unclear by itself. I'll suggest below change.From:
InvalidXLogRecPtr for resetting purpose, otherwise a valid LSN
To:
A valid LSN when we skip transaction, otherwise InvalidXLogRecPtr
"when we skip transaction" sounds incorrect to me since it's just an
option value but does not indicate that we really skip the transaction
that has that LSN. I realized that we directly use InvalidXLogRecPtr
for subskiplsn so I think no need to mention it.
(b) The code position of additional append in describeSubscriptions
+ + /* Skip LSN is only supported in v15 and higher */ + if (pset.sversion >= 150000) + appendPQExpBuffer(&buf, + ", subskiplsn AS \"%s\"\n", + gettext_noop("Skip LSN"));I suggest to combine this code after subdisableonerr.
I got the comment[1]/messages/by-id/09b80566-c790-704b-35b4-33f87befc41f@enterprisedb.com from Peter to put it at the end, which looks better to me.
(c) parse_subscription_options
+ /* Parse the argument as LSN */ + lsn = DatumGetTransactionId(DirectFunctionCall1(pg_lsn_in,Here, shouldn't we call DatumGetLSN, instead of DatumGetTransactionId ?
Right, fixed.
(d) parse_subscription_options
+ if (strcmp(lsn_str, "none") == 0) + { + /* Setting lsn = NONE is treated as resetting LSN */ + lsn = InvalidXLogRecPtr; + } +We should remove this pair of curly brackets that is for one sentence.
I moved the comment on top of the if statement and removed the brackets.
(e) src/backend/replication/logical/worker.c
+ * to skip applying the changes when starting to apply changes. The subskiplsn is + * cleared after successfully skipping the transaction or applying non-empty + * transaction, where the later avoids the mistakenly specified subskiplsn from + * being left.typo "the later" -> "the latter"
At the same time, I feel the last part of this sentence can be an independent sentence.
From:
, where the later avoids the mistakenly specified subskiplsn from being left
To:
. The latter prevents the mistakenly specified subskiplsn from being left
Fixed.
* Note that my comments below are applied if we choose we don't merge disable_on_error test with skip lsn tests.
(f) src/test/subscription/t/030_skip_xact.pl
+use Test::More tests => 4;
It's better to utilize the new style for the TAP test.
Then, probably we should introduce done_testing()
at the end of the test.
Fixed.
(g) src/test/subscription/t/030_skip_xact.pl
I think there's no need to create two types of subscriptions.
Just one subscription with two_phase = on and streaming = on
would be sufficient for the tests(normal commit, commit prepared,
stream commit cases). I think this point of view will reduce
the number of the table and the publication, which will
make the whole test simpler.
Good point, fixed.
On Mon, Mar 14, 2022 at 9:39 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
On Friday, March 11, 2022 5:20 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated version patch. This patch can be applied on top of the
latest disable_on_error patch[1].Hi, few extra comments on v13.
(1) src/backend/replication/logical/worker.c
With regard to clear_subscription_skip_lsn,
There are cases that we conduct origin state update twice.For instance, the case we reset subskiplsn by executing an
irrelevant non-empty transaction. The first update is
conducted at apply_handle_commit_internal and the second one
is at clear_subscription_skip_lsn. In the second change,
we update replorigin_session_origin_lsn by smaller value(commit_lsn),
compared to the first update(end_lsn). Were those intentional and OK ?
Good catch, this part is removed in the latest patch.
(2) src/backend/replication/logical/worker.c
+ * Both origin_lsn and origin_timestamp are the remote transaction's end_lsn + * and commit timestamp, respectively. + */ +static void +stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_ts)Typo. Should change 'origin_timestamp' to 'origin_ts',
because the name of the argument is the latter.Also, here we handle not only commit but also prepare.
You need to fix the comment "commit timestamp" as well.
Fixed.
(3) src/backend/replication/logical/worker.c
+/* + * Clear subskiplsn of pg_subscription catalog with origin state update. + * + * if with_warning is true, we raise a warning when clearing the subskipxid.It's better to insert this second sentence as the last sentence of
the other comments.
with_warning is removed in the latest patch.
I've attached an updated version patch.
Regards,
[1]: /messages/by-id/09b80566-c790-704b-35b4-33f87befc41f@enterprisedb.com
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v14-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transa.patchapplication/octet-stream; name=v14-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transa.patchDownload
From 044910c607ef91839bffb70d081ae682f3756e1c Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 10 Dec 2021 14:41:30 +0900
Subject: [PATCH v14] Add ALTER SUBSCRIPTION ... SKIP to skip the transaction
on subscriber nodes.
If incoming change violates any constraint, logical replication stops
until it's resolved. This commit introduces another way to skip the
transaction in question, other than manually updating the subscriber's
database or using pg_replication_origin_advance().
The user can specify LSN by ALTER SUBSCRIPTION ... SKIP (lsn = XXX),
updating pg_subscription.subskiplsn field, telling the apply worker to
skip the transaction. The apply worker skips all data modification changes
within the specified transaction.
After successfully skipping the transaction or finishing the
transaction, the apply worker clears pg_subscription.subskiplsn.
Author: Masahiko Sawada
Reviewed-by: Vignesh C, Greg Nancarrow, Takamichi Osumi, Haiying Tang, Hou Zhijie, Peter Eisentraut, Amit Kapila
Discussion: https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xJfUVihNZDA@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 10 +
doc/src/sgml/logical-replication.sgml | 18 +-
doc/src/sgml/ref/alter_subscription.sgml | 40 ++++
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 2 +-
src/backend/commands/subscriptioncmds.c | 67 +++++++
src/backend/parser/gram.y | 9 +
src/backend/replication/logical/worker.c | 218 ++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 4 +
src/bin/psql/describe.c | 8 +-
src/bin/psql/tab-complete.c | 5 +-
src/include/catalog/pg_subscription.h | 5 +
src/include/nodes/parsenodes.h | 3 +-
src/test/regress/expected/subscription.out | 126 ++++++------
src/test/regress/sql/subscription.sql | 11 ++
src/test/subscription/t/030_skip_xact.pl | 182 +++++++++++++++++
16 files changed, 635 insertions(+), 74 deletions(-)
create mode 100644 src/test/subscription/t/030_skip_xact.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 7777d60514..eec06b90e8 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7779,6 +7779,16 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subskiplsn</structfield> <type>pg_lsn</type>
+ </para>
+ <para>
+ Finish LSN of the transaction whose changes are to be skipped, if a valid
+ LSN; otherwise <literal>0/0</literal>.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subconninfo</structfield> <type>text</type>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 6431d4796d..18e4e4b186 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -366,15 +366,19 @@ CONTEXT: processing remote data for replication origin "pg_16395" during "INSER
transaction, the subscription needs to be disabled temporarily by
<command>ALTER SUBSCRIPTION ... DISABLE</command> first or alternatively, the
subscription can be used with the <literal>disable_on_error</literal> option.
- Then, the transaction can be skipped by calling the
+ Then, the transaction can be skipped by using
+ <command>ALTER SUBSCRITPION ... SKIP</command> with the finish LSN
+ (i.e., LSN 0/14C0378). After that the replication
+ can be resumed by <command>ALTER SUBSCRIPTION ... ENABLE</command>.
+ Alternatively, the transaction can also be skipped by calling the
<link linkend="pg-replication-origin-advance">
- <function>pg_replication_origin_advance()</function></link> function with
- the <parameter>node_name</parameter> (i.e., <literal>pg_16395</literal>) and the
- next LSN of the transaction's LSN (i.e., LSN 0/14C0379). After that the replication
- can be resumed by <command>ALTER SUBSCRIPTION ... ENABLE</command>. The current
- position of origins can be seen in the
- <link linkend="view-pg-replication-origin-status">
+ <function>pg_replication_origin_advance()</function></link> function with the
+ <parameter>node_name</parameter> (i.e., <literal>pg_16395</literal>) and the next
+ LSN of the finish LSN (i.e., 0/14C0379). The current position of origins can
+ be seen in the <link linkend="view-pg-replication-origin-status">
<structname>pg_replication_origin_status</structname></link> system view.
+ Please note that skipping the whole transaction include skipping changes that
+ might not violate any constraint. This can easily make the subscriber inconsistent.
</para>
</sect1>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 58b78a94ea..266b5717d5 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -29,6 +29,7 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> REFRESH PUB
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> ENABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> DISABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> )
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <replaceable>new_name</replaceable>
</synopsis>
@@ -210,6 +211,45 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><literal>SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> )</literal></term>
+ <listitem>
+ <para>
+ Skips applying all changes of the specified remote transaction. If incoming
+ data violates any constraints, logical replication will stop until it is
+ resolved. By using <command>ALTER SUBSCRIPTION ... SKIP</command> command,
+ the logical replication worker skips all data modification changes within
+ the specified transaction. This option has no effect on the transactions
+ that are already prepared by enabling <literal>two_phase</literal> on
+ subscriber.
+ After logical replication worker successfully skips the transaction or
+ finishes the transaction, LSN (stored in
+ <structname>pg_subscription</structname>.<structfield>subskiplsn</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts.
+ </para>
+
+ <para>
+ <replaceable>skip_option</replaceable> specifies options for this operation.
+ The supported option is:
+
+ <variablelist>
+ <varlistentry>
+ <term><literal>lsn</literal> (<type>pg_lsn</type>)</term>
+ <listitem>
+ <para>
+ Specifies the finish LSN of the remote transaction whose changes
+ are to be skipped by the logical replication worker. Skipping individual
+ subtransactions is not supported. Setting <literal>NONE</literal>
+ resets the LSN.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><replaceable class="parameter">new_owner</replaceable></term>
<listitem>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a6304f5f81..0ff0982f7b 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -70,6 +70,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->stream = subform->substream;
sub->twophasestate = subform->subtwophasestate;
sub->disableonerr = subform->subdisableonerr;
+ sub->skiplsn = subform->subskiplsn;
/* Get conninfo */
datum = SysCacheGetAttr(SUBSCRIPTIONOID,
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index bb1ac30cd1..bd48ee7bd2 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1261,7 +1261,7 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
- substream, subtwophasestate, subdisableonerr, subslotname,
+ substream, subtwophasestate, subdisableonerr, subskiplsn, subslotname,
subsynccommit, subpublications)
ON pg_subscription TO public;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 3922658bbc..4d0bee0403 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -45,6 +45,7 @@
#include "utils/guc.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/pg_lsn.h"
#include "utils/syscache.h"
/*
@@ -62,6 +63,7 @@
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
#define SUBOPT_DISABLE_ON_ERR 0x00000400
+#define SUBOPT_LSN 0x00000800
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -84,6 +86,7 @@ typedef struct SubOpts
bool streaming;
bool twophase;
bool disableonerr;
+ XLogRecPtr lsn;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -262,6 +265,33 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_DISABLE_ON_ERR;
opts->disableonerr = defGetBoolean(defel);
}
+ else if (IsSet(supported_opts, SUBOPT_LSN) &&
+ strcmp(defel->defname, "lsn") == 0)
+ {
+ char *lsn_str = defGetString(defel);
+ XLogRecPtr lsn;
+
+ if (IsSet(opts->specified_opts, SUBOPT_LSN))
+ errorConflictingDefElem(defel, pstate);
+
+ /* Setting lsn = NONE is treated as resetting LSN */
+ if (strcmp(lsn_str, "none") == 0)
+ lsn = InvalidXLogRecPtr;
+ else
+ {
+ /* Parse the argument as LSN */
+ lsn = DatumGetLSN(DirectFunctionCall1(pg_lsn_in,
+ CStringGetDatum(lsn_str)));
+
+ if (XLogRecPtrIsInvalid(lsn))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid WAL location (LSN): %s", lsn_str)));
+ }
+
+ opts->specified_opts |= SUBOPT_LSN;
+ opts->lsn = lsn;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -479,6 +509,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
values[Anum_pg_subscription_subdisableonerr - 1] = BoolGetDatum(opts.disableonerr);
+ values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -1106,6 +1137,42 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
break;
}
+ case ALTER_SUBSCRIPTION_SKIP:
+ {
+ parse_subscription_options(pstate, stmt->options, SUBOPT_LSN, &opts);
+
+ /* ALTER SUBSCRIPTION ... SKIP supports only LSN option */
+ Assert(IsSet(opts.specified_opts, SUBOPT_LSN));
+
+ /*
+ * If the user set subskiplsn, we do a sanity check to make
+ * sure that the specified LSN is a probable value.
+ */
+ if (!XLogRecPtrIsInvalid(opts.lsn))
+ {
+ RepOriginId originid;
+ char originname[NAMEDATALEN];
+ XLogRecPtr remote_lsn;
+
+ snprintf(originname, sizeof(originname), "pg_%u", subid);
+ originid = replorigin_by_name(originname, false);
+ remote_lsn = replorigin_get_progress(originid, false);
+
+ /* Check the given LSN is at least a future LSN */
+ if (!XLogRecPtrIsInvalid(remote_lsn) && opts.lsn < remote_lsn)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("skip WAL location (LSN) must be greater than origin LSN %X/%X",
+ LSN_FORMAT_ARGS(remote_lsn))));
+ }
+
+ values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(opts.lsn);
+ replaces[Anum_pg_subscription_subskiplsn - 1] = true;
+
+ update_tuple = true;
+ break;
+ }
+
default:
elog(ERROR, "unrecognized ALTER SUBSCRIPTION kind %d",
stmt->kind);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index a03b33b53b..0036c2f9e2 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9983,6 +9983,15 @@ AlterSubscriptionStmt:
(Node *)makeBoolean(false), @1));
$$ = (Node *)n;
}
+ | ALTER SUBSCRIPTION name SKIP definition
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_SKIP;
+ n->subname = $3;
+ n->options = $5;
+ $$ = (Node *)n;
+ }
;
/*****************************************************************************
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 03e069c7cd..18419c9a60 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -136,6 +136,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -189,6 +190,7 @@
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/pg_lsn.h"
#include "utils/rel.h"
#include "utils/rls.h"
#include "utils/syscache.h"
@@ -259,6 +261,21 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * We enable skipping all data modification changes (INSERT, UPDATE, etc.) for
+ * the subscription if the remote transaction's finish LSN matches the subskiplsn.
+ * Once we start skipping changes, we don't stop it until we skip all changes of
+ * the transaction even if pg_subscription is updated and MySubscription->skiplsn
+ * gets changed or reset during that. Also, in streaming transaction cases, we
+ * don't skip receiving and spooling the changes since we decide whether or not
+ * to skip applying the changes when starting to apply changes. The subskiplsn is
+ * cleared after successfully skipping the transaction or applying non-empty
+ * transaction. The latter prevents the mistakenly specified subskiplsn from
+ * being left.
+ */
+static XLogRecPtr skip_xact_finish_lsn = InvalidXLogRecPtr;
+#define is_skipping_changes() (unlikely(!XLogRecPtrIsInvalid(skip_xact_finish_lsn)))
+
/* BufFile handle of the current streaming file */
static BufFile *stream_fd = NULL;
@@ -336,6 +353,11 @@ static void TwoPhaseTransactionGid(Oid subid, TransactionId xid, char *gid, int
/* Common streaming function to apply all the spooled messages */
static void apply_spooled_messages(TransactionId xid, XLogRecPtr lsn);
+/* Functions for skipping changes */
+static void maybe_start_skipping_changes(XLogRecPtr finish_lsn);
+static void stop_skipping_changes(void);
+static void clear_subscription_skip_lsn(XLogRecPtr finish_lsn);
+
/* Functions for apply error callback */
static void apply_error_callback(void *arg);
static inline void set_apply_error_context_xact(TransactionId xid, XLogRecPtr lsn);
@@ -795,6 +817,8 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
+ maybe_start_skipping_changes(begin_data.final_lsn);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -847,6 +871,8 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
+ maybe_start_skipping_changes(begin_data.prepare_lsn);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -905,9 +931,9 @@ apply_handle_prepare(StringInfo s)
/*
* Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction. It is done this way because at
- * commit prepared time, we won't know whether we have skipped preparing a
- * transaction because of no change.
+ * change has happened in this transaction or all changes are skipped. It
+ * is done this way because at commit prepared time, we won't know whether
+ * we have skipped preparing a transaction because of no change.
*
* XXX, We can optimize such that at commit prepared time, we first check
* whether we have prepared the transaction or not but that doesn't seem
@@ -928,6 +954,15 @@ apply_handle_prepare(StringInfo s)
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(prepare_data.end_lsn);
+ /*
+ * Since we already have prepared the transaction, in a case where the
+ * server crashes before clearing the subskiplsn, it will be left but the
+ * transaction won't be resent. But that's okay because it's a rare case
+ * and the subskiplsn will be cleared when finishing the next transaction.
+ */
+ stop_skipping_changes();
+ clear_subscription_skip_lsn(prepare_data.prepare_lsn);
+
pgstat_report_activity(STATE_IDLE, NULL);
reset_apply_error_context_info();
}
@@ -969,6 +1004,8 @@ apply_handle_commit_prepared(StringInfo s)
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(prepare_data.end_lsn);
+ clear_subscription_skip_lsn(prepare_data.end_lsn);
+
pgstat_report_activity(STATE_IDLE, NULL);
reset_apply_error_context_info();
}
@@ -1010,6 +1047,8 @@ apply_handle_rollback_prepared(StringInfo s)
FinishPreparedTransaction(gid, false);
end_replication_step();
CommitTransactionCommand();
+
+ clear_subscription_skip_lsn(rollback_data.rollback_end_lsn);
}
pgstat_report_stat(false);
@@ -1072,6 +1111,13 @@ apply_handle_stream_prepare(StringInfo s)
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(prepare_data.end_lsn);
+ /*
+ * Similar to prepare case, the subskiplsn could be left in a case of
+ * server crash but it's okay. See the comments in apply_handle_prepare().
+ */
+ stop_skipping_changes();
+ clear_subscription_skip_lsn(prepare_data.prepare_lsn);
+
pgstat_report_activity(STATE_IDLE, NULL);
reset_apply_error_context_info();
@@ -1311,6 +1357,8 @@ apply_spooled_messages(TransactionId xid, XLogRecPtr lsn)
MemoryContext oldcxt;
BufFile *fd;
+ maybe_start_skipping_changes(lsn);
+
/* Make sure we have an open transaction */
begin_replication_step();
@@ -1455,8 +1503,26 @@ apply_handle_stream_commit(StringInfo s)
static void
apply_handle_commit_internal(LogicalRepCommitData *commit_data)
{
+ if (is_skipping_changes())
+ {
+ stop_skipping_changes();
+
+ /*
+ * Start a new transaction to clear the subskipxid, if not started
+ * yet. The transaction is committed below.
+ */
+ if (!IsTransactionState())
+ StartTransactionCommand();
+ }
+
if (IsTransactionState())
{
+ /*
+ * The transaction is either non-empty or skipped, so we clear the
+ * subskiplsn.
+ */
+ clear_subscription_skip_lsn(commit_data->commit_lsn);
+
/*
* Update origin state so we can restart streaming from correct
* position in case of crash.
@@ -1583,7 +1649,8 @@ apply_handle_insert(StringInfo s)
TupleTableSlot *remoteslot;
MemoryContext oldctx;
- if (handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s))
+ if (is_skipping_changes() ||
+ handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s))
return;
begin_replication_step();
@@ -1710,7 +1777,8 @@ apply_handle_update(StringInfo s)
RangeTblEntry *target_rte;
MemoryContext oldctx;
- if (handle_streamed_transaction(LOGICAL_REP_MSG_UPDATE, s))
+ if (is_skipping_changes() ||
+ handle_streamed_transaction(LOGICAL_REP_MSG_UPDATE, s))
return;
begin_replication_step();
@@ -1874,7 +1942,8 @@ apply_handle_delete(StringInfo s)
TupleTableSlot *remoteslot;
MemoryContext oldctx;
- if (handle_streamed_transaction(LOGICAL_REP_MSG_DELETE, s))
+ if (is_skipping_changes() ||
+ handle_streamed_transaction(LOGICAL_REP_MSG_DELETE, s))
return;
begin_replication_step();
@@ -2261,7 +2330,8 @@ apply_handle_truncate(StringInfo s)
ListCell *lc;
LOCKMODE lockmode = AccessExclusiveLock;
- if (handle_streamed_transaction(LOGICAL_REP_MSG_TRUNCATE, s))
+ if (is_skipping_changes() ||
+ handle_streamed_transaction(LOGICAL_REP_MSG_TRUNCATE, s))
return;
begin_replication_step();
@@ -3738,6 +3808,140 @@ IsLogicalWorker(void)
return MyLogicalRepWorker != NULL;
}
+/*
+ * Start skipping changes of the transaction if the given LSN matches the
+ * LSN specified by subscription's skiplsn.
+ */
+static void
+maybe_start_skipping_changes(XLogRecPtr finish_lsn)
+{
+ Assert(!is_skipping_changes());
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ /*
+ * Quick return if it's not requested to skip this transaction. This
+ * function is called every start of applying changes and we assume that
+ * skipping the transaction is not used in many cases.
+ */
+ if (likely(XLogRecPtrIsInvalid(MySubscription->skiplsn) ||
+ MySubscription->skiplsn != finish_lsn))
+ return;
+
+ /* Start skipping all changes of this transaction */
+ skip_xact_finish_lsn = finish_lsn;
+
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction which finished at %X/%X",
+ LSN_FORMAT_ARGS(skip_xact_finish_lsn)));
+}
+
+/*
+ * Stop skipping changes by resetting skip_xact_finish_lsn if enabled.
+ */
+static void
+stop_skipping_changes(void)
+{
+ if (!is_skipping_changes())
+ return;
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction which finished at %X/%X",
+ LSN_FORMAT_ARGS(skip_xact_finish_lsn))));
+
+ /* Stop skipping changes */
+ skip_xact_finish_lsn = InvalidXLogRecPtr;
+}
+
+/*
+ * Clear subskiplsn of pg_subscription catalog with origin state updated.
+ *
+ * finish_lsn is the transaction's finish LSN that is used to check if the
+ * subskiplsn matches it. If not matched, we raise a warning when clearing the
+ * subskipxid in order to inform users for cases e.g., where the user mistakenly
+ * specified the wrong subskiplsn.
+ */
+static void
+clear_subscription_skip_lsn(XLogRecPtr finish_lsn)
+{
+ Relation rel;
+ Form_pg_subscription subform;
+ HeapTuple tup;
+ XLogRecPtr myskiplsn = MySubscription->skiplsn;
+ bool started_tx = false;
+
+ if (likely(XLogRecPtrIsInvalid(myskiplsn)))
+ return;
+
+ if (!IsTransactionState())
+ {
+ StartTransactionCommand();
+ started_tx = true;
+ }
+
+ /*
+ * Protect subskiplsn of pg_subscription from being concurrently updated
+ * while clearing it.
+ */
+ LockSharedObject(SubscriptionRelationId, MySubscription->oid, 0,
+ AccessShareLock);
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ ObjectIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ subform = (Form_pg_subscription) GETSTRUCT(tup);
+
+ /*
+ * Update the subskiplsn of the tuple to InvalidXLogRecPtr. If user has
+ * already changed subskiplsn before clearing it we don't update the
+ * catalog and don't advance the replication origin state. So in the
+ * worst case, if the server crashes before sending an acknowledgment of
+ * the flush position the transaction will be sent again and the user
+ * needs to set subskiplsn again. We can reduce the possibility by
+ * logging a replication origin WAL record to advance the origin LSN
+ * instead but there is no way to advance the origin timestamp and it
+ * doesn't seem to be worth doing anything about it since it's a very rare
+ * case.
+ */
+ if (subform->subskiplsn == myskiplsn)
+ {
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ /* reset subskiplsn */
+ values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ replaces[Anum_pg_subscription_subskiplsn - 1] = true;
+
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+
+ if (myskiplsn != finish_lsn)
+ ereport(WARNING,
+ errmsg("skip-LSN of logical replication subscription \"%s\" cleared", MySubscription->name),
+ errdetail("Remote transaction's finish WAL location (LSN) %X/%X did not match skip-LSN %X/%X",
+ LSN_FORMAT_ARGS(finish_lsn),
+ LSN_FORMAT_ARGS(myskiplsn)));
+ }
+
+ heap_freetuple(tup);
+ table_close(rel, NoLock);
+
+ if (started_tx)
+ CommitTransactionCommand();
+}
+
/* Error callback to give more context info about the change being applied */
static void
apply_error_callback(void *arg)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 4dd24b8c89..202bca4b23 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4359,6 +4359,10 @@ getSubscriptions(Archive *fout)
ntups = PQntuples(res);
+ /*
+ * Get subscription fields. We don't include subskiplsn in the dump as
+ * after restoring the dump this value may no longer be relevant.
+ */
i_tableoid = PQfnumber(res, "tableoid");
i_oid = PQfnumber(res, "oid");
i_subname = PQfnumber(res, "subname");
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 9229eacb6d..4c6c370b6f 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6084,7 +6084,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false};
+ false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6131,6 +6131,12 @@ describeSubscriptions(const char *pattern, bool verbose)
", subconninfo AS \"%s\"\n",
gettext_noop("Synchronous commit"),
gettext_noop("Conninfo"));
+
+ /* Skip LSN is only supported in v15 and higher */
+ if (pset.sversion >= 150000)
+ appendPQExpBuffer(&buf,
+ ", subskiplsn AS \"%s\"\n",
+ gettext_noop("Skip LSN"));
}
/* Only display subscriptions in current database. */
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 17172827a9..11cb41128e 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1819,7 +1819,7 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> */
else if (Matches("ALTER", "SUBSCRIPTION", MatchAny))
COMPLETE_WITH("CONNECTION", "ENABLE", "DISABLE", "OWNER TO",
- "RENAME TO", "REFRESH PUBLICATION", "SET",
+ "RENAME TO", "REFRESH PUBLICATION", "SET", "SKIP (",
"ADD PUBLICATION", "DROP PUBLICATION");
/* ALTER SUBSCRIPTION <name> REFRESH PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) &&
@@ -1835,6 +1835,9 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
COMPLETE_WITH("binary", "slot_name", "streaming", "synchronous_commit", "disable_on_error");
+ /* ALTER SUBSCRIPTION <name> SKIP ( */
+ else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
+ COMPLETE_WITH("lsn");
/* ALTER SUBSCRIPTION <name> SET PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "PUBLICATION"))
{
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index e2befaf351..9275212d56 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -70,6 +70,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
bool subdisableonerr; /* True if a worker error should cause the
* subscription to be disabled */
+ XLogRecPtr subskiplsn; /* All changes which finished at this LSN are
+ * skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -109,6 +112,8 @@ typedef struct Subscription
bool disableonerr; /* Indicates if the subscription should be
* automatically disabled if a worker error
* occurs */
+ XLogRecPtr skiplsn; /* All changes which finished at this LSN are
+ * skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 1617702d9d..6f83a79a96 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3726,7 +3726,8 @@ typedef enum AlterSubscriptionType
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
ALTER_SUBSCRIPTION_DROP_PUBLICATION,
ALTER_SUBSCRIPTION_REFRESH,
- ALTER_SUBSCRIPTION_ENABLED
+ ALTER_SUBSCRIPTION_ENABLED,
+ ALTER_SUBSCRIPTION_SKIP
} AlterSubscriptionType;
typedef struct AlterSubscriptionStmt
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index ad8003fae1..7fcfad1591 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -76,10 +76,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -93,11 +93,25 @@ ALTER SUBSCRIPTION regress_doesnotexist CONNECTION 'dbname=regress_doesnotexist2
ERROR: subscription "regress_doesnotexist" does not exist
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
ERROR: unrecognized subscription parameter: "create_slot"
+-- ok
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | off | dbname=regress_doesnotexist2 | 0/12345
+(1 row)
+
+-- ok - with lsn = NONE
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
+ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------------------+------------------------------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | off | dbname=regress_doesnotexist2
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -129,10 +143,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------------------+------------------------------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | local | dbname=regress_doesnotexist2
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -165,19 +179,19 @@ ERROR: binary requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, binary = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -188,19 +202,19 @@ ERROR: streaming requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -215,10 +229,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more then once
@@ -233,10 +247,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -270,10 +284,10 @@ ERROR: two_phase requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -282,10 +296,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -294,10 +308,10 @@ DROP SUBSCRIPTION regress_testsub;
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -309,18 +323,18 @@ ERROR: disable_on_error requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, disable_on_error = false);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | t | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | t | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index a7c15b1daf..74c38ead5d 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -72,6 +72,17 @@ ALTER SUBSCRIPTION regress_testsub SET (slot_name = '');
ALTER SUBSCRIPTION regress_doesnotexist CONNECTION 'dbname=regress_doesnotexist2';
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
+-- ok
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
+
+\dRs+
+
+-- ok - with lsn = NONE
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
+
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
+
\dRs+
BEGIN;
diff --git a/src/test/subscription/t/030_skip_xact.pl b/src/test/subscription/t/030_skip_xact.pl
new file mode 100644
index 0000000000..d631400673
--- /dev/null
+++ b/src/test/subscription/t/030_skip_xact.pl
@@ -0,0 +1,182 @@
+
+# Copyright (c) 2022, PostgreSQL Global Development Group
+
+# Tests for skipping logical replication transactions
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $offset = 0;
+my $relname = 'tap_tab';
+my $subname = 'tap_sub';
+
+# Test skipping the transaction. This function must be called after the caller
+# has inserted data that conflicts with the subscriber. The finish LSN of the
+# error transaction that is used to specify to ALTER SUBSCRIPTION ... SKIP is
+# fetched from the server logs. After executing ALTER SUBSCRITPION ... SKIP, we
+# check if logical replication can continue working by inserting $nonconflict_data
+# on the publisher.
+sub test_skip_xact
+{
+ my ($node_publisher, $node_subscriber, $nonconflict_data, $expected, $msg)
+ = @_;
+
+ # Wait until a conflict occurs on the subscriber.
+ $node_subscriber->poll_query_until(
+ 'postgres',
+ qq[
+SELECT subenabled = FALSE FROM pg_subscription WHERE subname = '$subname'
+]);
+
+ # Get the finish LSN of the error transaction.
+ my $contents = slurp_file($node_subscriber->logfile, $offset);
+ $contents =~
+ qr/processing remote data for replication origin \"pg_\d+\" during "INSERT" for replication target relation "public.$relname" in transaction \d+ finished at ([[:xdigit:]]+\/[[:xdigit:]]+)/
+ or die "could not get error-LSN";
+ my $lsn = $1;
+
+ # Set skip lsn.
+ $node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION $subname SKIP (lsn = '$lsn')");
+
+ # Re-enable the subscription.
+ $node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION $subname ENABLE");
+
+ # Wait for the failed transaction to be skipped
+ $node_subscriber->poll_query_until('postgres',
+ "SELECT subskiplsn = '0/0' FROM pg_subscription WHERE subname = '$subname'"
+ );
+
+ # Check the log indicating that successfully skipped the transaction, and
+ # advance the offset of the log file for the next test.
+ $offset = $node_subscriber->wait_for_log(
+ qr/LOG: done skipping logical replication transaction which finished at $lsn/,
+ $offset);
+
+ # Insert non-conflict data
+ $node_publisher->safe_psql('postgres',
+ "INSERT INTO $relname VALUES $nonconflict_data");
+
+ $node_publisher->wait_for_catchup($subname);
+
+ # Check replicated data
+ my $res = $node_subscriber->safe_psql('postgres',
+ "SELECT count(*) FROM $relname");
+ is($res, $expected, $msg);
+}
+
+# Create publisher node. Set a low value to logical_decoding_work_mem
+# so we can test streaming cases easily.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf(
+ 'postgresql.conf',
+ qq[
+logical_decoding_work_mem = 64kB
+max_prepared_transactions = 10
+]);
+$node_publisher->start;
+
+# Create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->append_conf(
+ 'postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+]);
+
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On the subscriber, we
+# create the same tables but with primary keys. Also, insert some data that
+# will conflict with the data replicated from publisher later.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+CREATE TABLE $relname (a int, b text);
+COMMIT;
+]);
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+CREATE TABLE $relname (a int primary key, b text);
+INSERT INTO $relname VALUES (1);
+COMMIT;
+]);
+
+# Setup publications
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+CREATE PUBLICATION tap_pub FOR TABLE $relname;
+]);
+
+# Create subscriptions. Both subscription sets disable_on_error to on
+# so that they get disabled when a conflict occurs.
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+CREATE SUBSCRIPTION $subname CONNECTION '$publisher_connstr' PUBLICATION tap_pub WITH (streaming = on, two_phase = on, disable_on_error = on);
+]);
+
+$node_publisher->wait_for_catchup($subname);
+$node_subscriber->poll_query_until(
+ 'postgres',
+ qq[
+SELECT COUNT(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('s', 'r')
+]);
+
+# Insert data to test_tab1, raising an error on the subscriber due to violation
+# of the unique constraint on test_tab. Then skip the transaction.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO $relname VALUES (1);
+COMMIT;
+]);
+test_skip_xact($node_publisher, $node_subscriber,
+ "(2, NULL)", "2", "test skipping transaction");
+
+# Test for PREPARE and COMMIT PREPARED. Insert the same data to test_tab1 and
+# PREPARE the transaction, raising an error. Then skip the transaction.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO $relname VALUES (1);
+PREPARE TRANSACTION 'gtx';
+COMMIT PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber,
+ "(3, NULL)", "3", "test skipping prepare and commit prepared ");
+
+# Test for STREAM COMMIT. Insert enough rows to test_tab to exceed the 64kB
+# limit, also raising an error on the subscriber during applying spooled
+# changes for the same reason. Then skip the transaction.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO $relname SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+COMMIT;
+]);
+test_skip_xact($node_publisher, $node_subscriber, "(4, md5(4::text))",
+ "4", "test skipping stream-commit");
+
+my $res = $node_subscriber->safe_psql('postgres',
+ "SELECT count(*) FROM pg_prepared_xacts");
+is($res, "0",
+ "check all prepared transactions are resolved on the subscriber");
+
+$node_subscriber->stop;
+$node_publisher->stop;
+
+done_testing();
--
2.24.3 (Apple Git-128)
On Tue, Mar 15, 2022 at 11:43 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated version patch.
Review:
=======
1.
+++ b/doc/src/sgml/logical-replication.sgml
@@ -366,15 +366,19 @@ CONTEXT: processing remote data for replication
origin "pg_16395" during "INSER
transaction, the subscription needs to be disabled temporarily by
<command>ALTER SUBSCRIPTION ... DISABLE</command> first or
alternatively, the
subscription can be used with the
<literal>disable_on_error</literal> option.
- Then, the transaction can be skipped by calling the
+ Then, the transaction can be skipped by using
+ <command>ALTER SUBSCRITPION ... SKIP</command> with the finish LSN
+ (i.e., LSN 0/14C0378). After that the replication
+ can be resumed by <command>ALTER SUBSCRIPTION ... ENABLE</command>.
+ Alternatively, the transaction can also be skipped by calling the
Do we really need to disable the subscription for the skip feature? I
think that is required for origin_advance. Also, probably, we can say
Finish LSN could be Prepare LSN, Commit LSN, etc.
2.
+ /*
+ * Quick return if it's not requested to skip this transaction. This
+ * function is called every start of applying changes and we assume that
+ * skipping the transaction is not used in many cases.
+ */
+ if (likely(XLogRecPtrIsInvalid(MySubscription->skiplsn) ||
The second part of this comment (especially ".. every start of
applying changes ..") sounds slightly odd to me. How about changing it
to: "This function is called for every remote transaction and we
assume that skipping the transaction is not used in many cases."
3.
+
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction which
finished at %X/%X",
...
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction which
finished at %X/%X",
No need of 'which' in above LOG messages. I think the message will be
clear without the use of which in above message.
4.
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction which
finished at %X/%X",
+ LSN_FORMAT_ARGS(skip_xact_finish_lsn))));
+
+ /* Stop skipping changes */
+ skip_xact_finish_lsn = InvalidXLogRecPtr;
Let's reverse the order of these statements to make them consistent
with the corresponding maybe_start_* function.
5.
+
+ if (myskiplsn != finish_lsn)
+ ereport(WARNING,
+ errmsg("skip-LSN of logical replication subscription \"%s\"
cleared", MySubscription->name),
Shouldn't this be a LOG instead of a WARNING as this will be displayed
only in server logs and by background apply worker?
6.
@@ -1583,7 +1649,8 @@ apply_handle_insert(StringInfo s)
TupleTableSlot *remoteslot;
MemoryContext oldctx;
- if (handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s))
+ if (is_skipping_changes() ||
Is there a reason to keep the skip_changes check here and in other DML
operations instead of at one central place in apply_dispatch?
7.
+ /*
+ * Start a new transaction to clear the subskipxid, if not started
+ * yet. The transaction is committed below.
+ */
+ if (!IsTransactionState())
I think the second part of the comment: "The transaction is committed
below." is not required.
8.
+ XLogRecPtr subskiplsn; /* All changes which finished at this LSN are
+ * skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -109,6 +112,8 @@ typedef struct Subscription
bool disableonerr; /* Indicates if the subscription should be
* automatically disabled if a worker error
* occurs */
+ XLogRecPtr skiplsn; /* All changes which finished at this LSN are
+ * skipped */
No need for 'which' in the above comments.
9.
Can we merge 029_disable_on_error in 030_skip_xact and name it as
029_on_error (or 029_on_error_skip_disable or some variant of it)?
Both seem to be related features. I am slightly worried at the pace at
which the number of test files are growing in subscription test.
--
With Regards,
Amit Kapila.
On Tuesday, March 15, 2022 3:13 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated version patch.
A couple of minor comments on v14.
(1) apply_handle_commit_internal
+ if (is_skipping_changes())
+ {
+ stop_skipping_changes();
+
+ /*
+ * Start a new transaction to clear the subskipxid, if not started
+ * yet. The transaction is committed below.
+ */
+ if (!IsTransactionState())
+ StartTransactionCommand();
+ }
+
I suppose we can move this condition check and stop_skipping_changes() call
to the inside of the block we enter when IsTransactionState() returns true.
As the comment of apply_handle_commit_internal() mentions,
it's the helper function for apply_handle_commit() and
apply_handle_stream_commit().
Then, I couldn't think that both callers don't open
a transaction before the call of apply_handle_commit_internal().
For applying spooled messages, we call begin_replication_step as well.
I can miss something, but timing when we receive COMMIT message
without opening a transaction, would be the case of empty transactions
where the subscription (and its subscription worker) is not interested.
If this is true, currently the patch's code includes
such cases within the range of is_skipping_changes() check.
(2) clear_subscription_skip_lsn's comments.
The comments for this function shouldn't touch
update of origin states, now that we don't update those.
+/*
+ * Clear subskiplsn of pg_subscription catalog with origin state updated.
+ *
This applies to other comments.
+ /*
+ * Update the subskiplsn of the tuple to InvalidXLogRecPtr. If user has
+ * already changed subskiplsn before clearing it we don't update the
+ * catalog and don't advance the replication origin state.
...
+ * .... We can reduce the possibility by
+ * logging a replication origin WAL record to advance the origin LSN
+ * instead but there is no way to advance the origin timestamp and it
+ * doesn't seem to be worth doing anything about it since it's a very rare
+ * case.
+ */
Best Regards,
Takamichi Osumi
On Tue, Mar 15, 2022 at 7:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Mar 15, 2022 at 11:43 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
6.
@@ -1583,7 +1649,8 @@ apply_handle_insert(StringInfo s)
TupleTableSlot *remoteslot;
MemoryContext oldctx;- if (handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s)) + if (is_skipping_changes() ||Is there a reason to keep the skip_changes check here and in other DML
operations instead of at one central place in apply_dispatch?
Since we already have the check of applying the change on the spot at
the beginning of the handlers I feel it's better to add
is_skipping_changes() to the check than add a new if statement to
apply_dispatch, but do you prefer to check it in one central place in
apply_dispatch?
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wed, Mar 16, 2022 at 6:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Mar 15, 2022 at 7:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Mar 15, 2022 at 11:43 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
6.
@@ -1583,7 +1649,8 @@ apply_handle_insert(StringInfo s)
TupleTableSlot *remoteslot;
MemoryContext oldctx;- if (handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s)) + if (is_skipping_changes() ||Is there a reason to keep the skip_changes check here and in other DML
operations instead of at one central place in apply_dispatch?Since we already have the check of applying the change on the spot at
the beginning of the handlers I feel it's better to add
is_skipping_changes() to the check than add a new if statement to
apply_dispatch, but do you prefer to check it in one central place in
apply_dispatch?
I think either way is fine. I just wanted to know the reason, your
current change looks okay to me.
Some questions/comments
======================
1. IIRC, earlier, we thought of allowing to use of this option (SKIP)
only for superusers (as this can lead to inconsistent data if not used
carefully) but I don't see that check in the latest patch. What is the
reason for the same?
2.
+ /*
+ * Update the subskiplsn of the tuple to InvalidXLogRecPtr.
I think we can change the above part of the comment to "Clear subskiplsn."
3.
+ * Since we already have
Isn't it better to say here: Since we have already ...?
--
With Regards,
Amit Kapila.
On Tue, Mar 15, 2022 at 7:30 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
On Tuesday, March 15, 2022 3:13 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated version patch.
A couple of minor comments on v14.
(1) apply_handle_commit_internal
+ if (is_skipping_changes()) + { + stop_skipping_changes(); + + /* + * Start a new transaction to clear the subskipxid, if not started + * yet. The transaction is committed below. + */ + if (!IsTransactionState()) + StartTransactionCommand(); + } +I suppose we can move this condition check and stop_skipping_changes() call
to the inside of the block we enter when IsTransactionState() returns true.As the comment of apply_handle_commit_internal() mentions,
it's the helper function for apply_handle_commit() and
apply_handle_stream_commit().Then, I couldn't think that both callers don't open
a transaction before the call of apply_handle_commit_internal().
For applying spooled messages, we call begin_replication_step as well.I can miss something, but timing when we receive COMMIT message
without opening a transaction, would be the case of empty transactions
where the subscription (and its subscription worker) is not interested.
I think when we skip non-streamed transactions we don't start a
transaction. So, if we do what you are suggesting, we will miss to
clear the skip_lsn after skipping the transaction.
--
With Regards,
Amit Kapila.
On Wed, Mar 16, 2022 at 7:58 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Mar 16, 2022 at 6:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Mar 15, 2022 at 7:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Mar 15, 2022 at 11:43 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
6.
@@ -1583,7 +1649,8 @@ apply_handle_insert(StringInfo s)
TupleTableSlot *remoteslot;
MemoryContext oldctx;- if (handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s)) + if (is_skipping_changes() ||Is there a reason to keep the skip_changes check here and in other DML
operations instead of at one central place in apply_dispatch?Since we already have the check of applying the change on the spot at
the beginning of the handlers I feel it's better to add
is_skipping_changes() to the check than add a new if statement to
apply_dispatch, but do you prefer to check it in one central place in
apply_dispatch?I think either way is fine. I just wanted to know the reason, your
current change looks okay to me.
I feel it is better to at least add a comment suggesting that we skip
only data modification changes because the other part of message
handle_stream_* is there in other message handlers as well. It will
make it easier to add a similar check in future message handlers.
--
With Regards,
Amit Kapila.
On Wed, Mar 16, 2022 at 7:58 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Mar 16, 2022 at 6:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Mar 15, 2022 at 7:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Mar 15, 2022 at 11:43 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
6.
@@ -1583,7 +1649,8 @@ apply_handle_insert(StringInfo s)
TupleTableSlot *remoteslot;
MemoryContext oldctx;- if (handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s)) + if (is_skipping_changes() ||Is there a reason to keep the skip_changes check here and in other DML
operations instead of at one central place in apply_dispatch?Since we already have the check of applying the change on the spot at
the beginning of the handlers I feel it's better to add
is_skipping_changes() to the check than add a new if statement to
apply_dispatch, but do you prefer to check it in one central place in
apply_dispatch?I think either way is fine. I just wanted to know the reason, your
current change looks okay to me.Some questions/comments
======================
Some cosmetic suggestions:
======================
1.
+# Create subscriptions. Both subscription sets disable_on_error to on
+# so that they get disabled when a conflict occurs.
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+CREATE SUBSCRIPTION $subname CONNECTION '$publisher_connstr'
PUBLICATION tap_pub WITH (streaming = on, two_phase = on,
disable_on_error = on);
+]);
I don't understand what you mean by 'Both subscription ...' in the
above comments.
2.
+ # Check the log indicating that successfully skipped the transaction,
How about slightly rephrasing this to: "Check the log to ensure that
the transaction is skipped...."?
--
With Regards,
Amit Kapila.
On Tue, Mar 15, 2022 at 7:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Mar 15, 2022 at 11:43 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated version patch.
Review:
=======
Thank you for the comments.
1. +++ b/doc/src/sgml/logical-replication.sgml @@ -366,15 +366,19 @@ CONTEXT: processing remote data for replication origin "pg_16395" during "INSER transaction, the subscription needs to be disabled temporarily by <command>ALTER SUBSCRIPTION ... DISABLE</command> first or alternatively, the subscription can be used with the <literal>disable_on_error</literal> option. - Then, the transaction can be skipped by calling the + Then, the transaction can be skipped by using + <command>ALTER SUBSCRITPION ... SKIP</command> with the finish LSN + (i.e., LSN 0/14C0378). After that the replication + can be resumed by <command>ALTER SUBSCRIPTION ... ENABLE</command>. + Alternatively, the transaction can also be skipped by calling theDo we really need to disable the subscription for the skip feature? I
think that is required for origin_advance. Also, probably, we can say
Finish LSN could be Prepare LSN, Commit LSN, etc.
Not necessary to disable the subscription for skip feature. Fixed.
2. + /* + * Quick return if it's not requested to skip this transaction. This + * function is called every start of applying changes and we assume that + * skipping the transaction is not used in many cases. + */ + if (likely(XLogRecPtrIsInvalid(MySubscription->skiplsn) ||The second part of this comment (especially ".. every start of
applying changes ..") sounds slightly odd to me. How about changing it
to: "This function is called for every remote transaction and we
assume that skipping the transaction is not used in many cases."
Fixed.
3. + + ereport(LOG, + errmsg("start skipping logical replication transaction which finished at %X/%X", ... + ereport(LOG, + (errmsg("done skipping logical replication transaction which finished at %X/%X",No need of 'which' in above LOG messages. I think the message will be
clear without the use of which in above message.
Removed.
4. + ereport(LOG, + (errmsg("done skipping logical replication transaction which finished at %X/%X", + LSN_FORMAT_ARGS(skip_xact_finish_lsn)))); + + /* Stop skipping changes */ + skip_xact_finish_lsn = InvalidXLogRecPtr;Let's reverse the order of these statements to make them consistent
with the corresponding maybe_start_* function.
But we cannot simply rever the order since skip_xact_finish_lsn is
used in the log message. Do we want to use a variable for it?
5. + + if (myskiplsn != finish_lsn) + ereport(WARNING, + errmsg("skip-LSN of logical replication subscription \"%s\" cleared", MySubscription->name),Shouldn't this be a LOG instead of a WARNING as this will be displayed
only in server logs and by background apply worker?
WARNINGs are used also by other auxiliary processes such as archiver,
autovacuum workers, and launcher. So I think we can use it here.
6.
@@ -1583,7 +1649,8 @@ apply_handle_insert(StringInfo s)
TupleTableSlot *remoteslot;
MemoryContext oldctx;- if (handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s)) + if (is_skipping_changes() ||Is there a reason to keep the skip_changes check here and in other DML
operations instead of at one central place in apply_dispatch?
I'd leave it as is as I mentioned in another email. But I've added
some comments as you suggested.
7. + /* + * Start a new transaction to clear the subskipxid, if not started + * yet. The transaction is committed below. + */ + if (!IsTransactionState())I think the second part of the comment: "The transaction is committed
below." is not required.
Removed.
8. + XLogRecPtr subskiplsn; /* All changes which finished at this LSN are + * skipped */ + #ifdef CATALOG_VARLEN /* variable-length fields start here */ /* Connection string to the publisher */ text subconninfo BKI_FORCE_NOT_NULL; @@ -109,6 +112,8 @@ typedef struct Subscription bool disableonerr; /* Indicates if the subscription should be * automatically disabled if a worker error * occurs */ + XLogRecPtr skiplsn; /* All changes which finished at this LSN are + * skipped */No need for 'which' in the above comments.
Removed.
9.
Can we merge 029_disable_on_error in 030_skip_xact and name it as
029_on_error (or 029_on_error_skip_disable or some variant of it)?
Both seem to be related features. I am slightly worried at the pace at
which the number of test files are growing in subscription test.
Yes, we can merge them.
I'll submit an updated version patch after incorporating all comments I got.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wednesday, March 16, 2022 11:33 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Mar 15, 2022 at 7:30 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:On Tuesday, March 15, 2022 3:13 PM Masahiko Sawada
<sawada.mshk@gmail.com> wrote:
I've attached an updated version patch.
A couple of minor comments on v14.
(1) apply_handle_commit_internal
+ if (is_skipping_changes()) + { + stop_skipping_changes(); + + /* + * Start a new transaction to clear the subskipxid, if notstarted
+ * yet. The transaction is committed below. + */ + if (!IsTransactionState()) + StartTransactionCommand(); + } +I suppose we can move this condition check and stop_skipping_changes()
call to the inside of the block we enter when IsTransactionState() returnstrue.
As the comment of apply_handle_commit_internal() mentions, it's the
helper function for apply_handle_commit() and
apply_handle_stream_commit().Then, I couldn't think that both callers don't open a transaction
before the call of apply_handle_commit_internal().
For applying spooled messages, we call begin_replication_step as well.I can miss something, but timing when we receive COMMIT message
without opening a transaction, would be the case of empty transactions
where the subscription (and its subscription worker) is not interested.I think when we skip non-streamed transactions we don't start a transaction.
So, if we do what you are suggesting, we will miss to clear the skip_lsn after
skipping the transaction.
OK, this is what I missed.
On the other hand, what I was worried about is that
empty transaction can start skipping changes,
if the subskiplsn is equal to the finish LSN for
the empty transaction. The reason is we call
maybe_start_skipping_changes even for empty ones
and set skip_xact_finish_lsn by the finish LSN in that case.
I checked I could make this happen with debugger and some logs for LSN.
What I did is just having two pairs of pub/sub
and conduct a change for one of them,
after I set a breakpoint in the logicalrep_write_begin
on the walsender that will issue an empty transaction.
Then, I check the finish LSN of it and
conduct an alter subscription skip lsn command with this LSN value.
As a result, empty transaction calls stop_skipping_changes
in the apply_handle_commit_internal and then
enter the block for IsTransactionState == true,
which would not happen before applying the patch.
Also, this behavior looks contradicted with some comments in worker.c
"The subskiplsn is cleared after successfully skipping the transaction
or applying non-empty transaction." so, I was just confused and
wrote the above comment.
I think this would not happen in practice, then
it might be OK without a special measure for this,
but I wasn't sure.
Best Regards,
Takamichi Osumi
On Wednesday, March 16, 2022 3:37 PM I wrote:
On Wednesday, March 16, 2022 11:33 AM Amit Kapila
<amit.kapila16@gmail.com> wrote:On Tue, Mar 15, 2022 at 7:30 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:On Tuesday, March 15, 2022 3:13 PM Masahiko Sawada
<sawada.mshk@gmail.com> wrote:
I've attached an updated version patch.
A couple of minor comments on v14.
(1) apply_handle_commit_internal
+ if (is_skipping_changes()) + { + stop_skipping_changes(); + + /* + * Start a new transaction to clear the subskipxid, + if notstarted
+ * yet. The transaction is committed below. + */ + if (!IsTransactionState()) + StartTransactionCommand(); + } +I suppose we can move this condition check and
stop_skipping_changes() call to the inside of the block we enter
when IsTransactionState() returnstrue.
As the comment of apply_handle_commit_internal() mentions, it's the
helper function for apply_handle_commit() and
apply_handle_stream_commit().Then, I couldn't think that both callers don't open a transaction
before the call of apply_handle_commit_internal().
For applying spooled messages, we call begin_replication_step as well.I can miss something, but timing when we receive COMMIT message
without opening a transaction, would be the case of empty
transactions where the subscription (and its subscription worker) is notinterested.
I think when we skip non-streamed transactions we don't start a transaction.
So, if we do what you are suggesting, we will miss to clear the
skip_lsn after skipping the transaction.OK, this is what I missed.
On the other hand, what I was worried about is that empty transaction can start
skipping changes, if the subskiplsn is equal to the finish LSN for the empty
transaction. The reason is we call maybe_start_skipping_changes even for
empty ones and set skip_xact_finish_lsn by the finish LSN in that case.I checked I could make this happen with debugger and some logs for LSN.
What I did is just having two pairs of pub/sub and conduct a change for one of
them, after I set a breakpoint in the logicalrep_write_begin on the walsender
that will issue an empty transaction.
Then, I check the finish LSN of it and
conduct an alter subscription skip lsn command with this LSN value.
As a result, empty transaction calls stop_skipping_changes in the
apply_handle_commit_internal and then enter the block for IsTransactionState
== true, which would not happen before applying the patch.Also, this behavior looks contradicted with some comments in worker.c "The
subskiplsn is cleared after successfully skipping the transaction or applying
non-empty transaction." so, I was just confused and wrote the above comment.
Sorry, my understanding was not correct.
Even when we clear the subskiplsn by empty transaction,
we can say that it applies to the success of skipping the transaction.
Then this behavior and allowing empty transaction to match the indicated
LSN by alter subscription is fine.
I'm sorry for making noises.
Best Regards,
Takamichi Osumi
On Wed, Mar 16, 2022 at 11:28 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Mar 16, 2022 at 6:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Mar 15, 2022 at 7:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Mar 15, 2022 at 11:43 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
6.
@@ -1583,7 +1649,8 @@ apply_handle_insert(StringInfo s)
TupleTableSlot *remoteslot;
MemoryContext oldctx;- if (handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s)) + if (is_skipping_changes() ||Is there a reason to keep the skip_changes check here and in other DML
operations instead of at one central place in apply_dispatch?Since we already have the check of applying the change on the spot at
the beginning of the handlers I feel it's better to add
is_skipping_changes() to the check than add a new if statement to
apply_dispatch, but do you prefer to check it in one central place in
apply_dispatch?I think either way is fine. I just wanted to know the reason, your
current change looks okay to me.Some questions/comments
======================
1. IIRC, earlier, we thought of allowing to use of this option (SKIP)
only for superusers (as this can lead to inconsistent data if not used
carefully) but I don't see that check in the latest patch. What is the
reason for the same?
I thought the non-superuser subscription owner can resolve the
conflict by manuall manipulating the relations, which is the same
result of skipping all data modification changes by ALTER SUBSCRIPTION
SKIP feature. But after more thought, it would not be exactly the same
since the skipped transaction might include changes to the relation
that the owner doesn't have permission on it.
2. + /* + * Update the subskiplsn of the tuple to InvalidXLogRecPtr.I think we can change the above part of the comment to "Clear subskiplsn."
Fixed.
3.
+ * Since we already haveIsn't it better to say here: Since we have already ...?
Fixed.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Wed, Mar 16, 2022 at 1:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Mar 16, 2022 at 7:58 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Mar 16, 2022 at 6:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Mar 15, 2022 at 7:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Mar 15, 2022 at 11:43 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
6.
@@ -1583,7 +1649,8 @@ apply_handle_insert(StringInfo s)
TupleTableSlot *remoteslot;
MemoryContext oldctx;- if (handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s)) + if (is_skipping_changes() ||Is there a reason to keep the skip_changes check here and in other DML
operations instead of at one central place in apply_dispatch?Since we already have the check of applying the change on the spot at
the beginning of the handlers I feel it's better to add
is_skipping_changes() to the check than add a new if statement to
apply_dispatch, but do you prefer to check it in one central place in
apply_dispatch?I think either way is fine. I just wanted to know the reason, your
current change looks okay to me.Some questions/comments
======================Some cosmetic suggestions: ====================== 1. +# Create subscriptions. Both subscription sets disable_on_error to on +# so that they get disabled when a conflict occurs. +$node_subscriber->safe_psql( + 'postgres', + qq[ +CREATE SUBSCRIPTION $subname CONNECTION '$publisher_connstr' PUBLICATION tap_pub WITH (streaming = on, two_phase = on, disable_on_error = on); +]);I don't understand what you mean by 'Both subscription ...' in the
above comments.
Fixed.
2.
+ # Check the log indicating that successfully skipped the transaction,How about slightly rephrasing this to: "Check the log to ensure that
the transaction is skipped...."?
Fixed.
I've attached an updated version patch.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
v15-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transa.patchapplication/octet-stream; name=v15-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transa.patchDownload
From 0770db12abe0d10a918aad618c1bdf6a4e53350d Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 10 Dec 2021 14:41:30 +0900
Subject: [PATCH v15] Add ALTER SUBSCRIPTION ... SKIP to skip the transaction
on subscriber nodes.
If incoming change violates any constraint, logical replication stops
until it's resolved. This commit introduces another way to skip the
transaction in question, other than manually updating the subscriber's
database or using pg_replication_origin_advance().
The user can specify LSN by ALTER SUBSCRIPTION ... SKIP (lsn = XXX),
updating pg_subscription.subskiplsn field, telling the apply worker to
skip the transaction. The apply worker skips all data modification changes
within the specified transaction.
After successfully skipping the transaction or finishing the
transaction, the apply worker clears pg_subscription.subskiplsn.
Author: Masahiko Sawada
Reviewed-by: Vignesh C, Greg Nancarrow, Takamichi Osumi, Haiying Tang, Hou Zhijie, Peter Eisentraut, Amit Kapila
Discussion: https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xJfUVihNZDA@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 10 +
doc/src/sgml/logical-replication.sgml | 30 ++-
doc/src/sgml/ref/alter_subscription.sgml | 42 ++++
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 2 +-
src/backend/commands/subscriptioncmds.c | 72 ++++++
src/backend/parser/gram.y | 9 +
src/backend/replication/logical/worker.c | 233 +++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 4 +
src/bin/psql/describe.c | 8 +-
src/bin/psql/tab-complete.c | 5 +-
src/include/catalog/pg_subscription.h | 5 +
src/include/nodes/parsenodes.h | 3 +-
src/test/regress/expected/subscription.out | 126 +++++-----
src/test/regress/sql/subscription.sql | 11 +
.../subscription/t/029_disable_on_error.pl | 94 -------
src/test/subscription/t/029_on_error.pl | 182 ++++++++++++++
17 files changed, 664 insertions(+), 173 deletions(-)
delete mode 100644 src/test/subscription/t/029_disable_on_error.pl
create mode 100644 src/test/subscription/t/029_on_error.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 7777d60514..eec06b90e8 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7779,6 +7779,16 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subskiplsn</structfield> <type>pg_lsn</type>
+ </para>
+ <para>
+ Finish LSN of the transaction whose changes are to be skipped, if a valid
+ LSN; otherwise <literal>0/0</literal>.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subconninfo</structfield> <type>text</type>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 6431d4796d..7aaeb41b43 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -362,19 +362,25 @@ CONTEXT: processing remote data for replication origin "pg_16395" during "INSER
</screen>
The LSN of the transaction that contains the change violating the constraint and
the replication origin name can be found from the server log (LSN 0/14C0378 and
- replication origin <literal>pg_16395</literal> in the above case). To skip the
- transaction, the subscription needs to be disabled temporarily by
- <command>ALTER SUBSCRIPTION ... DISABLE</command> first or alternatively, the
- subscription can be used with the <literal>disable_on_error</literal> option.
- Then, the transaction can be skipped by calling the
- <link linkend="pg-replication-origin-advance">
- <function>pg_replication_origin_advance()</function></link> function with
- the <parameter>node_name</parameter> (i.e., <literal>pg_16395</literal>) and the
- next LSN of the transaction's LSN (i.e., LSN 0/14C0379). After that the replication
- can be resumed by <command>ALTER SUBSCRIPTION ... ENABLE</command>. The current
- position of origins can be seen in the
- <link linkend="view-pg-replication-origin-status">
+ replication origin <literal>pg_16395</literal> in the above case). The
+ transaction can be skipped by using
+ <command>ALTER SUBSCRITPION ... SKIP</command> with the finish LSN
+ (i.e., LSN 0/14C0378). The finish LSN could be an LSN at which the transaction
+ is committed or prepared on the publisher. Alternatively, the transaction can
+ also be skipped by calling the <link linkend="pg-replication-origin-advance">
+ <function>pg_replication_origin_advance()</function></link> function
+ transaction. Before using this function, the subscription needs to be disabled
+ temporarily by <command>ALTER SUBSCRIPTION ... DISABLE</command> or
+ alternatively, the subscription can be used with the
+ <literal>disable_on_error</literal> option. Then, you can use
+ <function>pg_replication_origin_advance()</function> function with the
+ <parameter>node_name</parameter> (i.e., <literal>pg_16395</literal>) and the
+ next LSN of the finish LSN (i.e., 0/14C0379). The current position of
+ origins can be seen in the <link linkend="view-pg-replication-origin-status">
<structname>pg_replication_origin_status</structname></link> system view.
+ Please note that skipping the whole transaction include skipping changes that
+ might not violate any constraint. This can easily make the subscriber
+ inconsistent.
</para>
</sect1>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 58b78a94ea..25d533992b 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -29,6 +29,7 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> REFRESH PUB
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> ENABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> DISABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> )
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <replaceable>new_name</replaceable>
</synopsis>
@@ -210,6 +211,47 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><literal>SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> )</literal></term>
+ <listitem>
+ <para>
+ Skips applying all changes of the specified remote transaction. If incoming
+ data violates any constraints, logical replication will stop until it is
+ resolved. By using <command>ALTER SUBSCRIPTION ... SKIP</command> command,
+ the logical replication worker skips all data modification changes within
+ the specified transaction. This option has no effect on the transactions
+ that are already prepared by enabling <literal>two_phase</literal> on
+ subscriber.
+ After logical replication worker successfully skips the transaction or
+ finishes a transaction, LSN (stored in
+ <structname>pg_subscription</structname>.<structfield>subskiplsn</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts. Using this command requires
+ superuser privilege.
+ </para>
+
+ <para>
+ <replaceable>skip_option</replaceable> specifies options for this operation.
+ The supported option is:
+
+ <variablelist>
+ <varlistentry>
+ <term><literal>lsn</literal> (<type>pg_lsn</type>)</term>
+ <listitem>
+ <para>
+ Specifies the finish LSN of the remote transaction whose changes
+ are to be skipped by the logical replication worker. The finish LSN
+ is the LSN at which the transaction is either committed or prepared.
+ Skipping individual subtransactions is not supported. Setting
+ <literal>NONE</literal> resets the LSN.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><replaceable class="parameter">new_owner</replaceable></term>
<listitem>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a6304f5f81..0ff0982f7b 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -70,6 +70,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->stream = subform->substream;
sub->twophasestate = subform->subtwophasestate;
sub->disableonerr = subform->subdisableonerr;
+ sub->skiplsn = subform->subskiplsn;
/* Get conninfo */
datum = SysCacheGetAttr(SUBSCRIPTIONOID,
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index bb1ac30cd1..bd48ee7bd2 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1261,7 +1261,7 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
- substream, subtwophasestate, subdisableonerr, subslotname,
+ substream, subtwophasestate, subdisableonerr, subskiplsn, subslotname,
subsynccommit, subpublications)
ON pg_subscription TO public;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 3922658bbc..a3d1270132 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -45,6 +45,7 @@
#include "utils/guc.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/pg_lsn.h"
#include "utils/syscache.h"
/*
@@ -62,6 +63,7 @@
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
#define SUBOPT_DISABLE_ON_ERR 0x00000400
+#define SUBOPT_LSN 0x00000800
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -84,6 +86,7 @@ typedef struct SubOpts
bool streaming;
bool twophase;
bool disableonerr;
+ XLogRecPtr lsn;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -262,6 +265,33 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_DISABLE_ON_ERR;
opts->disableonerr = defGetBoolean(defel);
}
+ else if (IsSet(supported_opts, SUBOPT_LSN) &&
+ strcmp(defel->defname, "lsn") == 0)
+ {
+ char *lsn_str = defGetString(defel);
+ XLogRecPtr lsn;
+
+ if (IsSet(opts->specified_opts, SUBOPT_LSN))
+ errorConflictingDefElem(defel, pstate);
+
+ /* Setting lsn = NONE is treated as resetting LSN */
+ if (strcmp(lsn_str, "none") == 0)
+ lsn = InvalidXLogRecPtr;
+ else
+ {
+ /* Parse the argument as LSN */
+ lsn = DatumGetLSN(DirectFunctionCall1(pg_lsn_in,
+ CStringGetDatum(lsn_str)));
+
+ if (XLogRecPtrIsInvalid(lsn))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid WAL location (LSN): %s", lsn_str)));
+ }
+
+ opts->specified_opts |= SUBOPT_LSN;
+ opts->lsn = lsn;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -479,6 +509,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
values[Anum_pg_subscription_subdisableonerr - 1] = BoolGetDatum(opts.disableonerr);
+ values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -1106,6 +1137,47 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
break;
}
+ case ALTER_SUBSCRIPTION_SKIP:
+ {
+ parse_subscription_options(pstate, stmt->options, SUBOPT_LSN, &opts);
+
+ /* ALTER SUBSCRIPTION ... SKIP supports only LSN option */
+ Assert(IsSet(opts.specified_opts, SUBOPT_LSN));
+
+ if (!superuser())
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("must be superuser to skip transaction")));
+
+ /*
+ * If the user set subskiplsn, we do a sanity check to make
+ * sure that the specified LSN is a probable value.
+ */
+ if (!XLogRecPtrIsInvalid(opts.lsn))
+ {
+ RepOriginId originid;
+ char originname[NAMEDATALEN];
+ XLogRecPtr remote_lsn;
+
+ snprintf(originname, sizeof(originname), "pg_%u", subid);
+ originid = replorigin_by_name(originname, false);
+ remote_lsn = replorigin_get_progress(originid, false);
+
+ /* Check the given LSN is at least a future LSN */
+ if (!XLogRecPtrIsInvalid(remote_lsn) && opts.lsn < remote_lsn)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("skip WAL location (LSN) must be greater than origin LSN %X/%X",
+ LSN_FORMAT_ARGS(remote_lsn))));
+ }
+
+ values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(opts.lsn);
+ replaces[Anum_pg_subscription_subskiplsn - 1] = true;
+
+ update_tuple = true;
+ break;
+ }
+
default:
elog(ERROR, "unrecognized ALTER SUBSCRIPTION kind %d",
stmt->kind);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index a03b33b53b..0036c2f9e2 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9983,6 +9983,15 @@ AlterSubscriptionStmt:
(Node *)makeBoolean(false), @1));
$$ = (Node *)n;
}
+ | ALTER SUBSCRIPTION name SKIP definition
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_SKIP;
+ n->subname = $3;
+ n->options = $5;
+ $$ = (Node *)n;
+ }
;
/*****************************************************************************
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 03e069c7cd..2e7dd83fe0 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -136,6 +136,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -189,6 +190,7 @@
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/pg_lsn.h"
#include "utils/rel.h"
#include "utils/rls.h"
#include "utils/syscache.h"
@@ -259,6 +261,21 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * We enable skipping all data modification changes (INSERT, UPDATE, etc.) for
+ * the subscription if the remote transaction's finish LSN matches the subskiplsn.
+ * Once we start skipping changes, we don't stop it until we skip all changes of
+ * the transaction even if pg_subscription is updated and MySubscription->skiplsn
+ * gets changed or reset during that. Also, in streaming transaction cases, we
+ * don't skip receiving and spooling the changes since we decide whether or not
+ * to skip applying the changes when starting to apply changes. The subskiplsn is
+ * cleared after successfully skipping the transaction or applying non-empty
+ * transaction. The latter prevents the mistakenly specified subskiplsn from
+ * being left.
+ */
+static XLogRecPtr skip_xact_finish_lsn = InvalidXLogRecPtr;
+#define is_skipping_changes() (unlikely(!XLogRecPtrIsInvalid(skip_xact_finish_lsn)))
+
/* BufFile handle of the current streaming file */
static BufFile *stream_fd = NULL;
@@ -336,6 +353,11 @@ static void TwoPhaseTransactionGid(Oid subid, TransactionId xid, char *gid, int
/* Common streaming function to apply all the spooled messages */
static void apply_spooled_messages(TransactionId xid, XLogRecPtr lsn);
+/* Functions for skipping changes */
+static void maybe_start_skipping_changes(XLogRecPtr finish_lsn);
+static void stop_skipping_changes(void);
+static void clear_subscription_skip_lsn(XLogRecPtr finish_lsn);
+
/* Functions for apply error callback */
static void apply_error_callback(void *arg);
static inline void set_apply_error_context_xact(TransactionId xid, XLogRecPtr lsn);
@@ -795,6 +817,8 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
+ maybe_start_skipping_changes(begin_data.final_lsn);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -847,6 +871,8 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
+ maybe_start_skipping_changes(begin_data.prepare_lsn);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -905,9 +931,9 @@ apply_handle_prepare(StringInfo s)
/*
* Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction. It is done this way because at
- * commit prepared time, we won't know whether we have skipped preparing a
- * transaction because of no change.
+ * change has happened in this transaction or all changes are skipped. It
+ * is done this way because at commit prepared time, we won't know whether
+ * we have skipped preparing a transaction because of no change.
*
* XXX, We can optimize such that at commit prepared time, we first check
* whether we have prepared the transaction or not but that doesn't seem
@@ -928,6 +954,15 @@ apply_handle_prepare(StringInfo s)
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(prepare_data.end_lsn);
+ /*
+ * Since we have already prepared the transaction, in a case where the
+ * server crashes before clearing the subskiplsn, it will be left but the
+ * transaction won't be resent. But that's okay because it's a rare case
+ * and the subskiplsn will be cleared when finishing the next transaction.
+ */
+ stop_skipping_changes();
+ clear_subscription_skip_lsn(prepare_data.prepare_lsn);
+
pgstat_report_activity(STATE_IDLE, NULL);
reset_apply_error_context_info();
}
@@ -969,6 +1004,8 @@ apply_handle_commit_prepared(StringInfo s)
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(prepare_data.end_lsn);
+ clear_subscription_skip_lsn(prepare_data.end_lsn);
+
pgstat_report_activity(STATE_IDLE, NULL);
reset_apply_error_context_info();
}
@@ -1010,6 +1047,8 @@ apply_handle_rollback_prepared(StringInfo s)
FinishPreparedTransaction(gid, false);
end_replication_step();
CommitTransactionCommand();
+
+ clear_subscription_skip_lsn(rollback_data.rollback_end_lsn);
}
pgstat_report_stat(false);
@@ -1072,6 +1111,13 @@ apply_handle_stream_prepare(StringInfo s)
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(prepare_data.end_lsn);
+ /*
+ * Similar to prepare case, the subskiplsn could be left in a case of
+ * server crash but it's okay. See the comments in apply_handle_prepare().
+ */
+ stop_skipping_changes();
+ clear_subscription_skip_lsn(prepare_data.prepare_lsn);
+
pgstat_report_activity(STATE_IDLE, NULL);
reset_apply_error_context_info();
@@ -1311,6 +1357,8 @@ apply_spooled_messages(TransactionId xid, XLogRecPtr lsn)
MemoryContext oldcxt;
BufFile *fd;
+ maybe_start_skipping_changes(lsn);
+
/* Make sure we have an open transaction */
begin_replication_step();
@@ -1455,8 +1503,26 @@ apply_handle_stream_commit(StringInfo s)
static void
apply_handle_commit_internal(LogicalRepCommitData *commit_data)
{
+ if (is_skipping_changes())
+ {
+ stop_skipping_changes();
+
+ /*
+ * Start a new transaction to clear the subskipxid, if not started
+ * yet.
+ */
+ if (!IsTransactionState())
+ StartTransactionCommand();
+ }
+
if (IsTransactionState())
{
+ /*
+ * The transaction is either non-empty or skipped, so we clear the
+ * subskiplsn.
+ */
+ clear_subscription_skip_lsn(commit_data->commit_lsn);
+
/*
* Update origin state so we can restart streaming from correct
* position in case of crash.
@@ -1583,7 +1649,12 @@ apply_handle_insert(StringInfo s)
TupleTableSlot *remoteslot;
MemoryContext oldctx;
- if (handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s))
+ /*
+ * Quick return if we are skipping data modification changes or handling
+ * streamed transactions.
+ */
+ if (is_skipping_changes() ||
+ handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s))
return;
begin_replication_step();
@@ -1710,7 +1781,12 @@ apply_handle_update(StringInfo s)
RangeTblEntry *target_rte;
MemoryContext oldctx;
- if (handle_streamed_transaction(LOGICAL_REP_MSG_UPDATE, s))
+ /*
+ * Quick return if we are skipping data modification changes or handling
+ * streamed transactions.
+ */
+ if (is_skipping_changes() ||
+ handle_streamed_transaction(LOGICAL_REP_MSG_UPDATE, s))
return;
begin_replication_step();
@@ -1874,7 +1950,12 @@ apply_handle_delete(StringInfo s)
TupleTableSlot *remoteslot;
MemoryContext oldctx;
- if (handle_streamed_transaction(LOGICAL_REP_MSG_DELETE, s))
+ /*
+ * Quick return if we are skipping data modification changes or handling
+ * streamed transactions.
+ */
+ if (is_skipping_changes() ||
+ handle_streamed_transaction(LOGICAL_REP_MSG_DELETE, s))
return;
begin_replication_step();
@@ -2261,7 +2342,12 @@ apply_handle_truncate(StringInfo s)
ListCell *lc;
LOCKMODE lockmode = AccessExclusiveLock;
- if (handle_streamed_transaction(LOGICAL_REP_MSG_TRUNCATE, s))
+ /*
+ * Quick return if we are skipping data modification changes or handling
+ * streamed transactions.
+ */
+ if (is_skipping_changes() ||
+ handle_streamed_transaction(LOGICAL_REP_MSG_TRUNCATE, s))
return;
begin_replication_step();
@@ -3738,6 +3824,139 @@ IsLogicalWorker(void)
return MyLogicalRepWorker != NULL;
}
+/*
+ * Start skipping changes of the transaction if the given LSN matches the
+ * LSN specified by subscription's skiplsn.
+ */
+static void
+maybe_start_skipping_changes(XLogRecPtr finish_lsn)
+{
+ Assert(!is_skipping_changes());
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ /*
+ * Quick return if it's not requested to skip this transaction. This
+ * function is called for every remote transaction and we assume that
+ * skipping the transaction is not used in many cases.
+ */
+ if (likely(XLogRecPtrIsInvalid(MySubscription->skiplsn) ||
+ MySubscription->skiplsn != finish_lsn))
+ return;
+
+ /* Start skipping all changes of this transaction */
+ skip_xact_finish_lsn = finish_lsn;
+
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction finished at %X/%X",
+ LSN_FORMAT_ARGS(skip_xact_finish_lsn)));
+}
+
+/*
+ * Stop skipping changes by resetting skip_xact_finish_lsn if enabled.
+ */
+static void
+stop_skipping_changes(void)
+{
+ if (!is_skipping_changes())
+ return;
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction finished at %X/%X",
+ LSN_FORMAT_ARGS(skip_xact_finish_lsn))));
+
+ /* Stop skipping changes */
+ skip_xact_finish_lsn = InvalidXLogRecPtr;
+}
+
+/*
+ * Clear subskiplsn of pg_subscription catalog.
+ *
+ * finish_lsn is the transaction's finish LSN that is used to check if the
+ * subskiplsn matches it. If not matched, we raise a warning when clearing the
+ * subskipxid in order to inform users for cases e.g., where the user mistakenly
+ * specified the wrong subskiplsn.
+ */
+static void
+clear_subscription_skip_lsn(XLogRecPtr finish_lsn)
+{
+ Relation rel;
+ Form_pg_subscription subform;
+ HeapTuple tup;
+ XLogRecPtr myskiplsn = MySubscription->skiplsn;
+ bool started_tx = false;
+
+ if (likely(XLogRecPtrIsInvalid(myskiplsn)))
+ return;
+
+ if (!IsTransactionState())
+ {
+ StartTransactionCommand();
+ started_tx = true;
+ }
+
+ /*
+ * Protect subskiplsn of pg_subscription from being concurrently updated
+ * while clearing it.
+ */
+ LockSharedObject(SubscriptionRelationId, MySubscription->oid, 0,
+ AccessShareLock);
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ ObjectIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ subform = (Form_pg_subscription) GETSTRUCT(tup);
+
+ /*
+ * Clear the subskiplsn. If user has already changed subskiplsn before
+ * clearing it we don't update the catalog and don't advance the
+ * replication origin state. So in the worst case, if the server crashes
+ * before sending an acknowledgment of the flush position the transaction
+ * will be sent again and the user needs to set subskiplsn again. We can
+ * reduce the possibility by logging a replication origin WAL record to
+ * advance the origin LSN instead but there is no way to advance the
+ * origin timestamp and it doesn't seem to be worth doing anything about
+ * it since it's a very rare case.
+ */
+ if (subform->subskiplsn == myskiplsn)
+ {
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ /* reset subskiplsn */
+ values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ replaces[Anum_pg_subscription_subskiplsn - 1] = true;
+
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+
+ if (myskiplsn != finish_lsn)
+ ereport(WARNING,
+ errmsg("skip-LSN of logical replication subscription \"%s\" cleared", MySubscription->name),
+ errdetail("Remote transaction's finish WAL location (LSN) %X/%X did not match skip-LSN %X/%X",
+ LSN_FORMAT_ARGS(finish_lsn),
+ LSN_FORMAT_ARGS(myskiplsn)));
+ }
+
+ heap_freetuple(tup);
+ table_close(rel, NoLock);
+
+ if (started_tx)
+ CommitTransactionCommand();
+}
+
/* Error callback to give more context info about the change being applied */
static void
apply_error_callback(void *arg)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 4dd24b8c89..202bca4b23 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4359,6 +4359,10 @@ getSubscriptions(Archive *fout)
ntups = PQntuples(res);
+ /*
+ * Get subscription fields. We don't include subskiplsn in the dump as
+ * after restoring the dump this value may no longer be relevant.
+ */
i_tableoid = PQfnumber(res, "tableoid");
i_oid = PQfnumber(res, "oid");
i_subname = PQfnumber(res, "subname");
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 9229eacb6d..4c6c370b6f 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6084,7 +6084,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false};
+ false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6131,6 +6131,12 @@ describeSubscriptions(const char *pattern, bool verbose)
", subconninfo AS \"%s\"\n",
gettext_noop("Synchronous commit"),
gettext_noop("Conninfo"));
+
+ /* Skip LSN is only supported in v15 and higher */
+ if (pset.sversion >= 150000)
+ appendPQExpBuffer(&buf,
+ ", subskiplsn AS \"%s\"\n",
+ gettext_noop("Skip LSN"));
}
/* Only display subscriptions in current database. */
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 17172827a9..11cb41128e 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1819,7 +1819,7 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> */
else if (Matches("ALTER", "SUBSCRIPTION", MatchAny))
COMPLETE_WITH("CONNECTION", "ENABLE", "DISABLE", "OWNER TO",
- "RENAME TO", "REFRESH PUBLICATION", "SET",
+ "RENAME TO", "REFRESH PUBLICATION", "SET", "SKIP (",
"ADD PUBLICATION", "DROP PUBLICATION");
/* ALTER SUBSCRIPTION <name> REFRESH PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) &&
@@ -1835,6 +1835,9 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
COMPLETE_WITH("binary", "slot_name", "streaming", "synchronous_commit", "disable_on_error");
+ /* ALTER SUBSCRIPTION <name> SKIP ( */
+ else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
+ COMPLETE_WITH("lsn");
/* ALTER SUBSCRIPTION <name> SET PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "PUBLICATION"))
{
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index e2befaf351..69969a0617 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -70,6 +70,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
bool subdisableonerr; /* True if a worker error should cause the
* subscription to be disabled */
+ XLogRecPtr subskiplsn; /* All changes finished at this LSN are
+ * skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -109,6 +112,8 @@ typedef struct Subscription
bool disableonerr; /* Indicates if the subscription should be
* automatically disabled if a worker error
* occurs */
+ XLogRecPtr skiplsn; /* All changes finished at this LSN are
+ * skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 1617702d9d..6f83a79a96 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3726,7 +3726,8 @@ typedef enum AlterSubscriptionType
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
ALTER_SUBSCRIPTION_DROP_PUBLICATION,
ALTER_SUBSCRIPTION_REFRESH,
- ALTER_SUBSCRIPTION_ENABLED
+ ALTER_SUBSCRIPTION_ENABLED,
+ ALTER_SUBSCRIPTION_SKIP
} AlterSubscriptionType;
typedef struct AlterSubscriptionStmt
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index ad8003fae1..7fcfad1591 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -76,10 +76,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -93,11 +93,25 @@ ALTER SUBSCRIPTION regress_doesnotexist CONNECTION 'dbname=regress_doesnotexist2
ERROR: subscription "regress_doesnotexist" does not exist
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
ERROR: unrecognized subscription parameter: "create_slot"
+-- ok
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | off | dbname=regress_doesnotexist2 | 0/12345
+(1 row)
+
+-- ok - with lsn = NONE
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
+ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------------------+------------------------------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | off | dbname=regress_doesnotexist2
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -129,10 +143,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------------------+------------------------------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | local | dbname=regress_doesnotexist2
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -165,19 +179,19 @@ ERROR: binary requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, binary = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -188,19 +202,19 @@ ERROR: streaming requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -215,10 +229,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more then once
@@ -233,10 +247,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -270,10 +284,10 @@ ERROR: two_phase requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -282,10 +296,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -294,10 +308,10 @@ DROP SUBSCRIPTION regress_testsub;
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -309,18 +323,18 @@ ERROR: disable_on_error requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, disable_on_error = false);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | t | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | t | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index a7c15b1daf..74c38ead5d 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -72,6 +72,17 @@ ALTER SUBSCRIPTION regress_testsub SET (slot_name = '');
ALTER SUBSCRIPTION regress_doesnotexist CONNECTION 'dbname=regress_doesnotexist2';
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
+-- ok
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
+
+\dRs+
+
+-- ok - with lsn = NONE
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
+
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
+
\dRs+
BEGIN;
diff --git a/src/test/subscription/t/029_disable_on_error.pl b/src/test/subscription/t/029_disable_on_error.pl
deleted file mode 100644
index 5eca804446..0000000000
--- a/src/test/subscription/t/029_disable_on_error.pl
+++ /dev/null
@@ -1,94 +0,0 @@
-
-# Copyright (c) 2021-2022, PostgreSQL Global Development Group
-
-# Test of logical replication subscription self-disabling feature.
-use strict;
-use warnings;
-use PostgreSQL::Test::Cluster;
-use PostgreSQL::Test::Utils;
-use Test::More;
-
-# create publisher node
-my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
-$node_publisher->init(allows_streaming => 'logical');
-$node_publisher->start;
-
-# create subscriber node
-my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
-$node_subscriber->init;
-$node_subscriber->start;
-
-# Create identical table on both nodes.
-$node_publisher->safe_psql('postgres', "CREATE TABLE tbl (i INT)");
-$node_subscriber->safe_psql('postgres', "CREATE TABLE tbl (i INT)");
-
-# Insert duplicate values on the publisher.
-$node_publisher->safe_psql('postgres',
- "INSERT INTO tbl (i) VALUES (1), (1), (1)");
-
-# Create an additional unique index on the subscriber.
-$node_subscriber->safe_psql('postgres',
- "CREATE UNIQUE INDEX tbl_unique ON tbl (i)");
-
-# Create a pub/sub to set up logical replication. This tests that the
-# uniqueness violation will cause the subscription to fail during initial
-# synchronization and make it disabled.
-my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
-$node_publisher->safe_psql('postgres',
- "CREATE PUBLICATION pub FOR TABLE tbl");
-$node_subscriber->safe_psql('postgres',
- "CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (disable_on_error = true)"
-);
-
-# Initial synchronization failure causes the subscription to be disabled.
-$node_subscriber->poll_query_until('postgres',
- "SELECT subenabled = false FROM pg_catalog.pg_subscription WHERE subname = 'sub'"
-) or die "Timed out while waiting for subscriber to be disabled";
-
-# Drop the unique index on the subscriber which caused the subscription to be
-# disabled.
-$node_subscriber->safe_psql('postgres', "DROP INDEX tbl_unique");
-
-# Re-enable the subscription "sub".
-$node_subscriber->safe_psql('postgres', "ALTER SUBSCRIPTION sub ENABLE");
-
-# Wait for the data to replicate.
-$node_publisher->wait_for_catchup('sub');
-$node_subscriber->poll_query_until('postgres',
- "SELECT COUNT(1) = 0 FROM pg_subscription_rel sr WHERE sr.srsubstate NOT IN ('s', 'r') AND sr.srrelid = 'tbl'::regclass"
-);
-
-# Confirm that we have finished the table sync.
-my $result =
- $node_subscriber->safe_psql('postgres', "SELECT MAX(i), COUNT(*) FROM tbl");
-is($result, qq(1|3), "subscription sub replicated data");
-
-# Delete the data from the subscriber and recreate the unique index.
-$node_subscriber->safe_psql('postgres', "DELETE FROM tbl");
-$node_subscriber->safe_psql('postgres',
- "CREATE UNIQUE INDEX tbl_unique ON tbl (i)");
-
-# Add more non-unique data to the publisher.
-$node_publisher->safe_psql('postgres',
- "INSERT INTO tbl (i) VALUES (3), (3), (3)");
-
-# Apply failure causes the subscription to be disabled.
-$node_subscriber->poll_query_until('postgres',
- "SELECT subenabled = false FROM pg_catalog.pg_subscription WHERE subname = 'sub'"
-) or die "Timed out while waiting for subscription sub to be disabled";
-
-# Drop the unique index on the subscriber and re-enabled the subscription. Then
-# confirm that the previously failing insert was applied OK.
-$node_subscriber->safe_psql('postgres', "DROP INDEX tbl_unique");
-$node_subscriber->safe_psql('postgres', "ALTER SUBSCRIPTION sub ENABLE");
-
-$node_publisher->wait_for_catchup('sub');
-
-$result = $node_subscriber->safe_psql('postgres',
- "SELECT COUNT(*) FROM tbl WHERE i = 3");
-is($result, qq(3), 'check the result of apply');
-
-$node_subscriber->stop;
-$node_publisher->stop;
-
-done_testing();
diff --git a/src/test/subscription/t/029_on_error.pl b/src/test/subscription/t/029_on_error.pl
new file mode 100644
index 0000000000..efec3ce728
--- /dev/null
+++ b/src/test/subscription/t/029_on_error.pl
@@ -0,0 +1,182 @@
+
+# Copyright (c) 2021-2022, PostgreSQL Global Development Group
+
+# Test of logical replication subscription self-disabling feature.
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $offset = 0;
+
+# Test skipping the transaction. This function must be called after the caller
+# has inserted data that conflicts with the subscriber. The finish LSN of the
+# error transaction that is used to specify to ALTER SUBSCRIPTION ... SKIP is
+# fetched from the server logs. After executing ALTER SUBSCRITPION ... SKIP, we
+# check if logical replication can continue working by inserting $nonconflict_data
+# on the publisher.
+sub test_skip_xact
+{
+ my ($node_publisher, $node_subscriber, $nonconflict_data, $expected, $msg)
+ = @_;
+
+ # Wait until a conflict occurs on the subscriber.
+ $node_subscriber->poll_query_until('postgres',
+ "SELECT subenabled = FALSE FROM pg_subscription WHERE subname = 'sub'"
+ );
+
+ # Get the finish LSN of the error transaction.
+ my $contents = slurp_file($node_subscriber->logfile, $offset);
+ $contents =~
+ qr/processing remote data for replication origin \"pg_\d+\" during "INSERT" for replication target relation "public.tbl" in transaction \d+ finished at ([[:xdigit:]]+\/[[:xdigit:]]+)/
+ or die "could not get error-LSN";
+ my $lsn = $1;
+
+ # Set skip lsn.
+ $node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION sub SKIP (lsn = '$lsn')");
+
+ # Re-enable the subscription.
+ $node_subscriber->safe_psql('postgres', "ALTER SUBSCRIPTION sub ENABLE");
+
+ # Wait for the failed transaction to be skipped
+ $node_subscriber->poll_query_until('postgres',
+ "SELECT subskiplsn = '0/0' FROM pg_subscription WHERE subname = 'sub'"
+ );
+
+ # Check the log to ensure that the transaction is skipped the transaction,
+ # and advance the offset of the log file for the next test.
+ $offset = $node_subscriber->wait_for_log(
+ qr/LOG: done skipping logical replication transaction finished at $lsn/,
+ $offset);
+
+ # Insert non-conflict data
+ $node_publisher->safe_psql('postgres',
+ "INSERT INTO tbl VALUES $nonconflict_data");
+
+ $node_publisher->wait_for_catchup('sub');
+
+ # Check replicated data
+ my $res =
+ $node_subscriber->safe_psql('postgres', "SELECT count(*) FROM tbl");
+ is($res, $expected, $msg);
+}
+
+# create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf(
+ 'postgresql.conf',
+ qq[
+logical_decoding_work_mem = 64kB
+max_prepared_transactions = 10
+]);
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init;
+$node_subscriber->append_conf(
+ 'postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On the subscriber, we
+# create the same tables but with a primary key. Also, insert some data that
+# will conflict with the data replicated from publisher later.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+CREATE TABLE tbl (i INT, t TEXT);
+INSERT INTO tbl VALUES (1, NULL);
+]);
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+CREATE TABLE tbl (i INT PRIMARY KEY, t TEXT);
+INSERT INTO tbl VALUES (1, NULL);
+]);
+
+# Create a pub/sub to set up logical replication. This tests that the
+# uniqueness violation will cause the subscription to fail during initial
+# synchronization and make it disabled.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION pub FOR TABLE tbl");
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (disable_on_error = true, streaming = on, two_phase = on)"
+);
+
+# Initial synchronization failure causes the subscription to be disabled.
+$node_subscriber->poll_query_until('postgres',
+ "SELECT subenabled = false FROM pg_catalog.pg_subscription WHERE subname = 'sub'"
+) or die "Timed out while waiting for subscriber to be disabled";
+
+# Truncate the table on the subscriber which caused the subscription to be
+# disabled.
+$node_subscriber->safe_psql('postgres', "TRUNCATE tbl");
+
+# Re-enable the subscription "sub".
+$node_subscriber->safe_psql('postgres', "ALTER SUBSCRIPTION sub ENABLE");
+
+# Wait for the data to replicate.
+$node_publisher->wait_for_catchup('sub');
+$node_subscriber->poll_query_until('postgres',
+ "SELECT COUNT(1) = 0 FROM pg_subscription_rel sr WHERE sr.srsubstate NOT IN ('s', 'r') AND sr.srrelid = 'tbl'::regclass"
+);
+
+# Confirm that we have finished the table sync.
+my $result =
+ $node_subscriber->safe_psql('postgres', "SELECT COUNT(*) FROM tbl");
+is($result, qq(1), "subscription sub replicated data");
+
+# Insert data to tbl, raising an error on the subscriber due to violation
+# of the unique constraint on tbl. Then skip the transaction.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO tbl VALUES (1, NULL);
+COMMIT;
+]);
+test_skip_xact($node_publisher, $node_subscriber,
+ "(2, NULL)", "2", "test skipping transaction");
+
+# Test for PREPARE and COMMIT PREPARED. Insert the same data to tbl and
+# PREPARE the transaction, raising an error. Then skip the transaction.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO tbl VALUES (1, NULL);
+PREPARE TRANSACTION 'gtx';
+COMMIT PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber,
+ "(3, NULL)", "3", "test skipping prepare and commit prepared ");
+
+# Test for STREAM COMMIT. Insert enough rows to tbl to exceed the 64kB
+# limit, also raising an error on the subscriber during applying spooled
+# changes for the same reason. Then skip the transaction.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO tbl SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+COMMIT;
+]);
+test_skip_xact($node_publisher, $node_subscriber, "(4, md5(4::text))",
+ "4", "test skipping stream-commit");
+
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT COUNT(*) FROM pg_prepared_xacts");
+is($result, "0",
+ "check all prepared transactions are resolved on the subscriber");
+
+$node_subscriber->stop;
+$node_publisher->stop;
+
+done_testing();
--
2.24.3 (Apple Git-128)
On Wed, Mar 16, 2022 4:23 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated version patch.
Thanks for updating the patch. Here are some comments for the v15 patch.
1. src/backend/replication/logical/worker.c
+ * to skip applying the changes when starting to apply changes. The subskiplsn is
+ * cleared after successfully skipping the transaction or applying non-empty
+ * transaction. The latter prevents the mistakenly specified subskiplsn from
Should "applying non-empty transaction" be modified to "finishing a
transaction"? To be consistent with the description in the
alter_subscription.sgml.
2. src/test/subscription/t/029_on_error.pl
+# Test of logical replication subscription self-disabling feature.
Should we add something about "skip logical replication transactions" in this
comment?
Regards,
Shi yu
On Thu, Mar 17, 2022 at 8:13 AM shiy.fnst@fujitsu.com
<shiy.fnst@fujitsu.com> wrote:
On Wed, Mar 16, 2022 4:23 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated version patch.
Thanks for updating the patch. Here are some comments for the v15 patch.
1. src/backend/replication/logical/worker.c
+ * to skip applying the changes when starting to apply changes. The subskiplsn is + * cleared after successfully skipping the transaction or applying non-empty + * transaction. The latter prevents the mistakenly specified subskiplsn fromShould "applying non-empty transaction" be modified to "finishing a
transaction"? To be consistent with the description in the
alter_subscription.sgml.
The current wording in the patch seems okay to me as it is good to
emphasize on non-empty transactions.
2. src/test/subscription/t/029_on_error.pl
+# Test of logical replication subscription self-disabling feature.
Should we add something about "skip logical replication transactions" in this
comment?
How about: "Tests for disable_on_error and SKIP transaction features."?
I am making some other minor edits in the patch and will take care of
whatever we decide for these comments.
--
With Regards,
Amit Kapila.
On Wed, Mar 16, 2022 at 1:53 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated version patch.
The patch LGTM. I have made minor changes in comments and docs in the
attached patch. Kindly let me know what you think of the attached?
I am planning to commit this early next week (on Monday) unless there
are more comments/suggestions.
--
With Regards,
Amit Kapila.
Attachments:
v16-0001-Add-ALTER-SUBSCRIPTION-.-SKIP.patchapplication/octet-stream; name=v16-0001-Add-ALTER-SUBSCRIPTION-.-SKIP.patchDownload
From 4b0222725e1c6511f4783b019c988c4624604a1c Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 10 Dec 2021 14:41:30 +0900
Subject: [PATCH v16] Add ALTER SUBSCRIPTION ... SKIP.
This feature allows skipping the transaction on subscriber nodes.
If incoming change violates any constraint, logical replication stops
until it's resolved. Currently, users need to either manually resolve the
conflict by updating a subscriber-side database or by using function
pg_replication_origin_advance() to skip the conflicting transaction. This
commit introduces a simpler way to skip the conflicting transactions.
The user can specify LSN by ALTER SUBSCRIPTION ... SKIP (lsn = XXX),
which allows the apply worker to skip the transaction finished at
specified LSN. The apply worker skips all data modification changes within
the transaction.
Author: Masahiko Sawada
Reviewed-by: Takamichi Osumi, Hou Zhijie, Peter Eisentraut, Amit Kapila, Shi Yu, Vignesh C, Greg Nancarrow, Haiying Tang
Discussion: https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xJfUVihNZDA@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 10 +
doc/src/sgml/logical-replication.sgml | 27 +--
doc/src/sgml/ref/alter_subscription.sgml | 42 +++++
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 2 +-
src/backend/commands/subscriptioncmds.c | 72 ++++++++
src/backend/parser/gram.y | 9 +
src/backend/replication/logical/worker.c | 233 +++++++++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 4 +
src/bin/psql/describe.c | 8 +-
src/bin/psql/tab-complete.c | 5 +-
src/include/catalog/pg_subscription.h | 5 +
src/include/nodes/parsenodes.h | 3 +-
src/test/regress/expected/subscription.out | 126 +++++++------
src/test/regress/sql/subscription.sql | 11 ++
src/test/subscription/t/029_disable_on_error.pl | 94 ----------
src/test/subscription/t/029_on_error.pl | 183 +++++++++++++++++++
17 files changed, 663 insertions(+), 172 deletions(-)
delete mode 100644 src/test/subscription/t/029_disable_on_error.pl
create mode 100644 src/test/subscription/t/029_on_error.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 7777d60..eec06b9 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7781,6 +7781,16 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<row>
<entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subskiplsn</structfield> <type>pg_lsn</type>
+ </para>
+ <para>
+ Finish LSN of the transaction whose changes are to be skipped, if a valid
+ LSN; otherwise <literal>0/0</literal>.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
<structfield>subconninfo</structfield> <type>text</type>
</para>
<para>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 6431d47..555fbd7 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -362,19 +362,24 @@ CONTEXT: processing remote data for replication origin "pg_16395" during "INSER
</screen>
The LSN of the transaction that contains the change violating the constraint and
the replication origin name can be found from the server log (LSN 0/14C0378 and
- replication origin <literal>pg_16395</literal> in the above case). To skip the
- transaction, the subscription needs to be disabled temporarily by
- <command>ALTER SUBSCRIPTION ... DISABLE</command> first or alternatively, the
+ replication origin <literal>pg_16395</literal> in the above case). The
+ transaction that produces conflict can be skipped by using
+ <command>ALTER SUBSCRIPTION ... SKIP</command> with the finish LSN
+ (i.e., LSN 0/14C0378). The finish LSN could be an LSN at which the transaction
+ is committed or prepared on the publisher. Alternatively, the transaction can
+ also be skipped by calling the <link linkend="pg-replication-origin-advance">
+ <function>pg_replication_origin_advance()</function></link> function
+ transaction. Before using this function, the subscription needs to be disabled
+ temporarily either by <command>ALTER SUBSCRIPTION ... DISABLE</command> or, the
subscription can be used with the <literal>disable_on_error</literal> option.
- Then, the transaction can be skipped by calling the
- <link linkend="pg-replication-origin-advance">
- <function>pg_replication_origin_advance()</function></link> function with
- the <parameter>node_name</parameter> (i.e., <literal>pg_16395</literal>) and the
- next LSN of the transaction's LSN (i.e., LSN 0/14C0379). After that the replication
- can be resumed by <command>ALTER SUBSCRIPTION ... ENABLE</command>. The current
- position of origins can be seen in the
- <link linkend="view-pg-replication-origin-status">
+ Then, you can use <function>pg_replication_origin_advance()</function> function
+ with the <parameter>node_name</parameter> (i.e., <literal>pg_16395</literal>)
+ and the next LSN of the finish LSN (i.e., 0/14C0379). The current position of
+ origins can be seen in the <link linkend="view-pg-replication-origin-status">
<structname>pg_replication_origin_status</structname></link> system view.
+ Please note that skipping the whole transaction include skipping changes that
+ might not violate any constraint. This can easily make the subscriber
+ inconsistent.
</para>
</sect1>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 58b78a9..ac2db24 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -29,6 +29,7 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> REFRESH PUB
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> ENABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> DISABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> )
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <replaceable>new_name</replaceable>
</synopsis>
@@ -211,6 +212,47 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
</varlistentry>
<varlistentry>
+ <term><literal>SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> )</literal></term>
+ <listitem>
+ <para>
+ Skips applying all changes of the remote transaction. If incoming data
+ violates any constraints, logical replication will stop until it is
+ resolved. By using <command>ALTER SUBSCRIPTION ... SKIP</command> command,
+ the logical replication worker skips all data modification changes within
+ the transaction. This option has no effect on the transactions that are
+ already prepared by enabling <literal>two_phase</literal> on
+ subscriber.
+ After logical replication worker successfully skips the transaction or
+ finishes a transaction, LSN (stored in
+ <structname>pg_subscription</structname>.<structfield>subskiplsn</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts. Using this command requires
+ superuser privilege.
+ </para>
+
+ <para>
+ <replaceable>skip_option</replaceable> specifies options for this operation.
+ The supported option is:
+
+ <variablelist>
+ <varlistentry>
+ <term><literal>lsn</literal> (<type>pg_lsn</type>)</term>
+ <listitem>
+ <para>
+ Specifies the finish LSN of the remote transaction whose changes
+ are to be skipped by the logical replication worker. The finish LSN
+ is the LSN at which the transaction is either committed or prepared.
+ Skipping individual subtransaction is not supported. Setting
+ <literal>NONE</literal> resets the LSN.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term><replaceable class="parameter">new_owner</replaceable></term>
<listitem>
<para>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a6304f5..0ff0982 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -70,6 +70,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->stream = subform->substream;
sub->twophasestate = subform->subtwophasestate;
sub->disableonerr = subform->subdisableonerr;
+ sub->skiplsn = subform->subskiplsn;
/* Get conninfo */
datum = SysCacheGetAttr(SUBSCRIPTIONOID,
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index bb1ac30..bd48ee7 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1261,7 +1261,7 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
- substream, subtwophasestate, subdisableonerr, subslotname,
+ substream, subtwophasestate, subdisableonerr, subskiplsn, subslotname,
subsynccommit, subpublications)
ON pg_subscription TO public;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 3922658..a3d1270 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -45,6 +45,7 @@
#include "utils/guc.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/pg_lsn.h"
#include "utils/syscache.h"
/*
@@ -62,6 +63,7 @@
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
#define SUBOPT_DISABLE_ON_ERR 0x00000400
+#define SUBOPT_LSN 0x00000800
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -84,6 +86,7 @@ typedef struct SubOpts
bool streaming;
bool twophase;
bool disableonerr;
+ XLogRecPtr lsn;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -262,6 +265,33 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_DISABLE_ON_ERR;
opts->disableonerr = defGetBoolean(defel);
}
+ else if (IsSet(supported_opts, SUBOPT_LSN) &&
+ strcmp(defel->defname, "lsn") == 0)
+ {
+ char *lsn_str = defGetString(defel);
+ XLogRecPtr lsn;
+
+ if (IsSet(opts->specified_opts, SUBOPT_LSN))
+ errorConflictingDefElem(defel, pstate);
+
+ /* Setting lsn = NONE is treated as resetting LSN */
+ if (strcmp(lsn_str, "none") == 0)
+ lsn = InvalidXLogRecPtr;
+ else
+ {
+ /* Parse the argument as LSN */
+ lsn = DatumGetLSN(DirectFunctionCall1(pg_lsn_in,
+ CStringGetDatum(lsn_str)));
+
+ if (XLogRecPtrIsInvalid(lsn))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid WAL location (LSN): %s", lsn_str)));
+ }
+
+ opts->specified_opts |= SUBOPT_LSN;
+ opts->lsn = lsn;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -479,6 +509,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
values[Anum_pg_subscription_subdisableonerr - 1] = BoolGetDatum(opts.disableonerr);
+ values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -1106,6 +1137,47 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
break;
}
+ case ALTER_SUBSCRIPTION_SKIP:
+ {
+ parse_subscription_options(pstate, stmt->options, SUBOPT_LSN, &opts);
+
+ /* ALTER SUBSCRIPTION ... SKIP supports only LSN option */
+ Assert(IsSet(opts.specified_opts, SUBOPT_LSN));
+
+ if (!superuser())
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("must be superuser to skip transaction")));
+
+ /*
+ * If the user set subskiplsn, we do a sanity check to make
+ * sure that the specified LSN is a probable value.
+ */
+ if (!XLogRecPtrIsInvalid(opts.lsn))
+ {
+ RepOriginId originid;
+ char originname[NAMEDATALEN];
+ XLogRecPtr remote_lsn;
+
+ snprintf(originname, sizeof(originname), "pg_%u", subid);
+ originid = replorigin_by_name(originname, false);
+ remote_lsn = replorigin_get_progress(originid, false);
+
+ /* Check the given LSN is at least a future LSN */
+ if (!XLogRecPtrIsInvalid(remote_lsn) && opts.lsn < remote_lsn)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("skip WAL location (LSN) must be greater than origin LSN %X/%X",
+ LSN_FORMAT_ARGS(remote_lsn))));
+ }
+
+ values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(opts.lsn);
+ replaces[Anum_pg_subscription_subskiplsn - 1] = true;
+
+ update_tuple = true;
+ break;
+ }
+
default:
elog(ERROR, "unrecognized ALTER SUBSCRIPTION kind %d",
stmt->kind);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index a03b33b..0036c2f 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9983,6 +9983,15 @@ AlterSubscriptionStmt:
(Node *)makeBoolean(false), @1));
$$ = (Node *)n;
}
+ | ALTER SUBSCRIPTION name SKIP definition
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_SKIP;
+ n->subname = $3;
+ n->options = $5;
+ $$ = (Node *)n;
+ }
;
/*****************************************************************************
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 03e069c..03dc305 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -136,6 +136,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -189,6 +190,7 @@
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/pg_lsn.h"
#include "utils/rel.h"
#include "utils/rls.h"
#include "utils/syscache.h"
@@ -259,6 +261,21 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * We enable skipping all data modification changes (INSERT, UPDATE, etc.) for
+ * the subscription if the remote transaction's finish LSN matches the subskiplsn.
+ * Once we start skipping changes, we don't stop it until we skip all changes of
+ * the transaction even if pg_subscription is updated and MySubscription->skiplsn
+ * gets changed or reset during that. Also, in streaming transaction cases, we
+ * don't skip receiving and spooling the changes since we decide whether or not
+ * to skip applying the changes when starting to apply changes. The subskiplsn is
+ * cleared after successfully skipping the transaction or applying non-empty
+ * transaction. The latter prevents the mistakenly specified subskiplsn from
+ * being left.
+ */
+static XLogRecPtr skip_xact_finish_lsn = InvalidXLogRecPtr;
+#define is_skipping_changes() (unlikely(!XLogRecPtrIsInvalid(skip_xact_finish_lsn)))
+
/* BufFile handle of the current streaming file */
static BufFile *stream_fd = NULL;
@@ -336,6 +353,11 @@ static void TwoPhaseTransactionGid(Oid subid, TransactionId xid, char *gid, int
/* Common streaming function to apply all the spooled messages */
static void apply_spooled_messages(TransactionId xid, XLogRecPtr lsn);
+/* Functions for skipping changes */
+static void maybe_start_skipping_changes(XLogRecPtr finish_lsn);
+static void stop_skipping_changes(void);
+static void clear_subscription_skip_lsn(XLogRecPtr finish_lsn);
+
/* Functions for apply error callback */
static void apply_error_callback(void *arg);
static inline void set_apply_error_context_xact(TransactionId xid, XLogRecPtr lsn);
@@ -795,6 +817,8 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
+ maybe_start_skipping_changes(begin_data.final_lsn);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -847,6 +871,8 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
+ maybe_start_skipping_changes(begin_data.prepare_lsn);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -905,9 +931,9 @@ apply_handle_prepare(StringInfo s)
/*
* Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction. It is done this way because at
- * commit prepared time, we won't know whether we have skipped preparing a
- * transaction because of no change.
+ * change has happened in this transaction or all changes are skipped. It
+ * is done this way because at commit prepared time, we won't know whether
+ * we have skipped preparing a transaction because of those reasons.
*
* XXX, We can optimize such that at commit prepared time, we first check
* whether we have prepared the transaction or not but that doesn't seem
@@ -928,6 +954,15 @@ apply_handle_prepare(StringInfo s)
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(prepare_data.end_lsn);
+ /*
+ * Since we have already prepared the transaction, in a case where the
+ * server crashes before clearing the subskiplsn, it will be left but the
+ * transaction won't be resent. But that's okay because it's a rare case
+ * and the subskiplsn will be cleared when finishing the next transaction.
+ */
+ stop_skipping_changes();
+ clear_subscription_skip_lsn(prepare_data.prepare_lsn);
+
pgstat_report_activity(STATE_IDLE, NULL);
reset_apply_error_context_info();
}
@@ -969,6 +1004,8 @@ apply_handle_commit_prepared(StringInfo s)
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(prepare_data.end_lsn);
+ clear_subscription_skip_lsn(prepare_data.end_lsn);
+
pgstat_report_activity(STATE_IDLE, NULL);
reset_apply_error_context_info();
}
@@ -1010,6 +1047,8 @@ apply_handle_rollback_prepared(StringInfo s)
FinishPreparedTransaction(gid, false);
end_replication_step();
CommitTransactionCommand();
+
+ clear_subscription_skip_lsn(rollback_data.rollback_end_lsn);
}
pgstat_report_stat(false);
@@ -1072,6 +1111,13 @@ apply_handle_stream_prepare(StringInfo s)
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(prepare_data.end_lsn);
+ /*
+ * Similar to prepare case, the subskiplsn could be left in a case of
+ * server crash but it's okay. See the comments in apply_handle_prepare().
+ */
+ stop_skipping_changes();
+ clear_subscription_skip_lsn(prepare_data.prepare_lsn);
+
pgstat_report_activity(STATE_IDLE, NULL);
reset_apply_error_context_info();
@@ -1311,6 +1357,8 @@ apply_spooled_messages(TransactionId xid, XLogRecPtr lsn)
MemoryContext oldcxt;
BufFile *fd;
+ maybe_start_skipping_changes(lsn);
+
/* Make sure we have an open transaction */
begin_replication_step();
@@ -1455,9 +1503,27 @@ apply_handle_stream_commit(StringInfo s)
static void
apply_handle_commit_internal(LogicalRepCommitData *commit_data)
{
+ if (is_skipping_changes())
+ {
+ stop_skipping_changes();
+
+ /*
+ * Start a new transaction to clear the subskipxid, if not started
+ * yet.
+ */
+ if (!IsTransactionState())
+ StartTransactionCommand();
+ }
+
if (IsTransactionState())
{
/*
+ * The transaction is either non-empty or skipped, so we clear the
+ * subskiplsn.
+ */
+ clear_subscription_skip_lsn(commit_data->commit_lsn);
+
+ /*
* Update origin state so we can restart streaming from correct
* position in case of crash.
*/
@@ -1583,7 +1649,12 @@ apply_handle_insert(StringInfo s)
TupleTableSlot *remoteslot;
MemoryContext oldctx;
- if (handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s))
+ /*
+ * Quick return if we are skipping data modification changes or handling
+ * streamed transactions.
+ */
+ if (is_skipping_changes() ||
+ handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s))
return;
begin_replication_step();
@@ -1710,7 +1781,12 @@ apply_handle_update(StringInfo s)
RangeTblEntry *target_rte;
MemoryContext oldctx;
- if (handle_streamed_transaction(LOGICAL_REP_MSG_UPDATE, s))
+ /*
+ * Quick return if we are skipping data modification changes or handling
+ * streamed transactions.
+ */
+ if (is_skipping_changes() ||
+ handle_streamed_transaction(LOGICAL_REP_MSG_UPDATE, s))
return;
begin_replication_step();
@@ -1874,7 +1950,12 @@ apply_handle_delete(StringInfo s)
TupleTableSlot *remoteslot;
MemoryContext oldctx;
- if (handle_streamed_transaction(LOGICAL_REP_MSG_DELETE, s))
+ /*
+ * Quick return if we are skipping data modification changes or handling
+ * streamed transactions.
+ */
+ if (is_skipping_changes() ||
+ handle_streamed_transaction(LOGICAL_REP_MSG_DELETE, s))
return;
begin_replication_step();
@@ -2261,7 +2342,12 @@ apply_handle_truncate(StringInfo s)
ListCell *lc;
LOCKMODE lockmode = AccessExclusiveLock;
- if (handle_streamed_transaction(LOGICAL_REP_MSG_TRUNCATE, s))
+ /*
+ * Quick return if we are skipping data modification changes or handling
+ * streamed transactions.
+ */
+ if (is_skipping_changes() ||
+ handle_streamed_transaction(LOGICAL_REP_MSG_TRUNCATE, s))
return;
begin_replication_step();
@@ -3738,6 +3824,139 @@ IsLogicalWorker(void)
return MyLogicalRepWorker != NULL;
}
+/*
+ * Start skipping changes of the transaction if the given LSN matches the
+ * LSN specified by subscription's skiplsn.
+ */
+static void
+maybe_start_skipping_changes(XLogRecPtr finish_lsn)
+{
+ Assert(!is_skipping_changes());
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ /*
+ * Quick return if it's not requested to skip this transaction. This
+ * function is called for every remote transaction and we assume that
+ * skipping the transaction is not used often.
+ */
+ if (likely(XLogRecPtrIsInvalid(MySubscription->skiplsn) ||
+ MySubscription->skiplsn != finish_lsn))
+ return;
+
+ /* Start skipping all changes of this transaction */
+ skip_xact_finish_lsn = finish_lsn;
+
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction finished at %X/%X",
+ LSN_FORMAT_ARGS(skip_xact_finish_lsn)));
+}
+
+/*
+ * Stop skipping changes by resetting skip_xact_finish_lsn if enabled.
+ */
+static void
+stop_skipping_changes(void)
+{
+ if (!is_skipping_changes())
+ return;
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction finished at %X/%X",
+ LSN_FORMAT_ARGS(skip_xact_finish_lsn))));
+
+ /* Stop skipping changes */
+ skip_xact_finish_lsn = InvalidXLogRecPtr;
+}
+
+/*
+ * Clear subskiplsn of pg_subscription catalog.
+ *
+ * finish_lsn is the transaction's finish LSN that is used to check if the
+ * subskiplsn matches it. If not matched, we raise a warning when clearing the
+ * subskipxid in order to inform users for cases e.g., where the user mistakenly
+ * specified the wrong subskiplsn.
+ */
+static void
+clear_subscription_skip_lsn(XLogRecPtr finish_lsn)
+{
+ Relation rel;
+ Form_pg_subscription subform;
+ HeapTuple tup;
+ XLogRecPtr myskiplsn = MySubscription->skiplsn;
+ bool started_tx = false;
+
+ if (likely(XLogRecPtrIsInvalid(myskiplsn)))
+ return;
+
+ if (!IsTransactionState())
+ {
+ StartTransactionCommand();
+ started_tx = true;
+ }
+
+ /*
+ * Protect subskiplsn of pg_subscription from being concurrently updated
+ * while clearing it.
+ */
+ LockSharedObject(SubscriptionRelationId, MySubscription->oid, 0,
+ AccessShareLock);
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ ObjectIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ subform = (Form_pg_subscription) GETSTRUCT(tup);
+
+ /*
+ * Clear the subskiplsn. If the user has already changed subskiplsn before
+ * clearing it we don't update the catalog and the replication origin
+ * state won't get advanced. So in the worst case, if the server crashes
+ * before sending an acknowledgment of the flush position the transaction
+ * will be sent again and the user needs to set subskiplsn again. We can
+ * reduce the possibility by logging a replication origin WAL record to
+ * advance the origin LSN instead but there is no way to advance the
+ * origin timestamp and it doesn't seem to be worth doing anything about
+ * it since it's a very rare case.
+ */
+ if (subform->subskiplsn == myskiplsn)
+ {
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ /* reset subskiplsn */
+ values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ replaces[Anum_pg_subscription_subskiplsn - 1] = true;
+
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+
+ if (myskiplsn != finish_lsn)
+ ereport(WARNING,
+ errmsg("skip-LSN of logical replication subscription \"%s\" cleared", MySubscription->name),
+ errdetail("Remote transaction's finish WAL location (LSN) %X/%X did not match skip-LSN %X/%X",
+ LSN_FORMAT_ARGS(finish_lsn),
+ LSN_FORMAT_ARGS(myskiplsn)));
+ }
+
+ heap_freetuple(tup);
+ table_close(rel, NoLock);
+
+ if (started_tx)
+ CommitTransactionCommand();
+}
+
/* Error callback to give more context info about the change being applied */
static void
apply_error_callback(void *arg)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 4dd24b8..202bca4 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4359,6 +4359,10 @@ getSubscriptions(Archive *fout)
ntups = PQntuples(res);
+ /*
+ * Get subscription fields. We don't include subskiplsn in the dump as
+ * after restoring the dump this value may no longer be relevant.
+ */
i_tableoid = PQfnumber(res, "tableoid");
i_oid = PQfnumber(res, "oid");
i_subname = PQfnumber(res, "subname");
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 9229eac..4c6c370 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6084,7 +6084,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false};
+ false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6131,6 +6131,12 @@ describeSubscriptions(const char *pattern, bool verbose)
", subconninfo AS \"%s\"\n",
gettext_noop("Synchronous commit"),
gettext_noop("Conninfo"));
+
+ /* Skip LSN is only supported in v15 and higher */
+ if (pset.sversion >= 150000)
+ appendPQExpBuffer(&buf,
+ ", subskiplsn AS \"%s\"\n",
+ gettext_noop("Skip LSN"));
}
/* Only display subscriptions in current database. */
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 1717282..11cb411 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1819,7 +1819,7 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> */
else if (Matches("ALTER", "SUBSCRIPTION", MatchAny))
COMPLETE_WITH("CONNECTION", "ENABLE", "DISABLE", "OWNER TO",
- "RENAME TO", "REFRESH PUBLICATION", "SET",
+ "RENAME TO", "REFRESH PUBLICATION", "SET", "SKIP (",
"ADD PUBLICATION", "DROP PUBLICATION");
/* ALTER SUBSCRIPTION <name> REFRESH PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) &&
@@ -1835,6 +1835,9 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
COMPLETE_WITH("binary", "slot_name", "streaming", "synchronous_commit", "disable_on_error");
+ /* ALTER SUBSCRIPTION <name> SKIP ( */
+ else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
+ COMPLETE_WITH("lsn");
/* ALTER SUBSCRIPTION <name> SET PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "PUBLICATION"))
{
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index e2befaf..69969a0 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -70,6 +70,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
bool subdisableonerr; /* True if a worker error should cause the
* subscription to be disabled */
+ XLogRecPtr subskiplsn; /* All changes finished at this LSN are
+ * skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -109,6 +112,8 @@ typedef struct Subscription
bool disableonerr; /* Indicates if the subscription should be
* automatically disabled if a worker error
* occurs */
+ XLogRecPtr skiplsn; /* All changes finished at this LSN are
+ * skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 1617702..6f83a79 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3726,7 +3726,8 @@ typedef enum AlterSubscriptionType
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
ALTER_SUBSCRIPTION_DROP_PUBLICATION,
ALTER_SUBSCRIPTION_REFRESH,
- ALTER_SUBSCRIPTION_ENABLED
+ ALTER_SUBSCRIPTION_ENABLED,
+ ALTER_SUBSCRIPTION_SKIP
} AlterSubscriptionType;
typedef struct AlterSubscriptionStmt
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index ad8003f..7fcfad1 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -76,10 +76,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -93,11 +93,25 @@ ALTER SUBSCRIPTION regress_doesnotexist CONNECTION 'dbname=regress_doesnotexist2
ERROR: subscription "regress_doesnotexist" does not exist
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
ERROR: unrecognized subscription parameter: "create_slot"
+-- ok
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | off | dbname=regress_doesnotexist2 | 0/12345
+(1 row)
+
+-- ok - with lsn = NONE
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
+ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------------------+------------------------------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | off | dbname=regress_doesnotexist2
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -129,10 +143,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------------------+------------------------------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | local | dbname=regress_doesnotexist2
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -165,19 +179,19 @@ ERROR: binary requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, binary = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -188,19 +202,19 @@ ERROR: streaming requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -215,10 +229,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more then once
@@ -233,10 +247,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -270,10 +284,10 @@ ERROR: two_phase requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -282,10 +296,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -294,10 +308,10 @@ DROP SUBSCRIPTION regress_testsub;
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -309,18 +323,18 @@ ERROR: disable_on_error requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, disable_on_error = false);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | t | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | t | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index a7c15b1..74c38ea 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -72,6 +72,17 @@ ALTER SUBSCRIPTION regress_testsub SET (slot_name = '');
ALTER SUBSCRIPTION regress_doesnotexist CONNECTION 'dbname=regress_doesnotexist2';
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
+-- ok
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
+
+\dRs+
+
+-- ok - with lsn = NONE
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
+
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
+
\dRs+
BEGIN;
diff --git a/src/test/subscription/t/029_disable_on_error.pl b/src/test/subscription/t/029_disable_on_error.pl
deleted file mode 100644
index 5eca804..0000000
--- a/src/test/subscription/t/029_disable_on_error.pl
+++ /dev/null
@@ -1,94 +0,0 @@
-
-# Copyright (c) 2021-2022, PostgreSQL Global Development Group
-
-# Test of logical replication subscription self-disabling feature.
-use strict;
-use warnings;
-use PostgreSQL::Test::Cluster;
-use PostgreSQL::Test::Utils;
-use Test::More;
-
-# create publisher node
-my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
-$node_publisher->init(allows_streaming => 'logical');
-$node_publisher->start;
-
-# create subscriber node
-my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
-$node_subscriber->init;
-$node_subscriber->start;
-
-# Create identical table on both nodes.
-$node_publisher->safe_psql('postgres', "CREATE TABLE tbl (i INT)");
-$node_subscriber->safe_psql('postgres', "CREATE TABLE tbl (i INT)");
-
-# Insert duplicate values on the publisher.
-$node_publisher->safe_psql('postgres',
- "INSERT INTO tbl (i) VALUES (1), (1), (1)");
-
-# Create an additional unique index on the subscriber.
-$node_subscriber->safe_psql('postgres',
- "CREATE UNIQUE INDEX tbl_unique ON tbl (i)");
-
-# Create a pub/sub to set up logical replication. This tests that the
-# uniqueness violation will cause the subscription to fail during initial
-# synchronization and make it disabled.
-my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
-$node_publisher->safe_psql('postgres',
- "CREATE PUBLICATION pub FOR TABLE tbl");
-$node_subscriber->safe_psql('postgres',
- "CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (disable_on_error = true)"
-);
-
-# Initial synchronization failure causes the subscription to be disabled.
-$node_subscriber->poll_query_until('postgres',
- "SELECT subenabled = false FROM pg_catalog.pg_subscription WHERE subname = 'sub'"
-) or die "Timed out while waiting for subscriber to be disabled";
-
-# Drop the unique index on the subscriber which caused the subscription to be
-# disabled.
-$node_subscriber->safe_psql('postgres', "DROP INDEX tbl_unique");
-
-# Re-enable the subscription "sub".
-$node_subscriber->safe_psql('postgres', "ALTER SUBSCRIPTION sub ENABLE");
-
-# Wait for the data to replicate.
-$node_publisher->wait_for_catchup('sub');
-$node_subscriber->poll_query_until('postgres',
- "SELECT COUNT(1) = 0 FROM pg_subscription_rel sr WHERE sr.srsubstate NOT IN ('s', 'r') AND sr.srrelid = 'tbl'::regclass"
-);
-
-# Confirm that we have finished the table sync.
-my $result =
- $node_subscriber->safe_psql('postgres', "SELECT MAX(i), COUNT(*) FROM tbl");
-is($result, qq(1|3), "subscription sub replicated data");
-
-# Delete the data from the subscriber and recreate the unique index.
-$node_subscriber->safe_psql('postgres', "DELETE FROM tbl");
-$node_subscriber->safe_psql('postgres',
- "CREATE UNIQUE INDEX tbl_unique ON tbl (i)");
-
-# Add more non-unique data to the publisher.
-$node_publisher->safe_psql('postgres',
- "INSERT INTO tbl (i) VALUES (3), (3), (3)");
-
-# Apply failure causes the subscription to be disabled.
-$node_subscriber->poll_query_until('postgres',
- "SELECT subenabled = false FROM pg_catalog.pg_subscription WHERE subname = 'sub'"
-) or die "Timed out while waiting for subscription sub to be disabled";
-
-# Drop the unique index on the subscriber and re-enabled the subscription. Then
-# confirm that the previously failing insert was applied OK.
-$node_subscriber->safe_psql('postgres', "DROP INDEX tbl_unique");
-$node_subscriber->safe_psql('postgres', "ALTER SUBSCRIPTION sub ENABLE");
-
-$node_publisher->wait_for_catchup('sub');
-
-$result = $node_subscriber->safe_psql('postgres',
- "SELECT COUNT(*) FROM tbl WHERE i = 3");
-is($result, qq(3), 'check the result of apply');
-
-$node_subscriber->stop;
-$node_publisher->stop;
-
-done_testing();
diff --git a/src/test/subscription/t/029_on_error.pl b/src/test/subscription/t/029_on_error.pl
new file mode 100644
index 0000000..06700ee
--- /dev/null
+++ b/src/test/subscription/t/029_on_error.pl
@@ -0,0 +1,183 @@
+
+# Copyright (c) 2021-2022, PostgreSQL Global Development Group
+
+# Tests for disable_on_error and SKIP transaction features.
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $offset = 0;
+
+# Test skipping the transaction. This function must be called after the caller
+# has inserted data that conflicts with the subscriber. The finish LSN of the
+# error transaction that is used to specify to ALTER SUBSCRIPTION ... SKIP is
+# fetched from the server logs. After executing ALTER SUBSCRITPION ... SKIP, we
+# check if logical replication can continue working by inserting $nonconflict_data
+# on the publisher.
+sub test_skip_xact
+{
+ my ($node_publisher, $node_subscriber, $nonconflict_data, $expected, $msg)
+ = @_;
+
+ # Wait until a conflict occurs on the subscriber.
+ $node_subscriber->poll_query_until('postgres',
+ "SELECT subenabled = FALSE FROM pg_subscription WHERE subname = 'sub'"
+ );
+
+ # Get the finish LSN of the error transaction.
+ my $contents = slurp_file($node_subscriber->logfile, $offset);
+ $contents =~
+ qr/processing remote data for replication origin \"pg_\d+\" during "INSERT" for replication target relation "public.tbl" in transaction \d+ finished at ([[:xdigit:]]+\/[[:xdigit:]]+)/
+ or die "could not get error-LSN";
+ my $lsn = $1;
+
+ # Set skip lsn.
+ $node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION sub SKIP (lsn = '$lsn')");
+
+ # Re-enable the subscription.
+ $node_subscriber->safe_psql('postgres', "ALTER SUBSCRIPTION sub ENABLE");
+
+ # Wait for the failed transaction to be skipped
+ $node_subscriber->poll_query_until('postgres',
+ "SELECT subskiplsn = '0/0' FROM pg_subscription WHERE subname = 'sub'"
+ );
+
+ # Check the log to ensure that the transaction is skipped, and advance the
+ # offset of the log file for the next test.
+ $offset = $node_subscriber->wait_for_log(
+ qr/LOG: done skipping logical replication transaction finished at $lsn/,
+ $offset);
+
+ # Insert non-conflict data
+ $node_publisher->safe_psql('postgres',
+ "INSERT INTO tbl VALUES $nonconflict_data");
+
+ $node_publisher->wait_for_catchup('sub');
+
+ # Check replicated data
+ my $res =
+ $node_subscriber->safe_psql('postgres', "SELECT count(*) FROM tbl");
+ is($res, $expected, $msg);
+}
+
+# Create publisher node. Set a low value of logical_decoding_work_mem to test
+# streaming cases.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf(
+ 'postgresql.conf',
+ qq[
+logical_decoding_work_mem = 64kB
+max_prepared_transactions = 10
+]);
+$node_publisher->start;
+
+# Create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init;
+$node_subscriber->append_conf(
+ 'postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On the subscriber, we
+# create the same tables but with a primary key. Also, insert some data that
+# will conflict with the data replicated from publisher later.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+CREATE TABLE tbl (i INT, t TEXT);
+INSERT INTO tbl VALUES (1, NULL);
+]);
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+CREATE TABLE tbl (i INT PRIMARY KEY, t TEXT);
+INSERT INTO tbl VALUES (1, NULL);
+]);
+
+# Create a pub/sub to set up logical replication. This tests that the
+# uniqueness violation will cause the subscription to fail during initial
+# synchronization and make it disabled.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION pub FOR TABLE tbl");
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (disable_on_error = true, streaming = on, two_phase = on)"
+);
+
+# Initial synchronization failure causes the subscription to be disabled.
+$node_subscriber->poll_query_until('postgres',
+ "SELECT subenabled = false FROM pg_catalog.pg_subscription WHERE subname = 'sub'"
+) or die "Timed out while waiting for subscriber to be disabled";
+
+# Truncate the table on the subscriber which caused the subscription to be
+# disabled.
+$node_subscriber->safe_psql('postgres', "TRUNCATE tbl");
+
+# Re-enable the subscription "sub".
+$node_subscriber->safe_psql('postgres', "ALTER SUBSCRIPTION sub ENABLE");
+
+# Wait for the data to replicate.
+$node_publisher->wait_for_catchup('sub');
+$node_subscriber->poll_query_until('postgres',
+ "SELECT COUNT(1) = 0 FROM pg_subscription_rel sr WHERE sr.srsubstate NOT IN ('s', 'r') AND sr.srrelid = 'tbl'::regclass"
+);
+
+# Confirm that we have finished the table sync.
+my $result =
+ $node_subscriber->safe_psql('postgres', "SELECT COUNT(*) FROM tbl");
+is($result, qq(1), "subscription sub replicated data");
+
+# Insert data to tbl, raising an error on the subscriber due to violation
+# of the unique constraint on tbl. Then skip the transaction.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO tbl VALUES (1, NULL);
+COMMIT;
+]);
+test_skip_xact($node_publisher, $node_subscriber,
+ "(2, NULL)", "2", "test skipping transaction");
+
+# Test for PREPARE and COMMIT PREPARED. Insert the same data to tbl and
+# PREPARE the transaction, raising an error. Then skip the transaction.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO tbl VALUES (1, NULL);
+PREPARE TRANSACTION 'gtx';
+COMMIT PREPARED 'gtx';
+]);
+test_skip_xact($node_publisher, $node_subscriber,
+ "(3, NULL)", "3", "test skipping prepare and commit prepared ");
+
+# Test for STREAM COMMIT. Insert enough rows to tbl to exceed the 64kB
+# limit, also raising an error on the subscriber during applying spooled
+# changes for the same reason. Then skip the transaction.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO tbl SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+COMMIT;
+]);
+test_skip_xact($node_publisher, $node_subscriber, "(4, md5(4::text))",
+ "4", "test skipping stream-commit");
+
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT COUNT(*) FROM pg_prepared_xacts");
+is($result, "0",
+ "check all prepared transactions are resolved on the subscriber");
+
+$node_subscriber->stop;
+$node_publisher->stop;
+
+done_testing();
--
1.8.3.1
On Thursday, March 17, 2022 3:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Mar 16, 2022 at 1:53 PM Masahiko Sawada
<sawada.mshk@gmail.com> wrote:I've attached an updated version patch.
The patch LGTM. I have made minor changes in comments and docs in the
attached patch. Kindly let me know what you think of the attached?
Hi, thank you for the patch. Few minor comments.
(1) comment of maybe_start_skipping_changes
+ /*
+ * Quick return if it's not requested to skip this transaction. This
+ * function is called for every remote transaction and we assume that
+ * skipping the transaction is not used often.
+ */
I feel this comment should explain more about our intention and
what it confirms. In a case when user requests skip,
but it doesn't match the condition, we don't start
skipping changes, strictly speaking.
From:
Quick return if it's not requested to skip this transaction.
To:
Quick return if we can't ensure possible skiplsn is set
and it equals to the finish LSN of this transaction.
(2) 029_on_error.pl
+ my $contents = slurp_file($node_subscriber->logfile, $offset);
+ $contents =~
+ qr/processing remote data for replication origin \"pg_\d+\" during "INSERT" for replication target relation "public.tbl" in transaction \d+ finishe$
+ or die "could not get error-LSN";
I think we shouldn't use a lot of new words.
How about a change below ?
From:
could not get error-LSN
To:
failed to find expected error message that contains finish LSN for SKIP option
(3) apply_handle_commit_internal
Lastly, may I have the reasons to call both
stop_skipping_changes and clear_subscription_skip_lsn
in this function, instead of having them at the end
of apply_handle_commit and apply_handle_stream_commit ?
IMHO, this structure looks to create the
extra condition branches in apply_handle_commit_internal.
Also, because of this code, when we call stop_skipping_changes
in the apply_handle_commit_internal, after checking
is_skipping_changes() returns true, we check another
is_skipping_changes() at the top of stop_skipping_changes.
OTOH, for other cases like apply_handle_prepare, apply_handle_stream_prepare,
we call those two functions (or either one) depending on the needs,
after existing commits and during the closing processing.
(In the case of rollback_prepare, it's also called after existing commit)
I feel if we move those two functions at the end
of the apply_handle_commit and apply_handle_stream_commit,
then we will have more aligned codes and improve readability.
Best Regards,
Takamichi Osumi
On Thu, Mar 17, 2022 at 12:39 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
On Thursday, March 17, 2022 3:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Mar 16, 2022 at 1:53 PM Masahiko Sawada
<sawada.mshk@gmail.com> wrote:I've attached an updated version patch.
The patch LGTM. I have made minor changes in comments and docs in the
attached patch. Kindly let me know what you think of the attached?Hi, thank you for the patch. Few minor comments.
(1) comment of maybe_start_skipping_changes
+ /* + * Quick return if it's not requested to skip this transaction. This + * function is called for every remote transaction and we assume that + * skipping the transaction is not used often. + */I feel this comment should explain more about our intention and
what it confirms. In a case when user requests skip,
but it doesn't match the condition, we don't start
skipping changes, strictly speaking.From:
Quick return if it's not requested to skip this transaction.To:
Quick return if we can't ensure possible skiplsn is set
and it equals to the finish LSN of this transaction.
Hmm, the current comment seems more appropriate. What you are
suggesting is almost writing the code in sentence form.
(2) 029_on_error.pl
+ my $contents = slurp_file($node_subscriber->logfile, $offset); + $contents =~ + qr/processing remote data for replication origin \"pg_\d+\" during "INSERT" for replication target relation "public.tbl" in transaction \d+ finishe$ + or die "could not get error-LSN";I think we shouldn't use a lot of new words.
How about a change below ?
From:
could not get error-LSN
To:
failed to find expected error message that contains finish LSN for SKIP option(3) apply_handle_commit_internal
...
I feel if we move those two functions at the end
of the apply_handle_commit and apply_handle_stream_commit,
then we will have more aligned codes and improve readability.
I think the intention is to avoid duplicate code as we have a common
function that gets called from both of those. OTOH, if Sawada-San or
others also prefer your approach to rearrange the code then I am fine
with it.
--
With Regards,
Amit Kapila.
On Thu, Mar 17, 2022 at 5:52 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Mar 17, 2022 at 12:39 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:On Thursday, March 17, 2022 3:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Mar 16, 2022 at 1:53 PM Masahiko Sawada
<sawada.mshk@gmail.com> wrote:I've attached an updated version patch.
The patch LGTM. I have made minor changes in comments and docs in the
attached patch. Kindly let me know what you think of the attached?Hi, thank you for the patch. Few minor comments.
(1) comment of maybe_start_skipping_changes
+ /* + * Quick return if it's not requested to skip this transaction. This + * function is called for every remote transaction and we assume that + * skipping the transaction is not used often. + */I feel this comment should explain more about our intention and
what it confirms. In a case when user requests skip,
but it doesn't match the condition, we don't start
skipping changes, strictly speaking.From:
Quick return if it's not requested to skip this transaction.To:
Quick return if we can't ensure possible skiplsn is set
and it equals to the finish LSN of this transaction.Hmm, the current comment seems more appropriate. What you are
suggesting is almost writing the code in sentence form.(2) 029_on_error.pl
+ my $contents = slurp_file($node_subscriber->logfile, $offset); + $contents =~ + qr/processing remote data for replication origin \"pg_\d+\" during "INSERT" for replication target relation "public.tbl" in transaction \d+ finishe$ + or die "could not get error-LSN";I think we shouldn't use a lot of new words.
How about a change below ?
From:
could not get error-LSN
To:
failed to find expected error message that contains finish LSN for SKIP option(3) apply_handle_commit_internal
...
I feel if we move those two functions at the end
of the apply_handle_commit and apply_handle_stream_commit,
then we will have more aligned codes and improve readability.
I think we cannot just move them to the end of apply_handle_commit()
and apply_handle_stream_commit(). Because if we do that, we end up
missing updating replication_session_origin_lsn/timestamp when
clearing the subskiplsn if we're skipping a non-stream transaction.
Basically, the apply worker differently handles 2pc transactions and
non-2pc transactions; we always prepare even empty transactions
whereas we don't commit empty non-2pc transactions. So I think we
don’t have to handle both in the same way.
I think the intention is to avoid duplicate code as we have a common
function that gets called from both of those.
Yes.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thursday, March 17, 2022 7:56 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Mar 17, 2022 at 5:52 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:
On Thu, Mar 17, 2022 at 12:39 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:On Thursday, March 17, 2022 3:04 PM Amit Kapila
<amit.kapila16@gmail.com> wrote:
On Wed, Mar 16, 2022 at 1:53 PM Masahiko Sawada
<sawada.mshk@gmail.com> wrote:I've attached an updated version patch.
The patch LGTM. I have made minor changes in comments and docs in
the attached patch. Kindly let me know what you think of the attached?Hi, thank you for the patch. Few minor comments.
(3) apply_handle_commit_internal
...
I feel if we move those two functions at the end of the
apply_handle_commit and apply_handle_stream_commit, then we will
have more aligned codes and improve readability.I think we cannot just move them to the end of apply_handle_commit() and
apply_handle_stream_commit(). Because if we do that, we end up missing
updating replication_session_origin_lsn/timestamp when clearing the
subskiplsn if we're skipping a non-stream transaction.Basically, the apply worker differently handles 2pc transactions and non-2pc
transactions; we always prepare even empty transactions whereas we don't
commit empty non-2pc transactions. So I think we don’t have to handle both in
the same way.
Okay. Thank you so much for your explanation.
Then the code looks good to me.
Best Regards,
Takamichi Osumi
On Thu, Mar 17, 2022 at 3:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Mar 16, 2022 at 1:53 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated version patch.
The patch LGTM. I have made minor changes in comments and docs in the
attached patch. Kindly let me know what you think of the attached?
Thank you for updating the patch. It looks good to me.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thu, Mar 17, 2022, at 3:03 AM, Amit Kapila wrote:
On Wed, Mar 16, 2022 at 1:53 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated version patch.
The patch LGTM. I have made minor changes in comments and docs in the
attached patch. Kindly let me know what you think of the attached?I am planning to commit this early next week (on Monday) unless there
are more comments/suggestions.
I reviewed this last version and I have a few comments.
+ * If the user set subskiplsn, we do a sanity check to make
+ * sure that the specified LSN is a probable value.
... user *sets*...
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("skip WAL location (LSN) must be greater than origin LSN %X/%X",
+ LSN_FORMAT_ARGS(remote_lsn))));
Shouldn't we add the LSN to be skipped in the "(LSN)"?
+ * Start a new transaction to clear the subskipxid, if not started
+ * yet.
It seems it means subskiplsn.
+ * subskipxid in order to inform users for cases e.g., where the user mistakenly
+ * specified the wrong subskiplsn.
It seems it means subskiplsn.
+sub test_skip_xact
+{
It seems this function should be named test_skip_lsn. Unless the intention is
to cover other skip options in the future.
src/test/subscription/t/029_disable_on_error.pl | 94 ----------
src/test/subscription/t/029_on_error.pl | 183 +++++++++++++++++++
It seems you are removing a test for 705e20f8550c0e8e47c0b6b20b5f5ffd6ffd9e33.
I should also name 029_on_error.pl to something else such as 030_skip_lsn.pl or
a generic name 030_skip_option.pl.
--
Euler Taveira
EDB https://www.enterprisedb.com/
On Mon, Mar 21, 2022 at 7:09 AM Euler Taveira <euler@eulerto.com> wrote:
src/test/subscription/t/029_disable_on_error.pl | 94 ----------
src/test/subscription/t/029_on_error.pl | 183 +++++++++++++++++++It seems you are removing a test for 705e20f8550c0e8e47c0b6b20b5f5ffd6ffd9e33.
We have covered the same test in the new test file. See "CREATE
SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH
(disable_on_error = true, ...". This will test the cases we were
earlier testing via 'disable_on_error'.
I should also name 029_on_error.pl to something else such as 030_skip_lsn.pl or
a generic name 030_skip_option.pl.
The reason to keep the name 'on_error' is that it has tests for both
'disable_on_error' option and 'skip_lsn'. The other option could be
'on_error_action' or something like that. Now, does this make sense to
you?
--
With Regards,
Amit Kapila.
On Mon, Mar 21, 2022 at 7:09 AM Euler Taveira <euler@eulerto.com> wrote:
On Thu, Mar 17, 2022, at 3:03 AM, Amit Kapila wrote:
On Wed, Mar 16, 2022 at 1:53 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've attached an updated version patch.
The patch LGTM. I have made minor changes in comments and docs in the
attached patch. Kindly let me know what you think of the attached?I am planning to commit this early next week (on Monday) unless there
are more comments/suggestions.I reviewed this last version and I have a few comments.
+ * If the user set subskiplsn, we do a sanity check to make + * sure that the specified LSN is a probable value.... user *sets*...
+ ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("skip WAL location (LSN) must be greater than origin LSN %X/%X", + LSN_FORMAT_ARGS(remote_lsn))));Shouldn't we add the LSN to be skipped in the "(LSN)"?
+ * Start a new transaction to clear the subskipxid, if not started + * yet.It seems it means subskiplsn.
+ * subskipxid in order to inform users for cases e.g., where the user mistakenly + * specified the wrong subskiplsn.It seems it means subskiplsn.
+sub test_skip_xact
+{It seems this function should be named test_skip_lsn. Unless the intention is
to cover other skip options in the future.
I have fixed all the above comments as per your suggestion in the
attached. Do let me know if something is missed?
src/test/subscription/t/029_disable_on_error.pl | 94 ----------
src/test/subscription/t/029_on_error.pl | 183 +++++++++++++++++++It seems you are removing a test for 705e20f8550c0e8e47c0b6b20b5f5ffd6ffd9e33.
I should also name 029_on_error.pl to something else such as 030_skip_lsn.pl or
a generic name 030_skip_option.pl.
As explained in my previous email, I don't think any change is
required for this comment but do let me know if you still think so?
--
With Regards,
Amit Kapila.
Attachments:
v17-0001-Add-ALTER-SUBSCRIPTION-.-SKIP.patchapplication/octet-stream; name=v17-0001-Add-ALTER-SUBSCRIPTION-.-SKIP.patchDownload
From 9b1811392c913aca2ee7da7011233402510acfd3 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 10 Dec 2021 14:41:30 +0900
Subject: [PATCH v17] Add ALTER SUBSCRIPTION ... SKIP.
This feature allows skipping the transaction on subscriber nodes.
If incoming change violates any constraint, logical replication stops
until it's resolved. Currently, users need to either manually resolve the
conflict by updating a subscriber-side database or by using function
pg_replication_origin_advance() to skip the conflicting transaction. This
commit introduces a simpler way to skip the conflicting transactions.
The user can specify LSN by ALTER SUBSCRIPTION ... SKIP (lsn = XXX),
which allows the apply worker to skip the transaction finished at
specified LSN. The apply worker skips all data modification changes within
the transaction.
Author: Masahiko Sawada
Reviewed-by: Takamichi Osumi, Hou Zhijie, Peter Eisentraut, Amit Kapila, Shi Yu, Vignesh C, Greg Nancarrow, Haiying Tang, Euler Taveira
Discussion: https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xJfUVihNZDA@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 10 +
doc/src/sgml/logical-replication.sgml | 27 +-
doc/src/sgml/ref/alter_subscription.sgml | 42 ++++
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 2 +-
src/backend/commands/subscriptioncmds.c | 73 ++++++
src/backend/parser/gram.y | 9 +
src/backend/replication/logical/worker.c | 233 +++++++++++++++++-
src/bin/pg_dump/pg_dump.c | 4 +
src/bin/psql/describe.c | 8 +-
src/bin/psql/tab-complete.c | 5 +-
src/include/catalog/pg_subscription.h | 5 +
src/include/nodes/parsenodes.h | 3 +-
src/test/regress/expected/subscription.out | 126 +++++-----
src/test/regress/sql/subscription.sql | 11 +
.../subscription/t/029_disable_on_error.pl | 94 -------
src/test/subscription/t/029_on_error.pl | 183 ++++++++++++++
17 files changed, 664 insertions(+), 172 deletions(-)
delete mode 100644 src/test/subscription/t/029_disable_on_error.pl
create mode 100644 src/test/subscription/t/029_on_error.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 4dc5b34d21..2a8cd02664 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7797,6 +7797,16 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subskiplsn</structfield> <type>pg_lsn</type>
+ </para>
+ <para>
+ Finish LSN of the transaction whose changes are to be skipped, if a valid
+ LSN; otherwise <literal>0/0</literal>.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subconninfo</structfield> <type>text</type>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 6431d4796d..555fbd749c 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -362,19 +362,24 @@ CONTEXT: processing remote data for replication origin "pg_16395" during "INSER
</screen>
The LSN of the transaction that contains the change violating the constraint and
the replication origin name can be found from the server log (LSN 0/14C0378 and
- replication origin <literal>pg_16395</literal> in the above case). To skip the
- transaction, the subscription needs to be disabled temporarily by
- <command>ALTER SUBSCRIPTION ... DISABLE</command> first or alternatively, the
+ replication origin <literal>pg_16395</literal> in the above case). The
+ transaction that produces conflict can be skipped by using
+ <command>ALTER SUBSCRIPTION ... SKIP</command> with the finish LSN
+ (i.e., LSN 0/14C0378). The finish LSN could be an LSN at which the transaction
+ is committed or prepared on the publisher. Alternatively, the transaction can
+ also be skipped by calling the <link linkend="pg-replication-origin-advance">
+ <function>pg_replication_origin_advance()</function></link> function
+ transaction. Before using this function, the subscription needs to be disabled
+ temporarily either by <command>ALTER SUBSCRIPTION ... DISABLE</command> or, the
subscription can be used with the <literal>disable_on_error</literal> option.
- Then, the transaction can be skipped by calling the
- <link linkend="pg-replication-origin-advance">
- <function>pg_replication_origin_advance()</function></link> function with
- the <parameter>node_name</parameter> (i.e., <literal>pg_16395</literal>) and the
- next LSN of the transaction's LSN (i.e., LSN 0/14C0379). After that the replication
- can be resumed by <command>ALTER SUBSCRIPTION ... ENABLE</command>. The current
- position of origins can be seen in the
- <link linkend="view-pg-replication-origin-status">
+ Then, you can use <function>pg_replication_origin_advance()</function> function
+ with the <parameter>node_name</parameter> (i.e., <literal>pg_16395</literal>)
+ and the next LSN of the finish LSN (i.e., 0/14C0379). The current position of
+ origins can be seen in the <link linkend="view-pg-replication-origin-status">
<structname>pg_replication_origin_status</structname></link> system view.
+ Please note that skipping the whole transaction include skipping changes that
+ might not violate any constraint. This can easily make the subscriber
+ inconsistent.
</para>
</sect1>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 58b78a94ea..ac2db249cb 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -29,6 +29,7 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> REFRESH PUB
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> ENABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> DISABLE
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SET ( <replaceable class="parameter">subscription_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] )
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> )
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <replaceable>new_name</replaceable>
</synopsis>
@@ -210,6 +211,47 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><literal>SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable class="parameter">value</replaceable> )</literal></term>
+ <listitem>
+ <para>
+ Skips applying all changes of the remote transaction. If incoming data
+ violates any constraints, logical replication will stop until it is
+ resolved. By using <command>ALTER SUBSCRIPTION ... SKIP</command> command,
+ the logical replication worker skips all data modification changes within
+ the transaction. This option has no effect on the transactions that are
+ already prepared by enabling <literal>two_phase</literal> on
+ subscriber.
+ After logical replication worker successfully skips the transaction or
+ finishes a transaction, LSN (stored in
+ <structname>pg_subscription</structname>.<structfield>subskiplsn</structfield>)
+ is cleared. See <xref linkend="logical-replication-conflicts"/> for
+ the details of logical replication conflicts. Using this command requires
+ superuser privilege.
+ </para>
+
+ <para>
+ <replaceable>skip_option</replaceable> specifies options for this operation.
+ The supported option is:
+
+ <variablelist>
+ <varlistentry>
+ <term><literal>lsn</literal> (<type>pg_lsn</type>)</term>
+ <listitem>
+ <para>
+ Specifies the finish LSN of the remote transaction whose changes
+ are to be skipped by the logical replication worker. The finish LSN
+ is the LSN at which the transaction is either committed or prepared.
+ Skipping individual subtransaction is not supported. Setting
+ <literal>NONE</literal> resets the LSN.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><replaceable class="parameter">new_owner</replaceable></term>
<listitem>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a6304f5f81..0ff0982f7b 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -70,6 +70,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->stream = subform->substream;
sub->twophasestate = subform->subtwophasestate;
sub->disableonerr = subform->subdisableonerr;
+ sub->skiplsn = subform->subskiplsn;
/* Get conninfo */
datum = SysCacheGetAttr(SUBSCRIPTIONOID,
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index bb1ac30cd1..bd48ee7bd2 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1261,7 +1261,7 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
- substream, subtwophasestate, subdisableonerr, subslotname,
+ substream, subtwophasestate, subdisableonerr, subskiplsn, subslotname,
subsynccommit, subpublications)
ON pg_subscription TO public;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 3922658bbc..e16f04626d 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -45,6 +45,7 @@
#include "utils/guc.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/pg_lsn.h"
#include "utils/syscache.h"
/*
@@ -62,6 +63,7 @@
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
#define SUBOPT_DISABLE_ON_ERR 0x00000400
+#define SUBOPT_LSN 0x00000800
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -84,6 +86,7 @@ typedef struct SubOpts
bool streaming;
bool twophase;
bool disableonerr;
+ XLogRecPtr lsn;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -262,6 +265,33 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_DISABLE_ON_ERR;
opts->disableonerr = defGetBoolean(defel);
}
+ else if (IsSet(supported_opts, SUBOPT_LSN) &&
+ strcmp(defel->defname, "lsn") == 0)
+ {
+ char *lsn_str = defGetString(defel);
+ XLogRecPtr lsn;
+
+ if (IsSet(opts->specified_opts, SUBOPT_LSN))
+ errorConflictingDefElem(defel, pstate);
+
+ /* Setting lsn = NONE is treated as resetting LSN */
+ if (strcmp(lsn_str, "none") == 0)
+ lsn = InvalidXLogRecPtr;
+ else
+ {
+ /* Parse the argument as LSN */
+ lsn = DatumGetLSN(DirectFunctionCall1(pg_lsn_in,
+ CStringGetDatum(lsn_str)));
+
+ if (XLogRecPtrIsInvalid(lsn))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid WAL location (LSN): %s", lsn_str)));
+ }
+
+ opts->specified_opts |= SUBOPT_LSN;
+ opts->lsn = lsn;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -479,6 +509,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
values[Anum_pg_subscription_subdisableonerr - 1] = BoolGetDatum(opts.disableonerr);
+ values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -1106,6 +1137,48 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
break;
}
+ case ALTER_SUBSCRIPTION_SKIP:
+ {
+ parse_subscription_options(pstate, stmt->options, SUBOPT_LSN, &opts);
+
+ /* ALTER SUBSCRIPTION ... SKIP supports only LSN option */
+ Assert(IsSet(opts.specified_opts, SUBOPT_LSN));
+
+ if (!superuser())
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("must be superuser to skip transaction")));
+
+ /*
+ * If the user sets subskiplsn, we do a sanity check to make
+ * sure that the specified LSN is a probable value.
+ */
+ if (!XLogRecPtrIsInvalid(opts.lsn))
+ {
+ RepOriginId originid;
+ char originname[NAMEDATALEN];
+ XLogRecPtr remote_lsn;
+
+ snprintf(originname, sizeof(originname), "pg_%u", subid);
+ originid = replorigin_by_name(originname, false);
+ remote_lsn = replorigin_get_progress(originid, false);
+
+ /* Check the given LSN is at least a future LSN */
+ if (!XLogRecPtrIsInvalid(remote_lsn) && opts.lsn < remote_lsn)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("skip WAL location (LSN %X/%X) must be greater than origin LSN %X/%X",
+ LSN_FORMAT_ARGS(opts.lsn),
+ LSN_FORMAT_ARGS(remote_lsn))));
+ }
+
+ values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(opts.lsn);
+ replaces[Anum_pg_subscription_subskiplsn - 1] = true;
+
+ update_tuple = true;
+ break;
+ }
+
default:
elog(ERROR, "unrecognized ALTER SUBSCRIPTION kind %d",
stmt->kind);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index a03b33b53b..0036c2f9e2 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9983,6 +9983,15 @@ AlterSubscriptionStmt:
(Node *)makeBoolean(false), @1));
$$ = (Node *)n;
}
+ | ALTER SUBSCRIPTION name SKIP definition
+ {
+ AlterSubscriptionStmt *n =
+ makeNode(AlterSubscriptionStmt);
+ n->kind = ALTER_SUBSCRIPTION_SKIP;
+ n->subname = $3;
+ n->options = $5;
+ $$ = (Node *)n;
+ }
;
/*****************************************************************************
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 03e069c7cd..82dcffc2db 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -136,6 +136,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/partition.h"
#include "catalog/pg_inherits.h"
@@ -189,6 +190,7 @@
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/pg_lsn.h"
#include "utils/rel.h"
#include "utils/rls.h"
#include "utils/syscache.h"
@@ -259,6 +261,21 @@ static bool in_streamed_transaction = false;
static TransactionId stream_xid = InvalidTransactionId;
+/*
+ * We enable skipping all data modification changes (INSERT, UPDATE, etc.) for
+ * the subscription if the remote transaction's finish LSN matches the subskiplsn.
+ * Once we start skipping changes, we don't stop it until we skip all changes of
+ * the transaction even if pg_subscription is updated and MySubscription->skiplsn
+ * gets changed or reset during that. Also, in streaming transaction cases, we
+ * don't skip receiving and spooling the changes since we decide whether or not
+ * to skip applying the changes when starting to apply changes. The subskiplsn is
+ * cleared after successfully skipping the transaction or applying non-empty
+ * transaction. The latter prevents the mistakenly specified subskiplsn from
+ * being left.
+ */
+static XLogRecPtr skip_xact_finish_lsn = InvalidXLogRecPtr;
+#define is_skipping_changes() (unlikely(!XLogRecPtrIsInvalid(skip_xact_finish_lsn)))
+
/* BufFile handle of the current streaming file */
static BufFile *stream_fd = NULL;
@@ -336,6 +353,11 @@ static void TwoPhaseTransactionGid(Oid subid, TransactionId xid, char *gid, int
/* Common streaming function to apply all the spooled messages */
static void apply_spooled_messages(TransactionId xid, XLogRecPtr lsn);
+/* Functions for skipping changes */
+static void maybe_start_skipping_changes(XLogRecPtr finish_lsn);
+static void stop_skipping_changes(void);
+static void clear_subscription_skip_lsn(XLogRecPtr finish_lsn);
+
/* Functions for apply error callback */
static void apply_error_callback(void *arg);
static inline void set_apply_error_context_xact(TransactionId xid, XLogRecPtr lsn);
@@ -795,6 +817,8 @@ apply_handle_begin(StringInfo s)
remote_final_lsn = begin_data.final_lsn;
+ maybe_start_skipping_changes(begin_data.final_lsn);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -847,6 +871,8 @@ apply_handle_begin_prepare(StringInfo s)
remote_final_lsn = begin_data.prepare_lsn;
+ maybe_start_skipping_changes(begin_data.prepare_lsn);
+
in_remote_transaction = true;
pgstat_report_activity(STATE_RUNNING, NULL);
@@ -905,9 +931,9 @@ apply_handle_prepare(StringInfo s)
/*
* Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction. It is done this way because at
- * commit prepared time, we won't know whether we have skipped preparing a
- * transaction because of no change.
+ * change has happened in this transaction or all changes are skipped. It
+ * is done this way because at commit prepared time, we won't know whether
+ * we have skipped preparing a transaction because of those reasons.
*
* XXX, We can optimize such that at commit prepared time, we first check
* whether we have prepared the transaction or not but that doesn't seem
@@ -928,6 +954,15 @@ apply_handle_prepare(StringInfo s)
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(prepare_data.end_lsn);
+ /*
+ * Since we have already prepared the transaction, in a case where the
+ * server crashes before clearing the subskiplsn, it will be left but the
+ * transaction won't be resent. But that's okay because it's a rare case
+ * and the subskiplsn will be cleared when finishing the next transaction.
+ */
+ stop_skipping_changes();
+ clear_subscription_skip_lsn(prepare_data.prepare_lsn);
+
pgstat_report_activity(STATE_IDLE, NULL);
reset_apply_error_context_info();
}
@@ -969,6 +1004,8 @@ apply_handle_commit_prepared(StringInfo s)
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(prepare_data.end_lsn);
+ clear_subscription_skip_lsn(prepare_data.end_lsn);
+
pgstat_report_activity(STATE_IDLE, NULL);
reset_apply_error_context_info();
}
@@ -1010,6 +1047,8 @@ apply_handle_rollback_prepared(StringInfo s)
FinishPreparedTransaction(gid, false);
end_replication_step();
CommitTransactionCommand();
+
+ clear_subscription_skip_lsn(rollback_data.rollback_end_lsn);
}
pgstat_report_stat(false);
@@ -1072,6 +1111,13 @@ apply_handle_stream_prepare(StringInfo s)
/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(prepare_data.end_lsn);
+ /*
+ * Similar to prepare case, the subskiplsn could be left in a case of
+ * server crash but it's okay. See the comments in apply_handle_prepare().
+ */
+ stop_skipping_changes();
+ clear_subscription_skip_lsn(prepare_data.prepare_lsn);
+
pgstat_report_activity(STATE_IDLE, NULL);
reset_apply_error_context_info();
@@ -1311,6 +1357,8 @@ apply_spooled_messages(TransactionId xid, XLogRecPtr lsn)
MemoryContext oldcxt;
BufFile *fd;
+ maybe_start_skipping_changes(lsn);
+
/* Make sure we have an open transaction */
begin_replication_step();
@@ -1455,8 +1503,26 @@ apply_handle_stream_commit(StringInfo s)
static void
apply_handle_commit_internal(LogicalRepCommitData *commit_data)
{
+ if (is_skipping_changes())
+ {
+ stop_skipping_changes();
+
+ /*
+ * Start a new transaction to clear the subskiplsn, if not started
+ * yet.
+ */
+ if (!IsTransactionState())
+ StartTransactionCommand();
+ }
+
if (IsTransactionState())
{
+ /*
+ * The transaction is either non-empty or skipped, so we clear the
+ * subskiplsn.
+ */
+ clear_subscription_skip_lsn(commit_data->commit_lsn);
+
/*
* Update origin state so we can restart streaming from correct
* position in case of crash.
@@ -1583,7 +1649,12 @@ apply_handle_insert(StringInfo s)
TupleTableSlot *remoteslot;
MemoryContext oldctx;
- if (handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s))
+ /*
+ * Quick return if we are skipping data modification changes or handling
+ * streamed transactions.
+ */
+ if (is_skipping_changes() ||
+ handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s))
return;
begin_replication_step();
@@ -1710,7 +1781,12 @@ apply_handle_update(StringInfo s)
RangeTblEntry *target_rte;
MemoryContext oldctx;
- if (handle_streamed_transaction(LOGICAL_REP_MSG_UPDATE, s))
+ /*
+ * Quick return if we are skipping data modification changes or handling
+ * streamed transactions.
+ */
+ if (is_skipping_changes() ||
+ handle_streamed_transaction(LOGICAL_REP_MSG_UPDATE, s))
return;
begin_replication_step();
@@ -1874,7 +1950,12 @@ apply_handle_delete(StringInfo s)
TupleTableSlot *remoteslot;
MemoryContext oldctx;
- if (handle_streamed_transaction(LOGICAL_REP_MSG_DELETE, s))
+ /*
+ * Quick return if we are skipping data modification changes or handling
+ * streamed transactions.
+ */
+ if (is_skipping_changes() ||
+ handle_streamed_transaction(LOGICAL_REP_MSG_DELETE, s))
return;
begin_replication_step();
@@ -2261,7 +2342,12 @@ apply_handle_truncate(StringInfo s)
ListCell *lc;
LOCKMODE lockmode = AccessExclusiveLock;
- if (handle_streamed_transaction(LOGICAL_REP_MSG_TRUNCATE, s))
+ /*
+ * Quick return if we are skipping data modification changes or handling
+ * streamed transactions.
+ */
+ if (is_skipping_changes() ||
+ handle_streamed_transaction(LOGICAL_REP_MSG_TRUNCATE, s))
return;
begin_replication_step();
@@ -3738,6 +3824,139 @@ IsLogicalWorker(void)
return MyLogicalRepWorker != NULL;
}
+/*
+ * Start skipping changes of the transaction if the given LSN matches the
+ * LSN specified by subscription's skiplsn.
+ */
+static void
+maybe_start_skipping_changes(XLogRecPtr finish_lsn)
+{
+ Assert(!is_skipping_changes());
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ /*
+ * Quick return if it's not requested to skip this transaction. This
+ * function is called for every remote transaction and we assume that
+ * skipping the transaction is not used often.
+ */
+ if (likely(XLogRecPtrIsInvalid(MySubscription->skiplsn) ||
+ MySubscription->skiplsn != finish_lsn))
+ return;
+
+ /* Start skipping all changes of this transaction */
+ skip_xact_finish_lsn = finish_lsn;
+
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction finished at %X/%X",
+ LSN_FORMAT_ARGS(skip_xact_finish_lsn)));
+}
+
+/*
+ * Stop skipping changes by resetting skip_xact_finish_lsn if enabled.
+ */
+static void
+stop_skipping_changes(void)
+{
+ if (!is_skipping_changes())
+ return;
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction finished at %X/%X",
+ LSN_FORMAT_ARGS(skip_xact_finish_lsn))));
+
+ /* Stop skipping changes */
+ skip_xact_finish_lsn = InvalidXLogRecPtr;
+}
+
+/*
+ * Clear subskiplsn of pg_subscription catalog.
+ *
+ * finish_lsn is the transaction's finish LSN that is used to check if the
+ * subskiplsn matches it. If not matched, we raise a warning when clearing the
+ * subskiplsn in order to inform users for cases e.g., where the user mistakenly
+ * specified the wrong subskiplsn.
+ */
+static void
+clear_subscription_skip_lsn(XLogRecPtr finish_lsn)
+{
+ Relation rel;
+ Form_pg_subscription subform;
+ HeapTuple tup;
+ XLogRecPtr myskiplsn = MySubscription->skiplsn;
+ bool started_tx = false;
+
+ if (likely(XLogRecPtrIsInvalid(myskiplsn)))
+ return;
+
+ if (!IsTransactionState())
+ {
+ StartTransactionCommand();
+ started_tx = true;
+ }
+
+ /*
+ * Protect subskiplsn of pg_subscription from being concurrently updated
+ * while clearing it.
+ */
+ LockSharedObject(SubscriptionRelationId, MySubscription->oid, 0,
+ AccessShareLock);
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+ /* Fetch the existing tuple. */
+ tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+ ObjectIdGetDatum(MySubscription->oid));
+
+ if (!HeapTupleIsValid(tup))
+ elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
+
+ subform = (Form_pg_subscription) GETSTRUCT(tup);
+
+ /*
+ * Clear the subskiplsn. If the user has already changed subskiplsn before
+ * clearing it we don't update the catalog and the replication origin
+ * state won't get advanced. So in the worst case, if the server crashes
+ * before sending an acknowledgment of the flush position the transaction
+ * will be sent again and the user needs to set subskiplsn again. We can
+ * reduce the possibility by logging a replication origin WAL record to
+ * advance the origin LSN instead but there is no way to advance the
+ * origin timestamp and it doesn't seem to be worth doing anything about
+ * it since it's a very rare case.
+ */
+ if (subform->subskiplsn == myskiplsn)
+ {
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ /* reset subskiplsn */
+ values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ replaces[Anum_pg_subscription_subskiplsn - 1] = true;
+
+ tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+ replaces);
+ CatalogTupleUpdate(rel, &tup->t_self, tup);
+
+ if (myskiplsn != finish_lsn)
+ ereport(WARNING,
+ errmsg("skip-LSN of logical replication subscription \"%s\" cleared", MySubscription->name),
+ errdetail("Remote transaction's finish WAL location (LSN) %X/%X did not match skip-LSN %X/%X",
+ LSN_FORMAT_ARGS(finish_lsn),
+ LSN_FORMAT_ARGS(myskiplsn)));
+ }
+
+ heap_freetuple(tup);
+ table_close(rel, NoLock);
+
+ if (started_tx)
+ CommitTransactionCommand();
+}
+
/* Error callback to give more context info about the change being applied */
static void
apply_error_callback(void *arg)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 725cd2e4eb..e5816c4cce 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4385,6 +4385,10 @@ getSubscriptions(Archive *fout)
ntups = PQntuples(res);
+ /*
+ * Get subscription fields. We don't include subskiplsn in the dump as
+ * after restoring the dump this value may no longer be relevant.
+ */
i_tableoid = PQfnumber(res, "tableoid");
i_oid = PQfnumber(res, "oid");
i_subname = PQfnumber(res, "subname");
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 991bfc1546..714097cad1 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6105,7 +6105,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false};
+ false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6152,6 +6152,12 @@ describeSubscriptions(const char *pattern, bool verbose)
", subconninfo AS \"%s\"\n",
gettext_noop("Synchronous commit"),
gettext_noop("Conninfo"));
+
+ /* Skip LSN is only supported in v15 and higher */
+ if (pset.sversion >= 150000)
+ appendPQExpBuffer(&buf,
+ ", subskiplsn AS \"%s\"\n",
+ gettext_noop("Skip LSN"));
}
/* Only display subscriptions in current database. */
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 183abcc275..5c064595a9 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1852,7 +1852,7 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> */
else if (Matches("ALTER", "SUBSCRIPTION", MatchAny))
COMPLETE_WITH("CONNECTION", "ENABLE", "DISABLE", "OWNER TO",
- "RENAME TO", "REFRESH PUBLICATION", "SET",
+ "RENAME TO", "REFRESH PUBLICATION", "SET", "SKIP (",
"ADD PUBLICATION", "DROP PUBLICATION");
/* ALTER SUBSCRIPTION <name> REFRESH PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) &&
@@ -1868,6 +1868,9 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
COMPLETE_WITH("binary", "slot_name", "streaming", "synchronous_commit", "disable_on_error");
+ /* ALTER SUBSCRIPTION <name> SKIP ( */
+ else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
+ COMPLETE_WITH("lsn");
/* ALTER SUBSCRIPTION <name> SET PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "PUBLICATION"))
{
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index e2befaf351..69969a0617 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -70,6 +70,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
bool subdisableonerr; /* True if a worker error should cause the
* subscription to be disabled */
+ XLogRecPtr subskiplsn; /* All changes finished at this LSN are
+ * skipped */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -109,6 +112,8 @@ typedef struct Subscription
bool disableonerr; /* Indicates if the subscription should be
* automatically disabled if a worker error
* occurs */
+ XLogRecPtr skiplsn; /* All changes finished at this LSN are
+ * skipped */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 1617702d9d..6f83a79a96 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3726,7 +3726,8 @@ typedef enum AlterSubscriptionType
ALTER_SUBSCRIPTION_ADD_PUBLICATION,
ALTER_SUBSCRIPTION_DROP_PUBLICATION,
ALTER_SUBSCRIPTION_REFRESH,
- ALTER_SUBSCRIPTION_ENABLED
+ ALTER_SUBSCRIPTION_ENABLED,
+ ALTER_SUBSCRIPTION_SKIP
} AlterSubscriptionType;
typedef struct AlterSubscriptionStmt
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index ad8003fae1..7fcfad1591 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -76,10 +76,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -93,11 +93,25 @@ ALTER SUBSCRIPTION regress_doesnotexist CONNECTION 'dbname=regress_doesnotexist2
ERROR: subscription "regress_doesnotexist" does not exist
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
ERROR: unrecognized subscription parameter: "create_slot"
+-- ok
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | off | dbname=regress_doesnotexist2 | 0/12345
+(1 row)
+
+-- ok - with lsn = NONE
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
+ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------------------+------------------------------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | off | dbname=regress_doesnotexist2
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -129,10 +143,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------------------+------------------------------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | local | dbname=regress_doesnotexist2
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -165,19 +179,19 @@ ERROR: binary requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, binary = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -188,19 +202,19 @@ ERROR: streaming requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -215,10 +229,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more then once
@@ -233,10 +247,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -270,10 +284,10 @@ ERROR: two_phase requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -282,10 +296,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -294,10 +308,10 @@ DROP SUBSCRIPTION regress_testsub;
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -309,18 +323,18 @@ ERROR: disable_on_error requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, disable_on_error = false);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | t | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | t | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index a7c15b1daf..74c38ead5d 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -72,6 +72,17 @@ ALTER SUBSCRIPTION regress_testsub SET (slot_name = '');
ALTER SUBSCRIPTION regress_doesnotexist CONNECTION 'dbname=regress_doesnotexist2';
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
+-- ok
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
+
+\dRs+
+
+-- ok - with lsn = NONE
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
+
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
+
\dRs+
BEGIN;
diff --git a/src/test/subscription/t/029_disable_on_error.pl b/src/test/subscription/t/029_disable_on_error.pl
deleted file mode 100644
index 5eca804446..0000000000
--- a/src/test/subscription/t/029_disable_on_error.pl
+++ /dev/null
@@ -1,94 +0,0 @@
-
-# Copyright (c) 2021-2022, PostgreSQL Global Development Group
-
-# Test of logical replication subscription self-disabling feature.
-use strict;
-use warnings;
-use PostgreSQL::Test::Cluster;
-use PostgreSQL::Test::Utils;
-use Test::More;
-
-# create publisher node
-my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
-$node_publisher->init(allows_streaming => 'logical');
-$node_publisher->start;
-
-# create subscriber node
-my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
-$node_subscriber->init;
-$node_subscriber->start;
-
-# Create identical table on both nodes.
-$node_publisher->safe_psql('postgres', "CREATE TABLE tbl (i INT)");
-$node_subscriber->safe_psql('postgres', "CREATE TABLE tbl (i INT)");
-
-# Insert duplicate values on the publisher.
-$node_publisher->safe_psql('postgres',
- "INSERT INTO tbl (i) VALUES (1), (1), (1)");
-
-# Create an additional unique index on the subscriber.
-$node_subscriber->safe_psql('postgres',
- "CREATE UNIQUE INDEX tbl_unique ON tbl (i)");
-
-# Create a pub/sub to set up logical replication. This tests that the
-# uniqueness violation will cause the subscription to fail during initial
-# synchronization and make it disabled.
-my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
-$node_publisher->safe_psql('postgres',
- "CREATE PUBLICATION pub FOR TABLE tbl");
-$node_subscriber->safe_psql('postgres',
- "CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (disable_on_error = true)"
-);
-
-# Initial synchronization failure causes the subscription to be disabled.
-$node_subscriber->poll_query_until('postgres',
- "SELECT subenabled = false FROM pg_catalog.pg_subscription WHERE subname = 'sub'"
-) or die "Timed out while waiting for subscriber to be disabled";
-
-# Drop the unique index on the subscriber which caused the subscription to be
-# disabled.
-$node_subscriber->safe_psql('postgres', "DROP INDEX tbl_unique");
-
-# Re-enable the subscription "sub".
-$node_subscriber->safe_psql('postgres', "ALTER SUBSCRIPTION sub ENABLE");
-
-# Wait for the data to replicate.
-$node_publisher->wait_for_catchup('sub');
-$node_subscriber->poll_query_until('postgres',
- "SELECT COUNT(1) = 0 FROM pg_subscription_rel sr WHERE sr.srsubstate NOT IN ('s', 'r') AND sr.srrelid = 'tbl'::regclass"
-);
-
-# Confirm that we have finished the table sync.
-my $result =
- $node_subscriber->safe_psql('postgres', "SELECT MAX(i), COUNT(*) FROM tbl");
-is($result, qq(1|3), "subscription sub replicated data");
-
-# Delete the data from the subscriber and recreate the unique index.
-$node_subscriber->safe_psql('postgres', "DELETE FROM tbl");
-$node_subscriber->safe_psql('postgres',
- "CREATE UNIQUE INDEX tbl_unique ON tbl (i)");
-
-# Add more non-unique data to the publisher.
-$node_publisher->safe_psql('postgres',
- "INSERT INTO tbl (i) VALUES (3), (3), (3)");
-
-# Apply failure causes the subscription to be disabled.
-$node_subscriber->poll_query_until('postgres',
- "SELECT subenabled = false FROM pg_catalog.pg_subscription WHERE subname = 'sub'"
-) or die "Timed out while waiting for subscription sub to be disabled";
-
-# Drop the unique index on the subscriber and re-enabled the subscription. Then
-# confirm that the previously failing insert was applied OK.
-$node_subscriber->safe_psql('postgres', "DROP INDEX tbl_unique");
-$node_subscriber->safe_psql('postgres', "ALTER SUBSCRIPTION sub ENABLE");
-
-$node_publisher->wait_for_catchup('sub');
-
-$result = $node_subscriber->safe_psql('postgres',
- "SELECT COUNT(*) FROM tbl WHERE i = 3");
-is($result, qq(3), 'check the result of apply');
-
-$node_subscriber->stop;
-$node_publisher->stop;
-
-done_testing();
diff --git a/src/test/subscription/t/029_on_error.pl b/src/test/subscription/t/029_on_error.pl
new file mode 100644
index 0000000000..e8b904b745
--- /dev/null
+++ b/src/test/subscription/t/029_on_error.pl
@@ -0,0 +1,183 @@
+
+# Copyright (c) 2021-2022, PostgreSQL Global Development Group
+
+# Tests for disable_on_error and SKIP transaction features.
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $offset = 0;
+
+# Test skipping the transaction. This function must be called after the caller
+# has inserted data that conflicts with the subscriber. The finish LSN of the
+# error transaction that is used to specify to ALTER SUBSCRIPTION ... SKIP is
+# fetched from the server logs. After executing ALTER SUBSCRITPION ... SKIP, we
+# check if logical replication can continue working by inserting $nonconflict_data
+# on the publisher.
+sub test_skip_lsn
+{
+ my ($node_publisher, $node_subscriber, $nonconflict_data, $expected, $msg)
+ = @_;
+
+ # Wait until a conflict occurs on the subscriber.
+ $node_subscriber->poll_query_until('postgres',
+ "SELECT subenabled = FALSE FROM pg_subscription WHERE subname = 'sub'"
+ );
+
+ # Get the finish LSN of the error transaction.
+ my $contents = slurp_file($node_subscriber->logfile, $offset);
+ $contents =~
+ qr/processing remote data for replication origin \"pg_\d+\" during "INSERT" for replication target relation "public.tbl" in transaction \d+ finished at ([[:xdigit:]]+\/[[:xdigit:]]+)/
+ or die "could not get error-LSN";
+ my $lsn = $1;
+
+ # Set skip lsn.
+ $node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION sub SKIP (lsn = '$lsn')");
+
+ # Re-enable the subscription.
+ $node_subscriber->safe_psql('postgres', "ALTER SUBSCRIPTION sub ENABLE");
+
+ # Wait for the failed transaction to be skipped
+ $node_subscriber->poll_query_until('postgres',
+ "SELECT subskiplsn = '0/0' FROM pg_subscription WHERE subname = 'sub'"
+ );
+
+ # Check the log to ensure that the transaction is skipped, and advance the
+ # offset of the log file for the next test.
+ $offset = $node_subscriber->wait_for_log(
+ qr/LOG: done skipping logical replication transaction finished at $lsn/,
+ $offset);
+
+ # Insert non-conflict data
+ $node_publisher->safe_psql('postgres',
+ "INSERT INTO tbl VALUES $nonconflict_data");
+
+ $node_publisher->wait_for_catchup('sub');
+
+ # Check replicated data
+ my $res =
+ $node_subscriber->safe_psql('postgres', "SELECT count(*) FROM tbl");
+ is($res, $expected, $msg);
+}
+
+# Create publisher node. Set a low value of logical_decoding_work_mem to test
+# streaming cases.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf(
+ 'postgresql.conf',
+ qq[
+logical_decoding_work_mem = 64kB
+max_prepared_transactions = 10
+]);
+$node_publisher->start;
+
+# Create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init;
+$node_subscriber->append_conf(
+ 'postgresql.conf',
+ qq[
+max_prepared_transactions = 10
+]);
+$node_subscriber->start;
+
+# Initial table setup on both publisher and subscriber. On the subscriber, we
+# create the same tables but with a primary key. Also, insert some data that
+# will conflict with the data replicated from publisher later.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+CREATE TABLE tbl (i INT, t TEXT);
+INSERT INTO tbl VALUES (1, NULL);
+]);
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+CREATE TABLE tbl (i INT PRIMARY KEY, t TEXT);
+INSERT INTO tbl VALUES (1, NULL);
+]);
+
+# Create a pub/sub to set up logical replication. This tests that the
+# uniqueness violation will cause the subscription to fail during initial
+# synchronization and make it disabled.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION pub FOR TABLE tbl");
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (disable_on_error = true, streaming = on, two_phase = on)"
+);
+
+# Initial synchronization failure causes the subscription to be disabled.
+$node_subscriber->poll_query_until('postgres',
+ "SELECT subenabled = false FROM pg_catalog.pg_subscription WHERE subname = 'sub'"
+) or die "Timed out while waiting for subscriber to be disabled";
+
+# Truncate the table on the subscriber which caused the subscription to be
+# disabled.
+$node_subscriber->safe_psql('postgres', "TRUNCATE tbl");
+
+# Re-enable the subscription "sub".
+$node_subscriber->safe_psql('postgres', "ALTER SUBSCRIPTION sub ENABLE");
+
+# Wait for the data to replicate.
+$node_publisher->wait_for_catchup('sub');
+$node_subscriber->poll_query_until('postgres',
+ "SELECT COUNT(1) = 0 FROM pg_subscription_rel sr WHERE sr.srsubstate NOT IN ('s', 'r') AND sr.srrelid = 'tbl'::regclass"
+);
+
+# Confirm that we have finished the table sync.
+my $result =
+ $node_subscriber->safe_psql('postgres', "SELECT COUNT(*) FROM tbl");
+is($result, qq(1), "subscription sub replicated data");
+
+# Insert data to tbl, raising an error on the subscriber due to violation
+# of the unique constraint on tbl. Then skip the transaction.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO tbl VALUES (1, NULL);
+COMMIT;
+]);
+test_skip_lsn($node_publisher, $node_subscriber,
+ "(2, NULL)", "2", "test skipping transaction");
+
+# Test for PREPARE and COMMIT PREPARED. Insert the same data to tbl and
+# PREPARE the transaction, raising an error. Then skip the transaction.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO tbl VALUES (1, NULL);
+PREPARE TRANSACTION 'gtx';
+COMMIT PREPARED 'gtx';
+]);
+test_skip_lsn($node_publisher, $node_subscriber,
+ "(3, NULL)", "3", "test skipping prepare and commit prepared ");
+
+# Test for STREAM COMMIT. Insert enough rows to tbl to exceed the 64kB
+# limit, also raising an error on the subscriber during applying spooled
+# changes for the same reason. Then skip the transaction.
+$node_publisher->safe_psql(
+ 'postgres',
+ qq[
+BEGIN;
+INSERT INTO tbl SELECT i, md5(i::text) FROM generate_series(1, 10000) s(i);
+COMMIT;
+]);
+test_skip_lsn($node_publisher, $node_subscriber, "(4, md5(4::text))",
+ "4", "test skipping stream-commit");
+
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT COUNT(*) FROM pg_prepared_xacts");
+is($result, "0",
+ "check all prepared transactions are resolved on the subscriber");
+
+$node_subscriber->stop;
+$node_publisher->stop;
+
+done_testing();
--
2.28.0.windows.1
On Mon, Mar 21, 2022, at 12:25 AM, Amit Kapila wrote:
I have fixed all the above comments as per your suggestion in the
attached. Do let me know if something is missed?
Looks good to me.
src/test/subscription/t/029_disable_on_error.pl | 94 ----------
src/test/subscription/t/029_on_error.pl | 183 +++++++++++++++++++It seems you are removing a test for 705e20f8550c0e8e47c0b6b20b5f5ffd6ffd9e33.
I should also name 029_on_error.pl to something else such as 030_skip_lsn.pl or
a generic name 030_skip_option.pl.As explained in my previous email, I don't think any change is
required for this comment but do let me know if you still think so?
Oh, sorry about the noise. I saw mixed tests between the 2 new features and I
was confused if it was intentional or not.
--
Euler Taveira
EDB https://www.enterprisedb.com/
On Mon, Mar 21, 2022 at 5:51 PM Euler Taveira <euler@eulerto.com> wrote:
On Mon, Mar 21, 2022, at 12:25 AM, Amit Kapila wrote:
I have fixed all the above comments as per your suggestion in the
attached. Do let me know if something is missed?Looks good to me.
This patch is committed
(https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=208c5d65bbd60e33e272964578cb74182ac726a8).
Today, I have marked the corresponding entry in CF as committed.
--
With Regards,
Amit Kapila.
On Tue, Mar 29, 2022 at 10:43:00AM +0530, Amit Kapila wrote:
On Mon, Mar 21, 2022 at 5:51 PM Euler Taveira <euler@eulerto.com> wrote:
On Mon, Mar 21, 2022, at 12:25 AM, Amit Kapila wrote:
I have fixed all the above comments as per your suggestion in the
attached. Do let me know if something is missed?Looks good to me.
This patch is committed
(https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=208c5d65bbd60e33e272964578cb74182ac726a8).
src/test/subscription/t/029_on_error.pl has been failing reliably on the five
AIX buildfarm members:
# poll_query_until timed out executing this query:
# SELECT subskiplsn = '0/0' FROM pg_subscription WHERE subname = 'sub'
# expecting this output:
# t
# last actual query output:
# f
# with stderr:
timed out waiting for match: (?^:LOG: done skipping logical replication transaction finished at 0/1D30788) at t/029_on_error.pl line 50.
I've posted five sets of logs (2.7 MiB compressed) here:
https://drive.google.com/file/d/16NkyNIV07o0o8WM7GwcaAYFQDPTkULkR/view?usp=sharing
The members have not actually uploaded these failures, due to an OOM in the
Perl process driving the buildfarm script. I think the OOM is due to a need
for excess RAM to capture 029_on_error_subscriber.log, which is 27MB here. I
will move the members to 64-bit Perl. (AIX 32-bit binaries OOM easily:
https://www.postgresql.org/docs/devel/installation-platform-notes.html#INSTALLATION-NOTES-AIX.)
On Fri, Apr 1, 2022 at 4:44 PM Noah Misch <noah@leadboat.com> wrote:
On Tue, Mar 29, 2022 at 10:43:00AM +0530, Amit Kapila wrote:
On Mon, Mar 21, 2022 at 5:51 PM Euler Taveira <euler@eulerto.com> wrote:
On Mon, Mar 21, 2022, at 12:25 AM, Amit Kapila wrote:
I have fixed all the above comments as per your suggestion in the
attached. Do let me know if something is missed?Looks good to me.
This patch is committed
(https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=208c5d65bbd60e33e272964578cb74182ac726a8).src/test/subscription/t/029_on_error.pl has been failing reliably on the five
AIX buildfarm members:# poll_query_until timed out executing this query:
# SELECT subskiplsn = '0/0' FROM pg_subscription WHERE subname = 'sub'
# expecting this output:
# t
# last actual query output:
# f
# with stderr:
timed out waiting for match: (?^:LOG: done skipping logical replication transaction finished at 0/1D30788) at t/029_on_error.pl line 50.I've posted five sets of logs (2.7 MiB compressed) here:
https://drive.google.com/file/d/16NkyNIV07o0o8WM7GwcaAYFQDPTkULkR/view?usp=sharing
Thank you for the report. I'm investigating this issue.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Fri, Apr 1, 2022 at 5:10 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Apr 1, 2022 at 4:44 PM Noah Misch <noah@leadboat.com> wrote:
On Tue, Mar 29, 2022 at 10:43:00AM +0530, Amit Kapila wrote:
On Mon, Mar 21, 2022 at 5:51 PM Euler Taveira <euler@eulerto.com> wrote:
On Mon, Mar 21, 2022, at 12:25 AM, Amit Kapila wrote:
I have fixed all the above comments as per your suggestion in the
attached. Do let me know if something is missed?Looks good to me.
This patch is committed
(https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=208c5d65bbd60e33e272964578cb74182ac726a8).src/test/subscription/t/029_on_error.pl has been failing reliably on the five
AIX buildfarm members:# poll_query_until timed out executing this query:
# SELECT subskiplsn = '0/0' FROM pg_subscription WHERE subname = 'sub'
# expecting this output:
# t
# last actual query output:
# f
# with stderr:
timed out waiting for match: (?^:LOG: done skipping logical replication transaction finished at 0/1D30788) at t/029_on_error.pl line 50.I've posted five sets of logs (2.7 MiB compressed) here:
https://drive.google.com/file/d/16NkyNIV07o0o8WM7GwcaAYFQDPTkULkR/view?usp=sharingThank you for the report. I'm investigating this issue.
Looking at the subscriber logs, it successfully fetched the correct
error-LSN from the server logs and set it to ALTER SUBSCRIPTION …
SKIP:
2022-03-30 09:48:36.617 UTC [17039636:4] CONTEXT: processing remote
data for replication origin "pg_16391" during "INSERT" for replication
target relation "public.tbl" in transaction 725 finished at 0/1D30788
2022-03-30 09:48:36.617 UTC [17039636:5] LOG: logical replication
subscription "sub" has been disabled due to an error
:
2022-03-30 09:48:36.670 UTC [17039640:1] [unknown] LOG: connection
received: host=[local]
2022-03-30 09:48:36.672 UTC [17039640:2] [unknown] LOG: connection
authorized: user=nm database=postgres application_name=029_on_error.pl
2022-03-30 09:48:36.675 UTC [17039640:3] 029_on_error.pl LOG:
statement: ALTER SUBSCRIPTION sub SKIP (lsn = '0/1D30788')
2022-03-30 09:48:36.676 UTC [17039640:4] 029_on_error.pl LOG:
disconnection: session time: 0:00:00.006 user=nm database=postgres
host=[local]
:
2022-03-30 09:48:36.762 UTC [28246036:2] ERROR: duplicate key value
violates unique constraint "tbl_pkey"
2022-03-30 09:48:36.762 UTC [28246036:3] DETAIL: Key (i)=(1) already exists.
2022-03-30 09:48:36.762 UTC [28246036:4] CONTEXT: processing remote
data for replication origin "pg_16391" during "INSERT" for replication
target relation "public.tbl" in transaction 725 finished at 0/1D30788
However, the worker could not start skipping changes of the error
transaction for some reason. Given that "SELECT subskiplsn = '0/0'
FROM pg_subscription WHERE subname = 'sub’” didn't return true, some
value was set to subskiplsn even after the unique key error.
So I'm guessing that the apply worker could not get the updated value
of the subskiplsn or its MySubscription->skiplsn could not match with
the transaction's finish LSN. Also, given that the test is failing on
all AIX buildfarm members, there might be something specific to AIX.
Noah, to investigate this issue further, is it possible for you to
apply the attached patch and run the 029_on_error.pl test? The patch
adds some logs to get additional information.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
Attachments:
add_logs.patchapplication/octet-stream; name=add_logs.patchDownload
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index f3868b3e1f..f7f77071c5 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3898,7 +3898,13 @@ maybe_start_skipping_changes(XLogRecPtr finish_lsn)
*/
if (likely(XLogRecPtrIsInvalid(MySubscription->skiplsn) ||
MySubscription->skiplsn != finish_lsn))
+ {
+ ereport(LOG,
+ (errmsg("not started skipping changes: my_skiplsn %X/%X finish_lsn %X/%X",
+ LSN_FORMAT_ARGS(MySubscription->skiplsn),
+ LSN_FORMAT_ARGS(finish_lsn))));
return;
+ }
/* Start skipping all changes of this transaction */
skip_xact_finish_lsn = finish_lsn;
@@ -3969,6 +3975,12 @@ clear_subscription_skip_lsn(XLogRecPtr finish_lsn)
subform = (Form_pg_subscription) GETSTRUCT(tup);
+ ereport(LOG,
+ (errmsg("clear subskiplsn %X/%X mysubskiplsn %X/%X finish_lsn %X/%X",
+ LSN_FORMAT_ARGS(subform->subskiplsn),
+ LSN_FORMAT_ARGS(myskiplsn),
+ LSN_FORMAT_ARGS(finish_lsn))));
+
/*
* Clear the subskiplsn. If the user has already changed subskiplsn before
* clearing it we don't update the catalog and the replication origin
On Fri, Apr 01, 2022 at 09:25:52PM +0900, Masahiko Sawada wrote:
On Fri, Apr 1, 2022 at 4:44 PM Noah Misch <noah@leadboat.com> wrote:
src/test/subscription/t/029_on_error.pl has been failing reliably on the five
AIX buildfarm members:# poll_query_until timed out executing this query:
# SELECT subskiplsn = '0/0' FROM pg_subscription WHERE subname = 'sub'
# expecting this output:
# t
# last actual query output:
# f
# with stderr:
timed out waiting for match: (?^:LOG: done skipping logical replication transaction finished at 0/1D30788) at t/029_on_error.pl line 50.I've posted five sets of logs (2.7 MiB compressed) here:
https://drive.google.com/file/d/16NkyNIV07o0o8WM7GwcaAYFQDPTkULkR/view?usp=sharing
Given that "SELECT subskiplsn = '0/0'
FROM pg_subscription WHERE subname = 'sub’” didn't return true, some
value was set to subskiplsn even after the unique key error.So I'm guessing that the apply worker could not get the updated value
of the subskiplsn or its MySubscription->skiplsn could not match with
the transaction's finish LSN. Also, given that the test is failing on
all AIX buildfarm members, there might be something specific to AIX.Noah, to investigate this issue further, is it possible for you to
apply the attached patch and run the 029_on_error.pl test? The patch
adds some logs to get additional information.
Logs attached. I ran this outside the buildfarm script environment. Most
notably, I didn't override PG_TEST_TIMEOUT_DEFAULT like my buildfarm
configuration does, so the total log size is smaller.
Attachments:
log-subscription-20220401.tar.xzapplication/octet-streamDownload
�7zXZ ���F ! �X��'���] 6�����(^����������a�����
%��������(9��f��F���v���69��8L:%��~����KOz��&�\@j�!-& ��-�������S��+S�����Ed}��v���`f�Q1 �����9,�|1e���
w/�\�9�|
)�@by��b��]���I�4��Q��@�@GGPy�7�Y}�r�� ���[���B��T
-A��;`��
&M����&�
kO���A�p��.�C3��^�5���n������:�������������V����_�!`,���%H� �����q#����YL;�Z������^\3�m24hQ����@������S��?�{X����7~C@+l{���4�GfN`~�3���n�T��!�vu�fG�R��7����Zv�>������0���]������T�%|�����E+���rG����'�uF�kw%b�!k���B��z!W��L�gT�H��������5�����+#G��M�����)���g�- ����(��������nX� �:f���k��5���5����sg�lrHK�0���/���|l�b1�T~���i����"4��QN��
"���y�������v>a~IV1=�o��T�(���b�`Eh\G���L0Hl{��^&N
*�\`E�;|��a,j�Mr�RFHo���(X���� 2+�����Ig2V�0����Q8.����1�+�pl�Dk����{�BK1�*��k�V� l�VFE"��-_�P�9�',0\3���U�:�hX��R�
��~����J��-�&,��jH?&[�.YD
Xv��O�b?;��������1��BwA�@TYb��*�����1���H`�{�wm+f��E�����Lt��E��2\|�L�I�]� ���J���c�8;�����[
�{��.�����ms��W�!]�`��4fm,K]�d���p�=�x���`�Y32O�<�/l(�� ��V����5��&6��;��� ��d�(
���E����u�O�'����]"�o>�)u/c�� w>����QE�AG�;�U��my4��k�W���SJW�<U��i��_�$��?��{��#��Y�.2L���>2����(����<��y f(V�6v��.�XY�=���%�^�x��c���^%�5�ssq��
�J�CL��A�
.�I�k!0�����p�8t�!��A�V��s3Z ~��f`%��'<2_���V�uqw�>�o��fi���������t����i|�M�3�J*Zp����ymj����op�/B3����Y��%�{D���UQ0��^��1�����$���6%�q�G��� p2�=��{B���
��!X�rjW
��g(�t���{N���e�pA��`��V�QS��D��4���{�Qb�
>5n�w��7�Z�������-[�9�6B�iJ��T����5\���-d����<v$�/M
�V��7$`��-�EQE=��^��"��p9���W�'�8|�l� �8�RMS�6�����������K#;:�<����-�7,
s��t"h.����Z*5V�L��R��7�v����]��`�
�p����X���c�z��Z�G�����?y��2�W�*qmJj���8�|��| C��3O-"�Y��s�J:G<��X\������EY�t�"
��W����`�����F��s/H�{�sI��������vi�O�mX��X!���x��.'#U0 �;���H�j6���%������[��G�T+e~2����3���0���@
v�e��u]�n"V��F�\����%��>����@�$�wn_6���<��� ��$��Y�h)�V'���%��X"pA�D�S��7���:�����b�*����T�P`Amo�?���������)�s}p5k=}�W���J��j
{��S���I �vDu�%s�<����pP�\�G��kN
��l{|���p-'t���a��F��j��^,|�e�����:�V4�Q���ik���@��>�3�Lh����W���Z�a�����)!6XE���9i�756k� j��
4�q�H��;TFY,����S9u�b t���������0�u����)�/�n�.����bU�!7�C����� N�,��gfH�����u�n2��c(G�d:$��T�xg�p� `��?��hT�W5����/M��a����D�j)����$,�T�;�]+��\��K���:�W� �G���j�W���J���4 ��wo*�Y&Z�p1�
+B������y��yyi�� ��hU%hc�D2�#�K�Mr?��4^��������YH�EKfPA8���l�������H������s_j��zJpU����f)���(���s�9������^��6&q�w����}�|�:1t��'-���i�����7(�9H<1��q�T|���3$?`.��=I�}e���V�y�3�'��e0&K!G�DQq��x�L��N7��yi>��^����F����V����|bO�*�����O�!�g]%I��\����{��2<��H��m��gj�qWo-�>=�f�|�0���1 �5"8$Y��/����#E>���ys����L�5b?R�@~B�Xn�c�'(o%+8�������/�A��N)PXv^�Ae3�T�c��
��E�
4��T'f���X���E�iW�� ��> v��]i���$��-n����,@X�W���!�^L9C�k8�i�)�]��!�:����gpm��p"��������bI�r����v�L���#�Z;���UIX�1�6��N� �����X��X�v6��]<!��M�"�������_�=m o�-/Y��OL��m����(3��$=�u,������Hb��MP��fv�<��"L'�$��}���gEt�=�����x��Qtna,Z��A�T�;�e�&S�fY�;�Z/u���bs�������#V��M+Y29] ���K�d����&a7����8b�@����7M���eb���� T.�J��J#�+=����N����P���1�n�����R����G�So�3�$�\&���g%5������$��e�������kC��K��*Z2���%~�n,|�Z(���)�b����������6�+�C���5n
���IJd�����?���`��?��`A����2g�Ue{^��8tLb�\:���j���c��������?�J������* l�8D�*�b�5'{���
�a����9�9��T�s������J�z_B��S�;�h���~Y]��?�5�� �)-1vi��w�g�;D7����2�����YM��F�jjr�0z�����&�S��!����4"�x���W�1,�g���b7/w���u�6�8��w>��Y����ff4vf���WkWd��8$�uPz\�D�g�|?�����U�w ;Y�u� _�:�\�J����� �xN�q� �2��X��5��p}�b�g�~��:6����5����C#�MD��$*���R�~��Z���+VC� 7up�F��
;�h�Z�'���H�|��2�fp?!�:�j�J��Og)I6��{�*R��I��[D�M����R<��\��r4�^!�������v���UM�/o�U�:�}�t?C`�f@�A��A�\��.�3��L�R\l�C�5�]��K�����k�!H:o+�SsJA�!���:�b??���pO<t51�X�������zn J.��O����j�&=�������g�]*���Ao����������'��P7��[-����y�u jr���LBQ{��N���J���Z��V��s�_������[���O���x �������j?�x��&Zf��5������3d0k�u�(M�V�����T�x���/��o1(P1��P��~8P��n��4��*��8T�U1s�J
l%�r;������� k�g�BW��1|���a<
���)��:���:� �b�-���R��U�m���k-�6��K�t��
�L+���c�/�N���9�(**W�������Jl�{!����0��j�����-���4��j�hV9�,��q���W��Xc����\��q��O^����!�Kn�!��� ��t�Q�I��{\8��l9�>���5��D3:�rk�:�l�"��'��q��]��HH��7�7,�h�8dnO{�s?%���G����;� )���=�/���KIGX�c��E��������U�����U���nV��oJ��<�
���F�_����UG-�
�6���/�x�D1,s�;c��[s��q=\s��w�1������S�z����m��tS[`��\����������'�mZ���T��o>��N������K���U���D6�^da���p6(/t3���|�~
����ja��-L3#�}b�r{���Y����!.������e���p�t�w���]����M�g�����Q�P��-�J9]������`
��^�Gl"����5G��rka%�����"����)8�,����!{��%}��&�u�����d}l>���Z��\��C��69I��J��v��I�^g��6:<������g��yl��+,~�U�x~c$NAp�.!��J{��ZN;�/T�NzZ�d��gU�d�I�y���{�=�����^��[�T��a������� ���a:Z����
;����)�2Pg`�U��)���/�crs�3��?�W�@g0�j���E������2��h`�a��|#
dm���H7W-��E�[���r����<�)���hQ��(+�y����>3|;]9o��Z��MD2��=w2
g��kUN�����o�/����_����?��g��&��%�Ww ���8�����c�UwM:CZ+���v�=���W(&��4x�����L���Hl��{I��1�[��<�� �8S
��E��_`� R&�&��c9/,�>�x�*QU����l;���,��Tkk����`~���;f����DW�bX����`
�aRGw���>gz���Yp��6�E��!E��PM[���<��q
��;ra� >)+$����VYQH���0L*<$��c��.�����Q���>�����b������0��n�%)�����L�B�
Mi��b�)���3z=�*�Q��q�#�����{�>����fMr
�����1��{�n����H�T/�>>)�����L���\)�b��UY�j��K� y�=�f�F��c�#go���|e_j����SN�*T�xV���o���rY</"�
� �\�dFj�"�Q�j~k��]�kb{��-�i�����U��G����89`C�����=�kp�d����K3L`6|�0�)�������;���%����@D�JEU^r��W��
��H�����'BCMcE J���/3A#;cG��>�Dc�6�M��Kg-���#��L�J�������4P��p
�]���|b�x��w� ���D�Y���F�^i����]*��?�k�4)�^y�-u5PfWWA���4�'�:���u>��,���z$���&��&USIk��4pZ,��
�L�Q8�Pj��=�c�A �����{���7����{tM�J����@�m�H���
��W�"�����^�bdft;��*W8�O�������8m����\4�)V
'^��F<5��Q�S!"�%����E�s�#��O�z��`��MR������b9�-��� �y��[�j� ��S=�O��� ������a�WK�TE'�AJ]�T�q�r��)N���H��oj�`@0O�����`I�<k�7�K�W�`+����= g��dFn�"�����;�QA���6k�����Y��3�w���Q�����V�<?�+>��{����N�����I��R�D�l`O^��^-����
�.�x�����z�l�p�9Dl�u�54��jgR�N���}7C��X�u���y�,���}��]�N2������6l�ZD;>z�
�:"SA� �
Q[�������db�{H �^������+3J*��Y�+H�jSu&��h�,�^���5hia�uV�o��3M���X�F���[�$��t�J%m: d0S(&�;��-�w���)s��6�O���N��^�������8H!���UR��~�~����G�����������yv��Y���POGH��h�������=�=:.H�����Q��������U�E��`T��p��u���SdAHh}7�����g<�
�j�"�F�����27L�M�ka'y�53�(���67�� �Q���Q8�?�m@C���U�'��0F�����l�~�j�9O���T�**���������Iw����uX�$�
�`��#��f�L��8�+��V������
��0�s��!��kw��\����k��L���q����\l��%���;b�����}!��2f�BM<����N�_# �����fs)k����������b����$�i����[��������w����q�9,��O�b����a2������kJ;�d2s�o:*�<R�>��5��(#�E
Pr:/Q~�*'��&���
����|�������V��b�_~cR��x<B�E0
����E4��R���G�ay?Z���
9�!w����C;i�%�_���C������O���*�n,����qA��|+q�Gh'�d7�)�
�Z@J���%�_�w���`G�W3�:�.��6���.���,;���I�8�MY�K��*���dk�4wS���s��*GE��5V{
�K{�$Z��h��n4�k���8��k����g$gs/�#�)��cb$8QZ�aI�������h�f���a "7���$�d����Rvw
�/�e5���'/��66����@��F��Q��3�/�;��X��NLg�S)m��D|�=���MQ����A��
�����g���?���f��P{�z���r�F���������|�&�l������q�3���I i�*d�z&*���J} �#t.pP��$�e���qR���Q��t�`�{�'���^W���A �w;OL=������.�0��������zX _��� ;1
����`��Yz����D��<�S*���� i80k�����%e����B�=�-�VNN|��i������VE�U���~��F���{�2%��,��;n���>F�q�����>������D����+� ���}Af�z4r^����kPJ�6,��'��BD��Q�n�B^w6�
����1�d���g������Z�Z���"�"�)�g$�}��w��X��c#�������Ae�<�1[��j!;�B�}W�3L��(}��C�����_`1�/��$��������M��b��k4lH������{�QCW
%A�����q�q#����������H��.�R ��W}�����s������������K���7�y�pE��`�
���kT, =y������YD�x*�.&p��I���dP9��������eY��/�@XH�������� �[���������h���p%�P4����2�2�M��K)pz�f}�TX��[���e���b�����H�D�i����9!�?&��%rZ�MR7n��+!L��V�I4^�����@#�;�-���2�q�I]V���gF`��Q=W9��C[��g�fe������K�$ 9��3=*�K�?�d��&l,N�5�
u��~�1�>������|�*d5����c�J2Y�0^v_�O�U����]!f{�:��=�oq��pJ!&r�9'8B5r�{<}��a��y~p�(�Y2'YesA�n�`�M�nHKJ]���6��;�8���p�����e����}�)�i�t���s7(����-B9�T �f��D�V�#�"Fr���������`�R����XT�1�&U�6�?l=�����!3I�M�B���f� �(��}Q���aE�s�����J@�-"���p�Or�S3|C���m7Uq�J&@;����a�8�s+���s�:�g��'TvoL�qv�aJ����K)����������j�[q�i�mi�Q�,�����x�M������ �Q��3%]�����
���8��B��#x�UH��G���oK�@`�4��sj�� J�
�ZmmXw�qJ�.V@KB����i�~�3���@D-��u g|�T�a�/�8[�-V��F��[���kd2�����0�q�����^��G���k��I����w>#��t6�E���!���~`��'����{1k�8�6�*��e��n������n��P�|���[��!���Z�kac0�Z|��>��W�����������
����|U%���������D
j�����|�gk����L������q�#c�U!n
�b���:�W �s�[ ��Qy-!�Sb�mF���C�'
�3�������_�+����E'�Q�;<���_�t���D�*U���jp���W���P?�.�d(�v3�&�6'�WY���A���"&�-p���}V;��k�~�f��IL������NL���I�N�k�$o��Fp~"�!t���[��=��xO?VX*��=������m�����Q��F���+����)1���
�~3)�E�����WG�7v���@b�z
mt�]�N�1�Z�����Kf�;P�h�o� ��UEN�����b��i���Gq�*���
�5�'Sw�`�$���day�f!�a!0Q���=T~.��j�������x�B��83��:���|[�����k�po+i�fQ���*���h��T$�u��k��Q2?p������ A�c���;X�#6�����#�b��8\^l�.D[�Cba��LEp����h��U!�����{�B���*~c������E�����m���(?e��%$_�eo�'(Kzs��4�QdC�
�9��X6���i��ny�'�0���L�J�� �������F�gmP}smxS�I���n%M�����K�aT%#L"mQ���u���}U��Zv�u�E�J�W�
.<f�3)@��\KI�_���/��@bY�X`��T��,�[�DKDT���`xUI
|%d�Y�*������5y�d��x�u�<x��/l�#x�^�@q>5�~�Kq9�\���.�1���y��P��[��m��F������td�WM$Z���n�g�Vn�%��D���y\�2@��Db�����=
i@�#��sqV��3E�K���?�� )�U��n�� � ��|<����;^�=���)TN�^mN�����/���< ���s���p�cGgoM�>��&wX�4J����22S\�%"�eD��F�h�l��W[�]Tu[VkL@sf��ws�1��z%�9������b�@��}��������9� Uv8|P�8�p�0��g F�JQi[����5�&"���|0B����0��������i'�����V*��PQ��D��0�X�����Um��v�Wxn���qV{B����A�����{Q'�/QU3JZ�?�~�@'����_ �:�NH{^��'�`h�~���N$0RhA�H�yj��}�c��~ ��2�4a''�����'*H<��Z��MZ��Y�(�M�S�?��~���z\��)
����t�����?F��-������h��O�z.Z��!��e�!+R����� ���>���G������L�D!��Q���f��1��z��o���I���%U��J��j�S�����)�_���D�1���NQ6���a��I"�������*�@
G�L��eS���'�I ���_����(!�K�B�U/O�l(��j�����7��k!�V��( =����;��pUQb��$H\���/�fc ��X 4!Pu�!������:���C��{��7�-�{/��H�?�D�Dg��j8@�Q��dgN*��z�;�999�W#��qam������"H����d n���$x� ([o���="� �E��8�����O��R�(7�t�z�%�����_��r�,/d�h g�N����#"r] �s�hHL��}#2����;i{
9�= WM������]���:Q^�I"��/2�O�R�:�6����D���F��j1�5na�z��DDu|���U�#�xe�/�!�)u��|�R^��Ef��ye���|��54P�� [���:�|���e����_j����3 x�g������QQM�g^N!��22����I� -bs�/����Y��mBl��1U��^�~����1m5��QqW�R����rh�Z�6����b�H����mO��CT���.���_n��b��{DG�� �8��d?�\�mqv��}�#����p����!8u^'|\��� *�O��l�v$33��;G����0!�5jD�8+�g�0��������P6�����`�C�4)�R��,�J�{?geaUN���|�����IJ�xB��>������ ���@�i\M�
�Y���kV���
���
'o��>�)��G�|���$��"��N�����P��'B��(����������-����0�D���!�x�:�Z�7��=����9r���'�����y%�=!sH=�I����G����fQ���Uc���}�bA0��5R:��c��aT���`���n.�}u<��5}��Uk����T&JtG�x������6�
���h8|��v^*�����(~��3�vZ�sJ,�}�,���HC�`�V`�n������m�LhB�&������9���E|/B����t�9M��������~G<"�Z�������VIx�e�*B���a��QM��-�F7���=>�qq�b������A����f���B��]�L;*~�����o�7���O3b�sr5�d39���<Vg@�m�&�|K�$z����O����d1m��V��+fM��i�
YTz��������C����SO`
� �n�wm�+����������;Mf�U��B!�v-1�T�.�.��,Y�KgN�--���O*:*��y>u�� ��p5���@��`��Pi� ����6��d��������Np�����6!&^������pB(K���MU��jAd���f i���Q�M���b�S!BDH�p�I]���oa3b���u��A���]0�)xa��~�����#1Mn�����qq���-�?A���)�Q�����C��a�F��� ���Wa��mz���v&�
�tw��H�k��.��Q�0��:�����c%���I��Z��s��w����Y��������G�T.�������x����.&�W�Z�X��G���Fq���F�������H@
���O�OD?���P�}��}��)�Z;�}�OYQ��l���RE�^[����7TR�`����{� ��@������C_.����
�J������s�� �j H ~e@��uy��y��"��[S����B^�!��s�D}��$/]�:��%��S������p�q\T�c�`0R�����gnbd~��
�qNx��8?�2Iq�U% ���������+;m��g(<����x�x4��H�f��uK*NO}��q���k���+?�W�lL]b���N��l��*��(�fJJ�A�1�ES�����C�"9_KH����!���%����Y%��S�����|��r��|2�/8"����R���
\��j�V�/VW�Qz�wLd}CX'��?h��S��;�\��S9h�$��N��:��ij�=����zk��Z�q2u��M�XDb�\����L5:��wT�g�����t�@����m����:�������<�W2�bBW����^K��+������`w��.��u�d-��y��hg`��;x�i��$|�����|���9?|F��{q�1!ht���]w��D%��>�;�i<�r��C����������*���(�
�*�v�W����J!��VCZda8��3��V:�s�9�]����JW��;�&�|���7PZ������le/J0�� �u��SiwM��K�������,��Qy������� ����gN"�����~�v�\�]P��n(��P6'l��`U3G����A"?���w���el>���=Gl��B^��,���ee��|�������;��������L�I�fYO��.uR����D��DlR��H.t�
q��������\������5�J���-�#�T��x�K�f.����.�1���N���9���g�����[I����E����A�>XrF�$����t?ImIK�cq�P �~�E��B�w�^�XqH$��0�����7@�uGb9g�� *�������d%�1���� ��p�����&�<,_C!"��b�8��EmV���(NC���h������B�(��j�*�.���e���[8
nb%�������[w ���9�N��D?ATTR������B��)R���;%2�n+���Sww����aR�����������L�[���?�w���$D��^��b���7h-�|�=���\�_�V.��tR��Md��2���e��q������B�E��\��q��E��Q�O�B�9�*��@������s���'������(z@f�}rM��6-o1�B���-��g �����k���#�����#�gP���sfi�[6���
9��e�a���w��kg���Z�x*�t�e���N��7�mM��Stw~w���2���������wv�����"#7�+m&�x�����{�VbK�;���"�N��H�H$�,�F�!-���~��;`}U����+.#��n`��?W��
4w�:����)B����Ls��X��������C�Z������L�����E�G�O� i��[�������-S��Y ���.�� ��S%��I �@?���F��G���3��O\��YMD
gG������m5�"�E&�)����������P���trB�Z�b�����&�0����>(���������-[�I��"��,��85U�-Vn���D<<���D��I,
�vI�CQ�@�N�G�%~�\�_=Y�_��rRX-4C_qa0�0�hr���"5o���tk=��7�=�Cl3E����G.R����z2.8�ur�� c����sk� ��a���M��]e��(�u��
�����:/�n,�I��b� ������/��l�L�G��
�#`K��������b7o�Lt���d{B+�z{D�����&�����(E�>{�����V��0����y�NH���lA���+���K�9�����\� ��@~T��4��w�p������ "��Zr�m�������|&;lH��9p�Q��o�!�=@7����r�B#������z��E!?)�����r���j,�K�5W�U\�IpDt
��Z|J4oj�����������DS�h w�@�qfwg��p�"d�D����G�8�/Q��@W�� �k�"J���-/9D3m���������:�����R N�K���k8lo������K�b�zQ�)uk����^���� ������}U`���G��\A����<����)����)UX�H��}*���v��HD!\~�L�_,�2#��z��P7�Q0���a�Z� �����v�&�U��%1+��[�W
~����`�D���] P9�R\���
u����Mj���]���6��3����GZ��6<9��A5�!�Lw�������1W��Y�
I �'��^��7����g���q25��?����@��p�
���)����A}C��n����hn�m�\~� �k��W
���bi]I�:��v�� O�����i�@�T�V*#��j���>��H���|D�D��jc�LC�v���w7���+>g:fA�7�-���N�n����4]���TTX�T�v����<-��Z�\�|��{m���".bB���xDQ���[������y�*{�$�i~W*1��<�3��~g����IS��;�}���Z�'��'Y��`S1���e�6����dRO���=�Fco@���x���>����WAc&�7G��y����o��n �W���8&�?�� ���YT)����7�yA�����N�
�Q������/go�[08�j�%@��X
��������?*��!�DXjY��<����Y�Vc��<�]qF"m^+��/u�p�3�������g������`d���\����++��Y�z"�I_��RW��r����AQ���b��t�<U���~�
P��D�V�$�j5C��Y�e��a�X��G����5���^�)j��.���"T�h����qh�^,����6??��
@H� ��c������?�4k�u�K��tq���K��5<�v��9����v� �P*����%������0���{���S����+�k�IU(Vr3e�����U�3i�����*F_G�3,��_�\�n�������\��}
�g��)BX�����I���}��%�/&���/���"��V����0|N�XE}��AFAc�"b�����0��8�I��Do������+������f���M����j�sA*K�/&�X���S��;Z�x�����<��������|~���B!����X���hx��my�OS�.T��B����q���"3�ooOicZ���/[aL�N��- �������ni�D
C�n�s@��2dA�)�teL\�A�F.R�m��r-
�$-��{-�-�,��� ������7�_^��OF��S�g����#N�30oY��L����'�o�J��
>M�a�?7���We���,��R ���J/[�'>|����-���9���8��X�l����@�[$vK
���2;KI�u
c�d�>��CC�{�(R=���6��
`��"��M���uf<�!���u�������
V��6�\xT���+�b��J�Z��;��.������s��y�[������m�ZU�[WYOMhN[�9�Y������2��F��'���7`�d��bU�_�� �!$�xnP����� %���k���uzW5��4ON�J0�K;Wa�=� �W�1&hk�Rl�Vt��8C%��Fy �]��Q� �e|I�Ki-�H�p%��$)Gs\�_�"&����%�@+Q#�5�f�����-�p�e���by�� �H
l`C����M�N���r �%F�B^��Fk�{��]�Ht�Q�=c��O,��t�*`�;���g��v��r�u�dU���x.��+C#�N9�������P<J7���v��������R����YU ��!���h��k�V�-�\���y��*��OG�$�wc�y�� ��x W)O2��W���_�V/O�R>�|4�%� 0,�W��(y��h��?��/5~�'���yQ��s�TvI`�E��EQzH�=����{����H���`n�r�X~����(8��7�U���^�U��J.���i�[��C��I�\�ZR������8�S��u�X�s!���!���Q��0�\@�|tR���^���G��R��>�i_wuOm��
'��>!�#t)�,�%W��[��m�DD��|<���&��z���B��n��x�YY�X������.���#�;�������m<����~J&�B� ��EZ��1���
��`��Y&�0$p�S��������q�?��;}���HN"8�=��|+�p�.�v����F�7�`��6��7�l�����W�������:QH����)����a�j_�6�Y��dJ����L�:���;��.G�T�����j���=N��z��(���~B���Bu���:�.�k[�A��:}F?s�^TA��TO8�+bm�B!+E�l������ga1���-������x��>R���Gt ��S�����H�\]���F"���_�|>��K�����{�\�����V��6o�Y����fa���F����`��L�������x�����'G�(k���1���i����O�;�C��f�3�����4�B�`Jco�K!��)8���E|F�>{%��Lh�A���9p~r��� �������,�U��J�?p�@���r����E�]o��� �=�nbL�uV�A�&��g"���9��sM+.\���=��i�B��O%o���D�U�"1?��[����^��#j o�*���]+Ck�q����Q"*��;�<=������<��va?�&���p{I\-�-"��&Yd��G�e%�}���8n��1��*�����3T�+�\�'��
������^�s�*�������}�j�������(��H�F$�%]�w��Gf������q���Z��F���[��A�|l�r����lo,T��]�?r��k�S�XA��.�f���`�(8�9��|�4��N~����X�[� �d7wOHT��Y�.i�EJ����No>�T��M����O��A�d��B��6I6��=��)�����D���7�� G����Vd�?T����l��b��P=��M��z��c;��
������"UYT6�$/.��.��KD��gA\�k@�q$�A���Yl��Q����,j���<-���(i�P��f�<6��L�=�G�:����V� �4+A>z��3��7B.0>�m��1#j�4]+�0{!R�
�|����f�f���^��kN��@����=����[6�cq[$&p`s�|� O���&k����K�@:������G!z����1�Zx���i_ N|yP
] M���%>�3mo��.I���h�Q�\�����i���(^?{%�������8�yN[��t(>�T�)�7(����*���#����f��#��q_���-|��2��W��
�&%X t?��������Og��N�~��v T��#�F�9���[`,7�U��U��Y���3H12�����g�^����~!�����M�;����vv�n�k���SIv�U��y�lh��ZW�h
\�c������U�����=�
��WN$t����h�������Gb���xG@��F��]���8�BR��
d�k����M��Lm�4
<������*����Pt� � �^�$\�v��7����l�z:����P�Jk��B��u��~� �7��~PY"]d��vTT�WP��B h4���D�l��0�FhyQ-�V��0���'�bb�@�H����+k�B���C�����LO]]DW��T���5V��Md�uT%T�y��� .�g�8����%����Ol���'�J"��d��=o�1�W��a+�����1���)]:,I52��=�H��?c�$�G����l��+K����W��h/~i�0�D�E�����$�o0��A���U#,��p6rB�������6����C��p����)6����;"�q �@p�\���F�m7��� �RX2��,���d�����
K��eHwh��>��k�WC���C��q��9JAs-��o��Ta!UDz�$R0s�A��_.25���&�[f�F��h|�E�Dwf�I��������f�?�=�TI'�[����m;��u��U�I���R��](F�
"rH���Q%�T���
�ZV����4t���7'8���ga�����/�w�V�}�h�����R Q�5���s���9��P�a>�$N�����(;7���3�j.hBJb���d��sw����]���=���y�]'g���M����,J/��z���q` ���2wH��#?|e9����Q����V�{D)
�P($�(������~7�8k�X����]1����2�{�qn4���o�H�Kp���� ?���1T�2n��/"���J�f�f������l�vf5'J_�~z�Z�:q��]{�ZjO��e��������f�/FX�����������;����s � ��a�V(~����� �H�7��j����|*j�\�n���Qg�2^��a85
l~H0�]7�j���R}��S��F`���"@<S8��B~��]1{�S�3��]�v�a��
m����,�p���vm�� T����;.?��m=����b���.dGT+\�?��P�%
��Kf>�>�f��YXMV��'��e
�5��"���a�gKS���H6�����Z��������=�A�9��V����;�+�����C��N��Z��RV���H-� =Y��t����$
v��
t�D3�����{"6�:
^9��P��+���N[K�1z9_�'f�t�8��F�Fu8�2%zW#C+�����������S�����y\�Lb�3{T�w������s�%4���j�Gu����u,� X�isx���� h��B-$p�{����]���zA�-��g�T��'�$I|�H:������2e>��: ��~���#I����u�;����U�U�p8?�����}I@�)
uZ��r��0�0_J'��%]���rL`�G����d8&���<H4L����&Q��J��8Gc�TCf�� `��0%�!'�q�M� ��5�����;i���)%O��D���foU\X
�(���%2�=�K�)O0(x<�}$��%O�w�!��~;�$�3��~�V zN+��H ,��$=pK��8��E=C��0M>��]Z������C��S����e�F� ?6D!����������D���V8�I Zg���}�1F�^N����+2'���k�����[�_J�>�|@g���a�+4�~��bo��������e%n����!_7Lh��� � �_�l� T���j��@�M�6�����y�w%UP]������������In�g�!�q�F4Yv����Q����j��f\,��5���,�f��tu;���#;@�?{o_�d��[]� Q�P�u��S�Z}P�l���V�l�A5��m<�&����������e�����2N�&-X����P(nL�D(N/ X!�w��
�P��;Y��FJ������ ��f��H��SW����#������2�S�>�����������L%��8����o({�
�� �!P���M��k�����e)W��!p��,����M�;e�-1��~��s��������\��)�����Cf��O�����/�6tK��.�����=?����Yod�B�����D&�
����� ����$���Y��[��/�M��
_���|RK�]R=�\Y2��g�3s���x/���� �*}�?��lTFc���TO���P�<�����Y�2#B�q0�0Z�I��e�@���XxL@��M�W>n�Ez�����WL���`Qg������Y��BT78��Q��_�>�|���4��C����mq��Lf�:
�7A�$='p�L�� +D30J&��u�) a����� �p����tVS��
7a��}���a�C�-gj�<����HW�pF1^�&�{�I+���C-�y�\*��JT �yT���ta�8-l=oH��#7*=�ig���� ����m��Ki���_�j=c����%%�m#��t�H���6�9?���2�-*3�*5�N���!;0������G���U@\3���Y�?x�.i�IDk�P����h�nWM�
���h�7�����V���� k���S
F0n�Q������Q$�M��i��MoO������
��GsA�����?|�L�k�������hpr2�h�h����x����(�������R�_�p����#�I8�D��������J��A?�6`��c[^s��9Xk�6���Q[J�}�J�.p����Jp�IK���c��� ���ra BOj�AO���7�� B ���^
(h�^��"�o�F�j���$,�0O��qbv@�D�?�)������{,(o��n�DY��]B������i����X�o�����Jj6�ddd��o������w�G�^����3���V���sHi �uP�Q�p����"�k�+/��6o Z�� q��TM��� �3#�k�5��v���H����y;��B�����l�����`{wKY�z��R�;���F�y|B�[�.2���y9��������V�������y��c}��r������J�����=���e�"���L�� p��0u��8���� P=��{��������q�E�������2�G��W���k�H) ���z/��We�>T����F�X�+�tBY,1�+������!��K�7��t����?��(RN�����m�����>���j�-�;B0}�������*����,- e��H�Bm�4������.� ��_�~r�����n��IHbYo�hm��gS���LO�R?YR�L-����!i�`,b��0`��4M}�bwX�n���S���_��-zY�B�=cRN�u�m��L�5���+�Q�I��^�.�����,fH�����$S����'l���g�Ph@>�z��tD����������G�R{�@���U��d��;�Ddx?p��/��?:/��J��Y�z+����\���>��<��g �)D��#�e������UL �^
2�p=���q>D������j������u�h�"Y���L}��Jo� ���ma�?@�7�V��@��WQ���:�N� �,$��1h>Q?����W�h=��e�������@�s��s G����@�K^^f��o������%�*x��|�zq�X�-4�������������k��>gl�������*L��J�~* �m�~�_�mU �
��XE��
���J�g
K��&�D�p�c�Buqv�w��8���<b�|@�M�a�����b�B���n��\���k_2���{�v7����M63���+n3������e�f9���c��`�[�Y�!:��c���E���U/��X�����*Jn�!�����AY|q{�O���n�Gk�=�e������ >-���Rj�YH������9�g�"yY�i=��b�^�\<M�s16��DS�(��
�c{kb�H�e��%P'r$����sIP�����w~>�wQ�1��)�
p�hx���� �>�)��i��gCy�z��U�E��|����Pt<�#��Ouq�9!D`G���=�'7WG� ��iS�M���A���?��t�d��q�J��#��Q�7��P�5.�&1�����&�� �p����Y�)V�%���w�~WY�FxV��
>���������������8Pgk����}�S4��� SEoLCq����S�HA�����K����e\���
��1����W�}���g�����������<��:32�Zx��bi��l,�'��8^Z����SqA���}�������6��$��I��>RLg"�
������9
��ebo�HE��j�T`oD57��M��c�O�Mk�e�l�������� lW�*���#��=��!�o��1
�]\�(+��{�"���@�+�~1?SO�]+z=Az��t�y������y�����fF����^��!� !����������`���q������b�l��!x��'M�q���/ f����
�{�RN�f��* �n��%�k����_�PQ�*�Y���a��Kb�/�t���T����a`5��d�6D�����DD�������6)z��F���lS�l��!��Nw���Z�g�E(�?D��{ ?�L�,��Z����^$^�P������&&���2�������M�[=iA1�tS�Qo��#n�?�U���3����/�����+�+.;]�����z`��!S�Nj���q���Zi�?���E�RT������t�
��%A�I��
Ba~%�3�J�����
�t�X`�;3����bo�!�'���"���0HD�r>��f��bD.?��&��.�o=<��>?F�o�C���+��"{F��v���64y���0sE�$`I�w]�:'��������J���D�>��������)����� ���u7��<�o@�G+�1��M����e����zr�\q�+����2�L*������g�_����COj�;k��6A,�*�f�Ny������W��.=�W��b�S.��i��}��K=��Dp�^��
Ao��E&���`����i{~v����J�8�bv.C:t��X �H��#7kUS����I<��%F�f,�<�],����H��T�w��������o����o���C���w��,
�c���b����5�O����wD�}��G���8���T�K;0^?<$_�x \.G6���Pq�/P3�l����of��r�y/^���v�|*�q������Z]�/}�L�
�s���HI���J���n!H@ �������"x �J�c�;gu�
�6�C+K�
p�RO��hPfX�� O(�9����t������eC����^���T>�/s2P�a��D��� t����=W�X>D7��U��lu&XDI���U
����o -=�>����K��>��?�@}�Qj9� �K��k=I^�1F���D-/
�D���dW�>��X(�F���-8:���hw�W��������bo�����-jp#o;=��VB�r�Q������j�$�����h��e;�q��-OA� ��xKR������PS`f���/[�mB��,@e�����0�&�2�VA6e��#+/'����`�("�x������Hwh��7�F
`�f:��r�z�w�����T�D�����(F���Ps�B�EN�L*����q<md&�Uf���*0���~�������9��������84?
Q�����~%�X`C8v� O�(���(E�{���(��K�]�%j�x�6�02A2K�FVuz����8���`J\Am�x�C��d�Xlp��X�!K�d�u�@���c)Ht����27�8"/�?���M)�"_�WD?��p��|�$��kf���w�o�OB�L����f�F,��C�2�2i�zUU��
�Z��`C� q
��`��:aPY���"�K�}N.��0>�4�|z 9�dK� zZjr���ka�N����M�5c��Jh���a�`]gD2x�cu�t���a�z2�����oK�� B~���xY�9|]~-���4\ zY�uv��7S�>:@������E���Nf���+~���5ds�[����3�q;���`��p�FLH����xf?��[�"�����z|� I�D��?-:b����^SF��`���"7����Z�����`��6.k�m�5:kbPp�P����q����������89�b[(������5���:\����R��V��x
��������������� ��q�\�����K��F@V��?�K��]/������x��vR\�$���#�d~)����3!1l�9x�=�"���������T���>4
&��e1��V���Z�:d��>��n���TT|0({�����KSn+�cUEz��C�1��J(�UDT��Q�������xLR��=�
���H�*Iv�,GB"���r��� ����~
'�v�+x/c����w�T)�Y��x�kn�)3�?��fx�|,��X����
l�loq������2v���Ee�����VWu�->�Qb�Q���n������_V���Mb�n?s*5F�gDWt�����������x<9=S<�3�y�b,���� ��PI%%�3XJ��M�]���3 �C���1W���o����.&\� U���|�A��\��TC��u�i���D�`9v��x�n����W/��~D����+J�
��W��R����45����� ��1|�R����t?J
�f�!p:�*X�[�.�J�iZ��<u�4���T�q�Q�vaQd���M����Z�����b��<�{����K����^9x����`Jh��6��-����NO��9�%i6����������IcWVU��5��H\���w��GcV����
���E�;M��EW�d;!����3�u�)q1[�zb���gL��.�@�w\ ���Y�A��=�w������*)�Kz|��`6\�
�q��9�p�uMj����
R���$&��#�{~�fC=��"uO�b���8��>��)��q�y)�@Mk��/mqt`#u��-�G4x2�s�e�������JG��
Q/���DZH"7�G�������x>�Y=��D��L�l]��.����t���#E�\�of�;7� ����el� '�B� ���Mf�>��s����^(p#�q�$_M���C������<�+�?��q�%�!�E!Iv���17���p�?Y��$��
oq���C�Gf�E���Zz
p�^�7+WH<�^���N��(}^�5^U�l+5���[�1L�[�#�jY+m� ��1���O:R��t1E;��W�%��?<
������
�T����!�xtY����}2��-?%�P�������5�u��p��sW"��G������v
;��J]��6;�S:��fs���m���+~��mx�u��~��b� ^%�{}"\�T\�( ��(-6�xn����]<��|O\y�RO?�����7b?��%�l�>�������F[����-���cs�� |��� m�Z�����zz:�l���!<�V5����,��z����Zi�9���l�}fk7:���JT�9�jT����"� :�I�C�����@?vs�:���p\Vc�E��zc��$�!z
�v������K�F$S�<VX5V�p�\:�K��3���<���j�xQW�W�������~��$�C�[�
� �C�������-�2����?b�LhL7`��%U�/hI�4t���I'��Bh�G��ai��R��b������l�X�(r6Vz�#p'.CB���PL��B�v������$��(i.�11�[d^� B*H��{e��_-��A�����;���p��}!53g7-}=��,��$Bc�������i�Z��d������K�������t�Wb��k`����m1w����DG|�3=�b<w��v��V��B����be�2b����y�-�s���/%��m��39��y���_�:cq��OD��u����N�E%Y���r���t���s�����
U��(�(W�[N6OY�x�g5���;��`�z�
!(lz�n!�ML�H���{����X��\������'��8�h�j�Q�
��O5S��[1j�>qA�%�f�0I|���6� T�KX����7�0r���o'�;�]o�Pr ?\�t�����ph������[����.aT]t������d�t����M�f�$O�������R�B���x�`6=@G��'0�=���e���V����z�2�8�=:}5K�la��W�][����<b��T +���o�T�
C�������
@�+�[w3\t���3h����w$
DE���(=��
�h�����2�(���e�s]���W�I�����o\{'�G�R���"�l��e�/j�U���H6�a*�G���z ������B����Q�D!�Z����1���fhR������R�m��9��hn�hQ??&�5��=��W���f����i$���v��x|���SA��9�m;3�� U4r��w�I��k����NN
��ig~&�jX�S'�~!Ek$�q����C�1�t�{@?Y���;\�
~�n��`��&�f���NL&\ d������"kPd����C���v���y��{��v���;��hH%�%�h����=��
��r� G;�*��:�������+�buPH�?�`S1I����r���l�15��������H��`�(���\5z������O����7�vXc*n�y��:�'{Hy����8%�O�ij�P@�����b�w����t[q�3�%e����u�09��R?�!�i��u5�D���U���=���4���1���{��+'t��*�3�������]+�P��]�7Cxf$���%[���V�����yd�9
������Ok �~,�
�.�\�D�+ }D������B� ��Puqm��\�4;�`�w��L�a�9B@C�v��Fn�p���5`h1�C���V{����"e(��H�N�T�"6Vq+�6K�UU����7���I�t�R��J��3������������+�p-��U�jN#���h��d�+��4c�
|��Et�_<4�!18�;����"��.i��
M�\}n��f��r�c��Y�r��/��P�a� `������8���t�
u(�}�*HZ`a�P�M
���u��+V��q�}�`r�86U��J�6� l,3�~4���44�5yM�O��_����.���%��l}�x����YO�i�^hA�����U�. ��j Z���rvw���D'mJ�����=m�A��5���h��z����z�b�68�Z��Q�t���?s��n��[��S��1��|-����e�I;e�x_�������$ku�~��Jk{����n�?R�)L��K�t5����G�x%\��RI�^���bS�)F���f�il������83�M��$��l�?G.e�xJ� t�L������NhX����z"���sc���XN �<o�#�5^�9QW|��?��E-o���H�������H���/(�p��{����*{n������d �w��I��3�[���u�����r�_F)����)�r�jR�LP�[�|��6�`��)3�;� @��v��4Z�6��I�(������}X�,En��y&&d��C!�����"��H Q��B9TmI�cF#9[��x��K"��eh�Q���m4�����U�*U�]Ex�$��?����%��rB3����Vm�|a�����,y�k����.��p&^oT�.e�����i����n�]+sv)�5IH��������'��7����<NwcI,0�����vx�g
���X��
D*xN/d��
U�$�] ��cV�P��f`v�A[D������)4���g*l����8I���x���e���.�}Lut��.k��������v�Kg!x
��w��v�_]�>|/�R���meB��'\��L�����M����(.V�"�'W��"W��?�,����J]M%��S����C�2��n�g�AV�x����k����bMY������x�%�`�/��?�<������}���c^�fXHi�z\t[�bO-���K}*�!��������[����p1�u")�P�:y��d�����V>��KZ�'!U�lB�R �@h�[�2j�(#�����~�R�0� (���B����r��J�D�Rm�u����e ��t�R����x���?�-(�b�y�i�\
b+���/��>���)0�K&�o����{���(~��<�\��7X1��)����/��h��]T;�d�*@u��;y�)�&/��|��|�2�Uj�~�"�l�L�����?�[���j�~�,��9R��I(t��� ���U��P���)����(��d�~�9�/�i��W�CM��G�e�O�)�i�{�.�m���g����~��R$�)<X�:�<��!L�.P�'�M|�
����\������]������(nC��C�Rx�6�����������N9�ryb��y�(��+
5������G��������@d�`�S^�$�}b~�1H�n���T?E+�T&�>s�gl3�.�M�C�:��F�O�e��k2���g�O;`Paa��[��@O�9��-�]�.��c�����<Q�H5c2�c.X��+�X�HE��E$�P��^�+{cQZ������
.M�Wv �I��3��'r�����%3��S��N��%�R���r����v����hw��*^�%z5b���>��7��kk�wK/Iy�#b�
���1�%�P���|Z�`�Q�mKmM����m��+��[�$RM`��oOU�Y����.��������&M��\�\����g�tySPVKx�`N��1�����!� ����Z���
�P�WSuk����P�����K
!$�1� �}��(�+�Y�SF�z����]F���sP���u���#�>��7�h����j�p}��iP�6W)2�a~�}
t FHZ�:|�8�C�X��*w�I�^o��$�������)�reK��
����{����_�/Dm3wX4�_�.�����s��}]� 8q�)����7l�Ow���\�t���~$�6VY�C~�����&���d��s��������<�K��.�
�[L^U,��j[0L�>��v.�1�y�����N�y�k i��
�����Xf���BU�[W2����U����m�FR����HL�{/��h�Yn����$�sq��^�����'�����7_����7D&w ,[EQ�WLW��G������ 7Y�����yL>aOf�MC���I����� �t�_�x����:P������k�� m�����a�h�=�F��C*���j���L�"^�Q�@��h~�_�u��)#��jb\���+��V�m�|�-Tx���;%=�^bH��XLrx�{]9���BQ�A�+�]���=�-��"�i�;��HZ43'�9�B�6��
������r��/���
A�{ }qXP� !��<@}��&��@�����M��jsy og� ����J�Z?�sI����F���]�!��M"���������FG��)),=�i�5�1�=:4��"�yh�|��2��}�H��~N��BG�f���G���YV�� �����W�^G����������7�J�X&/�l. ��?�����H6�gEddd:9�6�Jh=
C{�X����J��<�E+m��'�[F���J�H��� �7x`�#�=
~�G��"B����1���;yG��~�/0�C.��9fF��q�@��/�y�n�N����QB�]P�Y�b.1���Ch'6��R����{����X�qy������TT��'tt���45�F��:vMJ��v���v�&�e��g�9���[7�������m[����p��""�U
�V�J���F�P�:g����~wU;&�����g<����MH�]���O
z�q����8���F�U> \!�T�L�e����� {#��� ��D�(� aA��'��Y*#��Fv���
����F���Z�%�;>r��Jk���oj��%d�$[��'�K� �:f��1%������I� �� �HW������+F��Bv�����|��I�I��
��Hv�a�������n>��<:�S�
�.��%:l$�O� Zm�*�D�A���lD`.���L�2����{i��� L
>��Ui=�J�:�/�$�/�or68_���A�����p�/���q�~�|���N��R'��D��9�9�w#��B�
����D���m�����v��t�f��T���|�L�I+-� X�(�����t�L=�J~hD�jq����`�������8��h�U�~��_�,�.�����qal����03J�{>.O>8�@aN�t���$O�U�d�o����Y����X=�M��x����0��G����!^ko�����e���5�cDk
����������:(I+7���MG(_���o�Y�)+�p���������B�������Hi�Q�*E
eQ�E+��0��%����~r�l�W��_L�{N9�1FD� ��C���%wZ�H�����b;C*pYj�C��}b4'0�)�*��A�������q���"�������Z�
��z��VE���m���)�}��o�L���X�q��6��\:W+m]����:n�H���n�<��H���H�J��D; �N�Q���������h�� ����6I3B�z�A�H)*�# ��M��*�����w��k��$_��d��F��@�=��Z��y{�On�A�R����NV����
�n�e��G����f��L�v�E}����U�<��9����^��c;WVUG�X���]j:
������� �s�9��}r�P����A %-�o���������|�p/2�M�8XWF?�9H���0�|":,��\�-���B�k���3�r�����&-�7�FV,��I�����TOu>����9Q����P)h>������=��-�:�3���PQ�G�m`���1�G���KpKe9�?����
�D��V5�U�2��NE)6�t���F�(��9�����������Z
�N��T���,��xV<����7����_�[2�$�������T�h1~+#��6�}�lu*C������P���l�E��X4GbS��W�z���w r�7����n��H��(����������vwH����_a�j|c���(�����PX��4Mq$wu��5�wy���-��
@�N�>dcE��|b���U�_�R�Y��9����2K c�~�'��U���
�������������q������:\�H���(3b_�v.���!�)��fc-��v�`!W�.^6�~y���?=�m��Wgnx�[������iI;W�Y\hA���G� ��@K��-%�?����Xo��,bk��X�9� c�KiJs0U��W������<�I}��@)���E�����~s�����o����yu+�s�ze���]����u���r��[�[��������=c��� �\��+V��k5�>��q��x�(o�L���F����n������}Gk'�h�ii�a�ki{�V+
�Y��v��xJ�@����R; ���N�!"N|-c�������E����?�����'�(@`�����Lf���u�{HgnSz�E�<�M:����C�M��2�tB�,p$Q���
�W����?�� P���7�$��`��f��������!����s���\�?�?��8��@�uW\���fh| ��d}'"��a���
y��������H���A����3�����f��D��@y[���z��v7�
2R�m�7��`�T�%�w���QS�\@EYQx�"�s����������W�N��\���MH�������.����_4.X���y�q@�������$K����*��P�1GdH������_Mu�S�4<g�Qn����_.� �&�0j����,I�z�����3L�
aEU��u�a�y
�=�u��}1
,���Y��u�E�z���7�>��g�(@q}P�W��>Y�o���z����o7�����^����F�������/q|f-[����-�Q���@<��RE}{�%
���?;�l_V���fe����Y����I�h(cR���y��?���&axL\�m��3p����=m������x��qs|��q�����7�������-�
�R���e�}�b���@�2'����rEb��Qn���!oq3|�F����.���A(E^]��A�>q�t����a�u8�4��KfA�+��nK4)��`�\\Bi
����g�j��W�Dv��_�lu� ���d8Ncy��x)�R, �`��S9��w�.���:���u�8����h�"d<2*�AX�;{�����E��B����U��=��73 ���{b&�-s�Z1^r���
�����%�%=N�'z3��[M-vK��
���v?������J��?�1�.i/,�.�+���sf��������$5$�y�u�N�]YD�#�����<��:���(��(�{H�����
O(cb�2$ HI�zc+��p�Bl�?��2�Y���
_.,������D�OGn`R���g�������vG<QY����� ���-(���{�)�IW`���|��`� d�������2�6Rv��FrZ�v-q���ZL}�s�j�6�yG�[�@����+���j�0���q���x��P}/����>�D j����j��;O��$������B�������2��D�l�����-"�ULg^����C ���9��=Z�!)��.K������{9���5���I���+�.:�l��>)���^�6W��`8������Jk����-�J��������O�F�C���k{�%��q1����v"xH@5����L`<%d�{n������@R"�-'��m�[���n1��e!��p 4�luX�����7�$���V:�:S�C�J��f�{E��"U���J�n��=#,W�
�IS���*6�����j'���hl���QHL����H�]����F���R������`w���_7��+��h����9L�M�/E���J�j2tU����gG{{3�w�fJ�<q�Q��v��]����;����%+�������d�3iY�Xw�41%C*�rV�����xZ�iG���h�� ������6n��Rp_ U���f��� {����%��q�$5n 4���<��N(*u�_��z�g��6>��N�����SH|�������*�h�M�Z*F������C�����G����`WU��Qhy5��+_��
*�C���"�c�OY�� ���(� �=�_�<���{�PX@�p@�������3�������/������8�W�f��zu���E��I.��$��� a�[����
?`T����Z�KiVu�����W@8���sG�����
+�T����h
d��0%
�f��^o�d��T�~������@��Ul�4��V�L� ���^�"��
����5rv�E��P�t��#����n%M���������m�Qh�b�77\@<@�=�1C)l?a�e�5#�gN���[��3����&_�
�5�v����B@���P���`�����I�K��v����|$7
#vu;"��
��Nq9��A�D������x����p���G�,��"��-�s�����i�7�74�*���M#m
���K����r�]���sxqyaV���z3�H��yq�p�P��4LC�+? _+�Fv�&r�8Ej����s��sg_�>v�2�Q��W8y���fQ]b�W�M�J�B��(~p���W_��Q6��X��
�;�Av{/�Pz�}FA�������|{����&R�!��K4�s;�������y.84�i����oK���X�CQ�1�|��zf���p!����l{�r�9���������k"[�r�L�!.1����u�0.��4�X*��o�O��m<�H���P��[iAP�w�����U{]�9���o���L����N�����0�a
�J$�j1��������Q�]=���9�I��@Xwz�1����w��]�(��G�f��]�S�rFQ*U���������1%�C�=����O�e1v&��8X� `�v[�D��
���h7�k��o��^�����D�CM������D��4e-��gC�P��%����m04 ��U��-���K�*e���������z������Q>T�N�(�y��j#���T�(Q�bs��:`����m� ����8++3]��pW��1�}��i�~���L����WI�,�����y��Y�q,.����f��L��P������?"����eWMs�l�@9����&���*�Y"���,�.4���z��SPc��}���������R5+�����{����E
���f�[��
Vj<G�}�=G;������ ���R�
�����k�.�%�t�{9��h�TK��n�*�#Z�D�6����,��������
L�=�`l�$��}t#�X������V����g�$:�������8��: EN���0Ia���%Th��%:�����/�r�a�@�U���iV�k��t�l�JB�Uy�Yc�� �i�
i� ���E
��!v?{2^�q� O���i����b��l��b��q�E�#�&�g���e/�^=*zj�<
�� �����P������/���#�ji��Aj��{�+�W���A.e�j����Y��k:�'��ls��O
����w�z;�5QQ��� �����3���Z�{��F}C�GD��4o���)��������d�}�e*OmH%��}/f��6l�I���OM��o���Q�$��Nn��fY�� F#{Tn���B�L$i��(�3$0�wK�-r�r�
0(�>�����6T����
{]���5��z��{�u�>����J��������M��7�����T�=S�S����T�v��8*���[��(��+u���@'��dp���a�F�F�T��wD�F���"*E�$B�@^��s����~��r��l� ��Q7���L6 ni4 �X}���p�4��s��z�Md����?TA�&�������^*\���2�
�E�2��!�����J�.�_���ZS�>f=PL ����q�5��%,f�����`���an��[����C�{N�-/g������/����cr�1�h��������������j������gW�S<3��������)���Y��V�R98!�������XV|�S��n�����9(��a�dO7�g9.�ma���#�N�����;`c�����W���>�y���4P_�sb�N�b�!!�� )��4�w������������p�|&J�7��d����a�}�������8r�7�5:�e���4��~�3�m�������2 ��Ze�1+7m��%x��2P�G�m���e_dt���TQ����\��o�
h_t^����n�������2�v��t��eRU�h:
���u�O�$!��a��(���'��\v��T����O��A�K����!�=��@�G�U�"�y�g�] ���i�Dye��l.y�_��^�(�=�gdgZ��w7%�-�[-�fp���t�I}F�.e ���8t����Z�����I,��o��.����5�;�XS:<_����r�%M��*��,�hem!��������n����[�_|01��!�G&n��i��a�K2:��?v �.j�o�U��;2nr?����kc�E�r�����U����wR�%I�5f�[�8[8I��z5�����&0'>Ee�n���S����V�A����w7u ���K��*J�L�c��>b������/x:hf� k(��3-��5*�B��8q"N�e��1$������p��"���0o|7OQ?����er�+������Q�7��N��YJ�j�!�A�/������@��]�&D�=��E$\s��� 7Ix�l��I��������}�'�����P�o04-vx���c1h�[*����On�B�����]g������?n
���?sx��y>�ac�P]V���JUVw��:�R�����M���c��Wg>�;�-���q�y�-����m`g����
��t�Wq�����r������2.O
�f�� �"�E>��`OH54����j�/�����P����a�����Q2�����D+s�����=sev�j�]�Zj���������2��C������ ����Bk�U�v���,-�I�7������_.�:o_������O������;_,h�����9_��S|��>U�����u��@Q3���P���}�'����������?�A��n���@���z������j�v�/rU���w�lB�3Lo�UE�Z;Vt�g��V�{�Z��p\�:K��6$���O���R�ZQ�?W��I�`#"S��j� '����yP3B��@Vis��\��z�N!�D{49F���uoKoO�!K2_@|���G�f� #:���M�~��&�lRdpU�6�%�7T�`�f����Z���W���9z1�[����.S|2<�� )[�|f���V� k��v�*�>z�@�"