locking [user] catalog tables vs 2pc vs logical rep

Started by Andres Freundalmost 5 years ago57 messages
#1Andres Freund
Andres Freund
andres@anarazel.de

Hi,

The 2pc decoding added in

commit a271a1b50e9bec07e2ef3a05e38e7285113e4ce6
Author: Amit Kapila <akapila@postgresql.org>
Date: 2021-01-04 08:34:50 +0530

Allow decoding at prepare time in ReorderBuffer.

has a deadlock danger when used in a way that takes advantage of
separate decoding of the 2PC PREPARE.

I assume the goal of decoding the 2PC PREPARE is so one can wait for the
PREPARE to have logically replicated, before doing the COMMIT PREPARED.

However, currently it's pretty easy to get into a state where logical
decoding cannot progress until the 2PC transaction has
committed/aborted. Which essentially would lead to undetected deadlocks.

The problem is that catalog tables accessed during logical decoding need
to get locked (otherwise e.g. a table rewrite could happen
concurrently). But if the prepared transaction itself holds a lock on a
catalog table, logical decoding will block on that lock - which won't be
released until replication progresses. A deadlock.

A trivial example:

SELECT pg_create_logical_replication_slot('test', 'test_decoding');
CREATE TABLE foo(id serial primary key);
BEGIN;
LOCK pg_class;
INSERT INTO foo DEFAULT VALUES;
PREPARE TRANSACTION 'foo';

-- hangs waiting for pg_class to be unlocked
SELECT pg_logical_slot_get_changes('test', NULL, NULL, 'two-phase-commit', '1');

Now, more realistic versions of this scenario would probably lock a
'user catalog table' containing replication metadata instead of
pg_class, but ...

At first this seems to be a significant issue. But on the other hand, if
you were to shut the cluster down in this situation (or disconnect all
sessions), you have broken cluster on your hand - without logical
decoding being involved. As it turns out, we need to read pg_class to
log in... And I can't remember this being reported to be a problem?

Perhaps all that we need to do is to disallow 2PC prepare if [user]
catalog tables have been locked exclusively? Similar to how we're
disallowing preparing tables with temp table access.

Greetings,

Andres Freund

#2Amit Kapila
Amit Kapila
amit.kapila16@gmail.com
In reply to: Andres Freund (#1)
Re: locking [user] catalog tables vs 2pc vs logical rep

On Tue, Feb 23, 2021 at 3:58 AM Andres Freund <andres@anarazel.de> wrote:

At first this seems to be a significant issue. But on the other hand, if
you were to shut the cluster down in this situation (or disconnect all
sessions), you have broken cluster on your hand - without logical
decoding being involved. As it turns out, we need to read pg_class to
log in... And I can't remember this being reported to be a problem?

I don't remember seeing such a report but I think that is a reason
enough (leaving aside logical decoding of 2PC) to either disallow
locking catalog tables or at least document it in some way.

Perhaps all that we need to do is to disallow 2PC prepare if [user]
catalog tables have been locked exclusively?

Right, and we have discussed this during development [1]/messages/by-id/CAA4eK1JeeXOwD6rYnhSOYk5YN-fUTmxe1GkTpN2-BvgnKN6gZg@mail.gmail.com[2]/messages/by-id/CAMGcDxf83P5SGnGH52=_0wRP9pO6uRWCMRwAA0nxKtZvir2_vQ@mail.gmail.com. We
thought either we disallow this operation or will document it. I
thought of doing this along with a core-implementation of Prepare
waiting to get it logically replicated. But at this stage, I think if
the user wants he can do a similar thing in his application where
after prepare it can wait for the transaction to get logically
replicated (if they have their own replication solution based on
logical decoding) and then decide whether to rollback or commit. So,
maybe we should either disallow this operation or at least document
it. What do you think?

Similar to how we're
disallowing preparing tables with temp table access.

Yeah, we disallow other things like pg_export_snapshot as well in a
Prepared transaction, so we can probably disallow this operation as
well.

[1]: /messages/by-id/CAA4eK1JeeXOwD6rYnhSOYk5YN-fUTmxe1GkTpN2-BvgnKN6gZg@mail.gmail.com
[2]: /messages/by-id/CAMGcDxf83P5SGnGH52=_0wRP9pO6uRWCMRwAA0nxKtZvir2_vQ@mail.gmail.com

--
With Regards,
Amit Kapila.

#3Andres Freund
Andres Freund
andres@anarazel.de
In reply to: Amit Kapila (#2)
Re: locking [user] catalog tables vs 2pc vs logical rep

Hi,

On 2021-02-23 08:56:39 +0530, Amit Kapila wrote:

On Tue, Feb 23, 2021 at 3:58 AM Andres Freund <andres@anarazel.de> wrote:

Perhaps all that we need to do is to disallow 2PC prepare if [user]
catalog tables have been locked exclusively?

Right, and we have discussed this during development [1][2].

I remember bringing it up before as well... Issues like this really need
to be mentioned as explicit caveats at least somewhere in the code and
commit message. You can't expect people to look at 3+ year old threads.

Greetings,

Andres Freund

#4Amit Kapila
Amit Kapila
amit.kapila16@gmail.com
In reply to: Andres Freund (#3)
Re: locking [user] catalog tables vs 2pc vs logical rep

On Tue, Feb 23, 2021 at 9:09 AM Andres Freund <andres@anarazel.de> wrote:

On 2021-02-23 08:56:39 +0530, Amit Kapila wrote:

On Tue, Feb 23, 2021 at 3:58 AM Andres Freund <andres@anarazel.de> wrote:

Perhaps all that we need to do is to disallow 2PC prepare if [user]
catalog tables have been locked exclusively?

Right, and we have discussed this during development [1][2].

I remember bringing it up before as well... Issues like this really need
to be mentioned as explicit caveats at least somewhere in the code and
commit message.

Okay, so is it sufficient to add comments in code, or do we want to
add something in docs? I am not completely sure if we need to add in
docs till we have core-implementation of prepare waiting to get
logically replicated.

--
With Regards,
Amit Kapila.

#5Andres Freund
Andres Freund
andres@anarazel.de
In reply to: Amit Kapila (#4)
Re: locking [user] catalog tables vs 2pc vs logical rep

Hi

On 2021-02-23 09:24:18 +0530, Amit Kapila wrote:

Okay, so is it sufficient to add comments in code, or do we want to
add something in docs? I am not completely sure if we need to add in
docs till we have core-implementation of prepare waiting to get
logically replicated.

There's plenty users of logical decoding that aren't going through the
normal replication mechanism - so they can hit this. So I think it needs
to be documented somewhere.

Greetings,

Andres Freund

#6Amit Kapila
Amit Kapila
amit.kapila16@gmail.com
In reply to: Andres Freund (#5)
1 attachment(s)
Re: locking [user] catalog tables vs 2pc vs logical rep

On Tue, Feb 23, 2021 at 9:33 AM Andres Freund <andres@anarazel.de> wrote:

On 2021-02-23 09:24:18 +0530, Amit Kapila wrote:

Okay, so is it sufficient to add comments in code, or do we want to
add something in docs? I am not completely sure if we need to add in
docs till we have core-implementation of prepare waiting to get
logically replicated.

There's plenty users of logical decoding that aren't going through the
normal replication mechanism - so they can hit this. So I think it needs
to be documented somewhere.

As per discussion, the attached patch updates both docs and comments
in the code.

--
With Regards,
Amit Kapila.

Attachments:

v1-0001-Update-the-docs-and-comments-for-decoding-of-prep.patchapplication/octet-stream; name=v1-0001-Update-the-docs-and-comments-for-decoding-of-prep.patch
#7Amit Kapila
Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Kapila (#6)
Re: locking [user] catalog tables vs 2pc vs logical rep

On Tue, Feb 23, 2021 at 12:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Feb 23, 2021 at 9:33 AM Andres Freund <andres@anarazel.de> wrote:

On 2021-02-23 09:24:18 +0530, Amit Kapila wrote:

Okay, so is it sufficient to add comments in code, or do we want to
add something in docs? I am not completely sure if we need to add in
docs till we have core-implementation of prepare waiting to get
logically replicated.

There's plenty users of logical decoding that aren't going through the
normal replication mechanism - so they can hit this. So I think it needs
to be documented somewhere.

As per discussion, the attached patch updates both docs and comments
in the code.

I have pushed this patch.

--
With Regards,
Amit Kapila.

#8vignesh C
vignesh C
vignesh21@gmail.com
In reply to: Andres Freund (#1)
1 attachment(s)
Re: locking [user] catalog tables vs 2pc vs logical rep

On Tue, Feb 23, 2021 at 3:59 AM Andres Freund <andres@anarazel.de> wrote:

Hi,

The 2pc decoding added in

commit a271a1b50e9bec07e2ef3a05e38e7285113e4ce6
Author: Amit Kapila <akapila@postgresql.org>
Date: 2021-01-04 08:34:50 +0530

Allow decoding at prepare time in ReorderBuffer.

has a deadlock danger when used in a way that takes advantage of
separate decoding of the 2PC PREPARE.

I assume the goal of decoding the 2PC PREPARE is so one can wait for the
PREPARE to have logically replicated, before doing the COMMIT PREPARED.

However, currently it's pretty easy to get into a state where logical
decoding cannot progress until the 2PC transaction has
committed/aborted. Which essentially would lead to undetected deadlocks.

The problem is that catalog tables accessed during logical decoding need
to get locked (otherwise e.g. a table rewrite could happen
concurrently). But if the prepared transaction itself holds a lock on a
catalog table, logical decoding will block on that lock - which won't be
released until replication progresses. A deadlock.

A trivial example:

SELECT pg_create_logical_replication_slot('test', 'test_decoding');
CREATE TABLE foo(id serial primary key);
BEGIN;
LOCK pg_class;
INSERT INTO foo DEFAULT VALUES;
PREPARE TRANSACTION 'foo';

-- hangs waiting for pg_class to be unlocked
SELECT pg_logical_slot_get_changes('test', NULL, NULL, 'two-phase-commit', '1');

Now, more realistic versions of this scenario would probably lock a
'user catalog table' containing replication metadata instead of
pg_class, but ...

At first this seems to be a significant issue. But on the other hand, if
you were to shut the cluster down in this situation (or disconnect all
sessions), you have broken cluster on your hand - without logical
decoding being involved. As it turns out, we need to read pg_class to
log in... And I can't remember this being reported to be a problem?

Perhaps all that we need to do is to disallow 2PC prepare if [user]
catalog tables have been locked exclusively? Similar to how we're
disallowing preparing tables with temp table access.

Even I felt we should not allow prepare a transaction that has locked
system tables, as it does not allow creating a new session after
restart and also causes the deadlock while logical decoding of
prepared transaction.
I have made a patch to make the prepare transaction fail in this
scenario. Attached the patch for the same.
Thoughts?

Regards,
Vignesh

Attachments:

v1-0001-Fail-prepared-transaction-if-transaction-has-lock.patchtext/x-patch; charset=US-ASCII; name=v1-0001-Fail-prepared-transaction-if-transaction-has-lock.patch
#9Ajin Cherian
Ajin Cherian
itsajin@gmail.com
In reply to: vignesh C (#8)
Re: locking [user] catalog tables vs 2pc vs logical rep

On Tue, Mar 16, 2021 at 1:36 AM vignesh C <vignesh21@gmail.com> wrote:

On Tue, Feb 23, 2021 at 3:59 AM Andres Freund <andres@anarazel.de> wrote:

Hi,

The 2pc decoding added in

commit a271a1b50e9bec07e2ef3a05e38e7285113e4ce6
Author: Amit Kapila <akapila@postgresql.org>
Date: 2021-01-04 08:34:50 +0530

Allow decoding at prepare time in ReorderBuffer.

has a deadlock danger when used in a way that takes advantage of
separate decoding of the 2PC PREPARE.

I assume the goal of decoding the 2PC PREPARE is so one can wait for the
PREPARE to have logically replicated, before doing the COMMIT PREPARED.

However, currently it's pretty easy to get into a state where logical
decoding cannot progress until the 2PC transaction has
committed/aborted. Which essentially would lead to undetected deadlocks.

The problem is that catalog tables accessed during logical decoding need
to get locked (otherwise e.g. a table rewrite could happen
concurrently). But if the prepared transaction itself holds a lock on a
catalog table, logical decoding will block on that lock - which won't be
released until replication progresses. A deadlock.

A trivial example:

SELECT pg_create_logical_replication_slot('test', 'test_decoding');
CREATE TABLE foo(id serial primary key);
BEGIN;
LOCK pg_class;
INSERT INTO foo DEFAULT VALUES;
PREPARE TRANSACTION 'foo';

-- hangs waiting for pg_class to be unlocked
SELECT pg_logical_slot_get_changes('test', NULL, NULL,

'two-phase-commit', '1');

Now, more realistic versions of this scenario would probably lock a
'user catalog table' containing replication metadata instead of
pg_class, but ...

At first this seems to be a significant issue. But on the other hand, if
you were to shut the cluster down in this situation (or disconnect all
sessions), you have broken cluster on your hand - without logical
decoding being involved. As it turns out, we need to read pg_class to
log in... And I can't remember this being reported to be a problem?

Perhaps all that we need to do is to disallow 2PC prepare if [user]
catalog tables have been locked exclusively? Similar to how we're
disallowing preparing tables with temp table access.

Even I felt we should not allow prepare a transaction that has locked
system tables, as it does not allow creating a new session after
restart and also causes the deadlock while logical decoding of
prepared transaction.
I have made a patch to make the prepare transaction fail in this
scenario. Attached the patch for the same.
Thoughts?

The patch applies fine on HEAD and "make check" passes fine. No major
comments on the patch, just a minor comment:

If you could change the error from, " cannot PREPARE a transaction that has
a lock on user catalog/system table(s)"
to "cannot PREPARE a transaction that has an *exclusive lock* on user
catalog/system table(s)" that would be a more
accurate instruction to the user.

regards,
Ajin Cherian
Fujitsu Australia

#10vignesh C
vignesh C
vignesh21@gmail.com
In reply to: Ajin Cherian (#9)
1 attachment(s)
Re: locking [user] catalog tables vs 2pc vs logical rep

On Wed, Mar 31, 2021 at 2:35 PM Ajin Cherian <itsajin@gmail.com> wrote:

The patch applies fine on HEAD and "make check" passes fine. No major comments on the patch, just a minor comment:

If you could change the error from, " cannot PREPARE a transaction that has a lock on user catalog/system table(s)"
to "cannot PREPARE a transaction that has an exclusive lock on user catalog/system table(s)" that would be a more
accurate instruction to the user.

Thanks for reviewing the patch.
Please find the updated patch which includes the fix for the same.

Regards,
Vignesh

Attachments:

v2-0001-Fail-prepared-transaction-if-transaction-has-lock.patchtext/x-patch; charset=US-ASCII; name=v2-0001-Fail-prepared-transaction-if-transaction-has-lock.patch
#11vignesh C
vignesh C
vignesh21@gmail.com
In reply to: vignesh C (#10)
Re: locking [user] catalog tables vs 2pc vs logical rep

On Wed, Mar 31, 2021 at 5:47 PM vignesh C <vignesh21@gmail.com> wrote:

On Wed, Mar 31, 2021 at 2:35 PM Ajin Cherian <itsajin@gmail.com> wrote:

The patch applies fine on HEAD and "make check" passes fine. No major comments on the patch, just a minor comment:

If you could change the error from, " cannot PREPARE a transaction that has a lock on user catalog/system table(s)"
to "cannot PREPARE a transaction that has an exclusive lock on user catalog/system table(s)" that would be a more
accurate instruction to the user.

Thanks for reviewing the patch.
Please find the updated patch which includes the fix for the same.

This similar problem exists in case of synchronous replication setup
having synchronous_standby_names referring to the subscriber, when we
do the steps "begin;lock pg_class; insert into test1 values(10);
commit". In this case while decoding of commit, the commit will wait
while trying to acquire a lock on pg_class relation, stack trace for
the same is given below:
#4 0x0000556936cd5d37 in ProcSleep (locallock=0x556937de8728,
lockMethodTable=0x5569371c2620 <default_lockmethod>) at proc.c:1361
#5 0x0000556936cc294a in WaitOnLock (locallock=0x556937de8728,
owner=0x556937e3cd90) at lock.c:1858
#6 0x0000556936cc1231 in LockAcquireExtended (locktag=0x7ffcbb23cff0,
lockmode=1, sessionLock=false, dontWait=false, reportMemoryError=true,
locallockp=0x7ffcbb23cfe8)
at lock.c:1100
#7 0x0000556936cbdbce in LockRelationOid (relid=1259, lockmode=1) at lmgr.c:117
#8 0x00005569367afb12 in relation_open (relationId=1259, lockmode=1)
at relation.c:56
#9 0x00005569368888a2 in table_open (relationId=1259, lockmode=1) at table.c:43
#10 0x0000556936e90a91 in RelidByRelfilenode (reltablespace=0,
relfilenode=16385) at relfilenodemap.c:192
#11 0x0000556936c40361 in ReorderBufferProcessTXN (rb=0x556937e8e760,
txn=0x556937eb8778, commit_lsn=23752880, snapshot_now=0x556937ea0a90,
command_id=0, streaming=false)
at reorderbuffer.c:2122
#12 0x0000556936c411b7 in ReorderBufferReplay (txn=0x556937eb8778,
rb=0x556937e8e760, xid=590, commit_lsn=23752880, end_lsn=23752928,
commit_time=672204445820756,
origin_id=0, origin_lsn=0) at reorderbuffer.c:2589
#13 0x0000556936c41239 in ReorderBufferCommit (rb=0x556937e8e760,
xid=590, commit_lsn=23752880, end_lsn=23752928,
commit_time=672204445820756, origin_id=0, origin_lsn=0)
at reorderbuffer.c:2613
#14 0x0000556936c2f4d9 in DecodeCommit (ctx=0x556937e8c750,
buf=0x7ffcbb23d610, parsed=0x7ffcbb23d4b0, xid=590, two_phase=false)
at decode.c:744

Thoughts?

Regards,
Vignesh

#12Amit Kapila
Amit Kapila
amit.kapila16@gmail.com
In reply to: vignesh C (#11)
Re: locking [user] catalog tables vs 2pc vs logical rep

On Tue, Apr 20, 2021 at 9:57 AM vignesh C <vignesh21@gmail.com> wrote:

This similar problem exists in case of synchronous replication setup
having synchronous_standby_names referring to the subscriber, when we
do the steps "begin;lock pg_class; insert into test1 values(10);
commit". In this case while decoding of commit, the commit will wait
while trying to acquire a lock on pg_class relation,

So, this appears to be an existing caveat of synchronous replication.
If that is the case, I am not sure if it is a good idea to just block
such ops for the prepared transaction. Also, what about other
operations which acquire an exclusive lock on [user]_catalog_tables
like:
cluster pg_trigger using pg_class_oid_index, similarly cluster on any
user_catalog_table, then the other problematic operation could
truncate of user_catalog_table as is discussed in another thread [1]https://www.postgresql.org/message- id/OSBPR01MB4888314C70DA6B112E32DD6AED2B9%40OSBPR01MB4888.jpnprd01.prod.outlook.com.
I think all such operations can block even with synchronous
replication. I am not sure if we can create examples for all cases
because for ex. we don't have use of user_catalog_tables in-core but
maybe for others, we can try to create examples and see what happens?

If all such operations can block for synchronous replication and
prepared transactions replication then we might want to document them
as caveats at page:
https://www.postgresql.org/docs/devel/logicaldecoding-synchronous.html
and then also give the reference for these caveats at prepared
transactions page:https://www.postgresql.org/docs/devel/logicaldecoding-two-phase-commits.html

What do you think?

As this appears to be an existing caveat of logical replication, I
have added the Petr and Peter E in this email.

[1]: https://www.postgresql.org/message- id/OSBPR01MB4888314C70DA6B112E32DD6AED2B9%40OSBPR01MB4888.jpnprd01.prod.outlook.com
id/OSBPR01MB4888314C70DA6B112E32DD6AED2B9%40OSBPR01MB4888.jpnprd01.prod.outlook.com

--
With Regards,
Amit Kapila.

#13Michael Paquier
Michael Paquier
michael@paquier.xyz
In reply to: Andres Freund (#1)
Re: locking [user] catalog tables vs 2pc vs logical rep

On Mon, Feb 22, 2021 at 02:28:47PM -0800, Andres Freund wrote:

Perhaps all that we need to do is to disallow 2PC prepare if [user]
catalog tables have been locked exclusively? Similar to how we're
disallowing preparing tables with temp table access.

At least for anything involving critical relations that get loaded at
startup? It seems to me that if we can avoid users to get them
completely locked out even if they run the operation on an object
they own, that would be better than requiring tweaks involving
pg_resetwal or equal to rip of the 2PC transaction from existence.
--
Michael

#14Michael Paquier
Michael Paquier
michael@paquier.xyz
In reply to: Amit Kapila (#12)
Re: locking [user] catalog tables vs 2pc vs logical rep

On Mon, May 24, 2021 at 10:03:01AM +0530, Amit Kapila wrote:

So, this appears to be an existing caveat of synchronous replication.
If that is the case, I am not sure if it is a good idea to just block
such ops for the prepared transaction. Also, what about other
operations which acquire an exclusive lock on [user]_catalog_tables
like:
cluster pg_trigger using pg_class_oid_index, similarly cluster on any
user_catalog_table, then the other problematic operation could
truncate of user_catalog_table as is discussed in another thread [1].
I think all such operations can block even with synchronous
replication. I am not sure if we can create examples for all cases
because for ex. we don't have use of user_catalog_tables in-core but
maybe for others, we can try to create examples and see what happens?

If all such operations can block for synchronous replication and
prepared transactions replication then we might want to document them
as caveats at page:
https://www.postgresql.org/docs/devel/logicaldecoding-synchronous.html
and then also give the reference for these caveats at prepared
transactions page:https://www.postgresql.org/docs/devel/logicaldecoding-two-phase-commits.html

What do you think?

It seems to me that the 2PC issues on catalog tables and the issues
related to logical replication in synchonous mode are two distinct
things that need to be fixed separately.

The issue with LOCK taken on a catalog while a PREPARE TRANSACTION
holds locks around is bad enough in itself as it could lock down a
user from a cluster as long as the PREPARE TRANSACTION is not removed
from WAL (say the relation is critical for the connection startup).
This could be really disruptive for the user even if he tried to take
a lock on an object he owns, and the way to recover is not easy here,
and the way to recover involves either an old backup or worse,
pg_resetwal.

The second issue with logical replication is still disruptive, but it
looks to me more like a don't-do-it issue, and documenting the caveats
sounds fine enough.

Looking at the patch from upthread..

+           /*
+            * Make note that we've locked a system table or an user catalog
+            * table. This flag will be checked later during prepare transaction
+            * to fail the prepare transaction.
+            */
+           if (lockstmt->mode >= ExclusiveLock &&
+               (IsCatalogRelationOid(reloid) ||
+                RelationIsUsedAsCatalogTable(rel)))
+               MyXactFlags |= XACT_FLAGS_ACQUIREDEXCLUSIVELOCK_SYSREL;
I think that I'd just use IsCatalogRelationOid() here, and I'd be more
severe and restrict all attempts for any lock levels.  It seems to me
that this needs to happen within RangeVarCallbackForLockTable().
I would also rename the flag as just XACT_FLAGS_LOCKEDCATALOG.
+                errmsg("cannot PREPARE a transaction that has an exclusive lock on user catalog/system table(s)")));
What about "cannot PREPARE a transaction that has locked a catalog
relation"?
--
Michael
#15osumi.takamichi@fujitsu.com
osumi.takamichi@fujitsu.com
osumi.takamichi@fujitsu.com
In reply to: Amit Kapila (#12)
RE: locking [user] catalog tables vs 2pc vs logical rep

On Monday, May 24, 2021 1:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Apr 20, 2021 at 9:57 AM vignesh C <vignesh21@gmail.com> wrote:

This similar problem exists in case of synchronous replication setup
having synchronous_standby_names referring to the subscriber, when we
do the steps "begin;lock pg_class; insert into test1 values(10);
commit". In this case while decoding of commit, the commit will wait
while trying to acquire a lock on pg_class relation,

So, this appears to be an existing caveat of synchronous replication.
If that is the case, I am not sure if it is a good idea to just block such ops for the
prepared transaction. Also, what about other operations which acquire an
exclusive lock on [user]_catalog_tables
like:
cluster pg_trigger using pg_class_oid_index, similarly cluster on any
user_catalog_table, then the other problematic operation could truncate of
user_catalog_table as is discussed in another thread [1].
I think all such operations can block even with synchronous replication. I am not
sure if we can create examples for all cases because for ex. we don't have use
of user_catalog_tables in-core but maybe for others, we can try to create
examples and see what happens?

If all such operations can block for synchronous replication and prepared
transactions replication then we might want to document them as caveats at
page:
https://www.postgresql.org/docs/devel/logicaldecoding-synchronous.html
and then also give the reference for these caveats at prepared transactions
page:https://www.postgresql.org/docs/devel/logicaldecoding-two-phase-com
mits.html

What do you think?

I've checked the behavior of CLUSTER command
in synchronous mode, one of the examples above, as well.

IIUC, you meant pg_class, and
the deadlock happens when I run cluster commands on pg_class using its index in synchronous mode.
The command I used is "BEGIN; CLUSTER pg_class USING pg_class_oid_index; END;".
This deadlock comes from 2 processes, the backend to wait synchronization of the standby
and the walsender process which wants to take a lock on pg_class.
Therefore, I think we need to do something, at least documentation fix,
as you mentioned.

From the perspective of restating,
when I restart the locked pub with fast and immediate mode,
in both cases, the pub succeeded in restart and accepted
interactive psql connections. So, after the restart,
we are released from the lock.

Best Regards,
Takamichi Osumi

#16Amit Kapila
Amit Kapila
amit.kapila16@gmail.com
In reply to: osumi.takamichi@fujitsu.com (#15)
Re: locking [user] catalog tables vs 2pc vs logical rep

On Tue, May 25, 2021 at 1:43 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:

On Monday, May 24, 2021 1:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Apr 20, 2021 at 9:57 AM vignesh C <vignesh21@gmail.com> wrote:

This similar problem exists in case of synchronous replication setup
having synchronous_standby_names referring to the subscriber, when we
do the steps "begin;lock pg_class; insert into test1 values(10);
commit". In this case while decoding of commit, the commit will wait
while trying to acquire a lock on pg_class relation,

So, this appears to be an existing caveat of synchronous replication.
If that is the case, I am not sure if it is a good idea to just block such ops for the
prepared transaction. Also, what about other operations which acquire an
exclusive lock on [user]_catalog_tables
like:
cluster pg_trigger using pg_class_oid_index, similarly cluster on any
user_catalog_table, then the other problematic operation could truncate of
user_catalog_table as is discussed in another thread [1].
I think all such operations can block even with synchronous replication. I am not
sure if we can create examples for all cases because for ex. we don't have use
of user_catalog_tables in-core but maybe for others, we can try to create
examples and see what happens?

If all such operations can block for synchronous replication and prepared
transactions replication then we might want to document them as caveats at
page:
https://www.postgresql.org/docs/devel/logicaldecoding-synchronous.html
and then also give the reference for these caveats at prepared transactions
page:https://www.postgresql.org/docs/devel/logicaldecoding-two-phase-com
mits.html

What do you think?

I've checked the behavior of CLUSTER command
in synchronous mode, one of the examples above, as well.

IIUC, you meant pg_class, and
the deadlock happens when I run cluster commands on pg_class using its index in synchronous mode.
The command I used is "BEGIN; CLUSTER pg_class USING pg_class_oid_index; END;".
This deadlock comes from 2 processes, the backend to wait synchronization of the standby
and the walsender process which wants to take a lock on pg_class.

Have you tried to prepare this transaction? That won't be allowed. I
wanted to see if we can generate some scenarios where it is blocked
for prepared xacts decoding and for synchronous replication.

Therefore, I think we need to do something, at least documentation fix,
as you mentioned.

Yes, I think that is true.

--
With Regards,
Amit Kapila.

#17Amit Kapila
Amit Kapila
amit.kapila16@gmail.com
In reply to: Michael Paquier (#14)
Re: locking [user] catalog tables vs 2pc vs logical rep

On Tue, May 25, 2021 at 12:40 PM Michael Paquier <michael@paquier.xyz> wrote:

On Mon, May 24, 2021 at 10:03:01AM +0530, Amit Kapila wrote:

So, this appears to be an existing caveat of synchronous replication.
If that is the case, I am not sure if it is a good idea to just block
such ops for the prepared transaction. Also, what about other
operations which acquire an exclusive lock on [user]_catalog_tables
like:
cluster pg_trigger using pg_class_oid_index, similarly cluster on any
user_catalog_table, then the other problematic operation could
truncate of user_catalog_table as is discussed in another thread [1].
I think all such operations can block even with synchronous
replication. I am not sure if we can create examples for all cases
because for ex. we don't have use of user_catalog_tables in-core but
maybe for others, we can try to create examples and see what happens?

If all such operations can block for synchronous replication and
prepared transactions replication then we might want to document them
as caveats at page:
https://www.postgresql.org/docs/devel/logicaldecoding-synchronous.html
and then also give the reference for these caveats at prepared
transactions page:https://www.postgresql.org/docs/devel/logicaldecoding-two-phase-commits.html

What do you think?

It seems to me that the 2PC issues on catalog tables and the issues
related to logical replication in synchonous mode are two distinct
things that need to be fixed separately.

Fair enough. But the way we were looking at them as they will also
block (lead to deadlock) for logical replication of prepared
transactions and also logical replication in synchonous mode without
prepared transactions. Now, if we want to deal with the 2PC issues
separately that should be fine as well. However, for that we need to
see which all operations we want to block on [user]_catalog_tables.
The first one is lock command, then there are other operations like
Cluster which take exclusive lock on system catalog tables and we
allow them to be part of prepared transactions (example Cluster
pg_trigger using pg_trigger_oid_index;), another kind of operation is
Truncate on user_catalog_tables. Now, some of these might not allow
connecting after restart so we might need to think whether we want to
prohibit all such operations or only some of them.

The second issue with logical replication is still disruptive, but it
looks to me more like a don't-do-it issue, and documenting the caveats
sounds fine enough.

Right, that is what I also think.

--
With Regards,
Amit Kapila.

#18Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Amit Kapila (#17)
Re: locking [user] catalog tables vs 2pc vs logical rep

Amit Kapila <amit.kapila16@gmail.com> writes:

Fair enough. But the way we were looking at them as they will also
block (lead to deadlock) for logical replication of prepared
transactions and also logical replication in synchonous mode without
prepared transactions. Now, if we want to deal with the 2PC issues
separately that should be fine as well. However, for that we need to
see which all operations we want to block on [user]_catalog_tables.
The first one is lock command, then there are other operations like
Cluster which take exclusive lock on system catalog tables and we
allow them to be part of prepared transactions (example Cluster
pg_trigger using pg_trigger_oid_index;), another kind of operation is
Truncate on user_catalog_tables. Now, some of these might not allow
connecting after restart so we might need to think whether we want to
prohibit all such operations or only some of them.

2PC has pretty much always worked like that, and AFAIR there have been
a grand total of zero complaints about it. It seems quite likely to
me that you're proposing to expend a lot of effort on restrictions
that will hurt more people than they help. Maybe that score is only
about one to zero, but still you should account for the possibility
that you're breaking legitimate use-cases.

regards, tom lane

#19Petr Jelinek
Petr Jelinek
petr.jelinek@enterprisedb.com
In reply to: Amit Kapila (#17)
Re: locking [user] catalog tables vs 2pc vs logical rep

On 26 May 2021, at 05:04, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, May 25, 2021 at 12:40 PM Michael Paquier <michael@paquier.xyz> wrote:

On Mon, May 24, 2021 at 10:03:01AM +0530, Amit Kapila wrote:

So, this appears to be an existing caveat of synchronous replication.
If that is the case, I am not sure if it is a good idea to just block
such ops for the prepared transaction. Also, what about other
operations which acquire an exclusive lock on [user]_catalog_tables
like:
cluster pg_trigger using pg_class_oid_index, similarly cluster on any
user_catalog_table, then the other problematic operation could
truncate of user_catalog_table as is discussed in another thread [1].
I think all such operations can block even with synchronous
replication. I am not sure if we can create examples for all cases
because for ex. we don't have use of user_catalog_tables in-core but
maybe for others, we can try to create examples and see what happens?

If all such operations can block for synchronous replication and
prepared transactions replication then we might want to document them
as caveats at page:
https://www.postgresql.org/docs/devel/logicaldecoding-synchronous.html
and then also give the reference for these caveats at prepared
transactions page:https://www.postgresql.org/docs/devel/logicaldecoding-two-phase-commits.html

What do you think?

It seems to me that the 2PC issues on catalog tables and the issues
related to logical replication in synchonous mode are two distinct
things that need to be fixed separately.

Fair enough. But the way we were looking at them as they will also
block (lead to deadlock) for logical replication of prepared
transactions and also logical replication in synchonous mode without
prepared transactions. Now, if we want to deal with the 2PC issues
separately that should be fine as well. However, for that we need to
see which all operations we want to block on [user]_catalog_tables.
The first one is lock command, then there are other operations like
Cluster which take exclusive lock on system catalog tables and we
allow them to be part of prepared transactions (example Cluster
pg_trigger using pg_trigger_oid_index;), another kind of operation is
Truncate on user_catalog_tables. Now, some of these might not allow
connecting after restart so we might need to think whether we want to
prohibit all such operations or only some of them.

IIRC this was discussed the first time 2PC decoding was proposed and everybody seemed fine with the limitation so I'd vote for just documenting it, same way as the sync rep issue.

If you'd prefer fixing it by blocking something, wouldn't it make more sense to simply not allow PREPARE if one of these operations was executed, similarly to what we do with temp table access?

--
Petr

#20vignesh C
vignesh C
vignesh21@gmail.com
In reply to: Michael Paquier (#14)
Re: locking [user] catalog tables vs 2pc vs logical rep

On Tue, May 25, 2021 at 12:40 PM Michael Paquier <michael@paquier.xyz> wrote:

On Mon, May 24, 2021 at 10:03:01AM +0530, Amit Kapila wrote:

So, this appears to be an existing caveat of synchronous replication.
If that is the case, I am not sure if it is a good idea to just block
such ops for the prepared transaction. Also, what about other
operations which acquire an exclusive lock on [user]_catalog_tables
like:
cluster pg_trigger using pg_class_oid_index, similarly cluster on any
user_catalog_table, then the other problematic operation could
truncate of user_catalog_table as is discussed in another thread [1].
I think all such operations can block even with synchronous
replication. I am not sure if we can create examples for all cases
because for ex. we don't have use of user_catalog_tables in-core but
maybe for others, we can try to create examples and see what happens?

If all such operations can block for synchronous replication and
prepared transactions replication then we might want to document them
as caveats at page:
https://www.postgresql.org/docs/devel/logicaldecoding-synchronous.html
and then also give the reference for these caveats at prepared
transactions page:https://www.postgresql.org/docs/devel/logicaldecoding-two-phase-commits.html

What do you think?

It seems to me that the 2PC issues on catalog tables and the issues
related to logical replication in synchonous mode are two distinct
things that need to be fixed separately.

The issue with LOCK taken on a catalog while a PREPARE TRANSACTION
holds locks around is bad enough in itself as it could lock down a
user from a cluster as long as the PREPARE TRANSACTION is not removed
from WAL (say the relation is critical for the connection startup).
This could be really disruptive for the user even if he tried to take
a lock on an object he owns, and the way to recover is not easy here,
and the way to recover involves either an old backup or worse,
pg_resetwal.

The second issue with logical replication is still disruptive, but it
looks to me more like a don't-do-it issue, and documenting the caveats
sounds fine enough.

Looking at the patch from upthread..

+           /*
+            * Make note that we've locked a system table or an user catalog
+            * table. This flag will be checked later during prepare transaction
+            * to fail the prepare transaction.
+            */
+           if (lockstmt->mode >= ExclusiveLock &&
+               (IsCatalogRelationOid(reloid) ||
+                RelationIsUsedAsCatalogTable(rel)))
+               MyXactFlags |= XACT_FLAGS_ACQUIREDEXCLUSIVELOCK_SYSREL;
I think that I'd just use IsCatalogRelationOid() here, and I'd be more
severe and restrict all attempts for any lock levels.  It seems to me
that this needs to happen within RangeVarCallbackForLockTable().
I would also rename the flag as just XACT_FLAGS_LOCKEDCATALOG.

+ errmsg("cannot PREPARE a transaction that has an exclusive lock on user catalog/system table(s)")));
What about "cannot PREPARE a transaction that has locked a catalog
relation"?

At this point it is not clear if we are planning to fix this issue by
throwing an error or document it. I will fix these comments once we
come to consensus.

Regards,
Vignesh

#21Amit Kapila
Amit Kapila
amit.kapila16@gmail.com
In reply to: Petr Jelinek (#19)
Re: locking [user] catalog tables vs 2pc vs logical rep

On Wed, May 26, 2021 at 1:53 PM Petr Jelinek
<petr.jelinek@enterprisedb.com> wrote:

On 26 May 2021, at 05:04, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, May 25, 2021 at 12:40 PM Michael Paquier <michael@paquier.xyz> wrote:

It seems to me that the 2PC issues on catalog tables and the issues
related to logical replication in synchonous mode are two distinct
things that need to be fixed separately.

Fair enough. But the way we were looking at them as they will also
block (lead to deadlock) for logical replication of prepared
transactions and also logical replication in synchonous mode without
prepared transactions. Now, if we want to deal with the 2PC issues
separately that should be fine as well. However, for that we need to
see which all operations we want to block on [user]_catalog_tables.
The first one is lock command, then there are other operations like
Cluster which take exclusive lock on system catalog tables and we
allow them to be part of prepared transactions (example Cluster
pg_trigger using pg_trigger_oid_index;), another kind of operation is
Truncate on user_catalog_tables. Now, some of these might not allow
connecting after restart so we might need to think whether we want to
prohibit all such operations or only some of them.

IIRC this was discussed the first time 2PC decoding was proposed and everybody seemed fine with the limitation so I'd vote for just documenting it, same way as the sync rep issue.

+1.

If you'd prefer fixing it by blocking something, wouldn't it make more sense to simply not allow PREPARE if one of these operations was executed, similarly to what we do with temp table access?

The point was that even if somehow we block for prepare, there doesn't
seem to be a simple way for synchronous logical replication which can
also have similar problems. So, I would prefer to document it and we
can even think to backpatch the sync rep related documentation.
Michael seems to be a bit interested in dealing with some of the 2PC
issues due to reasons different than logical replication which I am
not completely sure is a good idea and Tom also feels that is not a
good idea.

--
With Regards,
Amit Kapila.

#22Peter Smith
Peter Smith
smithpb2250@gmail.com
In reply to: Andres Freund (#1)
1 attachment(s)
Re: locking [user] catalog tables vs 2pc vs logical rep

Hi.

The attached PG docs patch about catalog deadlocks was previously
implemented in another thread [1]/messages/by-id/CAA4eK1K+SeT31pxwL5iTvXq=JhZpG_cUJLFhiz-eD+Jr-WAPeg@mail.gmail.com, but it seems more relevant to this
one.

PSA.

------
[1]: /messages/by-id/CAA4eK1K+SeT31pxwL5iTvXq=JhZpG_cUJLFhiz-eD+Jr-WAPeg@mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia.

Attachments:

pg_docs_deadlock_note.diffapplication/octet-stream; name=pg_docs_deadlock_note.diff
#23osumi.takamichi@fujitsu.com
osumi.takamichi@fujitsu.com
osumi.takamichi@fujitsu.com
In reply to: Peter Smith (#22)
1 attachment(s)
RE: locking [user] catalog tables vs 2pc vs logical rep

On Tuesday, June 1, 2021 4:33 PM Peter Smith <smithpb2250@gmail.com>

To: Andres Freund <andres@anarazel.de>
Cc: PostgreSQL-development <pgsql-hackers@postgresql.org>; Amit Kapila
<amit.kapila16@gmail.com>; Markus Wanner
<markus.wanner@enterprisedb.com>
Subject: Re: locking [user] catalog tables vs 2pc vs logical rep

Hi.

The attached PG docs patch about catalog deadlocks was previously
implemented in another thread [1], but it seems more relevant to this one.

PSA.

Thank you for providing the patch.
I have updated your patch to include some other viewpoints.

For example, CLUSTER command scenario
that also causes hang of PREPARE in synchronous mode.
We get this deadlock, using the 2PC patch-set.

FYI, the scenario is
(1) create a table with a trigger
(2) create pub and sub in synchronous mode
(3) then, execute CLUSTER pg_trigger USING pg_trigger_oid_index,
and do some operations (e.g. INSERT) on the trigger-attached table and PREPARE

The mechanism of this is
walsender tries to take a lock on pg_trigger if the table has a trigger,
but, pg_trigger is already locked by the CLUSTER command, which leads to the deadlock.
Then, this scenario requires some operations on the table which has trigger
because it invokes the walsender to take the lock described above.

I also included the description about TRUNCATE on user_catalog_table
in the patch. Please have a look at this patch.

Best Regards,
Takamichi Osumi

Attachments:

deadlock_caveats_for_logical_decoding_v02.patchapplication/octet-stream; name=deadlock_caveats_for_logical_decoding_v02.patch
#24vignesh C
vignesh C
vignesh21@gmail.com
In reply to: osumi.takamichi@fujitsu.com (#23)
Re: locking [user] catalog tables vs 2pc vs logical rep

On Thu, Jun 3, 2021 at 9:18 AM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:

On Tuesday, June 1, 2021 4:33 PM Peter Smith <smithpb2250@gmail.com>

To: Andres Freund <andres@anarazel.de>
Cc: PostgreSQL-development <pgsql-hackers@postgresql.org>; Amit Kapila
<amit.kapila16@gmail.com>; Markus Wanner
<markus.wanner@enterprisedb.com>
Subject: Re: locking [user] catalog tables vs 2pc vs logical rep

Hi.

The attached PG docs patch about catalog deadlocks was previously
implemented in another thread [1], but it seems more relevant to this one.

PSA.

Thank you for providing the patch.
I have updated your patch to include some other viewpoints.

For example, CLUSTER command scenario
that also causes hang of PREPARE in synchronous mode.
We get this deadlock, using the 2PC patch-set.

FYI, the scenario is
(1) create a table with a trigger
(2) create pub and sub in synchronous mode
(3) then, execute CLUSTER pg_trigger USING pg_trigger_oid_index,
and do some operations (e.g. INSERT) on the trigger-attached table and PREPARE

The mechanism of this is
walsender tries to take a lock on pg_trigger if the table has a trigger,
but, pg_trigger is already locked by the CLUSTER command, which leads to the deadlock.
Then, this scenario requires some operations on the table which has trigger
because it invokes the walsender to take the lock described above.

I also included the description about TRUNCATE on user_catalog_table
in the patch. Please have a look at this patch.

1) I was not able to generate html docs with the attached patch:
logicaldecoding.sgml:1128: element sect1: validity error : Element
sect1 content does not follow the DTD, expecting (sect1info? , (title
, subtitle? , titleabbrev?) , (toc | lot | index | glossary |
bibliography)* , (((calloutlist | glosslist | bibliolist |
itemizedlist | orderedlist | segmentedlist | simplelist | variablelist
| caution | important | note | tip | warning | literallayout |
programlisting | programlistingco | screen | screenco | screenshot |
synopsis | cmdsynopsis | funcsynopsis | classsynopsis | fieldsynopsis
| constructorsynopsis | destructorsynopsis | methodsynopsis |
formalpara | para | simpara | address | blockquote | graphic |
graphicco | mediaobject | mediaobjectco | informalequation |
informalexample | informalfigure | informaltable | equation | example
| figure | table | msgset | procedure | sidebar | qandaset | task |
anchor | bridgehead | remark | highlights | abstract | authorblurb |
epigraph | indexterm | beginpage)+ , (refentry* | sect2* |
simplesect*)) | refentry+ | sect2+ | simplesect+) , (toc | lot | index
| glossary | bibliography)*), got (title sect2 sect2 note )
</sect1>

2) You could change hang to deadlock:
+ logical decoding of published table within the same
transaction leads to a hang.

Regards,
Vignesh

#25Amit Kapila
Amit Kapila
amit.kapila16@gmail.com
In reply to: osumi.takamichi@fujitsu.com (#23)
Re: locking [user] catalog tables vs 2pc vs logical rep

On Thu, Jun 3, 2021 at 9:18 AM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:

On Tuesday, June 1, 2021 4:33 PM Peter Smith <smithpb2250@gmail.com>

To: Andres Freund <andres@anarazel.de>
Cc: PostgreSQL-development <pgsql-hackers@postgresql.org>; Amit Kapila
<amit.kapila16@gmail.com>; Markus Wanner
<markus.wanner@enterprisedb.com>
Subject: Re: locking [user] catalog tables vs 2pc vs logical rep

Hi.

The attached PG docs patch about catalog deadlocks was previously
implemented in another thread [1], but it seems more relevant to this one.

PSA.

Thank you for providing the patch.
I have updated your patch to include some other viewpoints.

I suggest creating a synchronous replication part of the patch for
back-branches as well.

--
With Regards,
Amit Kapila.

#26osumi.takamichi@fujitsu.com
osumi.takamichi@fujitsu.com
osumi.takamichi@fujitsu.com
In reply to: Amit Kapila (#25)
5 attachment(s)
RE: locking [user] catalog tables vs 2pc vs logical rep

On Thursday, June 3, 2021 7:07 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Jun 3, 2021 at 9:18 AM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:

Thank you for providing the patch.
I have updated your patch to include some other viewpoints.

I suggest creating a synchronous replication part of the patch for
back-branches as well.

You are right. Please have a look at the attached patch-set.
Needless to say, the patch for HEAD has descriptions that depend on
the 2pc patch-set.

Best Regards,
Takamichi Osumi

Attachments:

HEAD_deadlock_documentation_of_logical_decoding_v03.patchapplication/octet-stream; name=HEAD_deadlock_documentation_of_logical_decoding_v03.patch
PG13_deadlock_documentation_of_logical_decoding_v03.patchapplication/octet-stream; name=PG13_deadlock_documentation_of_logical_decoding_v03.patch
PG12_deadlock_documentation_of_logical_decoding_v03.patchapplication/octet-stream; name=PG12_deadlock_documentation_of_logical_decoding_v03.patch
PG11_deadlock_documentation_of_logical_decoding_v03.patchapplication/octet-stream; name=PG11_deadlock_documentation_of_logical_decoding_v03.patch
PG10_deadlock_documentation_of_logical_decoding_v03.patchapplication/octet-stream; name=PG10_deadlock_documentation_of_logical_decoding_v03.patch
#27osumi.takamichi@fujitsu.com
osumi.takamichi@fujitsu.com
osumi.takamichi@fujitsu.com
In reply to: vignesh C (#24)
RE: locking [user] catalog tables vs 2pc vs logical rep

On Thursday, June 3, 2021 1:09 PM vignesh C <vignesh21@gmail.com> wrote:

On Thu, Jun 3, 2021 at 9:18 AM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:

Thank you for providing the patch.
I have updated your patch to include some other viewpoints.

I also included the description about TRUNCATE on user_catalog_table
in the patch. Please have a look at this patch.

1) I was not able to generate html docs with the attached patch:
logicaldecoding.sgml:1128: element sect1: validity error : Element...

Thank you for your review.
I fixed the patch to make it pass to generate html output.
Kindly have a look at the v03.

2) You could change hang to deadlock:
+ logical decoding of published table within the same
transaction leads to a hang.

Yes. I included your point. Thanks.

Best Regards,
Takamichi Osumi

#28vignesh C
vignesh C
vignesh21@gmail.com
In reply to: osumi.takamichi@fujitsu.com (#26)
Re: locking [user] catalog tables vs 2pc vs logical rep

On Mon, Jun 7, 2021 at 4:18 AM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:

On Thursday, June 3, 2021 7:07 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Jun 3, 2021 at 9:18 AM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:

Thank you for providing the patch.
I have updated your patch to include some other viewpoints.

I suggest creating a synchronous replication part of the patch for
back-branches as well.

You are right. Please have a look at the attached patch-set.
Needless to say, the patch for HEAD has descriptions that depend on
the 2pc patch-set.

1)
+     <para>
+       The use of any command to take an ACCESS EXCLUSIVE lock on
[user] catalog tables
+       can cause the deadlock of logical decoding in synchronous
mode. This means that
+       at the transaction commit or prepared transaction, the command
hangs or the server
+       becomes to block any new connections. To avoid this, users
must refrain from such
+       operations.
+     </para>

Can we change it something like:
Logical decoding of transactions in synchronous replication mode
requires access to system tables and/or user catalog tables, hence
user should refrain from taking exclusive lock on system tables and/or
user catalog tables or refrain from executing commands like cluster
command which will take exclusive lock on system tables internally. If
not the transaction will get blocked at commit/prepare time because of
a deadlock.

2) I was not sure if we should include the examples below or the above
para is enough, we can hear from others and retain it if required:
+     <para>
+       When <command>COMMIT</command> is conducted for a transaction that has
+       issued explicit <command>LOCK</command> on
<structname>pg_class</structname>
+       with logical decoding, the deadlock occurs. Also, committing
one that runs
+       <command>CLUSTER</command> <structname>pg_class</structname> is another
+       deadlock scenario.
+     </para>
+
+     <para>
+       Similarly, executing <command>PREPARE TRANSACTION</command>
+       after <command>LOCK</command> command on
<structname>pg_class</structname> and
+       logical decoding of published table within the same
transaction leads to the deadlock.
+       Clustering <structname>pg_trigger</structname> by
<command>CLUSTER</command> command
+       brings about the deadlock as well, when published table has a
trigger and any operations
+       that will be decoded are conducted on the same table.
+     </para>
+
+     <para>
+       The deadlock can happen when users execute <command>TRUNCATE</command>
+       on user_catalog_table under the condition that output plugin
have reference to it.
      </para>

Regards,
Vignesh

#29Amit Kapila
Amit Kapila
amit.kapila16@gmail.com
In reply to: vignesh C (#28)
Re: locking [user] catalog tables vs 2pc vs logical rep

On Mon, Jun 7, 2021 at 9:26 AM vignesh C <vignesh21@gmail.com> wrote:

On Mon, Jun 7, 2021 at 4:18 AM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:

On Thursday, June 3, 2021 7:07 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Jun 3, 2021 at 9:18 AM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:

Thank you for providing the patch.
I have updated your patch to include some other viewpoints.

I suggest creating a synchronous replication part of the patch for
back-branches as well.

You are right. Please have a look at the attached patch-set.
Needless to say, the patch for HEAD has descriptions that depend on
the 2pc patch-set.

1)
+     <para>
+       The use of any command to take an ACCESS EXCLUSIVE lock on
[user] catalog tables
+       can cause the deadlock of logical decoding in synchronous
mode. This means that
+       at the transaction commit or prepared transaction, the command
hangs or the server
+       becomes to block any new connections. To avoid this, users
must refrain from such
+       operations.
+     </para>

Can we change it something like:
Logical decoding of transactions in synchronous replication mode
requires access to system tables and/or user catalog tables, hence
user should refrain from taking exclusive lock on system tables and/or
user catalog tables or refrain from executing commands like cluster
command which will take exclusive lock on system tables internally. If
not the transaction will get blocked at commit/prepare time because of
a deadlock.

I think this is better than what the patch has proposed. I suggest
minor modifications to your proposed changes. Let's write the above
para as: "In synchronous replication setup, a deadlock can happen, if
the transaction has locked [user] catalog tables exclusively. This is
because logical decoding of transactions can lock catalog tables to
access them. To avoid this users must refrain from taking an exclusive
lock on [user] catalog tables. This can happen in the following ways:"

+     <para>
+       When <command>COMMIT</command> is conducted for a transaction that has
+       issued explicit <command>LOCK</command> on
<structname>pg_class</structname>
+       with logical decoding, the deadlock occurs. Also, committing
one that runs
+       <command>CLUSTER</command> <structname>pg_class</structname> is another
+       deadlock scenario.
      </para>

The above points need to be mentioned in the <itemizedlist> fashion.
See <sect2 id="continuous-archiving-caveats"> for an example. I think
the above point can be split as follows.

<listitem>
<para>
User has issued an explicit <command>LOCK</command> on
<structname>pg_class</structname> (or any other catalog table) in a
transaction. Now when we try to decode such a transaction, a deadlock
can happen.
</para>
</listitem>

Similarly, write separate points for Cluster and Truncate.

One more comment is that for HEAD, first just create a patch with
synchronous replication-related doc changes and then write a separate
patch for prepared transactions.

2) I was not sure if we should include the examples below or the above
para is enough,

It is better to give examples but let's use the format as I suggested above.

--
With Regards,
Amit Kapila.

#30Amit Kapila
Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Kapila (#29)
Re: locking [user] catalog tables vs 2pc vs logical rep

On Mon, Jun 7, 2021 at 10:44 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

One more comment is that for HEAD, first just create a patch with
synchronous replication-related doc changes and then write a separate
patch for prepared transactions.

I noticed that docs for "Synchronous replication support for Logical
Decoding" has been introduced by commit
49c0864d7ef5227faa24f903902db90e5c9d5d69 which goes till 9.6. So, I
think you need to create a patch for 9.6 as well unless one of the
existing patches already applies in 9.6.

--
With Regards,
Amit Kapila.

#31osumi.takamichi@fujitsu.com
osumi.takamichi@fujitsu.com
osumi.takamichi@fujitsu.com
In reply to: Amit Kapila (#30)
7 attachment(s)
RE: locking [user] catalog tables vs 2pc vs logical rep

On Monday, June 7, 2021 6:22 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Jun 7, 2021 at 10:44 AM Amit Kapila <amit.kapila16@gmail.com>
wrote:

One more comment is that for HEAD, first just create a patch with
synchronous replication-related doc changes and then write a separate
patch for prepared transactions.

I noticed that docs for "Synchronous replication support for Logical Decoding"
has been introduced by commit
49c0864d7ef5227faa24f903902db90e5c9d5d69 which goes till 9.6. So, I think
you need to create a patch for 9.6 as well unless one of the existing patches
already applies in 9.6.

OK. I could apply PG10's patch to 9.6.
Also, I've made a separate patch for 2PC description.

On the other hand, I need to mention that
there are some gaps to cause failures to apply patches
between supported versions.
(e.g. applying a patch for HEAD to stable PG13 fails)

To address the gaps between the versions,
I needed to conduct some manual fixes.
Therefore, please note that the content of patch
between PG12 and PG13 are almost same
like PG9.6 and PG10, but, I prepared
independent patches for HEAD and PG11,
in order to make those applied in a comfortable manner.

Kindly have a look at the updated patch-set.
They all passed the test of make html.

Best Regards,
Takamichi Osumi

Attachments:

PG12_deadlock_documentation_of_logical_decoding_v04.patchapplication/octet-stream; name=PG12_deadlock_documentation_of_logical_decoding_v04.patch
HEAD_deadlock_documentation_of_logical_decoding_v04.patchapplication/octet-stream; name=HEAD_deadlock_documentation_of_logical_decoding_v04.patch
HEAD_with_2PC_deadlock_documentation_of_logical_decoding_v04.patchapplication/octet-stream; name=HEAD_with_2PC_deadlock_documentation_of_logical_decoding_v04.patch
PG10_deadlock_documentation_of_logical_decoding_v04.patchapplication/octet-stream; name=PG10_deadlock_documentation_of_logical_decoding_v04.patch
PG11_deadlock_documentation_of_logical_decoding_v04.patchapplication/octet-stream; name=PG11_deadlock_documentation_of_logical_decoding_v04.patch
PG13_deadlock_documentation_of_logical_decoding_v04.patchapplication/octet-stream; name=PG13_deadlock_documentation_of_logical_decoding_v04.patch
PG96_deadlock_documentation_of_logical_decoding_v04.patchapplication/octet-stream; name=PG96_deadlock_documentation_of_logical_decoding_v04.patch
#32vignesh C
vignesh C
vignesh21@gmail.com
In reply to: osumi.takamichi@fujitsu.com (#31)
Re: locking [user] catalog tables vs 2pc vs logical rep

On Tue, Jun 8, 2021 at 1:34 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:

On Monday, June 7, 2021 6:22 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Jun 7, 2021 at 10:44 AM Amit Kapila <amit.kapila16@gmail.com>
wrote:

One more comment is that for HEAD, first just create a patch with
synchronous replication-related doc changes and then write a separate
patch for prepared transactions.

I noticed that docs for "Synchronous replication support for Logical Decoding"
has been introduced by commit
49c0864d7ef5227faa24f903902db90e5c9d5d69 which goes till 9.6. So, I think
you need to create a patch for 9.6 as well unless one of the existing patches
already applies in 9.6.

OK. I could apply PG10's patch to 9.6.
Also, I've made a separate patch for 2PC description.

On the other hand, I need to mention that
there are some gaps to cause failures to apply patches
between supported versions.
(e.g. applying a patch for HEAD to stable PG13 fails)

To address the gaps between the versions,
I needed to conduct some manual fixes.
Therefore, please note that the content of patch
between PG12 and PG13 are almost same
like PG9.6 and PG10, but, I prepared
independent patches for HEAD and PG11,
in order to make those applied in a comfortable manner.

Kindly have a look at the updated patch-set.
They all passed the test of make html.

Thanks for the updated patch.

I have few comments:
1) Should we list the actual system tables like pg_class,pg_trigger,
etc instead of any other catalog table?
User has issued an explicit LOCK on pg_class (or any other catalog table)
2) Here This means deadlock, after this we mention deadlock again for
each of the examples, we can remove it if redundant.
This can happen in the following ways:
3) Should [user] catalog tables be catalog tables or user catalog tables
[user] catalog tables

Regards,
Vignesh

#33Amit Kapila
Amit Kapila
amit.kapila16@gmail.com
In reply to: vignesh C (#32)
Re: locking [user] catalog tables vs 2pc vs logical rep

On Tue, Jun 8, 2021 at 6:24 PM vignesh C <vignesh21@gmail.com> wrote:

Thanks for the updated patch.

I have few comments:
1) Should we list the actual system tables like pg_class,pg_trigger,
etc instead of any other catalog table?
User has issued an explicit LOCK on pg_class (or any other catalog table)

I think the way it is mentioned is okay. We don't need to specify
other catalog tables.

2) Here This means deadlock, after this we mention deadlock again for
each of the examples, we can remove it if redundant.
This can happen in the following ways:
3) Should [user] catalog tables be catalog tables or user catalog tables
[user] catalog tables

The third point is not clear. Can you please elaborate by quoting the
exact change from the patch?

--
With Regards,
Amit Kapila.

#34osumi.takamichi@fujitsu.com
osumi.takamichi@fujitsu.com
osumi.takamichi@fujitsu.com
In reply to: Amit Kapila (#33)
RE: locking [user] catalog tables vs 2pc vs logical rep

On Wednesday, June 9, 2021 12:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Jun 8, 2021 at 6:24 PM vignesh C <vignesh21@gmail.com> wrote:

Thanks for the updated patch.

I have few comments:
1) Should we list the actual system tables like pg_class,pg_trigger,
etc instead of any other catalog table?
User has issued an explicit LOCK on pg_class (or any other catalog
table)

I think the way it is mentioned is okay. We don't need to specify other catalog
tables.

Okay.

2) Here This means deadlock, after this we mention deadlock again for
each of the examples, we can remove it if redundant.
This can happen in the following ways:

I think this sentence works to notify that commands described below
are major scenarios naturally, to the readers. Then, I don't want to remove it.

If you somehow feel that the descriptions are redundant,
how about unifying all listitems as nouns. like below ?

* An explicit <command>LOCK</command> on <structname>pg_class</structname> (or any other catalog table) in a transaction
* Reordering <structname>pg_class</structname> by <command>CLUSTER</command> command in a transaction
* Executing <command>TRUNCATE</command> on user_catalog_table

3) Should [user] catalog tables be catalog tables or user catalog
tables [user] catalog tables

The third point is not clear. Can you please elaborate by quoting the exact
change from the patch?

IIUC, he means to replace all descriptions "[user] catalog tables"
with "catalog tables or user catalog tables" in the patch,
because seemingly we don't use square brackets to describe optional clause in
normal descriptions(like outside of synopsis and I don't find any example for this).
But, even if so, I would like to keep the current square brackets description,
which makes sentence short and simple.

Best Regards,
Takamichi Osumi

#35Amit Kapila
Amit Kapila
amit.kapila16@gmail.com
In reply to: osumi.takamichi@fujitsu.com (#34)
Re: locking [user] catalog tables vs 2pc vs logical rep

On Wed, Jun 9, 2021 at 12:03 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:

On Wednesday, June 9, 2021 12:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Jun 8, 2021 at 6:24 PM vignesh C <vignesh21@gmail.com> wrote:

3) Should [user] catalog tables be catalog tables or user catalog
tables [user] catalog tables

The third point is not clear. Can you please elaborate by quoting the exact
change from the patch?

IIUC, he means to replace all descriptions "[user] catalog tables"
with "catalog tables or user catalog tables" in the patch,
because seemingly we don't use square brackets to describe optional clause in
normal descriptions(like outside of synopsis and I don't find any example for this).
But, even if so, I would like to keep the current square brackets description,
which makes sentence short and simple.

+1.

--
With Regards,
Amit Kapila.

#36osumi.takamichi@fujitsu.com
osumi.takamichi@fujitsu.com
osumi.takamichi@fujitsu.com
In reply to: osumi.takamichi@fujitsu.com (#31)
RE: locking [user] catalog tables vs 2pc vs logical rep

On Tuesday, June 8, 2021 5:04 PM I wrote:

On Monday, June 7, 2021 6:22 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:

On Mon, Jun 7, 2021 at 10:44 AM Amit Kapila <amit.kapila16@gmail.com>
wrote:

One more comment is that for HEAD, first just create a patch with
synchronous replication-related doc changes and then write a
separate patch for prepared transactions.

I noticed that docs for "Synchronous replication support for Logical Decoding"
has been introduced by commit
49c0864d7ef5227faa24f903902db90e5c9d5d69 which goes till 9.6. So, I
think you need to create a patch for 9.6 as well unless one of the
existing patches already applies in 9.6.

OK. I could apply PG10's patch to 9.6.
Also, I've made a separate patch for 2PC description.

On the other hand, I need to mention that there are some gaps to cause failures
to apply patches between supported versions.
(e.g. applying a patch for HEAD to stable PG13 fails)

I scrutinized this POV and checked the gaps between supported versions.
In terms of the section where the patch want to fix,
there are only 2 major gaps between PG10 and PG11 - [1]how we finish xref tag is different between PG10 and PG11
and between PG13 and HEAD - [2]in HEAD, we have a new sect1 after "Synchronous Replication Support for Logical Decoding". In other words,
the patch-set should be 4 types.

* patch for HEAD
* additional patch for HEAD based on 2PC patch-set
* patch for from PG11 to PG13
* patch for PG9.6 and PG10

To address the gaps between the versions, I needed to conduct some manual
fixes.
Therefore, please note that the content of patch between PG12 and PG13 are
almost same like PG9.6 and PG10, but, I prepared independent patches for
HEAD and PG11, in order to make those applied in a comfortable manner.

Therefore, I was wrong.
I didn't need the specific independent patch for PG11.
I'll fix the patch-set accordingly in the next version.

[1]: how we finish xref tag is different between PG10 and PG11

--- logicaldecoding.sgml_PG11   2021-06-09 04:38:18.214163527 +0000
+++ logicaldecoding.sgml_PG10   2021-06-09 04:37:50.533163527 +0000
@@ -730,9 +698,9 @@
     replication</link> solutions with the same user interface as synchronous
     replication for <link linkend="streaming-replication">streaming
     replication</link>.  To do this, the streaming replication interface
-    (see <xref linkend="logicaldecoding-walsender"/>) must be used to stream out
+    (see <xref linkend="logicaldecoding-walsender">) must be used to stream out
     data. Clients have to send <literal>Standby status update (F)</literal>
-    (see <xref linkend="protocol-replication"/>) messages, just like streaming
+    (see <xref linkend="protocol-replication">) messages, just like streaming
     replication clients do.
    </para>

[2]: in HEAD, we have a new sect1 after "Synchronous Replication Support for Logical Decoding"

--- logicaldecoding.sgml_PG13   2021-06-09 05:10:34.927163527 +0000
+++ logicaldecoding.sgml_HEAD   2021-06-09 05:08:12.810163527 +0000
@@ -747,4 +1089,177 @@
      </para>
    </note>
   </sect1>
+
+  <sect1 id="logicaldecoding-streaming">
+   <title>Streaming of Large Transactions for Logical Decoding</title>
+
+   <para>
+    The basic output plugin callbacks (e.g., <function>begin_cb</function>,
...

Best Regards,
Takamichi Osumi

#37vignesh C
vignesh C
vignesh21@gmail.com
In reply to: osumi.takamichi@fujitsu.com (#34)
Re: locking [user] catalog tables vs 2pc vs logical rep

On Wed, Jun 9, 2021 at 12:03 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:

On Wednesday, June 9, 2021 12:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Jun 8, 2021 at 6:24 PM vignesh C <vignesh21@gmail.com> wrote:

Thanks for the updated patch.

I have few comments:
1) Should we list the actual system tables like pg_class,pg_trigger,
etc instead of any other catalog table?
User has issued an explicit LOCK on pg_class (or any other catalog
table)

I think the way it is mentioned is okay. We don't need to specify other catalog
tables.

Okay.

2) Here This means deadlock, after this we mention deadlock again for
each of the examples, we can remove it if redundant.
This can happen in the following ways:

I think this sentence works to notify that commands described below
are major scenarios naturally, to the readers. Then, I don't want to remove it.

If you somehow feel that the descriptions are redundant,
how about unifying all listitems as nouns. like below ?

* An explicit <command>LOCK</command> on <structname>pg_class</structname> (or any other catalog table) in a transaction
* Reordering <structname>pg_class</structname> by <command>CLUSTER</command> command in a transaction
* Executing <command>TRUNCATE</command> on user_catalog_table

This looks good to me. Keep the 2PC documentation patch also on the same lines.

Regards,
Vignesh

#38osumi.takamichi@fujitsu.com
osumi.takamichi@fujitsu.com
osumi.takamichi@fujitsu.com
In reply to: vignesh C (#37)
RE: locking [user] catalog tables vs 2pc vs logical rep

On Thursday, June 10, 2021 1:14 PM vignesh C <vignesh21@gmail.com>

On Wed, Jun 9, 2021 at 12:03 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:

On Wednesday, June 9, 2021 12:06 PM Amit Kapila

<amit.kapila16@gmail.com> wrote:

On Tue, Jun 8, 2021 at 6:24 PM vignesh C <vignesh21@gmail.com> wrote:

Thanks for the updated patch.

I have few comments:
1) Should we list the actual system tables like
pg_class,pg_trigger, etc instead of any other catalog table?
User has issued an explicit LOCK on pg_class (or any other catalog
table)

I think the way it is mentioned is okay. We don't need to specify
other catalog tables.

Okay.

2) Here This means deadlock, after this we mention deadlock again
for each of the examples, we can remove it if redundant.
This can happen in the following ways:

I think this sentence works to notify that commands described below
are major scenarios naturally, to the readers. Then, I don't want to remove

it.

If you somehow feel that the descriptions are redundant, how about
unifying all listitems as nouns. like below ?

* An explicit <command>LOCK</command> on
<structname>pg_class</structname> (or any other catalog table) in a
transaction
* Reordering <structname>pg_class</structname> by
<command>CLUSTER</command> command in a transaction
* Executing <command>TRUNCATE</command> on user_catalog_table

This looks good to me. Keep the 2PC documentation patch also on the same
lines.

Yeah, of course. Thanks for your confirmation.

Best Regards,
Takamichi Osumi

#39osumi.takamichi@fujitsu.com
osumi.takamichi@fujitsu.com
osumi.takamichi@fujitsu.com
In reply to: osumi.takamichi@fujitsu.com (#38)
7 attachment(s)
RE: locking [user] catalog tables vs 2pc vs logical rep

On Thursday, June 10, 2021 1:30 PM I wrote:

On Thursday, June 10, 2021 1:14 PM vignesh C <vignesh21@gmail.com>

On Wed, Jun 9, 2021 at 12:03 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:

On Wednesday, June 9, 2021 12:06 PM Amit Kapila

<amit.kapila16@gmail.com> wrote:

On Tue, Jun 8, 2021 at 6:24 PM vignesh C <vignesh21@gmail.com>

wrote:

Thanks for the updated patch.

I have few comments:
1) Should we list the actual system tables like
pg_class,pg_trigger, etc instead of any other catalog table?
User has issued an explicit LOCK on pg_class (or any other
catalog
table)

I think the way it is mentioned is okay. We don't need to specify
other catalog tables.

Okay.

2) Here This means deadlock, after this we mention deadlock
again for each of the examples, we can remove it if redundant.
This can happen in the following ways:

I think this sentence works to notify that commands described below
are major scenarios naturally, to the readers. Then, I don't want to
remove

it.

If you somehow feel that the descriptions are redundant, how about
unifying all listitems as nouns. like below ?

* An explicit <command>LOCK</command> on
<structname>pg_class</structname> (or any other catalog table) in a
transaction
* Reordering <structname>pg_class</structname> by
<command>CLUSTER</command> command in a transaction
* Executing <command>TRUNCATE</command> on

user_catalog_table

This looks good to me. Keep the 2PC documentation patch also on the
same lines.

Yeah, of course. Thanks for your confirmation.

Hi, attached the updated patch-set.

I've conducted some updates.

(1) Added commit messages for all patches
(2) Sorted out the descriptions of listitem to make them look uniform
(3) Removed PG11-specific patch and unified the patch from PG11 to PG13,
which will keep the documents cleanliness for future back-patching, if any.

(4) Removed unnecessary space after 'id'

In v04, there was an unneeded space like below. Fixed.
In the same logicaldecoding.sgml doc, there is no space after 'id' for sec2.

+   <sect2 id ="logicaldecoding-synchronous-caveats">
+    <title>Caveats</title>

(5) Fixed the reference accurately by replacing link tag with xref tag.

In v04, I let the reference be inaccurate, because the linkend points to the caveats
but the link word was "Synchronous Replication Support for Logical Decoding".

+       [user] catalog tables exclusively. To avoid this users must refrain from
+       having locks on catalog tables (e.g. explicit <command>LOCK</command> command)
+       in such transactions.
+       (See <link linkend="logicaldecoding-synchronous-caveats">Synchronous
+       Replication Support for Logical Decoding</link> for the details.)

So, in v05, I've fixed this to point out the caveats directly.

+       [user] catalog tables exclusively. To avoid this users must refrain from
+       having locks on catalog tables (e.g. explicit <command>LOCK</command> command)
+       in such transactions.
+       (See <xref linkend="logicaldecoding-synchronous-caveats"/> for the details.)

Kindly have a look at the patch-set.

Best Regards,
Takamichi Osumi

Attachments:

HEAD_deadlock_documentation_of_logical_decoding_v05.patchapplication/octet-stream; name=HEAD_deadlock_documentation_of_logical_decoding_v05.patch
HEAD_with_2PC_deadlock_documentation_of_logical_decoding_v05.patchapplication/octet-stream; name=HEAD_with_2PC_deadlock_documentation_of_logical_decoding_v05.patch
PG10_deadlock_documentation_of_logical_decoding_v05.patchapplication/octet-stream; name=PG10_deadlock_documentation_of_logical_decoding_v05.patch
PG11_deadlock_documentation_of_logical_decoding_v05.patchapplication/octet-stream; name=PG11_deadlock_documentation_of_logical_decoding_v05.patch
PG12_deadlock_documentation_of_logical_decoding_v05.patchapplication/octet-stream; name=PG12_deadlock_documentation_of_logical_decoding_v05.patch
PG13_deadlock_documentation_of_logical_decoding_v05.patchapplication/octet-stream; name=PG13_deadlock_documentation_of_logical_decoding_v05.patch
PG96_deadlock_documentation_of_logical_decoding_v05.patchapplication/octet-stream; name=PG96_deadlock_documentation_of_logical_decoding_v05.patch
#40vignesh C
vignesh C
vignesh21@gmail.com
In reply to: osumi.takamichi@fujitsu.com (#39)
Re: locking [user] catalog tables vs 2pc vs logical rep

On Fri, Jun 11, 2021 at 6:57 AM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:

On Thursday, June 10, 2021 1:30 PM I wrote:

On Thursday, June 10, 2021 1:14 PM vignesh C <vignesh21@gmail.com>

On Wed, Jun 9, 2021 at 12:03 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:

On Wednesday, June 9, 2021 12:06 PM Amit Kapila

<amit.kapila16@gmail.com> wrote:

On Tue, Jun 8, 2021 at 6:24 PM vignesh C <vignesh21@gmail.com>

wrote:

Thanks for the updated patch.

I have few comments:
1) Should we list the actual system tables like
pg_class,pg_trigger, etc instead of any other catalog table?
User has issued an explicit LOCK on pg_class (or any other
catalog
table)

I think the way it is mentioned is okay. We don't need to specify
other catalog tables.

Okay.

2) Here This means deadlock, after this we mention deadlock
again for each of the examples, we can remove it if redundant.
This can happen in the following ways:

I think this sentence works to notify that commands described below
are major scenarios naturally, to the readers. Then, I don't want to
remove

it.

If you somehow feel that the descriptions are redundant, how about
unifying all listitems as nouns. like below ?

* An explicit <command>LOCK</command> on
<structname>pg_class</structname> (or any other catalog table) in a
transaction
* Reordering <structname>pg_class</structname> by
<command>CLUSTER</command> command in a transaction
* Executing <command>TRUNCATE</command> on

user_catalog_table

This looks good to me. Keep the 2PC documentation patch also on the
same lines.

Yeah, of course. Thanks for your confirmation.

Hi, attached the updated patch-set.

I've conducted some updates.

(1) Added commit messages for all patches
(2) Sorted out the descriptions of listitem to make them look uniform
(3) Removed PG11-specific patch and unified the patch from PG11 to PG13,
which will keep the documents cleanliness for future back-patching, if any.

(4) Removed unnecessary space after 'id'

In v04, there was an unneeded space like below. Fixed.
In the same logicaldecoding.sgml doc, there is no space after 'id' for sec2.

+   <sect2 id ="logicaldecoding-synchronous-caveats">
+    <title>Caveats</title>

(5) Fixed the reference accurately by replacing link tag with xref tag.

In v04, I let the reference be inaccurate, because the linkend points to the caveats
but the link word was "Synchronous Replication Support for Logical Decoding".

+       [user] catalog tables exclusively. To avoid this users must refrain from
+       having locks on catalog tables (e.g. explicit <command>LOCK</command> command)
+       in such transactions.
+       (See <link linkend="logicaldecoding-synchronous-caveats">Synchronous
+       Replication Support for Logical Decoding</link> for the details.)

So, in v05, I've fixed this to point out the caveats directly.

+       [user] catalog tables exclusively. To avoid this users must refrain from
+       having locks on catalog tables (e.g. explicit <command>LOCK</command> command)
+       in such transactions.
+       (See <xref linkend="logicaldecoding-synchronous-caveats"/> for the details.)

Kindly have a look at the patch-set.

Thanks for the updated patch:
Few comments:
1) We have used Reordering and Clustering for the same command, we
could rephrase similarly in both places.
+       <para>
+        Reordering <structname>pg_class</structname> by
<command>CLUSTER</command>
+        command in a transaction.
+       </para>
+       <para>
+        Clustering <structname>pg_trigger</structname> and decoding
<command>PREPARE
+        TRANSACTION</command>, if any published table have a trigger and any
+        operations that will be decoded are conducted.
+       </para>
+      </listitem>
2) Here user_catalog_table should be user catalog table
+       <para>
+        Executing <command>TRUNCATE</command> on user_catalog_table
in a transaction.
+       </para>

Regards,
Vignesh

#41osumi.takamichi@fujitsu.com
osumi.takamichi@fujitsu.com
osumi.takamichi@fujitsu.com
In reply to: vignesh C (#40)
7 attachment(s)
RE: locking [user] catalog tables vs 2pc vs logical rep

On Friday, June 11, 2021 2:13 PM vignesh C <vignesh21@gmail.com> wrote:

Thanks for the updated patch:
Few comments:
1) We have used Reordering and Clustering for the same command, we could
rephrase similarly in both places.
+       <para>
+        Reordering <structname>pg_class</structname> by
<command>CLUSTER</command>
+        command in a transaction.
+       </para>
+       <para>
+        Clustering <structname>pg_trigger</structname> and decoding
<command>PREPARE
+        TRANSACTION</command>, if any published table have a trigger
and any
+        operations that will be decoded are conducted.
+       </para>
+      </listitem>
2) Here user_catalog_table should be user catalog table
+       <para>
+        Executing <command>TRUNCATE</command> on
user_catalog_table
in a transaction.
+       </para>

Thanks for your review.

Attached the patch-set that addressed those two comments.
I also fixed the commit message a bit in the 2PC specific patch to HEAD.
No other changes.

Please check.

Best Regards,
Takamichi Osumi

Attachments:

PG10_deadlock_documentation_of_logical_decoding_v06.patchapplication/octet-stream; name=PG10_deadlock_documentation_of_logical_decoding_v06.patch
HEAD_deadlock_documentation_of_logical_decoding_v06.patchapplication/octet-stream; name=HEAD_deadlock_documentation_of_logical_decoding_v06.patch
HEAD_with_2PC_deadlock_documentation_of_logical_decoding_v06.patchapplication/octet-stream; name=HEAD_with_2PC_deadlock_documentation_of_logical_decoding_v06.patch
PG11_deadlock_documentation_of_logical_decoding_v06.patchapplication/octet-stream; name=PG11_deadlock_documentation_of_logical_decoding_v06.patch
PG12_deadlock_documentation_of_logical_decoding_v06.patchapplication/octet-stream; name=PG12_deadlock_documentation_of_logical_decoding_v06.patch
PG13_deadlock_documentation_of_logical_decoding_v06.patchapplication/octet-stream; name=PG13_deadlock_documentation_of_logical_decoding_v06.patch
PG96_deadlock_documentation_of_logical_decoding_v06.patchapplication/octet-stream; name=PG96_deadlock_documentation_of_logical_decoding_v06.patch
#42vignesh C
vignesh C
vignesh21@gmail.com
In reply to: osumi.takamichi@fujitsu.com (#41)
Re: locking [user] catalog tables vs 2pc vs logical rep

On Mon, Jun 14, 2021 at 5:33 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:

On Friday, June 11, 2021 2:13 PM vignesh C <vignesh21@gmail.com> wrote:

Thanks for the updated patch:
Few comments:
1) We have used Reordering and Clustering for the same command, we could
rephrase similarly in both places.
+       <para>
+        Reordering <structname>pg_class</structname> by
<command>CLUSTER</command>
+        command in a transaction.
+       </para>
+       <para>
+        Clustering <structname>pg_trigger</structname> and decoding
<command>PREPARE
+        TRANSACTION</command>, if any published table have a trigger
and any
+        operations that will be decoded are conducted.
+       </para>
+      </listitem>
2) Here user_catalog_table should be user catalog table
+       <para>
+        Executing <command>TRUNCATE</command> on
user_catalog_table
in a transaction.
+       </para>

Thanks for your review.

Attached the patch-set that addressed those two comments.
I also fixed the commit message a bit in the 2PC specific patch to HEAD.
No other changes.

Please check.

Thanks for the updated patches, the patch applies cleanly in all branches.
Please add a commitfest entry for this, so that we don't miss it.

Regards,
Vignesh

#43osumi.takamichi@fujitsu.com
osumi.takamichi@fujitsu.com
osumi.takamichi@fujitsu.com
In reply to: vignesh C (#42)
RE: locking [user] catalog tables vs 2pc vs logical rep

On Tuesday, June 15, 2021 1:51 PM vignesh C <vignesh21@gmail.com> wrote:

Attached the patch-set that addressed those two comments.
I also fixed the commit message a bit in the 2PC specific patch to HEAD.
No other changes.

Please check.

Thanks for the updated patches, the patch applies cleanly in all branches.
Please add a commitfest entry for this, so that we don't miss it.

Thank you. I've registered the patch-set in [1]https://commitfest.postgresql.org/33/3170/.
I'll wait for other reviews from other developers, if any.

[1]: https://commitfest.postgresql.org/33/3170/

Best Regards,
Takamichi Osumi

#44Amit Kapila
Amit Kapila
amit.kapila16@gmail.com
In reply to: osumi.takamichi@fujitsu.com (#41)
Re: locking [user] catalog tables vs 2pc vs logical rep

On Mon, Jun 14, 2021 at 5:33 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:

On Friday, June 11, 2021 2:13 PM vignesh C <vignesh21@gmail.com> wrote:

Attached the patch-set that addressed those two comments.

Few minor comments:
1.
+      <listitem>
+       <para>
+        Clustering <structname>pg_class</structname> in a transaction.

Can we change above to: Perform <command>CLUSTER</command> on
<structname>pg_class</structname> in a transaction.

2.
+      <listitem>
+       <para>
+        Executing <command>TRUNCATE</command> on user catalog table
in a transaction.
+       </para>

Square brackets are missing for user.

3.
+    <indexterm>
+     <primary>Overview</primary>
+    </indexterm>
..
..
+    <indexterm>
+     <primary>Caveats</primary>
+    </indexterm>

Why are these required when we already have titles? I have seen other
places in the docs where we use titles for Overview and Caveats but
they didn't have similar usage.

4.
<para>
+        Performing <command>PREPARE TRANSACTION</command> after
<command>LOCK</command>
+        command on <structname>pg_class</structname> and logical
decoding of published
+        table.

Can we change above to: <command>PREPARE TRANSACTION</command> after
<command>LOCK</command> command on <structname>pg_class</structname>
and allow logical decoding of two-phase transactions.

5.
+       <para>
+        Clustering <structname>pg_trigger</structname> and decoding
<command>PREPARE
+        TRANSACTION</command>, if any published table have a trigger and any
+        operations that will be decoded are conducted.
+       </para>

Can we change above to: <command>PREPARE TRANSACTION</command> after
<command>CLUSTER</command> command on
<structname>pg_trigger</structname> and allow logical decoding of
two-phase transactions. This will lead to deadlock only when published
table have a trigger.

--
With Regards,
Amit Kapila.

#45vignesh C
vignesh C
vignesh21@gmail.com
In reply to: Amit Kapila (#44)
Re: locking [user] catalog tables vs 2pc vs logical rep

On Wed, Jun 16, 2021 at 3:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Jun 14, 2021 at 5:33 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:

On Friday, June 11, 2021 2:13 PM vignesh C <vignesh21@gmail.com> wrote:

Attached the patch-set that addressed those two comments.

Few minor comments:
1.
+      <listitem>
+       <para>
+        Clustering <structname>pg_class</structname> in a transaction.

Can we change above to: Perform <command>CLUSTER</command> on
<structname>pg_class</structname> in a transaction.

2.
+      <listitem>
+       <para>
+        Executing <command>TRUNCATE</command> on user catalog table
in a transaction.
+       </para>

Square brackets are missing for user.

3.
+    <indexterm>
+     <primary>Overview</primary>
+    </indexterm>
..
..
+    <indexterm>
+     <primary>Caveats</primary>
+    </indexterm>

Why are these required when we already have titles? I have seen other
places in the docs where we use titles for Overview and Caveats but
they didn't have similar usage.

Even I felt this was not required. I had checked other places and also
prepared doc by removing it, it works fine.

Regards,
Vignesh

#46osumi.takamichi@fujitsu.com
osumi.takamichi@fujitsu.com
osumi.takamichi@fujitsu.com
In reply to: Amit Kapila (#44)
RE: locking [user] catalog tables vs 2pc vs logical rep

On Wednesday, June 16, 2021 7:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Jun 14, 2021 at 5:33 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:

On Friday, June 11, 2021 2:13 PM vignesh C <vignesh21@gmail.com>

wrote:

Attached the patch-set that addressed those two comments.

Few minor comments:
1.
+      <listitem>
+       <para>
+        Clustering <structname>pg_class</structname> in a transaction.

Can we change above to: Perform <command>CLUSTER</command> on
<structname>pg_class</structname> in a transaction.

Looks better.

2.
+      <listitem>
+       <para>
+        Executing <command>TRUNCATE</command> on user catalog
table
in a transaction.
+       </para>

Square brackets are missing for user.

Thanks for catching it. You are right.

3.
+    <indexterm>
+     <primary>Overview</primary>
+    </indexterm>
..
..
+    <indexterm>
+     <primary>Caveats</primary>
+    </indexterm>

Why are these required when we already have titles? I have seen other places
in the docs where we use titles for Overview and Caveats but they didn't have
similar usage.

Sorry, this was a mistake. We didn't need those sections.

4.
<para>
+        Performing <command>PREPARE TRANSACTION</command>
after
<command>LOCK</command>
+        command on <structname>pg_class</structname> and logical
decoding of published
+        table.

Can we change above to: <command>PREPARE
TRANSACTION</command> after <command>LOCK</command>
command on <structname>pg_class</structname> and allow logical
decoding of two-phase transactions.

5.
+       <para>
+        Clustering <structname>pg_trigger</structname> and decoding
<command>PREPARE
+        TRANSACTION</command>, if any published table have a trigger
and any
+        operations that will be decoded are conducted.
+       </para>

Can we change above to: <command>PREPARE
TRANSACTION</command> after <command>CLUSTER</command>
command on <structname>pg_trigger</structname> and allow logical
decoding of two-phase transactions. This will lead to deadlock only when
published table have a trigger.

Yeah, I needed the nuance to turn on logical decoding of two-phase transactions...
Your above suggestions are much tidier and more accurate than mine.
I agree with your all suggestions.

Best Regards,
Takamichi Osumi

#47Amit Kapila
Amit Kapila
amit.kapila16@gmail.com
In reply to: osumi.takamichi@fujitsu.com (#46)
Re: locking [user] catalog tables vs 2pc vs logical rep

On Thu, Jun 17, 2021 at 8:41 AM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:

On Wednesday, June 16, 2021 7:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Jun 14, 2021 at 5:33 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:

On Friday, June 11, 2021 2:13 PM vignesh C <vignesh21@gmail.com>

wrote:

Attached the patch-set that addressed those two comments.

Few minor comments:
1.
+      <listitem>
+       <para>
+        Clustering <structname>pg_class</structname> in a transaction.

Can we change above to: Perform <command>CLUSTER</command> on
<structname>pg_class</structname> in a transaction.

Looks better.

2.
+      <listitem>
+       <para>
+        Executing <command>TRUNCATE</command> on user catalog
table
in a transaction.
+       </para>

Square brackets are missing for user.

Thanks for catching it. You are right.

3.
+    <indexterm>
+     <primary>Overview</primary>
+    </indexterm>
..
..
+    <indexterm>
+     <primary>Caveats</primary>
+    </indexterm>

Why are these required when we already have titles? I have seen other places
in the docs where we use titles for Overview and Caveats but they didn't have
similar usage.

Sorry, this was a mistake. We didn't need those sections.

4.
<para>
+        Performing <command>PREPARE TRANSACTION</command>
after
<command>LOCK</command>
+        command on <structname>pg_class</structname> and logical
decoding of published
+        table.

Can we change above to: <command>PREPARE
TRANSACTION</command> after <command>LOCK</command>
command on <structname>pg_class</structname> and allow logical
decoding of two-phase transactions.

5.
+       <para>
+        Clustering <structname>pg_trigger</structname> and decoding
<command>PREPARE
+        TRANSACTION</command>, if any published table have a trigger
and any
+        operations that will be decoded are conducted.
+       </para>

Can we change above to: <command>PREPARE
TRANSACTION</command> after <command>CLUSTER</command>
command on <structname>pg_trigger</structname> and allow logical
decoding of two-phase transactions. This will lead to deadlock only when
published table have a trigger.

Yeah, I needed the nuance to turn on logical decoding of two-phase transactions...
Your above suggestions are much tidier and more accurate than mine.
I agree with your all suggestions.

Pushed!

--
With Regards,
Amit Kapila.

#48Amit Kapila
Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Kapila (#47)
Re: locking [user] catalog tables vs 2pc vs logical rep

On Thu, Jun 17, 2021 at 4:27 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Jun 17, 2021 at 8:41 AM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:

Pushed!

[Responding to Simon's comments]

If LOCK and TRUNCATE is advised against on all user catalog tables, why would CLUSTER only apply to pg_class? Surely its locking
level is the same as LOCK?

Cluster will also apply to all user catalog tables. I think we can
extend it slightly as we have mentioned for Lock.

The use of "[user]" isn't fully explained, so it might not be clear that this applies to both Postgres catalog tables and any user tables
that have been nominated as catalogs. Probably worth linking to the "Capabilities" section to explain.

Sounds reasonable.

It would be worth coalescing the following sections into a single page, since they are just a few lines each:
Streaming Replication Protocol Interface
Logical Decoding SQL Interface
System Catalogs Related to Logical Decoding

I think this is worth considering but we might want to discuss this as
a separate change/patch.

--
With Regards,
Amit Kapila.

#49Simon Riggs
Simon Riggs
simon.riggs@enterprisedb.com
In reply to: Amit Kapila (#48)
Re: locking [user] catalog tables vs 2pc vs logical rep

On Thu, Jun 17, 2021 at 12:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Jun 17, 2021 at 4:27 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Jun 17, 2021 at 8:41 AM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:

Pushed!

[Responding to Simon's comments]

If LOCK and TRUNCATE is advised against on all user catalog tables, why would CLUSTER only apply to pg_class? Surely its locking
level is the same as LOCK?

Cluster will also apply to all user catalog tables. I think we can
extend it slightly as we have mentioned for Lock.

OK, good.

The use of "[user]" isn't fully explained, so it might not be clear that this applies to both Postgres catalog tables and any user tables
that have been nominated as catalogs. Probably worth linking to the "Capabilities" section to explain.

Sounds reasonable.

It would be worth coalescing the following sections into a single page, since they are just a few lines each:
Streaming Replication Protocol Interface
Logical Decoding SQL Interface
System Catalogs Related to Logical Decoding

I think this is worth considering but we might want to discuss this as
a separate change/patch.

Makes sense.

Thanks

--
Simon Riggs http://www.EnterpriseDB.com/

#50osumi.takamichi@fujitsu.com
osumi.takamichi@fujitsu.com
osumi.takamichi@fujitsu.com
In reply to: Simon Riggs (#49)
RE: locking [user] catalog tables vs 2pc vs logical rep

On Thursday, June 17, 2021 10:34 PM Simon Riggs <simon.riggs@enterprisedb.com> wrote:

On Thu, Jun 17, 2021 at 12:57 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:

On Thu, Jun 17, 2021 at 4:27 PM Amit Kapila <amit.kapila16@gmail.com>

wrote:

On Thu, Jun 17, 2021 at 8:41 AM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:

Pushed!

[Responding to Simon's comments]

If LOCK and TRUNCATE is advised against on all user catalog tables,
why would CLUSTER only apply to pg_class? Surely its locking level is the

same as LOCK?

Cluster will also apply to all user catalog tables. I think we can
extend it slightly as we have mentioned for Lock.

OK, good.

The use of "[user]" isn't fully explained, so it might not be clear
that this applies to both Postgres catalog tables and any user tables that

have been nominated as catalogs. Probably worth linking to the "Capabilities"
section to explain.

Sounds reasonable.

Simon, I appreciate your suggestions and yes,
if the user catalog table is referenced by the output plugin,
it can be another cause of the deadlock.

I'm going to post the patch for the those two changes, accordingly.

Best Regards,
Takamichi Osumi

#51osumi.takamichi@fujitsu.com
osumi.takamichi@fujitsu.com
osumi.takamichi@fujitsu.com
In reply to: osumi.takamichi@fujitsu.com (#50)
6 attachment(s)
RE: locking [user] catalog tables vs 2pc vs logical rep

On Friday, June 18, 2021 11:41 AM osumi.takamichi@fujitsu.com <osumi.takamichi@fujitsu.com> wrote:

On Thursday, June 17, 2021 10:34 PM Simon Riggs
<simon.riggs@enterprisedb.com> wrote:

On Thu, Jun 17, 2021 at 12:57 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:

On Thu, Jun 17, 2021 at 4:27 PM Amit Kapila
<amit.kapila16@gmail.com>

wrote:

On Thu, Jun 17, 2021 at 8:41 AM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:

Pushed!

[Responding to Simon's comments]

If LOCK and TRUNCATE is advised against on all user catalog
tables, why would CLUSTER only apply to pg_class? Surely its
locking level is the

same as LOCK?

Cluster will also apply to all user catalog tables. I think we can
extend it slightly as we have mentioned for Lock.

OK, good.

The use of "[user]" isn't fully explained, so it might not be
clear that this applies to both Postgres catalog tables and any
user tables that

have been nominated as catalogs. Probably worth linking to the

"Capabilities"

section to explain.

Sounds reasonable.

Simon, I appreciate your suggestions and yes, if the user catalog table is
referenced by the output plugin, it can be another cause of the deadlock.

I'm going to post the patch for the those two changes, accordingly.

Hi, I've made the patch-set to cover the discussion above for all-supported versions.
Please have a look at those.

Best Regards,
Takamichi Osumi

Attachments:

HEAD_extend_logical_decoding_caveats_in_sync_mode_v01.patchapplication/octet-stream; name=HEAD_extend_logical_decoding_caveats_in_sync_mode_v01.patch
PG10_extend_logical_decoding_caveats_in_sync_mode_v01.patchapplication/octet-stream; name=PG10_extend_logical_decoding_caveats_in_sync_mode_v01.patch
PG11_extend_logical_decoding_caveats_in_sync_mode_v01.patchapplication/octet-stream; name=PG11_extend_logical_decoding_caveats_in_sync_mode_v01.patch
PG12_extend_logical_decoding_caveats_in_sync_mode_v01.patchapplication/octet-stream; name=PG12_extend_logical_decoding_caveats_in_sync_mode_v01.patch
PG13_extend_logical_decoding_caveats_in_sync_mode_v01.patchapplication/octet-stream; name=PG13_extend_logical_decoding_caveats_in_sync_mode_v01.patch
PG96_extend_logical_decoding_caveats_in_sync_mode_v01.patchapplication/octet-stream; name=PG96_extend_logical_decoding_caveats_in_sync_mode_v01.patch
#52Amit Kapila
Amit Kapila
amit.kapila16@gmail.com
In reply to: osumi.takamichi@fujitsu.com (#51)
1 attachment(s)
Re: locking [user] catalog tables vs 2pc vs logical rep

On Fri, Jun 18, 2021 at 2:25 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:

On Friday, June 18, 2021 11:41 AM osumi.takamichi@fujitsu.com <osumi.takamichi@fujitsu.com> wrote:

Simon, I appreciate your suggestions and yes, if the user catalog table is
referenced by the output plugin, it can be another cause of the deadlock.

I'm going to post the patch for the those two changes, accordingly.

Hi, I've made the patch-set to cover the discussion above for all-supported versions.
Please have a look at those.

I have slightly modified your patch, see if the attached looks okay to
you? This is just a HEAD patch, I'll modify the back-branch patches
accordingly.

--
With Regards,
Amit Kapila.

Attachments:

v2-0001-Doc-Update-caveats-in-synchronous-logical-replica.patchapplication/octet-stream; name=v2-0001-Doc-Update-caveats-in-synchronous-logical-replica.patch
#53osumi.takamichi@fujitsu.com
osumi.takamichi@fujitsu.com
osumi.takamichi@fujitsu.com
In reply to: Amit Kapila (#52)
RE: locking [user] catalog tables vs 2pc vs logical rep

On Saturday, June 19, 2021 6:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Jun 18, 2021 at 2:25 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:

On Friday, June 18, 2021 11:41 AM osumi.takamichi@fujitsu.com

<osumi.takamichi@fujitsu.com> wrote:

Simon, I appreciate your suggestions and yes, if the user catalog
table is referenced by the output plugin, it can be another cause of the

deadlock.

I'm going to post the patch for the those two changes, accordingly.

Hi, I've made the patch-set to cover the discussion above for all-supported

versions.

Please have a look at those.

I have slightly modified your patch, see if the attached looks okay to you? This
is just a HEAD patch, I'll modify the back-branch patches accordingly.

Thank you for updating the patch.
The patch becomes much better. Yet, I have one suggestion.

* doc/src/sgml/logicaldecoding.sgml
      <itemizedlist>
       <listitem>
        <para>
         Issuing an explicit <command>LOCK</command> on <structname>pg_class</structname>
-        (or any other catalog table) in a transaction.
+        (or any other [user] catalog table) in a transaction.
        </para>
       </listitem>
       <listitem>
        <para>
-        Perform <command>CLUSTER</command> on <structname>pg_class</structname> in
-        a transaction.
+        Perform <command>CLUSTER</command> on <structname>pg_class</structname> (or any
+        other [user] catalog table) in a transaction.
        </para>
       </listitem>
       <listitem>
        <para>
         <command>PREPARE TRANSACTION</command> after <command>LOCK</command> command
-        on <structname>pg_class</structname> and allow logical decoding of two-phase
-        transactions.
+        on <structname>pg_class</structname> (or any other [user] catalog table) and
+        allow logical decoding of two-phase transactions.
        </para>
       </listitem>
       <listitem>
        <para>
         <command>PREPARE TRANSACTION</command> after <command>CLUSTER</command>
-        command on <structname>pg_trigger</structname> and allow logical decoding of
-        two-phase transactions. This will lead to deadlock only when published table
-        have a trigger.
+        command on <structname>pg_trigger</structname> (or any other [user] catalog
+        table) and allow logical decoding of two-phase transactions. This will lead
+        to deadlock only when published table have a trigger.

Now we have the four paren supplementary descriptions,
not to make users miss any other [user] catalog tables.
Because of this, the built output html gives me some redundant
impression, for that parts. Accordingly, couldn't we move them
to outside of the itemizedlist section in a simple manner ?

For example, to insert a sentence below the list,
after removing the paren descriptions in the listitem, which says
"Note that those commands that can cause deadlock apply to not only
explicitly indicated system catalog tables above but also any other [user] catalog table."

Best Regards,
Takamichi Osumi

#54Amit Kapila
Amit Kapila
amit.kapila16@gmail.com
In reply to: osumi.takamichi@fujitsu.com (#53)
Re: locking [user] catalog tables vs 2pc vs logical rep

On Sun, Jun 20, 2021 at 9:28 AM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:

On Saturday, June 19, 2021 6:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Jun 18, 2021 at 2:25 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:

On Friday, June 18, 2021 11:41 AM osumi.takamichi@fujitsu.com

<osumi.takamichi@fujitsu.com> wrote:

Simon, I appreciate your suggestions and yes, if the user catalog
table is referenced by the output plugin, it can be another cause of the

deadlock.

I'm going to post the patch for the those two changes, accordingly.

Hi, I've made the patch-set to cover the discussion above for all-supported

versions.

Please have a look at those.

I have slightly modified your patch, see if the attached looks okay to you? This
is just a HEAD patch, I'll modify the back-branch patches accordingly.

Thank you for updating the patch.
The patch becomes much better. Yet, I have one suggestion.

* doc/src/sgml/logicaldecoding.sgml
<itemizedlist>
<listitem>
<para>
Issuing an explicit <command>LOCK</command> on <structname>pg_class</structname>
-        (or any other catalog table) in a transaction.
+        (or any other [user] catalog table) in a transaction.
</para>
</listitem>
<listitem>
<para>
-        Perform <command>CLUSTER</command> on <structname>pg_class</structname> in
-        a transaction.
+        Perform <command>CLUSTER</command> on <structname>pg_class</structname> (or any
+        other [user] catalog table) in a transaction.
</para>
</listitem>
<listitem>
<para>
<command>PREPARE TRANSACTION</command> after <command>LOCK</command> command
-        on <structname>pg_class</structname> and allow logical decoding of two-phase
-        transactions.
+        on <structname>pg_class</structname> (or any other [user] catalog table) and
+        allow logical decoding of two-phase transactions.
</para>
</listitem>
<listitem>
<para>
<command>PREPARE TRANSACTION</command> after <command>CLUSTER</command>
-        command on <structname>pg_trigger</structname> and allow logical decoding of
-        two-phase transactions. This will lead to deadlock only when published table
-        have a trigger.
+        command on <structname>pg_trigger</structname> (or any other [user] catalog
+        table) and allow logical decoding of two-phase transactions. This will lead
+        to deadlock only when published table have a trigger.

Now we have the four paren supplementary descriptions,
not to make users miss any other [user] catalog tables.
Because of this, the built output html gives me some redundant
impression, for that parts. Accordingly, couldn't we move them
to outside of the itemizedlist section in a simple manner ?

For example, to insert a sentence below the list,
after removing the paren descriptions in the listitem, which says
"Note that those commands that can cause deadlock apply to not only
explicitly indicated system catalog tables above but also any other [user] catalog table."

Sounds reasonable to me. /but also any other/but also to any other/,
to seems to be missing in the above line. Kindly send an update patch.

--
With Regards,
Amit Kapila.

#55osumi.takamichi@fujitsu.com
osumi.takamichi@fujitsu.com
osumi.takamichi@fujitsu.com
In reply to: Amit Kapila (#54)
1 attachment(s)
RE: locking [user] catalog tables vs 2pc vs logical rep

On Sunday, June 20, 2021 3:23 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Sun, Jun 20, 2021 at 9:28 AM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:

* doc/src/sgml/logicaldecoding.sgml

...

Now we have the four paren supplementary descriptions, not to make
users miss any other [user] catalog tables.
Because of this, the built output html gives me some redundant
impression, for that parts. Accordingly, couldn't we move them to
outside of the itemizedlist section in a simple manner ?

For example, to insert a sentence below the list, after removing the
paren descriptions in the listitem, which says "Note that those
commands that can cause deadlock apply to not only explicitly
indicated system catalog tables above but also any other [user] catalog table."

Sounds reasonable to me. /but also any other/but also to any other/, to
seems to be missing in the above line. Kindly send an update patch.

Excuse me, I don't understand the second sentence.
I wrote "but also" clause in my example.

Also, attached the patch for the change to the HEAD.

Best Regards,
Takamichi Osumi

Attachments:

v3-0001-Doc-Update-caveats-in-synchronous-logical-replica.patchapplication/octet-stream; name=v3-0001-Doc-Update-caveats-in-synchronous-logical-replica.patch
#56osumi.takamichi@fujitsu.com
osumi.takamichi@fujitsu.com
osumi.takamichi@fujitsu.com
In reply to: osumi.takamichi@fujitsu.com (#55)
1 attachment(s)
RE: locking [user] catalog tables vs 2pc vs logical rep

On Sunday, June 20, 2021 9:50 PM I wrote:

On Sunday, June 20, 2021 3:23 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:

On Sun, Jun 20, 2021 at 9:28 AM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:

* doc/src/sgml/logicaldecoding.sgml

...

Now we have the four paren supplementary descriptions, not to make
users miss any other [user] catalog tables.
Because of this, the built output html gives me some redundant
impression, for that parts. Accordingly, couldn't we move them to
outside of the itemizedlist section in a simple manner ?

For example, to insert a sentence below the list, after removing the
paren descriptions in the listitem, which says "Note that those
commands that can cause deadlock apply to not only explicitly
indicated system catalog tables above but also any other [user] catalog

table."

Sounds reasonable to me. /but also any other/but also to any other/,
to seems to be missing in the above line. Kindly send an update patch.

Excuse me, I don't understand the second sentence.
I wrote "but also" clause in my example.

Also, attached the patch for the change to the HEAD.

I've updated the patch to follow the correction Amit-san mentioned.
Please check.

Best Regards,
Takamichi Osumi

Attachments:

v4-0001-Doc-Update-caveats-in-synchronous-logical-replica.patchapplication/octet-stream; name=v4-0001-Doc-Update-caveats-in-synchronous-logical-replica.patch
#57Amit Kapila
Amit Kapila
amit.kapila16@gmail.com
In reply to: osumi.takamichi@fujitsu.com (#56)
Re: locking [user] catalog tables vs 2pc vs logical rep

On Mon, Jun 21, 2021 at 8:48 AM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:

On Sunday, June 20, 2021 9:50 PM I wrote:

On Sunday, June 20, 2021 3:23 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:

On Sun, Jun 20, 2021 at 9:28 AM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:

* doc/src/sgml/logicaldecoding.sgml

...

Now we have the four paren supplementary descriptions, not to make
users miss any other [user] catalog tables.
Because of this, the built output html gives me some redundant
impression, for that parts. Accordingly, couldn't we move them to
outside of the itemizedlist section in a simple manner ?

For example, to insert a sentence below the list, after removing the
paren descriptions in the listitem, which says "Note that those
commands that can cause deadlock apply to not only explicitly
indicated system catalog tables above but also any other [user] catalog

table."

Sounds reasonable to me. /but also any other/but also to any other/,
to seems to be missing in the above line. Kindly send an update patch.

Excuse me, I don't understand the second sentence.
I wrote "but also" clause in my example.

Also, attached the patch for the change to the HEAD.

I've updated the patch to follow the correction Amit-san mentioned.
Please check.

Pushed.

--
With Regards,
Amit Kapila.