Forget close an open relation in ReorderBufferProcessTXN()
The RelationIdGetRelation() comment says:
Caller should eventually decrement count. (Usually,
that happens by calling RelationClose().)
However, it doesn't do it in ReorderBufferProcessTXN().
I think we should close it, here is a patch that fixes it. Thoughts?
--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 52d06285a2..aac6ffc602 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -2261,7 +2261,10 @@ ReorderBufferProcessTXN(ReorderBuffer *rb, ReorderBufferTXN *txn,
elog(ERROR, "could not open relation with OID %u", relid);
if (!RelationIsLogicallyLogged(relation))
+ {
+ RelationClose(relation);
continue;
+ }
relations[nrelations++] = relation;
}
On Thu, Apr 15, 2021 at 4:00 PM Japin Li <japinli@hotmail.com> wrote:
The RelationIdGetRelation() comment says:
Caller should eventually decrement count. (Usually,
that happens by calling RelationClose().)However, it doesn't do it in ReorderBufferProcessTXN().
I think we should close it, here is a patch that fixes it. Thoughts?
+1. Your fix looks correct to me but can we test it in some way?
--
With Regards,
Amit Kapila.
Amit Kapila <amit.kapila16@gmail.com> writes:
On Thu, Apr 15, 2021 at 4:00 PM Japin Li <japinli@hotmail.com> wrote:
The RelationIdGetRelation() comment says:
Caller should eventually decrement count. (Usually,
that happens by calling RelationClose().)However, it doesn't do it in ReorderBufferProcessTXN().
I think we should close it, here is a patch that fixes it. Thoughts?
+1. Your fix looks correct to me but can we test it in some way?
I think this code has a bigger problem: it should not be using
RelationIdGetRelation and RelationClose directly. 99.44% of
the backend goes through relation_open or one of the other
relation.c wrappers, so why doesn't this?
Possibly the answer is "it copied the equally misguided code
in pgoutput.c". A quick grep shows nothing else doing it this
way.
regards, tom lane
On Thu, Apr 15, 2021 at 10:56 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Amit Kapila <amit.kapila16@gmail.com> writes:
On Thu, Apr 15, 2021 at 4:00 PM Japin Li <japinli@hotmail.com> wrote:
The RelationIdGetRelation() comment says:
Caller should eventually decrement count. (Usually,
that happens by calling RelationClose().)However, it doesn't do it in ReorderBufferProcessTXN().
I think we should close it, here is a patch that fixes it. Thoughts?+1. Your fix looks correct to me but can we test it in some way?
I think this code has a bigger problem: it should not be using
RelationIdGetRelation and RelationClose directly. 99.44% of
the backend goes through relation_open or one of the other
relation.c wrappers, so why doesn't this?
I think it is because relation_open expects either caller to have a
lock on the relation or don't use 'NoLock' lockmode. AFAIU, we don't
need to acquire a lock on relation while decoding changes from WAL
because it uses a historic snapshot to build a relcache entry and all
the later changes to the rel are absorbed while decoding WAL.
I think it is also important to *not* acquire any lock on relation
otherwise it can lead to some sort of deadlock or infinite wait in the
decoding process. Consider a case for operations like Truncate (or if
the user has acquired an exclusive lock on the relation in some other
way say via Lock command) which acquires an exclusive lock on
relation, it won't get replicated in synchronous mode (when
synchronous_standby_name is configured). The truncate operation will
wait for the transaction to be replicated to the subscriber and the
decoding process will wait for the Truncate operation to finish.
Possibly the answer is "it copied the equally misguided code
in pgoutput.c".
I think it is following what is done during decoding, otherwise, it
will lead to the problems as described above. We are already
discussing one of the similar problems [1]/messages/by-id/OS0PR01MB6113C2499C7DC70EE55ADB82FB759@OS0PR01MB6113.jpnprd01.prod.outlook.com where pgoutput
unintentionally acquired a lock on the index and lead to a sort of
deadlock.
If the above understanding is correct, I think we might want to
improve comments in this area.
[1]: /messages/by-id/OS0PR01MB6113C2499C7DC70EE55ADB82FB759@OS0PR01MB6113.jpnprd01.prod.outlook.com
--
With Regards,
Amit Kapila.
Hi,
On 2021-04-16 08:42:40 +0530, Amit Kapila wrote:
I think it is because relation_open expects either caller to have a
lock on the relation or don't use 'NoLock' lockmode. AFAIU, we don't
need to acquire a lock on relation while decoding changes from WAL
because it uses a historic snapshot to build a relcache entry and all
the later changes to the rel are absorbed while decoding WAL.
Right.
I think it is also important to *not* acquire any lock on relation
otherwise it can lead to some sort of deadlock or infinite wait in the
decoding process. Consider a case for operations like Truncate (or if
the user has acquired an exclusive lock on the relation in some other
way say via Lock command) which acquires an exclusive lock on
relation, it won't get replicated in synchronous mode (when
synchronous_standby_name is configured). The truncate operation will
wait for the transaction to be replicated to the subscriber and the
decoding process will wait for the Truncate operation to finish.
However, this cannot be really relied upon for catalog tables. An output
function might acquire locks or such. But for those we do not need to
decode contents...
This made me take a brief look at pgoutput.c - maybe I am missing
something, but how is the following not a memory leak?
static void
maybe_send_schema(LogicalDecodingContext *ctx,
ReorderBufferTXN *txn, ReorderBufferChange *change,
Relation relation, RelationSyncEntry *relentry)
{
...
/* Map must live as long as the session does. */
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
relentry->map = convert_tuples_by_name(CreateTupleDescCopy(indesc),
CreateTupleDescCopy(outdesc));
MemoryContextSwitchTo(oldctx);
send_relation_and_attrs(ancestor, xid, ctx);
RelationClose(ancestor);
If - and that's common - convert_tuples_by_name() won't have to do
anything, the copied tuple descs will be permanently leaked.
Greetings,
Andres Freund
On Fri, Apr 16, 2021 at 11:24 PM Andres Freund <andres@anarazel.de> wrote:
I think it is also important to *not* acquire any lock on relation
otherwise it can lead to some sort of deadlock or infinite wait in the
decoding process. Consider a case for operations like Truncate (or if
the user has acquired an exclusive lock on the relation in some other
way say via Lock command) which acquires an exclusive lock on
relation, it won't get replicated in synchronous mode (when
synchronous_standby_name is configured). The truncate operation will
wait for the transaction to be replicated to the subscriber and the
decoding process will wait for the Truncate operation to finish.However, this cannot be really relied upon for catalog tables. An output
function might acquire locks or such. But for those we do not need to
decode contents...
True, so, if we don't need to decode contents then we won't have the
problems of the above kind.
This made me take a brief look at pgoutput.c - maybe I am missing
something, but how is the following not a memory leak?static void
maybe_send_schema(LogicalDecodingContext *ctx,
ReorderBufferTXN *txn, ReorderBufferChange *change,
Relation relation, RelationSyncEntry *relentry)
{
...
/* Map must live as long as the session does. */
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
relentry->map = convert_tuples_by_name(CreateTupleDescCopy(indesc),
CreateTupleDescCopy(outdesc));
MemoryContextSwitchTo(oldctx);
send_relation_and_attrs(ancestor, xid, ctx);
RelationClose(ancestor);If - and that's common - convert_tuples_by_name() won't have to do
anything, the copied tuple descs will be permanently leaked.
I also think this is a permanent leak. I think we need to free all the
memory associated with this map on the invalidation of this particular
relsync entry (basically in rel_sync_cache_relation_cb).
--
With Regards,
Amit Kapila.
On Thu, Apr 15, 2021 at 4:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Apr 15, 2021 at 4:00 PM Japin Li <japinli@hotmail.com> wrote:
The RelationIdGetRelation() comment says:
Caller should eventually decrement count. (Usually,
that happens by calling RelationClose().)However, it doesn't do it in ReorderBufferProcessTXN().
I think we should close it, here is a patch that fixes it. Thoughts?+1. Your fix looks correct to me but can we test it in some way?
I have tried to find a test but not able to find one. I have tried
with a foreign table but we don't log truncate for it, see
ExecuteTruncate. It has a check that it will log for relids where
RelationIsLogicallyLogged. If that is the case, it is not clear to me
how we can ever hit this condition? Have you tried to find the test?
--
With Regards,
Amit Kapila.
On Fri, Apr 16, 2021 at 11:24 PM Andres Freund <andres@anarazel.de> wrote:
I think it is also important to *not* acquire any lock on relation
otherwise it can lead to some sort of deadlock or infinite wait in the
decoding process. Consider a case for operations like Truncate (or if
the user has acquired an exclusive lock on the relation in some other
way say via Lock command) which acquires an exclusive lock on
relation, it won't get replicated in synchronous mode (when
synchronous_standby_name is configured). The truncate operation will
wait for the transaction to be replicated to the subscriber and the
decoding process will wait for the Truncate operation to finish.However, this cannot be really relied upon for catalog tables. An output
function might acquire locks or such. But for those we do not need to
decode contents...
I see that if we define a user_catalog_table (create table t1_cat(c1
int) WITH(user_catalog_table = true);), we are able to decode
operations like (insert, truncate) on such a table. What do you mean
by "But for those we do not need to decode contents"?
--
With Regards,
Amit Kapila.
On Sat, 17 Apr 2021 at 14:09, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Apr 15, 2021 at 4:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Apr 15, 2021 at 4:00 PM Japin Li <japinli@hotmail.com> wrote:
The RelationIdGetRelation() comment says:
Caller should eventually decrement count. (Usually,
that happens by calling RelationClose().)However, it doesn't do it in ReorderBufferProcessTXN().
I think we should close it, here is a patch that fixes it. Thoughts?+1. Your fix looks correct to me but can we test it in some way?
I have tried to find a test but not able to find one. I have tried
with a foreign table but we don't log truncate for it, see
ExecuteTruncate. It has a check that it will log for relids where
RelationIsLogicallyLogged. If that is the case, it is not clear to me
how we can ever hit this condition? Have you tried to find the test?
I also don't find a test for this. It is introduced in 5dfd1e5a6696,
wrote by Simon Riggs, Marco Nenciarini and Peter Eisentraut. Maybe they
can explain when we can enter this condition?
--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.
On Sat, Apr 17, 2021 at 12:05 PM Japin Li <japinli@hotmail.com> wrote:
On Sat, 17 Apr 2021 at 14:09, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Apr 15, 2021 at 4:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Apr 15, 2021 at 4:00 PM Japin Li <japinli@hotmail.com> wrote:
The RelationIdGetRelation() comment says:
Caller should eventually decrement count. (Usually,
that happens by calling RelationClose().)However, it doesn't do it in ReorderBufferProcessTXN().
I think we should close it, here is a patch that fixes it. Thoughts?+1. Your fix looks correct to me but can we test it in some way?
I have tried to find a test but not able to find one. I have tried
with a foreign table but we don't log truncate for it, see
ExecuteTruncate. It has a check that it will log for relids where
RelationIsLogicallyLogged. If that is the case, it is not clear to me
how we can ever hit this condition? Have you tried to find the test?I also don't find a test for this. It is introduced in 5dfd1e5a6696,
wrote by Simon Riggs, Marco Nenciarini and Peter Eisentraut. Maybe they
can explain when we can enter this condition?
My guess is that this has been copied from the code a few lines above
to handle insert/update/delete where it is required to handle some DDL
ops like Alter Table but I think we don't need it here (for Truncate
op). If that understanding turns out to be true then we should either
have an Assert for this or an elog message.
--
With Regards,
Amit Kapila.
On Sat, Apr 17, 2021 at 12:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Apr 16, 2021 at 11:24 PM Andres Freund <andres@anarazel.de> wrote:
I think it is also important to *not* acquire any lock on relation
otherwise it can lead to some sort of deadlock or infinite wait in the
decoding process. Consider a case for operations like Truncate (or if
the user has acquired an exclusive lock on the relation in some other
way say via Lock command) which acquires an exclusive lock on
relation, it won't get replicated in synchronous mode (when
synchronous_standby_name is configured). The truncate operation will
wait for the transaction to be replicated to the subscriber and the
decoding process will wait for the Truncate operation to finish.However, this cannot be really relied upon for catalog tables. An output
function might acquire locks or such. But for those we do not need to
decode contents...I see that if we define a user_catalog_table (create table t1_cat(c1
int) WITH(user_catalog_table = true);), we are able to decode
operations like (insert, truncate) on such a table. What do you mean
by "But for those we do not need to decode contents"?
I think we are allowed to decode the operations on user catalog tables
because we are using RelationIsLogicallyLogged() instead of
RelationIsAccessibleInLogicalDecoding() in ReorderBufferProcessTXN().
Based on this discussion, I think we should not be allowing decoding
of operations on user catalog tables, so we should use
RelationIsAccessibleInLogicalDecoding to skip such ops in
ReorderBufferProcessTXN(). Am, I missing something?
Can you please clarify?
--
With Regards,
Amit Kapila.
On Sat, Apr 17, 2021 at 1:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Apr 16, 2021 at 11:24 PM Andres Freund <andres@anarazel.de> wrote:> > This made me take a brief look at pgoutput.c - maybe I am missing
something, but how is the following not a memory leak?
static void
maybe_send_schema(LogicalDecodingContext *ctx,
ReorderBufferTXN *txn, ReorderBufferChange *change,
Relation relation, RelationSyncEntry *relentry)
{
...
/* Map must live as long as the session does. */
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
relentry->map = convert_tuples_by_name(CreateTupleDescCopy(indesc),
CreateTupleDescCopy(outdesc));
MemoryContextSwitchTo(oldctx);
send_relation_and_attrs(ancestor, xid, ctx);
RelationClose(ancestor);If - and that's common - convert_tuples_by_name() won't have to do
anything, the copied tuple descs will be permanently leaked.I also think this is a permanent leak. I think we need to free all the
memory associated with this map on the invalidation of this particular
relsync entry (basically in rel_sync_cache_relation_cb).
I agree there's a problem here.
Back in:
/messages/by-id/CA+HiwqEeU19iQgjN6HF1HTPU0L5+JxyS5CmxaOVGNXBAfUY06Q@mail.gmail.com
I had proposed to move the map creation from maybe_send_schema() to
get_rel_sync_entry(), mainly because the latter is where I realized it
belongs, though a bit too late. Attached is the part of the patch
for this particular issue. It also takes care to release the copied
TupleDescs if the map is found to be unnecessary, thus preventing
leaking into CacheMemoryContext.
--
Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
pgoutput-map-tupdesc-leak.patchapplication/octet-stream; name=pgoutput-map-tupdesc-leak.patchDownload+34-9
On Saturday, April 17, 2021 4:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Sat, Apr 17, 2021 at 12:05 PM Japin Li <japinli@hotmail.com> wrote:
On Sat, 17 Apr 2021 at 14:09, Amit Kapila <amit.kapila16@gmail.com>
wrote:
On Thu, Apr 15, 2021 at 4:53 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:
On Thu, Apr 15, 2021 at 4:00 PM Japin Li <japinli@hotmail.com> wrote:
The RelationIdGetRelation() comment says:
Caller should eventually decrement count. (Usually, that
happens by calling RelationClose().)However, it doesn't do it in ReorderBufferProcessTXN().
I think we should close it, here is a patch that fixes it. Thoughts?+1. Your fix looks correct to me but can we test it in some way?
I have tried to find a test but not able to find one. I have tried
with a foreign table but we don't log truncate for it, see
ExecuteTruncate. It has a check that it will log for relids where
RelationIsLogicallyLogged. If that is the case, it is not clear to
me how we can ever hit this condition? Have you tried to find the test?I also don't find a test for this. It is introduced in 5dfd1e5a6696,
wrote by Simon Riggs, Marco Nenciarini and Peter Eisentraut. Maybe
they can explain when we can enter this condition?My guess is that this has been copied from the code a few lines above to
handle insert/update/delete where it is required to handle some DDL ops like
Alter Table but I think we don't need it here (for Truncate op). If that
understanding turns out to be true then we should either have an Assert for
this or an elog message.
In this thread, we are discussing 3 topics below...
(1) necessity of the check for REORDER_BUFFER_CHANGE_TRUNCATE in ReorderBufferProcessTXN()
(2) discussion of whether we disallow decoding of operations on user catalog tables or not
(3) memory leak of maybe_send_schema() (patch already provided)
Let's address those one by one.
In terms of (1), which was close to the motivation of this thread,
first of all, I traced the truncate processing
and I think the check is done by truncate command side as well.
I preferred Assert rather than never called elog,
but it's OK to choose elog if someone has strong opinion on it.
Attached the patch for this.
Best Regards,
Takamichi Osumi
Attachments:
replace_check_of_ReorderBufferProcessTXN_v01.patchapplication/octet-stream; name=replace_check_of_ReorderBufferProcessTXN_v01.patchDownload+6-2
On Fri, 23 Apr 2021 at 22:33, osumi.takamichi@fujitsu.com <osumi.takamichi@fujitsu.com> wrote:
On Saturday, April 17, 2021 4:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Sat, Apr 17, 2021 at 12:05 PM Japin Li <japinli@hotmail.com> wrote:
On Sat, 17 Apr 2021 at 14:09, Amit Kapila <amit.kapila16@gmail.com>
wrote:
On Thu, Apr 15, 2021 at 4:53 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:
On Thu, Apr 15, 2021 at 4:00 PM Japin Li <japinli@hotmail.com> wrote:
The RelationIdGetRelation() comment says:
Caller should eventually decrement count. (Usually, that
happens by calling RelationClose().)However, it doesn't do it in ReorderBufferProcessTXN().
I think we should close it, here is a patch that fixes it. Thoughts?+1. Your fix looks correct to me but can we test it in some way?
I have tried to find a test but not able to find one. I have tried
with a foreign table but we don't log truncate for it, see
ExecuteTruncate. It has a check that it will log for relids where
RelationIsLogicallyLogged. If that is the case, it is not clear to
me how we can ever hit this condition? Have you tried to find the test?I also don't find a test for this. It is introduced in 5dfd1e5a6696,
wrote by Simon Riggs, Marco Nenciarini and Peter Eisentraut. Maybe
they can explain when we can enter this condition?My guess is that this has been copied from the code a few lines above to
handle insert/update/delete where it is required to handle some DDL ops like
Alter Table but I think we don't need it here (for Truncate op). If that
understanding turns out to be true then we should either have an Assert for
this or an elog message.In this thread, we are discussing 3 topics below...
(1) necessity of the check for REORDER_BUFFER_CHANGE_TRUNCATE in ReorderBufferProcessTXN()
(2) discussion of whether we disallow decoding of operations on user catalog tables or not
(3) memory leak of maybe_send_schema() (patch already provided)Let's address those one by one.
In terms of (1), which was close to the motivation of this thread,
first of all, I traced the truncate processing
and I think the check is done by truncate command side as well.
I preferred Assert rather than never called elog,
but it's OK to choose elog if someone has strong opinion on it.
Attached the patch for this.
+1, make check-world passed.
--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.
On Fri, Apr 23, 2021 at 8:03 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
On Saturday, April 17, 2021 4:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I also don't find a test for this. It is introduced in 5dfd1e5a6696,
wrote by Simon Riggs, Marco Nenciarini and Peter Eisentraut. Maybe
they can explain when we can enter this condition?My guess is that this has been copied from the code a few lines above to
handle insert/update/delete where it is required to handle some DDL ops like
Alter Table but I think we don't need it here (for Truncate op). If that
understanding turns out to be true then we should either have an Assert for
this or an elog message.In this thread, we are discussing 3 topics below...
(1) necessity of the check for REORDER_BUFFER_CHANGE_TRUNCATE in ReorderBufferProcessTXN()
(2) discussion of whether we disallow decoding of operations on user catalog tables or not
(3) memory leak of maybe_send_schema() (patch already provided)Let's address those one by one.
In terms of (1), which was close to the motivation of this thread,
I think (1) and (2) are related because if we need (2) then the check
removed by (1) needs to be replaced with another check. So, I am not
sure how to make this decision.
--
With Regards,
Amit Kapila.
On Tuesday, April 20, 2021 12:07 PM Amit Langote <amitlangote09@gmail.com> wrote:
On Sat, Apr 17, 2021 at 1:30 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:On Fri, Apr 16, 2021 at 11:24 PM Andres Freund <andres@anarazel.de>
wrote:> > This made me take a brief look at pgoutput.c - maybe I am
missingsomething, but how is the following not a memory leak?
static void
maybe_send_schema(LogicalDecodingContext *ctx,
ReorderBufferTXN *txn, ReorderBufferChange*change,
Relation relation, RelationSyncEntry *relentry) {
...
/* Map must live as long as the session does. */
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
relentry->map =convert_tuples_by_name(CreateTupleDescCopy(indesc),
CreateTupleDescCopy(outdesc));
MemoryContextSwitchTo(oldctx);
send_relation_and_attrs(ancestor, xid, ctx);
RelationClose(ancestor);If - and that's common - convert_tuples_by_name() won't have to do
anything, the copied tuple descs will be permanently leaked.I also think this is a permanent leak. I think we need to free all the
memory associated with this map on the invalidation of this particular
relsync entry (basically in rel_sync_cache_relation_cb).I agree there's a problem here.
Back in:
/messages/by-id/CA+HiwqEeU19iQgjN6HF1HTP
U0L5%2BJxyS5CmxaOVGNXBAfUY06Q%40mail.gmail.comI had proposed to move the map creation from maybe_send_schema() to
get_rel_sync_entry(), mainly because the latter is where I realized it
belongs, though a bit too late. Attached is the part of the patch
for this particular issue. It also takes care to release the copied TupleDescs
if the map is found to be unnecessary, thus preventing leaking into
CacheMemoryContext.
Thank you for sharing the patch.
Your patch looks correct to me. Make check-world has
passed with it. Also, I agree with the idea to place
the processing to set the map in the get_rel_sync_entry.
One thing I'd like to ask is an advanced way to confirm
the memory leak is solved by the patch, not just by running make check-world.
I used OSS HEAD and valgrind, expecting that
I could see function stack which has a call of CreateTupleDescCopy
from both pgoutput_change and pgoutput_truncate as memory leak report
in the valgrind logs, and they disappear after applying the patch.
But, I cannot find the pair of pgoutput functions and CreateTupleDescCopy in one report
when I used OSS HEAD, which means that I need to do advanced testing to check if
the memory leak of CreateTupleDescCopy is addressed.
I collected the logs from RT at src/test/subscription so should pass the routes of our interest.
Could someone give me
an advise about the way to confirm the memory leak is solved ?
Best Regards,
Takamichi Osumi
On Monday, April 26, 2021 2:05 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Apr 23, 2021 at 8:03 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:On Saturday, April 17, 2021 4:13 PM Amit Kapila
<amit.kapila16@gmail.com> wrote:
I also don't find a test for this. It is introduced in
5dfd1e5a6696, wrote by Simon Riggs, Marco Nenciarini and Peter
Eisentraut. Maybe they can explain when we can enter this condition?My guess is that this has been copied from the code a few lines
above to handle insert/update/delete where it is required to handle
some DDL ops like Alter Table but I think we don't need it here (for
Truncate op). If that understanding turns out to be true then we
should either have an Assert for this or an elog message.In this thread, we are discussing 3 topics below...
(1) necessity of the check for REORDER_BUFFER_CHANGE_TRUNCATE
in
ReorderBufferProcessTXN()
(2) discussion of whether we disallow decoding of operations on user
catalog tables or not
(3) memory leak of maybe_send_schema() (patch already provided)Let's address those one by one.
In terms of (1), which was close to the motivation of this thread,I think (1) and (2) are related because if we need (2) then the check removed
by (1) needs to be replaced with another check. So, I am not sure how to
make this decision.
Yeah, you are right.
On Monday, April 19, 2021 9:23 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Sat, Apr 17, 2021 at 12:01 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:On Fri, Apr 16, 2021 at 11:24 PM Andres Freund <andres@anarazel.de>
wrote:
I think it is also important to *not* acquire any lock on relation
otherwise it can lead to some sort of deadlock or infinite wait in
the decoding process. Consider a case for operations like Truncate
(or if the user has acquired an exclusive lock on the relation in
some other way say via Lock command) which acquires an exclusive
lock on relation, it won't get replicated in synchronous mode
(when synchronous_standby_name is configured). The truncate
operation will wait for the transaction to be replicated to the
subscriber and the decoding process will wait for the Truncateoperation to finish.
However, this cannot be really relied upon for catalog tables. An
output function might acquire locks or such. But for those we do not
need to decode contents...I see that if we define a user_catalog_table (create table t1_cat(c1
int) WITH(user_catalog_table = true);), we are able to decode
operations like (insert, truncate) on such a table. What do you mean
by "But for those we do not need to decode contents"?I think we are allowed to decode the operations on user catalog tables
because we are using RelationIsLogicallyLogged() instead of
RelationIsAccessibleInLogicalDecoding() in ReorderBufferProcessTXN().
Based on this discussion, I think we should not be allowing decoding of
operations on user catalog tables, so we should use
RelationIsAccessibleInLogicalDecoding to skip such ops in
ReorderBufferProcessTXN(). Am, I missing something?Can you please clarify?
I don't understand that point, either.
I read the context where the user_catalog_table was introduced - [1]/messages/by-id/20130914204913.GA4071@awork2.anarazel.de.
There, I couldn't find any discussion if we should skip decode operations
on that kind of tables or not. Accordingly, we just did not conclude it, I suppose.
What surprised me a bit is to decode operations of system catalog table are considered like [2]/messages/by-id/CA+TgmobhDCHuckL_86wRDWJ31Gw3Y1HrQ4yUKEn7U1_hTbeVqQ@mail.gmail.com
somehow at the time. I cannot find any concrete description of such use cases in the thread, though.
Anyway, I felt disallowing decoding of operations on user catalog tables
doesn't spoil the feature's purpose. So, I'm OK to do so. What do you think ?
[1]: /messages/by-id/20130914204913.GA4071@awork2.anarazel.de
Note that in this discussion, user_catalog_table was renamed from
treat_as_catalog_table in the middle of the thread. Searching it might help you to shorten your time to have a look at it.
[2]: /messages/by-id/CA+TgmobhDCHuckL_86wRDWJ31Gw3Y1HrQ4yUKEn7U1_hTbeVqQ@mail.gmail.com
Best Regards,
Takamichi Osumi
On Wed, Apr 28, 2021 at 5:36 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
On Monday, April 26, 2021 2:05 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Apr 23, 2021 at 8:03 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
I think we are allowed to decode the operations on user catalog tables
because we are using RelationIsLogicallyLogged() instead of
RelationIsAccessibleInLogicalDecoding() in ReorderBufferProcessTXN().
Based on this discussion, I think we should not be allowing decoding of
operations on user catalog tables, so we should use
RelationIsAccessibleInLogicalDecoding to skip such ops in
ReorderBufferProcessTXN(). Am, I missing something?Can you please clarify?
I don't understand that point, either.
I read the context where the user_catalog_table was introduced - [1].
There, I couldn't find any discussion if we should skip decode operations
on that kind of tables or not. Accordingly, we just did not conclude it, I suppose.What surprised me a bit is to decode operations of system catalog table are considered like [2]
somehow at the time. I cannot find any concrete description of such use cases in the thread, though.Anyway, I felt disallowing decoding of operations on user catalog tables
doesn't spoil the feature's purpose. So, I'm OK to do so. What do you think ?
I am not so sure about it because I think we don't have any example of
user_catalog_tables in the core code. This is the reason I was kind of
looking towards Andres to clarify this. Right now, if the user
performs TRUNCATE on user_catalog_table in synchronous mode then it
will hang in case the decoding plugin takes even share lock on it. The
main reason is that we allow decoding of TRUNCATE operation for
user_catalog_tables. I think even if we want to allow decoding of
other operations on user_catalog_table, the decoding of TRUNCATE
should be prohibited but maybe we shouldn't allow decoding of any
operation on such tables as we don't do it for system catalog tables.
--
With Regards,
Amit Kapila.
Takamichi-san,
On Tue, Apr 27, 2021 at 9:37 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
On Tuesday, April 20, 2021 12:07 PM Amit Langote <amitlangote09@gmail.com> wrote:
On Sat, Apr 17, 2021 at 1:30 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:On Fri, Apr 16, 2021 at 11:24 PM Andres Freund <andres@anarazel.de>
wrote:> > This made me take a brief look at pgoutput.c - maybe I am
missingsomething, but how is the following not a memory leak?
static void
maybe_send_schema(LogicalDecodingContext *ctx,
ReorderBufferTXN *txn, ReorderBufferChange*change,
Relation relation, RelationSyncEntry *relentry) {
...
/* Map must live as long as the session does. */
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
relentry->map =convert_tuples_by_name(CreateTupleDescCopy(indesc),
CreateTupleDescCopy(outdesc));
MemoryContextSwitchTo(oldctx);
send_relation_and_attrs(ancestor, xid, ctx);
RelationClose(ancestor);If - and that's common - convert_tuples_by_name() won't have to do
anything, the copied tuple descs will be permanently leaked.I also think this is a permanent leak. I think we need to free all the
memory associated with this map on the invalidation of this particular
relsync entry (basically in rel_sync_cache_relation_cb).I agree there's a problem here.
Back in:
/messages/by-id/CA+HiwqEeU19iQgjN6HF1HTP
U0L5%2BJxyS5CmxaOVGNXBAfUY06Q%40mail.gmail.comI had proposed to move the map creation from maybe_send_schema() to
get_rel_sync_entry(), mainly because the latter is where I realized it
belongs, though a bit too late. Attached is the part of the patch
for this particular issue. It also takes care to release the copied TupleDescs
if the map is found to be unnecessary, thus preventing leaking into
CacheMemoryContext.Thank you for sharing the patch.
Your patch looks correct to me. Make check-world has
passed with it. Also, I agree with the idea to place
the processing to set the map in the get_rel_sync_entry.
Thanks for checking.
One thing I'd like to ask is an advanced way to confirm
the memory leak is solved by the patch, not just by running make check-world.I used OSS HEAD and valgrind, expecting that
I could see function stack which has a call of CreateTupleDescCopy
from both pgoutput_change and pgoutput_truncate as memory leak report
in the valgrind logs, and they disappear after applying the patch.But, I cannot find the pair of pgoutput functions and CreateTupleDescCopy in one report
when I used OSS HEAD, which means that I need to do advanced testing to check if
the memory leak of CreateTupleDescCopy is addressed.
I collected the logs from RT at src/test/subscription so should pass the routes of our interest.Could someone give me
an advise about the way to confirm the memory leak is solved ?
I have not used valgrind or other testing methods to check this.
To me, it's clear in this case by only looking at the code that the
TupleDescs returned by CreateTupleDescCopy() ought to be freed when
convert_tuples_by_name() determines that no map is necessary such that
there will be no need to keep those TupleDesc copies around. Failing
that, those copies end up in CacheMemoryContext without anything
pointing to them, hence the leak. Actually, since maybe_send_schema()
doesn't execute this code if schema_sent is found to have been set by
earlier calls, the leak in question should occur only once in most
tests.
--
Amit Langote
EDB: http://www.enterprisedb.com
On Thursday, April 29, 2021 2:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I am not so sure about it because I think we don't have any example of
user_catalog_tables in the core code. This is the reason I was kind of looking
towards Andres to clarify this. Right now, if the user performs TRUNCATE on
user_catalog_table in synchronous mode then it will hang in case the
decoding plugin takes even share lock on it. The main reason is that we allow
decoding of TRUNCATE operation for user_catalog_tables. I think even if we
want to allow decoding of other operations on user_catalog_table, the
decoding of TRUNCATE should be prohibited but maybe we shouldn't allow
decoding of any operation on such tables as we don't do it for system catalog
tables.
I tried the following scenarios for trying to reproduce this.
Scenario1:
(1) set up 1 publisher and 1 subscriber
(2) create table with user_catalog_table = true on the pub
(3) insert some data to this table
(4) create publication for the table on the pub
(5) create table with user_catalog_table = true on the sub
(6) create subscription on the sub
(7) add synchronous_standby_names to publisher's configuration and restart the pub
(8) have 1 session to hold a lock to the user_catalog_table on the pub in access share mode
(9) have another session to truncate the user_catalog_table on the pub
Here, It keeps waiting but I'm not sure this is the scenario described above,
since this deadlock is caused by (8)'s lock.
Scenario2:
(1) set up 1 publisher and 1 subscriber
(2) create table with user_catalog_table = true on the pub
(3) insert some data to this table
(4) create publication for the table on the pub
(5) create table with user_catalog_table = true on the sub
(6) create subscription on the sub
(7) add synchronous_standby_names to publisher's configuration and restart the pub
(8) have a session to truncate the user_catalog_table on the pub
Scenario 2 was successful.
Are these the scenario you have in mind,
if not please let me know for the missing steps.
I would like to reproduce the scenario and write a patch to fix this.
Best Regards,
Takamichi Osumi