Make reorder buffer max_changes_in_memory adjustable?

Started by Jingtang Zhangover 1 year ago4 messages
#1Jingtang Zhang
mrdrivingduck@gmail.com

Hi hackers.

Recently I came to an issue about logical replicating very big
transactions. Since we already have logical_decoding_work_mem to keep
the memory usage, there is no risk of OOM during decoding. However, the
memory usage still goes out of control in 'Tuples' memory context of
reorder buffer. It seems that when restoring the spilled transactions
from disk, the memory usage is still limited by max_changes_in_memory
which is hard coded to 4096 like what decoding does before v13.

For big transactions, we have already supported streaming mode since
v14, which should solve this issue, but using streaming mode relies on
the subscriptor's support. There are still a lot of PostgreSQL running
v12/13 in production, or maybe v11 or older even though EOLed. Also,
there are a lot of CDCs which logical-replicates PostgreSQL seem not
support streaming either.

Would it be possible to make max_changes_in_memory a GUC so it can be
adjusted dynamically? Make the default value 4096 as what current is.
When coming with big transactions on memory-constrained machine, at
least we can adjust max_changes_in_memory to a lower value to make
logical WAL sender passing through this transaction. Or WAL sender may
get kill -9 and recovery is needed. After recovery, WAL sender needs to
restart from a point before this transaction starts, and keep this loop
without anything useful. It would never have a chance to pass through
this transaction except adding more memory to the machine, which is
usually not practical in reality.

Sincerely, Jingtang

#2Tomas Vondra
tomas.vondra@enterprisedb.com
In reply to: Jingtang Zhang (#1)
Re: Make reorder buffer max_changes_in_memory adjustable?

On 7/21/24 07:19, Jingtang Zhang wrote:

Hi hackers.

Recently I came to an issue about logical replicating very big
transactions. Since we already have logical_decoding_work_mem to keep
the memory usage, there is no risk of OOM during decoding. However, the
memory usage still goes out of control in 'Tuples' memory context of
reorder buffer. It seems that when restoring the spilled transactions
from disk, the memory usage is still limited by max_changes_in_memory
which is hard coded to 4096 like what decoding does before v13.

For big transactions, we have already supported streaming mode since
v14, which should solve this issue, but using streaming mode relies on
the subscriptor's support. There are still a lot of PostgreSQL running
v12/13 in production, or maybe v11 or older even though EOLed. Also,
there are a lot of CDCs which logical-replicates PostgreSQL seem not
support streaming either.

Would it be possible to make max_changes_in_memory a GUC so it can be
adjusted dynamically? Make the default value 4096 as what current is.
When coming with big transactions on memory-constrained machine, at
least we can adjust max_changes_in_memory to a lower value to make
logical WAL sender passing through this transaction. Or WAL sender may
get kill -9 and recovery is needed. After recovery, WAL sender needs to
restart from a point before this transaction starts, and keep this loop
without anything useful. It would never have a chance to pass through
this transaction except adding more memory to the machine, which is
usually not practical in reality.

Theoretically, yes, we could make max_changes_in_memory a GUC, but it's
not clear to me how would that help 12/13, because there's ~0% chance
we'd backpatch that ...

But even for master, is GUC really the appropriate solution? It's still
a manual action, so if things go wrong some human has to connect, try a
setting a lower value, which might or might not work, etc.

Wouldn't it be better to have adjusts the value automatically, somehow?
For example, before restoring the changes, we could count the number of
transactions, and set it to 4096/ntransactions or something like that.
Or do something smarter by estimating tuple size, to count it in the
logical__decoding_work_mem budget.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#3Jingtang Zhang
mrdrivingduck@gmail.com
In reply to: Tomas Vondra (#2)
Re: Make reorder buffer max_changes_in_memory adjustable?

Thanks, Tomas.

Theoretically, yes, we could make max_changes_in_memory a GUC, but it's
not clear to me how would that help 12/13, because there's ~0% chance
we'd backpatch that ...

What I mean is not about back-patch work. Things should happen on publisher
side?

Consider when the publisher is a PostgreSQL v14+~master (with streaming
support) and subscriber is a 12/13 where streaming is not supported, the
publisher
would still have the risk of OOM. The same thing should happen when we use a
v14+~master as publisher and a whatever open source CDC as subscriber.

Wouldn't it be better to have adjusts the value automatically, somehow?
For example, before restoring the changes, we could count the number of
transactions, and set it to 4096/ntransactions or something like that.
Or do something smarter by estimating tuple size, to count it in the
logical__decoding_work_mem budget.

Yes, I think this issue should have been solved when
logical_decoding_work_mem
was initially been introduced, but it didn't. There could be some reasons
like
sub-transaction stuff which has been commented in the header of
reorderbuffer.c.

regards, Jingtang

#4Tomas Vondra
tomas.vondra@enterprisedb.com
In reply to: Jingtang Zhang (#3)
Re: Make reorder buffer max_changes_in_memory adjustable?

On 7/22/24 05:28, Jingtang Zhang wrote:

Thanks, Tomas.

Theoretically, yes, we could make max_changes_in_memory a GUC, but it's
not clear to me how would that help 12/13, because there's ~0% chance
we'd backpatch that ...

What I mean is not about back-patch work. Things should happen on publisher
side?

Consider when the publisher is a PostgreSQL v14+~master (with streaming
support) and subscriber is a 12/13 where streaming is not supported, the
publisher
would still have the risk of OOM. The same thing should happen when we use a
v14+~master as publisher and a whatever open source CDC as subscriber.

Yes, you're right - if we're talking about mixed setups with just the
subscriber running the old version, then it would benefit from this
improvement even without backpatching.

Wouldn't it be better to have adjusts the value automatically, somehow?
For example, before restoring the changes, we could count the number of
transactions, and set it to 4096/ntransactions or something like that.
Or do something smarter by estimating tuple size, to count it in the
logical__decoding_work_mem budget.

Yes, I think this issue should have been solved when
logical_decoding_work_mem
was initially been introduced, but it didn't. There could be some reasons
like
sub-transaction stuff which has been commented in the header of
reorderbuffer.c.

True, but few patches are perfect/complete from V1. There's often stuff
that's unlikely to happen, left for future improvements. And this is one
of those cases, I believe. The fact that the comment even mentions this
is a sign the developers considered this, and chose to ignore it.

That being said, I think it'd be nice to improve this, and I'm willing
to take a look if someone prepares a patch. But I don't think making
this a GUC is the right approach - it's the simplest patch, but every
new GUC just makes the database harder to manage.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company