Use-after-free in reorderbuffer.c for INSERT ON CONFLICT

Started by Michael Paquier9 months ago10 messagesbugs

michael@paquier.xyz

9 months ago

Hi all,

Ethan Mertz (colleague, in CC) has found a bug in reorderbuffer.c when
processing a REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM change, based
on the data gathered from a customer issue. The bug is a
use-after-free, causing random crashes, that can be summarized like
that:
1) "specinsert" is saved as a new change to process:
change = specinsert;
change->action = REORDER_BUFFER_CHANGE_INSERT;
2) A bit later on the change and specinsert are freed when we are done
with the speculative insert:
change_done:

/*
* If speculative insertion was confirmed, the record
* isn't needed anymore.
*/
if (specinsert != NULL)
{
ReorderBufferFreeChange(rb, specinsert, true);
specinsert = NULL;
}
3) Finally, a couple of lines down, we then do the following things,
and attempt to use a reference to change->lsn that has been freed:
#define CHANGES_THRESHOLD 100

if (++changes_count >= CHANGES_THRESHOLD)
{
rb->update_progress_txn(rb, txn, change->lsn);
changes_count = 0;
}

This issue has been introduced in 8c58624df462, down to REL_16_STABLE.
Committer of the related change is added in CC (aka Amit K.).

Sawada-san (also in CC) has come up with an imaginative trick to
easily reproduce the issue, as of:
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -2595,7 +2595,7 @@ ReorderBufferProcessTXN(ReorderBuffer *rb, ReorderBufferTXN *txn,
              * overhead, but testing showed there is no noticeable overhead if
              * we do it after every ~100 changes.
              */
-#define CHANGES_THRESHOLD 100
+#define CHANGES_THRESHOLD 1

if (++changes_count >= CHANGES_THRESHOLD)
{

When running the test "ddl" from test_decoding, an instance running
valgrind comes up with that, pointing immediately at the problem:
==28821== Invalid read of size 8
==28821== at 0x222B2D8: ReorderBufferProcessTXN (reorderbuffer.c:2602)
==28821== by 0x222C6D6: ReorderBufferReplay (reorderbuffer.c:2864)
==28821== by 0x222C754: ReorderBufferCommit (reorderbuffer.c:2888)
==28821== by 0x21EDA48: DecodeCommit (decode.c:743)
==28821== by 0x21EA54C: xact_decode (decode.c:242)
==28821== by 0x21E98AA: LogicalDecodingProcessRecord (decode.c:116)
==28821== by 0x22093A4: pg_logical_slot_get_changes_guts (logicalfuncs.c:266)
==28821== by 0x22097BC: pg_logical_slot_get_changes (logicalfuncs.c:333)
==28821== by 0x1AF25D3: ExecMakeTableFunctionResult (execSRF.c:234)
==28821== by 0x1B65606: FunctionNext (nodeFunctionscan.c:94)
==28821== by 0x1AF7306: ExecScanFetch (execScan.h:126)
==28821== by 0x1AF748D: ExecScanExtended (execScan.h:187)
==28821== Address 0x107fae08 is 7,656 bytes inside a recently re-allocated block of size 8,192 alloc'd
==28821== at 0x4844818: malloc (vg_replace_malloc.c:446)
==28821== by 0x2AD3573: SlabAllocFromNewBlock (slab.c:565)
==28821== by 0x2AD3EF7: SlabAlloc (slab.c:656)
==28821== by 0x2AC5282: MemoryContextAlloc (mcxt.c:1237)
==28821== by 0x221E17A: ReorderBufferAllocChange (reorderbuffer.c:511)
==28821== by 0x222F64B: ReorderBufferAddNewTupleCids (reorderbuffer.c:3444)
==28821== by 0x2248439: SnapBuildProcessNewCid (snapbuild.c:700)
==28821== by 0x21EB9D8: heap2_decode (decode.c:437)
==28821== by 0x21E98AA: LogicalDecodingProcessRecord (decode.c:116)
==28821== by 0x22093A4: pg_logical_slot_get_changes_guts (logicalfuncs.c:266)
==28821== by 0x22097BC: pg_logical_slot_get_changes (logicalfuncs.c:333)
==28821== by 0x1AF25D3: ExecMakeTableFunctionResult (execSRF.c:234)

A simple solution suggested by Ethan would be to use the "prev_lsn"
guessed from the change at the beginning of the loop, rather than the
problematic change->lsn. But that does not seem completely right to
me because we can switch to "specinsert" as the change to process,
hence wouldn't we want to call update_progress_txn() based on the LSN
of the actual change we are looking at? All that leads me to the
attached. Note that I am not the most familiar with this area of the
code, so please take my arguments with a grain of salt.

Comments and thoughts are welcome.
--
Michael

Masahiko Sawada

sawada.mshk@gmail.com

9 months ago

In reply to: Michael Paquier (#1)

Re: Use-after-free in reorderbuffer.c for INSERT ON CONFLICT

On Wed, Jul 30, 2025 at 11:44 PM Michael Paquier <michael@paquier.xyz> wrote:

A simple solution suggested by Ethan would be to use the "prev_lsn"
guessed from the change at the beginning of the loop, rather than the
problematic change->lsn. But that does not seem completely right to
me because we can switch to "specinsert" as the change to process,
hence wouldn't we want to call update_progress_txn() based on the LSN
of the actual change we are looking at? All that leads me to the
attached.

Thank you for preparing the patch!

Yes, I think it's sensible to keep the current behavior. So the patch
looks good to me.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Michael Paquier

michael@paquier.xyz

9 months ago

In reply to: Masahiko Sawada (#2)

Re: Use-after-free in reorderbuffer.c for INSERT ON CONFLICT

On Thu, Jul 31, 2025 at 09:51:21AM -0700, Masahiko Sawada wrote:

Yes, I think it's sensible to keep the current behavior. So the patch
looks good to me.

Thanks for the review. I am planning to apply that at the beginning
of next week, in time for the next set of minor releases.
--
Michael

Amit Kapila

amit.kapila16@gmail.com

9 months ago

In reply to: Michael Paquier (#1)

Re: Use-after-free in reorderbuffer.c for INSERT ON CONFLICT

On Thu, Jul 31, 2025 at 12:14 PM Michael Paquier <michael@paquier.xyz> wrote:

Ethan Mertz (colleague, in CC) has found a bug in reorderbuffer.c when
processing a REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM change, based
on the data gathered from a customer issue. The bug is a
use-after-free, causing random crashes, that can be summarized like
that:

...

A simple solution suggested by Ethan would be to use the "prev_lsn"
guessed from the change at the beginning of the loop, rather than the
problematic change->lsn. But that does not seem completely right to
me because we can switch to "specinsert" as the change to process,
hence wouldn't we want to call update_progress_txn() based on the LSN
of the actual change we are looking at?

We still won't be able to capture the latest LSN in case of
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT. IIRC, update_progress_txn
is used to keep the client active so that when many changes are
skipped, the client doesn't timeout. In this case, it seems okay to
use prev_lsn as well.

--
With Regards,
Amit Kapila.

Michael Paquier

michael@paquier.xyz

9 months ago

In reply to: Amit Kapila (#4)

Re: Use-after-free in reorderbuffer.c for INSERT ON CONFLICT

On Fri, Aug 01, 2025 at 10:03:14AM +0530, Amit Kapila wrote:

We still won't be able to capture the latest LSN in case of
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT. IIRC, update_progress_txn
is used to keep the client active so that when many changes are
skipped, the client doesn't timeout. In this case, it seems okay to
use prev_lsn as well.

I am not quite sure to follow your argument here. In the case of a
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT change, we would use
change->lsn, which is in the case of the patch and HEAD the same
thing: prev_lsn. So the logic is unchanged in the case, isn't it?
--
Michael

Amit Kapila

amit.kapila16@gmail.com

9 months ago

In reply to: Michael Paquier (#5)

Re: Use-after-free in reorderbuffer.c for INSERT ON CONFLICT

On Fri, Aug 1, 2025 at 10:22 AM Michael Paquier <michael@paquier.xyz> wrote:

On Fri, Aug 01, 2025 at 10:03:14AM +0530, Amit Kapila wrote:

We still won't be able to capture the latest LSN in case of
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT. IIRC, update_progress_txn
is used to keep the client active so that when many changes are
skipped, the client doesn't timeout. In this case, it seems okay to
use prev_lsn as well.

I am not quite sure to follow your argument here. In the case of a
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT change, we would use
change->lsn, which is in the case of the patch and HEAD the same
thing: prev_lsn.

I mean to say we can use the same change LSN both for
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM and
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT. Right now, for
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM, we switch the change to
specinsert which would have a prior LSN value (say, if confirm/abort
record will have value, 1000, it will be 800 or so) but we should
still use 1000 for update_progress_txn. The update_progress_txn() is
helpful when such an insert is skipped by a plugin (in this case
pgouput) and in that case, we would require the latest LSN processed
by reorder buffer to pass to it. We use it to send a keep_alive to a
client with the last LSN processed.

--
With Regards,
Amit Kapila.

Michael Paquier

michael@paquier.xyz

9 months ago

In reply to: Amit Kapila (#6)

Re: Use-after-free in reorderbuffer.c for INSERT ON CONFLICT

On Fri, Aug 01, 2025 at 03:30:17PM +0530, Amit Kapila wrote:

I mean to say we can use the same change LSN both for
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM and
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT. Right now, for
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM, we switch the change to
specinsert which would have a prior LSN value (say, if confirm/abort
record will have value, 1000, it will be 800 or so) but we should
still use 1000 for update_progress_txn. The update_progress_txn() is
helpful when such an insert is skipped by a plugin (in this case
pgouput) and in that case, we would require the latest LSN processed
by reorder buffer to pass to it. We use it to send a keep_alive to a
client with the last LSN processed.

Ah, OK, I've missed your point then. It's kind of an optimization in
itself because we would be a bit more aggressive with the updates, but
I agree to do that in the scope of this fix. The updated attached
uses prev_lsn for the job, for both the ABORT and CONFIRM cases,
meaning a one-liner.
--
Michael

Masahiko Sawada

sawada.mshk@gmail.com

9 months ago

In reply to: Michael Paquier (#7)

Re: Use-after-free in reorderbuffer.c for INSERT ON CONFLICT

On Fri, Aug 1, 2025 at 4:45 AM Michael Paquier <michael@paquier.xyz> wrote:

On Fri, Aug 01, 2025 at 03:30:17PM +0530, Amit Kapila wrote:

I mean to say we can use the same change LSN both for
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM and
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT. Right now, for
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM, we switch the change to
specinsert which would have a prior LSN value (say, if confirm/abort
record will have value, 1000, it will be 800 or so) but we should
still use 1000 for update_progress_txn. The update_progress_txn() is
helpful when such an insert is skipped by a plugin (in this case
pgouput) and in that case, we would require the latest LSN processed
by reorder buffer to pass to it. We use it to send a keep_alive to a
client with the last LSN processed.

Ah, OK, I've missed your point then. It's kind of an optimization in
itself because we would be a bit more aggressive with the updates, but
I agree to do that in the scope of this fix. The updated attached
uses prev_lsn for the job, for both the ABORT and CONFIRM cases,
meaning a one-liner.

I assumed that behavior was intentional of the original patch but I'm
fine with the new version patch too if it's not.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Amit Kapila

amit.kapila16@gmail.com

9 months ago

In reply to: Michael Paquier (#7)

Re: Use-after-free in reorderbuffer.c for INSERT ON CONFLICT

On Fri, Aug 1, 2025 at 5:15 PM Michael Paquier <michael@paquier.xyz> wrote:

On Fri, Aug 01, 2025 at 03:30:17PM +0530, Amit Kapila wrote:

I mean to say we can use the same change LSN both for
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM and
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT. Right now, for
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM, we switch the change to
specinsert which would have a prior LSN value (say, if confirm/abort
record will have value, 1000, it will be 800 or so) but we should
still use 1000 for update_progress_txn. The update_progress_txn() is
helpful when such an insert is skipped by a plugin (in this case
pgouput) and in that case, we would require the latest LSN processed
by reorder buffer to pass to it. We use it to send a keep_alive to a
client with the last LSN processed.

Ah, OK, I've missed your point then. It's kind of an optimization in
itself because we would be a bit more aggressive with the updates, but
I agree to do that in the scope of this fix. The updated attached
uses prev_lsn for the job, for both the ABORT and CONFIRM cases,
meaning a one-liner.

The proposed change looks good to me.

--
With Regards,
Amit Kapila.

#10

Michael Paquier

michael@paquier.xyz

9 months ago

In reply to: Amit Kapila (#9)

Re: Use-after-free in reorderbuffer.c for INSERT ON CONFLICT

On Sat, Aug 02, 2025 at 08:57:10AM +0530, Amit Kapila wrote:

The proposed change looks good to me.

Thanks. I've learnt that my schedule is going to be a bit unstable at
the beginning of the next week and I had a window today, so done now.
--
Michael

Use-after-free in reorderbuffer.c for INSERT ON CONFLICT

Attachments:

Attachments: