Fix possible dereference null pointer (src/backend/replication/logical/reorderbuffer.c)

Started by Ranier Vilelaabout 2 years ago10 messageshackers
Jump to latest
#1Ranier Vilela
ranier.vf@gmail.com

Hi,

Per Coverity.

The function ReorderBufferTXNByXid,
can return NULL when the parameter *create* is false.

In the functions ReorderBufferSetBaseSnapshot
and ReorderBufferXidHasBaseSnapshot,
the second call to ReorderBufferTXNByXid,
pass false to *create* argument.

In the function ReorderBufferSetBaseSnapshot,
fixed passing true as argument to always return
a valid ReorderBufferTXN pointer.

In the function ReorderBufferXidHasBaseSnapshot,
fixed by checking if the pointer is NULL.

best regards,
Ranier Vilela

Attachments:

fix-possible-dereference-null-pointer-reorderbuffer.patchapplication/octet-stream; name=fix-possible-dereference-null-pointer-reorderbuffer.patchDownload+6-2
#2Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Ranier Vilela (#1)
Re: Fix possible dereference null pointer (src/backend/replication/logical/reorderbuffer.c)

On 10/04/2024 21:07, Ranier Vilela wrote:

Hi,

Per Coverity.

The function ReorderBufferTXNByXid,
can return NULL when the parameter *create* is false.

In the functions ReorderBufferSetBaseSnapshot
and ReorderBufferXidHasBaseSnapshot,
the second call to ReorderBufferTXNByXid,
pass false to *create* argument.

In the function ReorderBufferSetBaseSnapshot,
fixed passing true as argument to always return
a valid ReorderBufferTXN pointer.

In the function ReorderBufferXidHasBaseSnapshot,
fixed by checking if the pointer is NULL.

If it's a "known subxid", the top-level XID should already have its
ReorderBufferTXN entry, so ReorderBufferTXN() should never return NULL.
It's not surprising if Coverity doesn't understand that, but setting the
'create' flag doesn't seem like the right fix. If we add "Assert(txn !=
NULL)", does that silence it?

--
Heikki Linnakangas
Neon (https://neon.tech)

#3Ranier Vilela
ranier.vf@gmail.com
In reply to: Heikki Linnakangas (#2)
Re: Fix possible dereference null pointer (src/backend/replication/logical/reorderbuffer.c)

Em qua., 10 de abr. de 2024 às 18:28, Heikki Linnakangas <hlinnaka@iki.fi>
escreveu:

On 10/04/2024 21:07, Ranier Vilela wrote:

Hi,

Per Coverity.

The function ReorderBufferTXNByXid,
can return NULL when the parameter *create* is false.

In the functions ReorderBufferSetBaseSnapshot
and ReorderBufferXidHasBaseSnapshot,
the second call to ReorderBufferTXNByXid,
pass false to *create* argument.

In the function ReorderBufferSetBaseSnapshot,
fixed passing true as argument to always return
a valid ReorderBufferTXN pointer.

In the function ReorderBufferXidHasBaseSnapshot,
fixed by checking if the pointer is NULL.

If it's a "known subxid", the top-level XID should already have its
ReorderBufferTXN entry, so ReorderBufferTXN() should never return NULL.

There are several conditions to not return NULL,
I think trusting never is insecure.

It's not surprising if Coverity doesn't understand that, but setting the

'create' flag doesn't seem like the right fix.

ReorderBufferSetBaseSnapshot always expects that *txn* exists.
If a second call fails, the only solution is to create a new one, no?

If we add "Assert(txn !=
NULL)", does that silence it?

I think no.
I always compile it as a release to send to Coverity.

best regards,
Ranier Vilela

#4Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Ranier Vilela (#3)
Re: Fix possible dereference null pointer (src/backend/replication/logical/reorderbuffer.c)

On 11/04/2024 15:03, Ranier Vilela wrote:

Em qua., 10 de abr. de 2024 às 18:28, Heikki Linnakangas
<hlinnaka@iki.fi <mailto:hlinnaka@iki.fi>> escreveu:

On 10/04/2024 21:07, Ranier Vilela wrote:

Hi,

Per Coverity.

The function ReorderBufferTXNByXid,
can return NULL when the parameter *create* is false.

In the functions ReorderBufferSetBaseSnapshot
and ReorderBufferXidHasBaseSnapshot,
the second call to ReorderBufferTXNByXid,
pass false to *create* argument.

In the function ReorderBufferSetBaseSnapshot,
fixed passing true as argument to always return
a valid ReorderBufferTXN pointer.

In the function ReorderBufferXidHasBaseSnapshot,
fixed by checking if the pointer is NULL.

If it's a "known subxid", the top-level XID should already have its
ReorderBufferTXN entry, so ReorderBufferTXN() should never return NULL.

There are several conditions to not return NULL,
I think trusting never is insecure.

Well, you could make it an elog(ERROR, ..) instead. But the point is
that it should not happen, and if it does for some reason, that's very
suprpising and there is a bug somewhere. In that case, we should *not*
just blindly create it and proceed as if everything was OK.

--
Heikki Linnakangas
Neon (https://neon.tech)

#5Ranier Vilela
ranier.vf@gmail.com
In reply to: Heikki Linnakangas (#4)
Re: Fix possible dereference null pointer (src/backend/replication/logical/reorderbuffer.c)

Em qui., 11 de abr. de 2024 às 09:54, Heikki Linnakangas <hlinnaka@iki.fi>
escreveu:

On 11/04/2024 15:03, Ranier Vilela wrote:

Em qua., 10 de abr. de 2024 às 18:28, Heikki Linnakangas
<hlinnaka@iki.fi <mailto:hlinnaka@iki.fi>> escreveu:

On 10/04/2024 21:07, Ranier Vilela wrote:

Hi,

Per Coverity.

The function ReorderBufferTXNByXid,
can return NULL when the parameter *create* is false.

In the functions ReorderBufferSetBaseSnapshot
and ReorderBufferXidHasBaseSnapshot,
the second call to ReorderBufferTXNByXid,
pass false to *create* argument.

In the function ReorderBufferSetBaseSnapshot,
fixed passing true as argument to always return
a valid ReorderBufferTXN pointer.

In the function ReorderBufferXidHasBaseSnapshot,
fixed by checking if the pointer is NULL.

If it's a "known subxid", the top-level XID should already have its
ReorderBufferTXN entry, so ReorderBufferTXN() should never return

NULL.

There are several conditions to not return NULL,
I think trusting never is insecure.

Well, you could make it an elog(ERROR, ..) instead. But the point is
that it should not happen, and if it does for some reason, that's very
suprpising and there is a bug somewhere. In that case, we should *not*
just blindly create it and proceed as if everything was OK.

Thanks for the clarification.
I will then suggest improving robustness,
but avoiding hiding any possible errors that may occur.

v1 patch attached.

best regards,
Ranier Vilela

Attachments:

v1-fix-possible-dereference-null-pointer-reorderbuffer.patchapplication/octet-stream; name=v1-fix-possible-dereference-null-pointer-reorderbuffer.patchDownload+9-0
#6Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Ranier Vilela (#5)
Re: Fix possible dereference null pointer (src/backend/replication/logical/reorderbuffer.c)

On 11/04/2024 16:37, Ranier Vilela wrote:

Em qui., 11 de abr. de 2024 às 09:54, Heikki Linnakangas
<hlinnaka@iki.fi <mailto:hlinnaka@iki.fi>> escreveu:

On 11/04/2024 15:03, Ranier Vilela wrote:

Em qua., 10 de abr. de 2024 às 18:28, Heikki Linnakangas
<hlinnaka@iki.fi <mailto:hlinnaka@iki.fi> <mailto:hlinnaka@iki.fi

<mailto:hlinnaka@iki.fi>>> escreveu:

     On 10/04/2024 21:07, Ranier Vilela wrote:
      > Hi,
      >
      > Per Coverity.
      >
      > The function ReorderBufferTXNByXid,
      > can return NULL when the parameter *create* is false.
      >
      > In the functions ReorderBufferSetBaseSnapshot
      > and ReorderBufferXidHasBaseSnapshot,
      > the second call to ReorderBufferTXNByXid,
      > pass false to *create* argument.
      >
      > In the function ReorderBufferSetBaseSnapshot,
      > fixed passing true as argument to always return
      > a valid ReorderBufferTXN pointer.
      >
      > In the function ReorderBufferXidHasBaseSnapshot,
      > fixed by checking if the pointer is NULL.

     If it's a "known subxid", the top-level XID should already

have its

     ReorderBufferTXN entry, so ReorderBufferTXN() should never

return NULL.

There are several conditions to not return NULL,
I think trusting never is insecure.

Well, you could make it an elog(ERROR, ..) instead. But the point is
that it should not happen, and if it does for some reason, that's very
suprpising and there is a bug somewhere. In that case, we should *not*
just blindly create it and proceed as if everything was OK.

Thanks for the clarification.
I will then suggest improving robustness,
but avoiding hiding any possible errors that may occur.

I don't much like adding extra runtime checks for "can't happen"
scenarios either. Assertions would be more clear, but in this case the
code would just segfault trying to dereference the NULL pointer, which
is fine for a "can't happen" scenario.

Looking closer, when we identify an XID as a subtransaction, we:
- assign toplevel_xid,
- set RBTXN_IS_SUBXACT, and
- assign toptxn pointer.

ISTM the 'toplevel_xid' and RBTXN_IS_SUBXACT are redundant.
'toplevel_xid' is only used in those two calls that do
ReorderBufferTXNByXid(rb, txn->toplevel_xid,...), and you could replace
those by following the 'toptxn' pointer directly. And RBTXN_IS_SUBXACT
is redundant with (toptxn != NULL). So here's a patch to remove
'toplevel_xid' and RBTXN_IS_SUBXACT altogether.

Amit, you added 'toptxn' in commit c55040ccd017; does this look right to
you?

--
Heikki Linnakangas
Neon (https://neon.tech)

Attachments:

0001-Remove-redundant-field-and-flag-from-ReorderBufferTX.patchtext/x-patch; charset=UTF-8; name=0001-Remove-redundant-field-and-flag-from-ReorderBufferTX.patchDownload+14-30
#7Ranier Vilela
ranier.vf@gmail.com
In reply to: Heikki Linnakangas (#6)
Re: Fix possible dereference null pointer (src/backend/replication/logical/reorderbuffer.c)

Em sáb., 13 de abr. de 2024 às 04:16, Heikki Linnakangas <hlinnaka@iki.fi>
escreveu:

On 11/04/2024 16:37, Ranier Vilela wrote:

Em qui., 11 de abr. de 2024 às 09:54, Heikki Linnakangas
<hlinnaka@iki.fi <mailto:hlinnaka@iki.fi>> escreveu:

On 11/04/2024 15:03, Ranier Vilela wrote:

Em qua., 10 de abr. de 2024 às 18:28, Heikki Linnakangas
<hlinnaka@iki.fi <mailto:hlinnaka@iki.fi> <mailto:hlinnaka@iki.fi

<mailto:hlinnaka@iki.fi>>> escreveu:

On 10/04/2024 21:07, Ranier Vilela wrote:

Hi,

Per Coverity.

The function ReorderBufferTXNByXid,
can return NULL when the parameter *create* is false.

In the functions ReorderBufferSetBaseSnapshot
and ReorderBufferXidHasBaseSnapshot,
the second call to ReorderBufferTXNByXid,
pass false to *create* argument.

In the function ReorderBufferSetBaseSnapshot,
fixed passing true as argument to always return
a valid ReorderBufferTXN pointer.

In the function ReorderBufferXidHasBaseSnapshot,
fixed by checking if the pointer is NULL.

If it's a "known subxid", the top-level XID should already

have its

ReorderBufferTXN entry, so ReorderBufferTXN() should never

return NULL.

There are several conditions to not return NULL,
I think trusting never is insecure.

Well, you could make it an elog(ERROR, ..) instead. But the point is
that it should not happen, and if it does for some reason, that's

very

suprpising and there is a bug somewhere. In that case, we should

*not*

just blindly create it and proceed as if everything was OK.

Thanks for the clarification.
I will then suggest improving robustness,
but avoiding hiding any possible errors that may occur.

I don't much like adding extra runtime checks for "can't happen"
scenarios either. Assertions would be more clear, but in this case the
code would just segfault trying to dereference the NULL pointer, which
is fine for a "can't happen" scenario.

This sounds a little confusing to me.
Is the project policy to *tolerate* dereferencing NULL pointers?
If this is the case, no problem, using Assert would serve the purpose of
protecting against these errors well.

Looking closer, when we identify an XID as a subtransaction, we:
- assign toplevel_xid,
- set RBTXN_IS_SUBXACT, and
- assign toptxn pointer.

ISTM the 'toplevel_xid' and RBTXN_IS_SUBXACT are redundant.
'toplevel_xid' is only used in those two calls that do
ReorderBufferTXNByXid(rb, txn->toplevel_xid,...), and you could replace
those by following the 'toptxn' pointer directly. And RBTXN_IS_SUBXACT
is redundant with (toptxn != NULL). So here's a patch to remove
'toplevel_xid' and RBTXN_IS_SUBXACT altogether.

+ if (rbtxn_is_subtxn(txn))
+ txn = rbtxn_get_toptxn(txn);

rbtxn_get_toptxn already calls rbtxn_is_subtx,
which adds a little unnecessary overhead.
I made a v1 of your patch, modifying this, please ignore it if you don't
agree.

best regards,
Ranier Vilela

Attachments:

v1-0001-Remove-redundant-field-and-flag-from-ReorderBufferTX.patchapplication/octet-stream; name=v1-0001-Remove-redundant-field-and-flag-from-ReorderBufferTX.patchDownload+13-18
#8Michael Paquier
michael@paquier.xyz
In reply to: Ranier Vilela (#7)
Re: Fix possible dereference null pointer (src/backend/replication/logical/reorderbuffer.c)

On Sat, Apr 13, 2024 at 10:40:35AM -0300, Ranier Vilela wrote:

This sounds a little confusing to me.
Is the project policy to *tolerate* dereferencing NULL pointers?
If this is the case, no problem, using Assert would serve the purpose of
protecting against these errors well.

In most cases, it does not matter to have an assertion for a code path
that's going to crash a few lines later. The result is the same: the
code will crash and inform about the failure.

rbtxn_get_toptxn already calls rbtxn_is_subtx,
which adds a little unnecessary overhead.
I made a v1 of your patch, modifying this, please ignore it if you don't
agree.

c55040ccd017 has been added in v14~, so this is material for 18~ at
this stage. If you want to move on with these changes, I'd suggest to
add a CF entry.

FWIW, I think that I agree with the point made upthread by Heikki
about the fact that these extra ReorderBufferTXNByXid() are redundant.
In these two code paths, the ReorderBufferTXN entry of top transaction
ID should exist, so removing toplevel_xid makes sense.

It may be worth exploring if more simplifications around
ReorderBufferTXNByXid() are possible, actually.
--
Michael

#9Amit Kapila
amit.kapila16@gmail.com
In reply to: Heikki Linnakangas (#6)
Re: Fix possible dereference null pointer (src/backend/replication/logical/reorderbuffer.c)

On Sat, Apr 13, 2024 at 12:46 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

I don't much like adding extra runtime checks for "can't happen"
scenarios either. Assertions would be more clear, but in this case the
code would just segfault trying to dereference the NULL pointer, which
is fine for a "can't happen" scenario.

Isn't the existing assertion (Assert(!create || txn != NULL);) in
ReorderBufferTXNByXid() sufficient to handle this case?

Looking closer, when we identify an XID as a subtransaction, we:
- assign toplevel_xid,
- set RBTXN_IS_SUBXACT, and
- assign toptxn pointer.

ISTM the 'toplevel_xid' and RBTXN_IS_SUBXACT are redundant.
'toplevel_xid' is only used in those two calls that do
ReorderBufferTXNByXid(rb, txn->toplevel_xid,...), and you could replace
those by following the 'toptxn' pointer directly. And RBTXN_IS_SUBXACT
is redundant with (toptxn != NULL). So here's a patch to remove
'toplevel_xid' and RBTXN_IS_SUBXACT altogether.

Good idea. I don't see a problem with this idea.

@@ -1135,8 +1133,6 @@ static void
 ReorderBufferTransferSnapToParent(ReorderBufferTXN *txn,
    ReorderBufferTXN *subtxn)
 {
- Assert(subtxn->toplevel_xid == txn->xid);

Is there a benefit in converting this assertion using toptxn?

--
With Regards,
Amit Kapila.

#10Ranier Vilela
ranier.vf@gmail.com
In reply to: Michael Paquier (#8)
Re: Fix possible dereference null pointer (src/backend/replication/logical/reorderbuffer.c)

Em dom., 14 de abr. de 2024 às 21:29, Michael Paquier <michael@paquier.xyz>
escreveu:

On Sat, Apr 13, 2024 at 10:40:35AM -0300, Ranier Vilela wrote:

This sounds a little confusing to me.
Is the project policy to *tolerate* dereferencing NULL pointers?
If this is the case, no problem, using Assert would serve the purpose of
protecting against these errors well.

In most cases, it does not matter to have an assertion for a code path
that's going to crash a few lines later. The result is the same: the
code will crash and inform about the failure.

rbtxn_get_toptxn already calls rbtxn_is_subtx,
which adds a little unnecessary overhead.
I made a v1 of your patch, modifying this, please ignore it if you don't
agree.

c55040ccd017 has been added in v14~, so this is material for 18~ at
this stage. If you want to move on with these changes, I'd suggest to
add a CF entry.

I think it's worth it, because it's a case of a possible bug, but very
unlikely,
and Heikki's suggestions improve the code.

best regards,
Ranier Vilela