Avoid possible dereference null pointer (src/backend/catalog/pg_depend.c)

ranier.vf@gmail.com

almost 2 years ago

In reply to: Ranier Vilela (#1)

Re: Avoid possible dereference null pointer (src/backend/catalog/pg_depend.c)

Em qua., 22 de mai. de 2024 às 11:44, Ranier Vilela <ranier.vf@gmail.com>
escreveu:

Hi.

Per Coverity.

2. returned_null: SearchSysCacheAttName returns NULL (checked 20 out of
21 times).
3. var_assigned: Assigning: ptup = NULL return value from
SearchSysCacheAttName.
964 ptup = SearchSysCacheAttName(relid, attname);
CID 1545986: (#1 of 1): Dereference null return value (NULL_RETURNS)
4. dereference: Dereferencing ptup, which is known to be NULL.

The functions SearchSysCacheAttNum and SearchSysCacheAttName,
need to have the result checked.

The commit 5091995
<https://github.com/postgres/postgres/commit/509199587df73f06eda898ae13284292f4ae573a>,
left an oversight.

Fixed by the patch attached, a change of style, unfortunately, was
necessary.

v1 Attached, fix wrong column variable name in error report.

best regards,
Ranier Vilela

ranier.vf@gmail.com

almost 2 years ago

In reply to: Ranier Vilela (#2)

Re: Avoid possible dereference null pointer (src/backend/catalog/pg_depend.c)

Em qua., 22 de mai. de 2024 às 13:09, Ranier Vilela <ranier.vf@gmail.com>
escreveu:

Em qua., 22 de mai. de 2024 às 11:44, Ranier Vilela <ranier.vf@gmail.com>
escreveu:

Hi.

Per Coverity.

2. returned_null: SearchSysCacheAttName returns NULL (checked 20 out of
21 times).
3. var_assigned: Assigning: ptup = NULL return value from
SearchSysCacheAttName.
964 ptup = SearchSysCacheAttName(relid, attname);
CID 1545986: (#1 of 1): Dereference null return value (NULL_RETURNS)
4. dereference: Dereferencing ptup, which is known to be NULL.

The functions SearchSysCacheAttNum and SearchSysCacheAttName,
need to have the result checked.

The commit 5091995
<https://github.com/postgres/postgres/commit/509199587df73f06eda898ae13284292f4ae573a>,
left an oversight.

Fixed by the patch attached, a change of style, unfortunately, was
necessary.

v1 Attached, fix wrong column variable name in error report.

1. Another concern is the function *get_partition_ancestors*,
which may return NIL, which may affect *llast_oid*, which does not handle
NIL entries.

2. Is checking *relispartition* enough?
There a function *check_rel_can_be_partition*
(src/backend/utils/adt/partitionfuncs.c),
which performs a much more robust check, would it be worth using it?

With the v2 attached, 1 is handled, but, in this case,
will it be the most correct?

best regards,
Ranier Vilela

michael@paquier.xyz

almost 2 years ago

In reply to: Ranier Vilela (#3)

Re: Avoid possible dereference null pointer (src/backend/catalog/pg_depend.c)

On Wed, May 22, 2024 at 03:28:48PM -0300, Ranier Vilela wrote:

1. Another concern is the function *get_partition_ancestors*,
which may return NIL, which may affect *llast_oid*, which does not handle
NIL entries.

Hm? We already know in the code path that the relation we are dealing
with when calling get_partition_ancestors() *is* a partition thanks to
the check on relispartition, no? In this case, calling
get_partition_ancestors() is valid and there should be a top-most
parent in any case all the time. So I don't get the point of checking
get_partition_ancestors() for NIL-ness just for the sake of assuming
that it would be possible.

2. Is checking *relispartition* enough?
There a function *check_rel_can_be_partition*
(src/backend/utils/adt/partitionfuncs.c),
which performs a much more robust check, would it be worth using it?

With the v2 attached, 1 is handled, but, in this case,
will it be the most correct?

Saying that, your point about the result of SearchSysCacheAttName not
checked if it is a valid tuple is right. We paint errors in these
cases even if they should not happen as that's useful when it comes to
debugging, at least.
--
Michael

ashutosh.bapat@enterprisedb.com

almost 2 years ago

In reply to: Michael Paquier (#4)

Re: Avoid possible dereference null pointer (src/backend/catalog/pg_depend.c)

On Thu, May 23, 2024 at 5:52 AM Michael Paquier <michael@paquier.xyz> wrote:

On Wed, May 22, 2024 at 03:28:48PM -0300, Ranier Vilela wrote:

1. Another concern is the function *get_partition_ancestors*,
which may return NIL, which may affect *llast_oid*, which does not handle
NIL entries.

Hm? We already know in the code path that the relation we are dealing
with when calling get_partition_ancestors() *is* a partition thanks to
the check on relispartition, no? In this case, calling
get_partition_ancestors() is valid and there should be a top-most
parent in any case all the time. So I don't get the point of checking
get_partition_ancestors() for NIL-ness just for the sake of assuming
that it would be possible.

+1.

2. Is checking *relispartition* enough?
There a function *check_rel_can_be_partition*
(src/backend/utils/adt/partitionfuncs.c),
which performs a much more robust check, would it be worth using it?

With the v2 attached, 1 is handled, but, in this case,
will it be the most correct?

Saying that, your point about the result of SearchSysCacheAttName not
checked if it is a valid tuple is right. We paint errors in these
cases even if they should not happen as that's useful when it comes to
debugging, at least.

I think an Assert would do instead of whole ereport(). The callers have
already resolved attribute name to attribute number. Hence the attribute
*should* exist in both partition as well as topmost partitioned table.

   relid = llast_oid(ancestors);
+
  ptup = SearchSysCacheAttName(relid, attname);
+ if (!HeapTupleIsValid(ptup))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("column \"%s\" of relation \"%s\" does not exist",
+ attname, RelationGetRelationName(rel))));

We changed the relid from OID of partition to that of topmost partitioned
table but didn't change rel; which still points to partition relation. We
have to invoke relation_open() with new relid, in order to use rel in the
error message. I don't think all that is worth it, unless we find a
scenario when SearchSysCacheAttName() returns NULL.

--
Best Wishes,
Ashutosh Bapat

ranier.vf@gmail.com

almost 2 years ago

In reply to: Michael Paquier (#4)

Re: Avoid possible dereference null pointer (src/backend/catalog/pg_depend.c)

Hi Micheal,

Em qua., 22 de mai. de 2024 às 21:21, Michael Paquier <michael@paquier.xyz>
escreveu:

On Wed, May 22, 2024 at 03:28:48PM -0300, Ranier Vilela wrote:

1. Another concern is the function *get_partition_ancestors*,
which may return NIL, which may affect *llast_oid*, which does not handle
NIL entries.

Hm? We already know in the code path that the relation we are dealing
with when calling get_partition_ancestors() *is* a partition thanks to
the check on relispartition, no? In this case, calling
get_partition_ancestors() is valid and there should be a top-most
parent in any case all the time. So I don't get the point of checking
get_partition_ancestors() for NIL-ness just for the sake of assuming
that it would be possible.

I don't have strong feelings about this.
But analyzing the function, *pg_partition_root*
(src/backend/utils/adt/partitionfuncs.c),
we see that checking whether it is a partition is done by
check_rel_can_be_partition.
And it doesn't trust get_partition_ancestors, checking
if the return is NIL.

2. Is checking *relispartition* enough?
There a function *check_rel_can_be_partition*
(src/backend/utils/adt/partitionfuncs.c),
which performs a much more robust check, would it be worth using it?

With the v2 attached, 1 is handled, but, in this case,
will it be the most correct?

Saying that, your point about the result of SearchSysCacheAttName not
checked if it is a valid tuple is right. We paint errors in these
cases even if they should not happen as that's useful when it comes to
debugging, at least.

Thanks.

best regards,
Ranier Vilela

Show quoted text

--
Michael

ranier.vf@gmail.com

almost 2 years ago

In reply to: Ashutosh Bapat (#5)

Re: Avoid possible dereference null pointer (src/backend/catalog/pg_depend.c)

Em qui., 23 de mai. de 2024 às 06:27, Ashutosh Bapat <
ashutosh.bapat.oss@gmail.com> escreveu:

On Thu, May 23, 2024 at 5:52 AM Michael Paquier <michael@paquier.xyz>
wrote:

On Wed, May 22, 2024 at 03:28:48PM -0300, Ranier Vilela wrote:

1. Another concern is the function *get_partition_ancestors*,
which may return NIL, which may affect *llast_oid*, which does not

handle

NIL entries.

Hm? We already know in the code path that the relation we are dealing
with when calling get_partition_ancestors() *is* a partition thanks to
the check on relispartition, no? In this case, calling
get_partition_ancestors() is valid and there should be a top-most
parent in any case all the time. So I don't get the point of checking
get_partition_ancestors() for NIL-ness just for the sake of assuming
that it would be possible.

+1.

2. Is checking *relispartition* enough?
There a function *check_rel_can_be_partition*
(src/backend/utils/adt/partitionfuncs.c),
which performs a much more robust check, would it be worth using it?

With the v2 attached, 1 is handled, but, in this case,
will it be the most correct?

Saying that, your point about the result of SearchSysCacheAttName not
checked if it is a valid tuple is right. We paint errors in these
cases even if they should not happen as that's useful when it comes to
debugging, at least.

I think an Assert would do instead of whole ereport().

IMO, Assert there is no better solution here.

The callers have already resolved attribute name to attribute number.
Hence the attribute *should* exist in both partition as well as topmost
partitioned table.
relid = llast_oid(ancestors);
+
ptup = SearchSysCacheAttName(relid, attname);
+ if (!HeapTupleIsValid(ptup))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("column \"%s\" of relation \"%s\" does not exist",
+ attname, RelationGetRelationName(rel))));
We changed the relid from OID of partition to that of topmost partitioned
table but didn't change rel; which still points to partition relation. We
have to invoke relation_open() with new relid, in order to use rel in the
error message. I don't think all that is worth it, unless we find a
scenario when SearchSysCacheAttName() returns NULL.

All calls to functions like SearchSysCacheAttName, in the whole codebase,
checks if returns are valid.
It must be for a very strong reason, such a style.

So, v3, implements it this way.

best regards,
Ranier Vilela

michael@paquier.xyz

almost 2 years ago

In reply to: Ranier Vilela (#7)

Re: Avoid possible dereference null pointer (src/backend/catalog/pg_depend.c)

On Thu, May 23, 2024 at 08:54:12AM -0300, Ranier Vilela wrote:

All calls to functions like SearchSysCacheAttName, in the whole codebase,
checks if returns are valid.
It must be for a very strong reason, such a style.

Usually good practice, as I've outlined once upthread, because we do
expect the attributes to exist in this case. Or if you want, an error
is better than a crash if a concurrent path causes this area to lead
to inconsistent lookups, which is something I've seen in the past
while hacking on my own stuff, or just fix other things causing
syscache lookup inconsistencies. You'd be surprised to hear that
dropped attributes being mishandled is not that uncommon, especially
in out-of-core code, as one example. FWIW, I don't see much a point
in using ereport(), the two checks ought to be elog()s pointing to an
internal error as these two errors should never happen. Still, it is
a good idea to check that they never happen: aka an internal
error state is better than a crash if a problem arises.

So, v3, implements it this way.

I don't understand the point behind the open/close of attrelation,
TBH. That's not needed.

Except fot these two points, this is just moving the calls to make
sure that we have valid tuples from the syscache, which is a better
practice. 509199587df7 is recent enough that this should be fixed now
rather than later.
--
Michael

ashutosh.bapat@enterprisedb.com

almost 2 years ago

In reply to: Michael Paquier (#8)

Re: Avoid possible dereference null pointer (src/backend/catalog/pg_depend.c)

On Fri, May 24, 2024 at 11:03 AM Michael Paquier <michael@paquier.xyz>
wrote:

On Thu, May 23, 2024 at 08:54:12AM -0300, Ranier Vilela wrote:

All calls to functions like SearchSysCacheAttName, in the whole codebase,
checks if returns are valid.
It must be for a very strong reason, such a style.

Usually good practice, as I've outlined once upthread, because we do
expect the attributes to exist in this case. Or if you want, an error
is better than a crash if a concurrent path causes this area to lead
to inconsistent lookups, which is something I've seen in the past
while hacking on my own stuff, or just fix other things causing
syscache lookup inconsistencies. You'd be surprised to hear that
dropped attributes being mishandled is not that uncommon, especially
in out-of-core code, as one example. FWIW, I don't see much a point
in using ereport(), the two checks ought to be elog()s pointing to an
internal error as these two errors should never happen. Still, it is
a good idea to check that they never happen: aka an internal
error state is better than a crash if a problem arises.

So, v3, implements it this way.

I don't understand the point behind the open/close of attrelation,
TBH. That's not needed.

Except fot these two points, this is just moving the calls to make
sure that we have valid tuples from the syscache, which is a better
practice. 509199587df7 is recent enough that this should be fixed now
rather than later.

If we are looking for avoiding a segfault and get a message which helps
debugging, using get_attname and get_attnum might be better options.
get_attname throws an error. get_attnum doesn't throw an error and returns
InvalidAttnum which won't return any valid identity sequence, and thus
return a NIL sequence list which is handled in that function already. Using
these two functions will avoid the clutter as well as segfault. If that's
acceptable, I will provide a patch.

--
Best Wishes,
Ashutosh Bapat

#10

michael@paquier.xyz

almost 2 years ago

In reply to: Ashutosh Bapat (#9)

Re: Avoid possible dereference null pointer (src/backend/catalog/pg_depend.c)

On Fri, May 24, 2024 at 11:58:51AM +0530, Ashutosh Bapat wrote:

If we are looking for avoiding a segfault and get a message which helps
debugging, using get_attname and get_attnum might be better options.
get_attname throws an error. get_attnum doesn't throw an error and returns
InvalidAttnum which won't return any valid identity sequence, and thus
return a NIL sequence list which is handled in that function already. Using
these two functions will avoid the clutter as well as segfault. If that's
acceptable, I will provide a patch.

Yeah, you could do that with these two routines as well. The result
would be the same in terms of runtime validity checks.
--
Michael

#11

ashutosh.bapat@enterprisedb.com

almost 2 years ago

In reply to: Michael Paquier (#10)

Re: Avoid possible dereference null pointer (src/backend/catalog/pg_depend.c)

On Fri, May 24, 2024 at 12:16 PM Michael Paquier <michael@paquier.xyz>
wrote:

On Fri, May 24, 2024 at 11:58:51AM +0530, Ashutosh Bapat wrote:

If we are looking for avoiding a segfault and get a message which helps
debugging, using get_attname and get_attnum might be better options.
get_attname throws an error. get_attnum doesn't throw an error and

returns

InvalidAttnum which won't return any valid identity sequence, and thus
return a NIL sequence list which is handled in that function already.

Using

these two functions will avoid the clutter as well as segfault. If that's
acceptable, I will provide a patch.

Yeah, you could do that with these two routines as well. The result
would be the same in terms of runtime validity checks.

PFA patch using those two routines.

--
Best Wishes,
Ashutosh Bapat

#12

ranier.vf@gmail.com

almost 2 years ago

In reply to: Ashutosh Bapat (#11)

Re: Avoid possible dereference null pointer (src/backend/catalog/pg_depend.c)

Em sex., 24 de mai. de 2024 às 08:48, Ashutosh Bapat <
ashutosh.bapat.oss@gmail.com> escreveu:

On Fri, May 24, 2024 at 12:16 PM Michael Paquier <michael@paquier.xyz>
wrote:

On Fri, May 24, 2024 at 11:58:51AM +0530, Ashutosh Bapat wrote:

If we are looking for avoiding a segfault and get a message which helps
debugging, using get_attname and get_attnum might be better options.
get_attname throws an error. get_attnum doesn't throw an error and

returns

InvalidAttnum which won't return any valid identity sequence, and thus
return a NIL sequence list which is handled in that function already.

Using

these two functions will avoid the clutter as well as segfault. If

that's

acceptable, I will provide a patch.

Yeah, you could do that with these two routines as well. The result
would be the same in terms of runtime validity checks.

PFA patch using those two routines.

The function *get_attname* palloc the result name (pstrdup).
Isn't it necessary to free the memory here (pfree)?

best regards,
Ranier Vilela

#13

michael@paquier.xyz

almost 2 years ago

In reply to: Ranier Vilela (#12)

Re: Avoid possible dereference null pointer (src/backend/catalog/pg_depend.c)

On Fri, May 24, 2024 at 09:05:35AM -0300, Ranier Vilela wrote:

The function *get_attname* palloc the result name (pstrdup).
Isn't it necessary to free the memory here (pfree)?

This is going to be freed with the current memory context, and all the
callers of getIdentitySequence() are in query execution paths, so I
don't see much the point. A second thing was a missing check on the
attnum returned by get_attnum() with InvalidAttrNumber. I'd be
tempted to introduce a missing_ok to this routine after looking at the
callers in all the tree, as some of them want to fail still would not
expect it, so that would reduce a bit the elog churn. That's a story
for a different day, though.
--
Michael

#14