Avoid possible dereference null pointer (src/backend/utils/cache/relcache.c)

Started by Ranier Vilela7 months ago10 messages

ranier.vf@gmail.com

7 months ago

1 attachment(s)

Hi.

The function *ScanPgRelation* can return a invalid tuple.
It is necessary to check the function's return,
as is already done in other parts of the source.

patch attached.

Best regards,
Ranier Vilela

Attachments:

fix-possible-dereference-null-pointer-relcache.patchapplication/octet-stream; name=fix-possible-dereference-null-pointer-relcache.patchDownload

diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 559ba9cdb2..e4b3a4e855 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -2417,6 +2417,9 @@ RelationReloadNailed(Relation relation)
 
 		pg_class_tuple = ScanPgRelation(RelationGetRelid(relation),
 										true, false);
+		if (!HeapTupleIsValid(pg_class_tuple))
+			elog(ERROR, "could not find pg_class entry for %u",
+				 RelationGetRelid(relation));
 		relp = (Form_pg_class) GETSTRUCT(pg_class_tuple);
 		memcpy(relation->rd_rel, relp, CLASS_TUPLE_SIZE);
 		heap_freetuple(pg_class_tuple);

Aleksander Alekseev

aleksander@timescale.com

7 months ago

In reply to: Ranier Vilela (#1)

Re: Avoid possible dereference null pointer (src/backend/utils/cache/relcache.c)

Hi,

The function *ScanPgRelation* can return a invalid tuple.
It is necessary to check the function's return,
as is already done in other parts of the source.

Good find although I'm not certain if I agree with the patch.

We would be in serious trouble if RelationReloadNailed() could be
called for a non-existing relation. Wouldn't Assert() be more
appropriate?

--
Best regards,
Aleksander Alekseev

Ranier Vilela

ranier.vf@gmail.com

7 months ago

In reply to: Aleksander Alekseev (#2)

Re: Avoid possible dereference null pointer (src/backend/utils/cache/relcache.c)

Hi Aleksander.

Em ter., 17 de jun. de 2025 às 09:35, Aleksander Alekseev <
aleksander@timescale.com> escreveu:

Hi,

The function *ScanPgRelation* can return a invalid tuple.
It is necessary to check the function's return,
as is already done in other parts of the source.

Good find although I'm not certain if I agree with the patch.

We would be in serious trouble if RelationReloadNailed() could be
called for a non-existing relation. Wouldn't Assert() be more
appropriate?

IMO, I think no.

In all paths, this check is done, why would this be the only exception?

best regards,
Ranier Vilela

Aleksander Alekseev

aleksander@timescale.com

7 months ago

In reply to: Ranier Vilela (#3)

Re: Avoid possible dereference null pointer (src/backend/utils/cache/relcache.c)

Hi,

We would be in serious trouble if RelationReloadNailed() could be
called for a non-existing relation. Wouldn't Assert() be more
appropriate?

IMO, I think no.

In all paths, this check is done, why would this be the only exception?

We use Asserts() for cases that shouldn't happen in practice for
performance reasons. Since this code doesn't crash I suspect this is
one of such cases. Unless you are aware of a specific scenario that
makes the code crash of course.

--
Best regards,
Aleksander Alekseev

Ranier Vilela

ranier.vf@gmail.com

7 months ago

In reply to: Aleksander Alekseev (#4)

Re: Avoid possible dereference null pointer (src/backend/utils/cache/relcache.c)

Em ter., 17 de jun. de 2025 às 10:09, Aleksander Alekseev <
aleksander@timescale.com> escreveu:

Hi,

We would be in serious trouble if RelationReloadNailed() could be
called for a non-existing relation. Wouldn't Assert() be more
appropriate?

IMO, I think no.

In all paths, this check is done, why would this be the only exception?

We use Asserts() for cases that shouldn't happen in practice for
performance reasons.

If you are concerned about performance in this case,
you can rest assured.

Since this code doesn't crash.

No, you can't be sure.

I suspect this is
one of such cases. Unless you are aware of a specific scenario that
makes the code crash of course.

ScanPgRelation really can fail, better make sure.

best regards,
Ranier Vilela

Tomas Vondra

tomas@vondra.me

7 months ago

In reply to: Aleksander Alekseev (#4)

Re: Avoid possible dereference null pointer (src/backend/utils/cache/relcache.c)

On 6/17/25 15:09, Aleksander Alekseev wrote:

Hi,

We would be in serious trouble if RelationReloadNailed() could be
called for a non-existing relation. Wouldn't Assert() be more
appropriate?

IMO, I think no.

In all paths, this check is done, why would this be the only exception?

We use Asserts() for cases that shouldn't happen in practice for
performance reasons. Since this code doesn't crash I suspect this is
one of such cases. Unless you are aware of a specific scenario that
makes the code crash of course.

Right. Assert() are for checks that we don't expect to ever fail, so we
don't want to keep them in production builds. I believe this is exactly
such case, because the code is about "nailed" catalogs, that we need to
access very early (e.g. before being able to access pg_class).

There's only ~10 of such catalogs (look for formrdesc calls), and if we
get a failure here, it's not clear to me we can really keep the system
running anyway. It's not just the usual "relcache miss" and if the user
retries it will probably work fine. The catalog is borked, and who knows
in what way.

My opinion is that adding a "elog(ERROR)" here would be misleading, as
it implies it's something we expect. And mostly pointless. I can imagine
adding an Assert, but I don't quite see how is that better than just
hitting a segfault a couple lines later.

regards

--
Tomas Vondra

Ranier Vilela

ranier.vf@gmail.com

7 months ago

In reply to: Tomas Vondra (#6)

Re: Avoid possible dereference null pointer (src/backend/utils/cache/relcache.c)

Em ter., 17 de jun. de 2025 às 10:43, Tomas Vondra <tomas@vondra.me>
escreveu:

On 6/17/25 15:09, Aleksander Alekseev wrote:

Hi,

We would be in serious trouble if RelationReloadNailed() could be
called for a non-existing relation. Wouldn't Assert() be more
appropriate?

IMO, I think no.

In all paths, this check is done, why would this be the only exception?

We use Asserts() for cases that shouldn't happen in practice for
performance reasons. Since this code doesn't crash I suspect this is
one of such cases. Unless you are aware of a specific scenario that
makes the code crash of course.

Right. Assert() are for checks that we don't expect to ever fail, so we
don't want to keep them in production builds. I believe this is exactly
such case, because the code is about "nailed" catalogs, that we need to
access very early (e.g. before being able to access pg_class).

There's only ~10 of such catalogs (look for formrdesc calls), and if we
get a failure here, it's not clear to me we can really keep the system
running anyway. It's not just the usual "relcache miss" and if the user
retries it will probably work fine. The catalog is borked, and who knows
in what way.

My opinion is that adding a "elog(ERROR)" here would be misleading, as
it implies it's something we expect. And mostly pointless. I can imagine
adding an Assert, but I don't quite see how is that better than just
hitting a segfault a couple lines later.

To me this is a contradiction, whether you consider waiting for a segfault
or consider adding an Assert.
For the user it is better to have a log, where he can quickly find the
problem, rather than having to investigate on his own.

best regards,
Ranier Vilela

Aleksander Alekseev

aleksander@timescale.com

7 months ago

In reply to: Ranier Vilela (#7)

Re: Avoid possible dereference null pointer (src/backend/utils/cache/relcache.c)

Hi Ranier,

To me this is a contradiction, whether you consider waiting for a segfault or consider adding an Assert.
For the user it is better to have a log, where he can quickly find the problem, rather than having to investigate on his own.

That's the point. User's shouldn't encounter this. Thus there is no
reason to break branch prediction for everyone here.

I agree with Tomas that adding Assert() in fact will not change much.
It can serve as a little piece of documentation maybe: "yes we thought
about it", that's all.

--
Best regards,
Aleksander Alekseev

Ranier Vilela

ranier.vf@gmail.com

7 months ago

In reply to: Aleksander Alekseev (#8)

Re: Avoid possible dereference null pointer (src/backend/utils/cache/relcache.c)

Em ter., 17 de jun. de 2025 às 11:10, Aleksander Alekseev <
aleksander@timescale.com> escreveu:

Hi Ranier,

To me this is a contradiction, whether you consider waiting for a

segfault or consider adding an Assert.

For the user it is better to have a log, where he can quickly find the

problem, rather than having to investigate on his own.

That's the point. User's shouldn't encounter this.

It shouldn't.

Thus there is no
reason to break branch prediction for everyone here.

For now there is no consensus that there will not be a segfault.
If a segfault will never occur, ok.
IMO, this is not a query-dependent issue.
But the table in question is never corrupted or invalid.

best regards,
Ranier Vilela

#10

Daniel Gustafsson

daniel@yesql.se

7 months ago

In reply to: Tomas Vondra (#6)

Re: Avoid possible dereference null pointer (src/backend/utils/cache/relcache.c)

On 17 Jun 2025, at 15:43, Tomas Vondra <tomas@vondra.me> wrote:

.. if we
get a failure here, it's not clear to me we can really keep the system
running anyway. It's not just the usual "relcache miss" and if the user
retries it will probably work fine. The catalog is borked, and who knows
in what way.

Agreed.

My opinion is that adding a "elog(ERROR)" here would be misleading, as
it implies it's something we expect. And mostly pointless. I can imagine
adding an Assert, but I don't quite see how is that better than just
hitting a segfault a couple lines later.

If I break something here while hacking I'd probably prefer a segfault to an
Assert.

--
Daniel Gustafsson