Don't pass NULL pointer to strcmp().
Hi hackers,
I found that there's a nullable pointer being passed to strcmp() and
can make the server crash. It can be reproduced on the latest master
branch by crafting an extension[1]https://github.com/higuoxing/guc_crash/tree/pg. Patch for fixing it is attatched.
[1]: https://github.com/higuoxing/guc_crash/tree/pg
--
Best Regards,
Xing
Attachments:
0001-Don-t-use-strcmp-with-nullable-pointers.patchtext/x-patch; charset=US-ASCII; name=0001-Don-t-use-strcmp-with-nullable-pointers.patchDownload+3-2
On Wed, Nov 1, 2023 at 5:25 PM Xing Guo <higuoxing@gmail.com> wrote:
Hi hackers,
I found that there's a nullable pointer being passed to strcmp() and
can make the server crash. It can be reproduced on the latest master
branch by crafting an extension[1]. Patch for fixing it is attatched.
Can we set a string guc to NULL? If not, `*lconf->variable == NULL` would
be unnecessary.
--
Best Regards,
--
Regards
Junwang Zhao
Hi,
I found that there's a nullable pointer being passed to strcmp() and
can make the server crash. It can be reproduced on the latest master
branch by crafting an extension[1]. Patch for fixing it is attatched.
Thanks for reporting. I can confirm that the issue reproduces on the
`master` branch and the proposed patch fixes it.
Can we set a string guc to NULL? If not, `*lconf->variable == NULL` would
be unnecessary.
Judging by the rest of the code we better keep it, at least for consistenc.
I see one more place with a similar code in guc.c around line 1472.
Although I don't have exact steps to trigger a crash I suggest adding
a similar check there.
--
Best regards,
Aleksander Alekseev
Hi Aleksander and Junwang,
Thanks for your comments. I have updated the patch accordingly.
Best Regards,
Xing
On Wed, Nov 1, 2023 at 7:44 PM Aleksander Alekseev <aleksander@timescale.com>
wrote:
Show quoted text
Hi,
I found that there's a nullable pointer being passed to strcmp() and
can make the server crash. It can be reproduced on the latest master
branch by crafting an extension[1]. Patch for fixing it is attatched.Thanks for reporting. I can confirm that the issue reproduces on the
`master` branch and the proposed patch fixes it.Can we set a string guc to NULL? If not, `*lconf->variable == NULL` would
be unnecessary.Judging by the rest of the code we better keep it, at least for consistenc.
I see one more place with a similar code in guc.c around line 1472.
Although I don't have exact steps to trigger a crash I suggest adding
a similar check there.--
Best regards,
Aleksander Alekseev
Attachments:
v2-0001-Don-t-use-strcmp-with-nullable-pointers.patchtext/x-patch; charset=US-ASCII; name=v2-0001-Don-t-use-strcmp-with-nullable-pointers.patchDownload+5-3
Xing Guo <higuoxing@gmail.com> writes:
Thanks for your comments. I have updated the patch accordingly.
I'm leery of accepting this patch, as I see no reason that we
should consider it valid for an extension to have a string GUC
with a boot_val of NULL.
I realize that we have a few core GUCs that are like that, but
I'm pretty sure that every one of them has special-case code
that initializes the GUC to something non-null a bit later on
in startup. I don't think there are any cases where a string
GUC's persistent value will be null, and I don't like the
idea of considering that to be an allowed case. It would
open the door to more crash situations, and it brings up the
old question of how could a user tell NULL from empty string
(via SHOW or current_setting() or whatever). Besides, what's
the benefit really?
regards, tom lane
Hi Tom,
There're extensions that set their boot_val to NULL. E.g., postgres_fdw (
https://github.com/postgres/postgres/blob/4210b55f598534db9d52c4535b7dcc777dda75a6/contrib/postgres_fdw/option.c#L582),
plperl (
https://github.com/postgres/postgres/blob/4210b55f598534db9d52c4535b7dcc777dda75a6/src/pl/plperl/plperl.c#L422C13-L422C13,
https://github.com/postgres/postgres/blob/4210b55f598534db9d52c4535b7dcc777dda75a6/src/pl/plperl/plperl.c#L444C12-L444C12,
https://github.com/postgres/postgres/blob/4210b55f598534db9d52c4535b7dcc777dda75a6/src/pl/plperl/plperl.c#L452C6-L452C6)
(Can we treat plperl as an extension?), pltcl (
https://github.com/postgres/postgres/blob/4210b55f598534db9d52c4535b7dcc777dda75a6/src/pl/tcl/pltcl.c#L465C14-L465C14,
https://github.com/postgres/postgres/blob/4210b55f598534db9d52c4535b7dcc777dda75a6/src/pl/tcl/pltcl.c#L472C12-L472C12
).
TBH, I don't know if NULL is a valid boot_val for string variables, I just
came across some extensions that use NULL as their boot_val. If the
boot_val can't be NULL in extensions, we should probably add some
assertions or comments about it?
Best Regards,
Xing
On Wed, Nov 1, 2023 at 11:30 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Show quoted text
Xing Guo <higuoxing@gmail.com> writes:
Thanks for your comments. I have updated the patch accordingly.
I'm leery of accepting this patch, as I see no reason that we
should consider it valid for an extension to have a string GUC
with a boot_val of NULL.I realize that we have a few core GUCs that are like that, but
I'm pretty sure that every one of them has special-case code
that initializes the GUC to something non-null a bit later on
in startup. I don't think there are any cases where a string
GUC's persistent value will be null, and I don't like the
idea of considering that to be an allowed case. It would
open the door to more crash situations, and it brings up the
old question of how could a user tell NULL from empty string
(via SHOW or current_setting() or whatever). Besides, what's
the benefit really?regards, tom lane
Xing Guo <higuoxing@gmail.com> writes:
There're extensions that set their boot_val to NULL. E.g., postgres_fdw
Hmm ... if we're doing it ourselves, I suppose we've got to consider
it supported :-(. But I'm still wondering how many seldom-used
code paths didn't get the message. An example here is that this
could lead to GetConfigOptionResetString returning NULL, which
I think is outside its admittedly-vague API spec.
regards, tom lane
I wrote:
Hmm ... if we're doing it ourselves, I suppose we've got to consider
it supported :-(. But I'm still wondering how many seldom-used
code paths didn't get the message. An example here is that this
could lead to GetConfigOptionResetString returning NULL, which
I think is outside its admittedly-vague API spec.
After digging around for a bit, I think part of the problem is a lack
of a clearly defined spec for what should happen with NULL string GUCs.
In the attached v3, I attempted to remedy that by adding a comment in
guc_tables.h (which is maybe not the best place but I didn't see a
better one). That led me to a couple more changes beyond what you had.
It's possible that some of these are unreachable --- for example,
given that a NULL could only be the default value, I'm not sure that
the fix in write_one_nondefault_variable is a live bug. But we ought
to code all this stuff defensively, and most of it already was
NULL-safe.
regards, tom lane
Attachments:
v3-0001-Don-t-use-strcmp-with-nullable-pointers.patchtext/x-diff; charset=us-ascii; name=v3-0001-Don-t-use-strcmp-with-nullable-pointers.patchDownload+30-8
Thank you Tom!
Your comment
"NULL doesn't have semantics that are visibly different from an empty
string" is exactly what I want to confirm :-)
On 11/2/23, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I wrote:
Hmm ... if we're doing it ourselves, I suppose we've got to consider
it supported :-(. But I'm still wondering how many seldom-used
code paths didn't get the message. An example here is that this
could lead to GetConfigOptionResetString returning NULL, which
I think is outside its admittedly-vague API spec.After digging around for a bit, I think part of the problem is a lack
of a clearly defined spec for what should happen with NULL string GUCs.
In the attached v3, I attempted to remedy that by adding a comment in
guc_tables.h (which is maybe not the best place but I didn't see a
better one). That led me to a couple more changes beyond what you had.It's possible that some of these are unreachable --- for example,
given that a NULL could only be the default value, I'm not sure that
the fix in write_one_nondefault_variable is a live bug. But we ought
to code all this stuff defensively, and most of it already was
NULL-safe.regards, tom lane
--
Best Regards,
Xing
On Wed, Nov 01, 2023 at 09:57:18PM -0400, Tom Lane wrote:
I wrote:
Hmm ... if we're doing it ourselves, I suppose we've got to consider
it supported :-(. But I'm still wondering how many seldom-used
code paths didn't get the message. An example here is that this
could lead to GetConfigOptionResetString returning NULL, which
I think is outside its admittedly-vague API spec.After digging around for a bit, I think part of the problem is a lack
of a clearly defined spec for what should happen with NULL string GUCs.
In the attached v3, I attempted to remedy that by adding a comment in
guc_tables.h (which is maybe not the best place but I didn't see a
better one). That led me to a couple more changes beyond what you had.
What if we disallowed NULL string GUCs in v17? That'd simplify the spec
and future-proof against similar bugs, but it might also break a fair
number of extensions. If there aren't any other reasons to continue
supporting it, maybe it's the right long-term approach, though.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Nathan Bossart <nathandbossart@gmail.com> writes:
On Wed, Nov 01, 2023 at 09:57:18PM -0400, Tom Lane wrote:
After digging around for a bit, I think part of the problem is a lack
of a clearly defined spec for what should happen with NULL string GUCs.
What if we disallowed NULL string GUCs in v17?
Well, we'd need to devise some other solution for hacks like the
one used by timezone_abbreviations (see comment in
check_timezone_abbreviations). I think it's not worth the trouble,
especially seeing that 95% of guc.c is already set up for this.
The bugs are mostly in newer code like get_explain_guc_options,
and I think that's directly traceable to the lack of any comments
or docs about this behavior.
regards, tom lane
On Wed, Nov 01, 2023 at 10:39:04PM -0400, Tom Lane wrote:
Nathan Bossart <nathandbossart@gmail.com> writes:
What if we disallowed NULL string GUCs in v17?
Well, we'd need to devise some other solution for hacks like the
one used by timezone_abbreviations (see comment in
check_timezone_abbreviations). I think it's not worth the trouble,
especially seeing that 95% of guc.c is already set up for this.
The bugs are mostly in newer code like get_explain_guc_options,
and I think that's directly traceable to the lack of any comments
or docs about this behavior.
Eh, yeah, it's probably not worth it if we find ourselves trading one set
of hacks for another.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Hi,
Seems that Tom's patch cannot be applied to the current master branch.
I just re-generate the patch for others to play with.
On 11/2/23, Nathan Bossart <nathandbossart@gmail.com> wrote:
On Wed, Nov 01, 2023 at 10:39:04PM -0400, Tom Lane wrote:
Nathan Bossart <nathandbossart@gmail.com> writes:
What if we disallowed NULL string GUCs in v17?
Well, we'd need to devise some other solution for hacks like the
one used by timezone_abbreviations (see comment in
check_timezone_abbreviations). I think it's not worth the trouble,
especially seeing that 95% of guc.c is already set up for this.
The bugs are mostly in newer code like get_explain_guc_options,
and I think that's directly traceable to the lack of any comments
or docs about this behavior.Eh, yeah, it's probably not worth it if we find ourselves trading one set
of hacks for another.--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
--
Best Regards,
Xing
Attachments:
v3-0001-Consider-treating-NULL-is-a-valid-boot_val-for-st.patchtext/x-patch; charset=US-ASCII; name=v3-0001-Consider-treating-NULL-is-a-valid-boot_val-for-st.patchDownload+30-9
Looking closer, I realized that my proposed change in RestoreGUCState
is unnecessary, because guc_free() is already permissive about being
passed a NULL. That leaves us with one live bug in
get_explain_guc_options, two probably-unreachable hazards in
check_GUC_init and write_one_nondefault_variable, and two API changes
in GetConfigOption and GetConfigOptionResetString. I'm dubious that
back-patching the API changes would be a good idea, so I applied
that to HEAD only. The rest I backpatched as far as relevant.
Thanks for the report!
regards, tom lane