Dereferenced pointers checked as NULL in btree_utils_var.c
Hi all,
Coverity is pointing out $subject, with the following stuff in gbt_var_same():
GBT_VARKEY *t1 = (GBT_VARKEY *) DatumGetPointer(d1);
GBT_VARKEY *t2 = (GBT_VARKEY *) DatumGetPointer(d2);
GBT_VARKEY_R r1,
r2;
r1 = gbt_var_key_readable(t1); <= t1 dereferenced
r2 = gbt_var_key_readable(t2); <= t2 dereferenced
if (t1 && t2)
result = ((*tinfo->f_cmp) (r1.lower, r2.lower,
collation) == 0 &&
(*tinfo->f_cmp) (r1.upper, r2.upper,
collation) == 0);
else
result = (t1 == NULL && t2 == NULL); <= Coverity complains here
return result;
As Heikki pointed me out on IM, the lack of crash report in this area,
as well as similar coding style in cube/ seem to be sufficient
arguments to simply remove those NULL checks instead of doing more
solid checks on them. Patch is attached.
Regards,
--
Michael
Attachments:
20150120_btree_utils_fix.patchtext/x-patch; charset=US-ASCII; name=20150120_btree_utils_fix.patchDownload
diff --git a/contrib/btree_gist/btree_utils_var.c b/contrib/btree_gist/btree_utils_var.c
index b7dd060..6ad3347 100644
--- a/contrib/btree_gist/btree_utils_var.c
+++ b/contrib/btree_gist/btree_utils_var.c
@@ -337,7 +337,6 @@ bool
gbt_var_same(Datum d1, Datum d2, Oid collation,
const gbtree_vinfo *tinfo)
{
- bool result;
GBT_VARKEY *t1 = (GBT_VARKEY *) DatumGetPointer(d1);
GBT_VARKEY *t2 = (GBT_VARKEY *) DatumGetPointer(d2);
GBT_VARKEY_R r1,
@@ -346,13 +345,8 @@ gbt_var_same(Datum d1, Datum d2, Oid collation,
r1 = gbt_var_key_readable(t1);
r2 = gbt_var_key_readable(t2);
- if (t1 && t2)
- result = ((*tinfo->f_cmp) (r1.lower, r2.lower, collation) == 0 &&
- (*tinfo->f_cmp) (r1.upper, r2.upper, collation) == 0);
- else
- result = (t1 == NULL && t2 == NULL);
-
- return result;
+ return ((*tinfo->f_cmp) (r1.lower, r2.lower, collation) == 0 &&
+ (*tinfo->f_cmp) (r1.upper, r2.upper, collation) == 0);
}
Michael Paquier <michael.paquier@gmail.com> writes:
Coverity is pointing out $subject, with the following stuff in gbt_var_same():
...
As Heikki pointed me out on IM, the lack of crash report in this area,
as well as similar coding style in cube/ seem to be sufficient
arguments to simply remove those NULL checks instead of doing more
solid checks on them. Patch is attached.
The way to form a convincing argument that these checks are unnecessary
would be to verify that (1) the SQL-accessible functions directly calling
gbt_var_same() are all marked STRICT, and (2) the core GIST code never
passes a NULL to these support functions. I'm prepared to believe that
(1) and (2) are both true, but it merits checking.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Jan 20, 2015 at 11:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Michael Paquier <michael.paquier@gmail.com> writes:
Coverity is pointing out $subject, with the following stuff in gbt_var_same():
...
As Heikki pointed me out on IM, the lack of crash report in this area,
as well as similar coding style in cube/ seem to be sufficient
arguments to simply remove those NULL checks instead of doing more
solid checks on them. Patch is attached.The way to form a convincing argument that these checks are unnecessary
would be to verify that (1) the SQL-accessible functions directly calling
gbt_var_same() are all marked STRICT, and (2) the core GIST code never
passes a NULL to these support functions. I'm prepared to believe that
(1) and (2) are both true, but it merits checking.
Sure. gbt_var_same is called by those functions in btree_gist/:
- gbt_bit_same
- gbt_bytea_same
- gbt_numeric_same
- gbt_text_same
=# select proname, proisstrict from pg_proc
where proname in ('gbt_bit_same', 'gbt_bytea_same',
'gbt_numeric_same', 'gbt_text_same');
proname | proisstrict
------------------+-------------
gbt_text_same | t
gbt_bytea_same | t
gbt_numeric_same | t
gbt_bit_same | t
(4 rows)
For the second point, I have run regression tests with an assertion in
gbt_var_same checking if t1 or t2 are NULL and tests worked. Also,
looking at the code of gist, gbt_var_same is called through
gistKeyIsEQ, which is used for an insertion or for a split. The point
is that when doing an insertion, a call to gistgetadjusted is done and
we look if one of the keys is NULL. If one of them is, code does not
go through gistKeyIsEQ, so the NULL checks do not seem necessary in
gbt_var_same.
Btw, looking at the code of gist, I think that it would be interesting
to add an assertion in gistKeyIsEQ like that:
Assert(DatumGetPointer(a) != NULL && DatumGetPointer(b) != NULL);
Thoughts?
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 01/21/2015 07:14 AM, Michael Paquier wrote:
Also,
looking at the code of gist, gbt_var_same is called through
gistKeyIsEQ, which is used for an insertion or for a split. The point
is that when doing an insertion, a call to gistgetadjusted is done and
we look if one of the keys is NULL. If one of them is, code does not
go through gistKeyIsEQ, so the NULL checks do not seem necessary in
gbt_var_same.
I think you are confusing NULL pointers with an SQL NULL.
gistgetadjusted checks that it's not an SQL NULL (!oldisnull[i]), but it
does not check if it's a NULL pointer
(DatumGetPointer(oldentries[i].key) != NULL). The case we're worried
about is that the value is not an SQL NULL, i.e. isnull flag is false,
but the Datum is a NULL pointer.
Nevertheless, looking at the code, I don't that a NULL pointer can ever
be passed to the same-method, for any of the built-in or contrib
opclasses, but it's very confusing because some functions check for
that. My proof goes like this:
1. The key value passed as argument must've been originally formed by
the compress, union, or picksplit methods.
1.1. Some compress methods do nothing, and just return the passed-in
key, which comes from the table and cannot be a NULL pointer (the
compress method is never called for SQL NULLs). Other compress methods
don't check for a NULL pointer, and would crash if there was one.
gist_poly_compress() and gist_circle_compress() do check for a NULL, but
they only return a NULL key if the input key is NULL, which cannot
happen because the input comes from a table. So I believe the
NULL-checks in those functions are dead code, and none of the compress
methods ever return a NULL key.
1.2. None of the union methods return a NULL pointer (nor expect one as
input).
1.3. None of the picksplit methods return a NULL pointer. They all
return one of the original values, or one formed with the union method.
The picksplit method can return a NULL pointer in the spl_ldatum or
spl_rdatum field, in the degenerate case that it puts all keys on the
same halve. However, the caller, gistUserPickSplit checks for that and
immediately overwrites spl_ldatum and spl_rdatum with sane values in
that case.
I don't understand what this sentence in the documentation on the
compress method means:
Depending on your needs, you could also need to care about
compressing NULL values in there, storing for example (Datum) 0 like
gist_circle_compress does.
The compress method is never called for NULLs, so the above is nonsense.
I think it should be removed, as well as the checks in
gist_circle_compress and gist_poly_compress. According to git history,
the check in gist_circle_compress been there ever since the module was
imported into contrib/rtree_gist, in 2001. The documentation was added
later:
commit a0a3883dd977d6618899ccd14258a0696912a9d2
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: Fri Jun 12 19:48:53 2009 +0000
Improve documentation about GiST opclass support functions.
Dimitri Fontaine
I'm guessing that Tom added that sentence (it was not in the patch that
Dimitri submitted originally) just because there was that check in the
existing function, without realizing that the check was in fact dead code.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Heikki Linnakangas <hlinnakangas@vmware.com> writes:
I think you are confusing NULL pointers with an SQL NULL.
gistgetadjusted checks that it's not an SQL NULL (!oldisnull[i]), but it
does not check if it's a NULL pointer
(DatumGetPointer(oldentries[i].key) != NULL). The case we're worried
about is that the value is not an SQL NULL, i.e. isnull flag is false,
but the Datum is a NULL pointer.
Actually both of these deserve to be worried about; though it's fairly
clear from looking at the core GIST code that it avoids calling
gistKeyIsEQ on SQL NULLs.
Nevertheless, looking at the code, I don't that a NULL pointer can ever
be passed to the same-method, for any of the built-in or contrib
opclasses, but it's very confusing because some functions check for
that. My proof goes like this:
1. The key value passed as argument must've been originally formed by
the compress, union, or picksplit methods.
1.1. Some compress methods do nothing, and just return the passed-in
key, which comes from the table and cannot be a NULL pointer (the
compress method is never called for SQL NULLs). Other compress methods
don't check for a NULL pointer, and would crash if there was one.
gist_poly_compress() and gist_circle_compress() do check for a NULL, but
they only return a NULL key if the input key is NULL, which cannot
happen because the input comes from a table. So I believe the
NULL-checks in those functions are dead code, and none of the compress
methods ever return a NULL key.
1.2. None of the union methods return a NULL pointer (nor expect one as
input).
1.3. None of the picksplit methods return a NULL pointer. They all
return one of the original values, or one formed with the union method.
The picksplit method can return a NULL pointer in the spl_ldatum or
spl_rdatum field, in the degenerate case that it puts all keys on the
same halve. However, the caller, gistUserPickSplit checks for that and
immediately overwrites spl_ldatum and spl_rdatum with sane values in
that case.
Sounds good to me.
I don't understand what this sentence in the documentation on the
compress method means:
Depending on your needs, you could also need to care about
compressing NULL values in there, storing for example (Datum) 0 like
gist_circle_compress does.
I believe you're right that I added this because there were checks for
null pointers in some but not all of the opclass support functions.
It looked to me like some opclasses might be intending to pass around null
pointers as valid (not-SQL-NULL) values. I think your analysis above
eliminates that idea though. It's a sufficiently weird concept that
I don't feel a need to document or support it.
So I'm fine with taking out both this documentation text and the dead
null-pointer checks; but let's do that all in one patch not piecemeal.
In any case, this is just cosmetic cleanup; there's no actual hazard
here.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Jan 28, 2015 at 3:15 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
So I'm fine with taking out both this documentation text and the dead
null-pointer checks; but let's do that all in one patch not piecemeal.
In any case, this is just cosmetic cleanup; there's no actual hazard
here.
Attached is a patch with all those things done. I added as well an
assertion in gistKeyIsEQ checking if the input datums are NULL. I
believe that this is still useful for developers, feel free to rip it
out from the patch if you think otherwise.
Regards,
--
Michael
Attachments:
20150128_gist_nullchecks.patchtext/x-patch; charset=US-ASCII; name=20150128_gist_nullchecks.patchDownload
From 95b051aac56de68412a7b0635484e46eb4e24ad0 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@otacoo.com>
Date: Wed, 28 Jan 2015 09:36:11 +0900
Subject: [PATCH] Remove dead NULL-pointer checks in GiST code
The key value passed to the same method is generated by either the
compress, union or picksplit methods, but it happens that none of them
actually pass a NULL pointer. Some compress methods do not check for a
NULL pointer, what should have resulted in a crash. Note that
gist_poly_compress() and gist_circle_compress() do check for a NULL,
but they only return a NULL key if the input key is NULL, which cannot
happen because the input comes from a table.
This commit also removes a documentation note added in commit a0a3883
that has been introduced based on the fact that some routines of the
same method were doing NULL-pointer checks, and adds an assertion
in gistKeyIsEQ to ensure that no NULL keys are passed when calling the
same method.
---
contrib/btree_gist/btree_utils_num.c | 8 ++----
contrib/btree_gist/btree_utils_var.c | 10 ++-----
doc/src/sgml/gist.sgml | 5 ----
src/backend/access/gist/gistproc.c | 56 ++++++++++++------------------------
src/backend/access/gist/gistutil.c | 3 ++
5 files changed, 25 insertions(+), 57 deletions(-)
diff --git a/contrib/btree_gist/btree_utils_num.c b/contrib/btree_gist/btree_utils_num.c
index 505633c..7c5b911 100644
--- a/contrib/btree_gist/btree_utils_num.c
+++ b/contrib/btree_gist/btree_utils_num.c
@@ -147,13 +147,9 @@ gbt_num_same(const GBT_NUMKEY *a, const GBT_NUMKEY *b, const gbtree_ninfo *tinfo
b2.lower = &(((GBT_NUMKEY *) b)[0]);
b2.upper = &(((GBT_NUMKEY *) b)[tinfo->size]);
- if (
- (*tinfo->f_eq) (b1.lower, b2.lower) &&
- (*tinfo->f_eq) (b1.upper, b2.upper)
- )
- return TRUE;
- return FALSE;
+ return ((*tinfo->f_eq) (b1.lower, b2.lower) &&
+ (*tinfo->f_eq) (b1.upper, b2.upper));
}
diff --git a/contrib/btree_gist/btree_utils_var.c b/contrib/btree_gist/btree_utils_var.c
index b7dd060..6ad3347 100644
--- a/contrib/btree_gist/btree_utils_var.c
+++ b/contrib/btree_gist/btree_utils_var.c
@@ -337,7 +337,6 @@ bool
gbt_var_same(Datum d1, Datum d2, Oid collation,
const gbtree_vinfo *tinfo)
{
- bool result;
GBT_VARKEY *t1 = (GBT_VARKEY *) DatumGetPointer(d1);
GBT_VARKEY *t2 = (GBT_VARKEY *) DatumGetPointer(d2);
GBT_VARKEY_R r1,
@@ -346,13 +345,8 @@ gbt_var_same(Datum d1, Datum d2, Oid collation,
r1 = gbt_var_key_readable(t1);
r2 = gbt_var_key_readable(t2);
- if (t1 && t2)
- result = ((*tinfo->f_cmp) (r1.lower, r2.lower, collation) == 0 &&
- (*tinfo->f_cmp) (r1.upper, r2.upper, collation) == 0);
- else
- result = (t1 == NULL && t2 == NULL);
-
- return result;
+ return ((*tinfo->f_cmp) (r1.lower, r2.lower, collation) == 0 &&
+ (*tinfo->f_cmp) (r1.upper, r2.upper, collation) == 0);
}
diff --git a/doc/src/sgml/gist.sgml b/doc/src/sgml/gist.sgml
index 5de282b..f9644b0 100644
--- a/doc/src/sgml/gist.sgml
+++ b/doc/src/sgml/gist.sgml
@@ -498,11 +498,6 @@ my_compress(PG_FUNCTION_ARGS)
course.
</para>
- <para>
- Depending on your needs, you could also need to care about
- compressing <literal>NULL</> values in there, storing for example
- <literal>(Datum) 0</> like <literal>gist_circle_compress</> does.
- </para>
</listitem>
</varlistentry>
diff --git a/src/backend/access/gist/gistproc.c b/src/backend/access/gist/gistproc.c
index 4decaa6..1df6357 100644
--- a/src/backend/access/gist/gistproc.c
+++ b/src/backend/access/gist/gistproc.c
@@ -1039,25 +1039,15 @@ gist_poly_compress(PG_FUNCTION_ARGS)
if (entry->leafkey)
{
- retval = palloc(sizeof(GISTENTRY));
- if (DatumGetPointer(entry->key) != NULL)
- {
- POLYGON *in = DatumGetPolygonP(entry->key);
- BOX *r;
+ POLYGON *in = DatumGetPolygonP(entry->key);
+ BOX *r;
- r = (BOX *) palloc(sizeof(BOX));
- memcpy((void *) r, (void *) &(in->boundbox), sizeof(BOX));
- gistentryinit(*retval, PointerGetDatum(r),
- entry->rel, entry->page,
- entry->offset, FALSE);
-
- }
- else
- {
- gistentryinit(*retval, (Datum) 0,
- entry->rel, entry->page,
- entry->offset, FALSE);
- }
+ retval = palloc(sizeof(GISTENTRY));
+ r = (BOX *) palloc(sizeof(BOX));
+ memcpy((void *) r, (void *) &(in->boundbox), sizeof(BOX));
+ gistentryinit(*retval, PointerGetDatum(r),
+ entry->rel, entry->page,
+ entry->offset, FALSE);
}
else
retval = entry;
@@ -1113,28 +1103,18 @@ gist_circle_compress(PG_FUNCTION_ARGS)
if (entry->leafkey)
{
+ CIRCLE *in = DatumGetCircleP(entry->key);
+ BOX *r;
retval = palloc(sizeof(GISTENTRY));
- if (DatumGetCircleP(entry->key) != NULL)
- {
- CIRCLE *in = DatumGetCircleP(entry->key);
- BOX *r;
-
- r = (BOX *) palloc(sizeof(BOX));
- r->high.x = in->center.x + in->radius;
- r->low.x = in->center.x - in->radius;
- r->high.y = in->center.y + in->radius;
- r->low.y = in->center.y - in->radius;
- gistentryinit(*retval, PointerGetDatum(r),
- entry->rel, entry->page,
- entry->offset, FALSE);
- }
- else
- {
- gistentryinit(*retval, (Datum) 0,
- entry->rel, entry->page,
- entry->offset, FALSE);
- }
+ r = (BOX *) palloc(sizeof(BOX));
+ r->high.x = in->center.x + in->radius;
+ r->low.x = in->center.x - in->radius;
+ r->high.y = in->center.y + in->radius;
+ r->low.y = in->center.y - in->radius;
+ gistentryinit(*retval, PointerGetDatum(r),
+ entry->rel, entry->page,
+ entry->offset, FALSE);
}
else
retval = entry;
diff --git a/src/backend/access/gist/gistutil.c b/src/backend/access/gist/gistutil.c
index 38af0e0..a44cce8 100644
--- a/src/backend/access/gist/gistutil.c
+++ b/src/backend/access/gist/gistutil.c
@@ -276,6 +276,9 @@ gistKeyIsEQ(GISTSTATE *giststate, int attno, Datum a, Datum b)
{
bool result;
+ /* both keys cannot be NULL */
+ Assert(DatumGetPointer(a) != NULL && DatumGetPointer(b) != NULL);
+
FunctionCall3Coll(&giststate->equalFn[attno],
giststate->supportCollation[attno],
a, b,
--
2.2.2
On 01/28/2015 02:47 AM, Michael Paquier wrote:
On Wed, Jan 28, 2015 at 3:15 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
So I'm fine with taking out both this documentation text and the dead
null-pointer checks; but let's do that all in one patch not piecemeal.
In any case, this is just cosmetic cleanup; there's no actual hazard
here.Attached is a patch with all those things done.
Thanks, applied.
I added as well an assertion in gistKeyIsEQ checking if the input
datums are NULL. I believe that this is still useful for developers,
feel free to rip it out from the patch if you think otherwise.
I ripped it out because I think was wrong. It assumed that the input
Datums are pass-by-reference, which is not a given. It looks that's true
for all the current opclasses, so I wouldn't be surprised if there are
hidden assumptions on that elsewhere in the code, but it was wrong
nevertheless.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers