sparse (static analyzer) report

Started by Mark Wongalmost 21 years ago20 messages
#1Mark Wong
markw@osdl.org

Hi,

Just wondering if anyone finds spare's analysis useful. I ran it
against 8.0-rc5:
http://developer.osdl.org/markw/pgsql/sparse/pg-8.0rc5.txt

Sparse can be downloaded
http://www.codemonkey.org.uk/projects/bitkeeper/sparse/
or
bk://sparse.bkbits.net/sparse

#2Mark Wong
markw@osdl.org
In reply to: Mark Wong (#1)
Re: sparse (static analyzer) report

We've also started automating sparse analyses in our PLM tool, which
will show an error and warning count. Here's an example:
http://www.osdl.org/plm-cgi/plm?module=patch_info&patch_id=4065

Show quoted text

On Wed, Jan 12, 2005 at 02:18:36PM -0800, Mark Wong wrote:

Hi,

Just wondering if anyone finds spare's analysis useful. I ran it
against 8.0-rc5:
http://developer.osdl.org/markw/pgsql/sparse/pg-8.0rc5.txt

Sparse can be downloaded
http://www.codemonkey.org.uk/projects/bitkeeper/sparse/
or
bk://sparse.bkbits.net/sparse

#3Alvaro Herrera
alvherre@dcc.uchile.cl
In reply to: Mark Wong (#2)
Re: sparse (static analyzer) report

On Thu, Jan 13, 2005 at 01:31:36PM -0800, Mark Wong wrote:

We've also started automating sparse analyses in our PLM tool, which
will show an error and warning count. Here's an example:
http://www.osdl.org/plm-cgi/plm?module=patch_info&patch_id=4065

I took a peek at the first sparse report you posted, and it's too noisy
to be useful. The parser seems confused in several ways in thousands of
places.

Maybe there's something useful to be extracted, but we'd need to adapt
sparse.

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
www.google.com: interfaz de l�nea de comando para la web.

#4Mark Wong
markw@osdl.org
In reply to: Alvaro Herrera (#3)
Re: sparse (static analyzer) report

On Thu, Jan 13, 2005 at 08:06:23PM -0300, Alvaro Herrera wrote:

On Thu, Jan 13, 2005 at 01:31:36PM -0800, Mark Wong wrote:

We've also started automating sparse analyses in our PLM tool, which
will show an error and warning count. Here's an example:
http://www.osdl.org/plm-cgi/plm?module=patch_info&amp;patch_id=4065

I took a peek at the first sparse report you posted, and it's too noisy
to be useful. The parser seems confused in several ways in thousands of
places.

Maybe there's something useful to be extracted, but we'd need to adapt
sparse.

It might not have helped to dump every source file's report into a
single file. Would it have helped to split out the results per file?

Mark

#5Mark Wong
markw@osdl.org
In reply to: Mark Wong (#4)
Re: sparse (static analyzer) report

On Thu, Jan 13, 2005 at 03:09:19PM -0800, Mark Wong wrote:

On Thu, Jan 13, 2005 at 08:06:23PM -0300, Alvaro Herrera wrote:

On Thu, Jan 13, 2005 at 01:31:36PM -0800, Mark Wong wrote:

We've also started automating sparse analyses in our PLM tool, which
will show an error and warning count. Here's an example:
http://www.osdl.org/plm-cgi/plm?module=patch_info&amp;patch_id=4065

I took a peek at the first sparse report you posted, and it's too noisy
to be useful. The parser seems confused in several ways in thousands of
places.

Maybe there's something useful to be extracted, but we'd need to adapt
sparse.

It might not have helped to dump every source file's report into a
single file. Would it have helped to split out the results per file?

Something like this:
http://developer.osdl.org/markw/pgsql/sparse/pg-8.0rc5.html

Mark

#6Alvaro Herrera
alvherre@dcc.uchile.cl
In reply to: Mark Wong (#5)
Re: sparse (static analyzer) report

On Fri, Jan 14, 2005 at 03:53:09PM -0800, Mark Wong wrote:

On Thu, Jan 13, 2005 at 03:09:19PM -0800, Mark Wong wrote:

On Thu, Jan 13, 2005 at 08:06:23PM -0300, Alvaro Herrera wrote:

On Thu, Jan 13, 2005 at 01:31:36PM -0800, Mark Wong wrote:

We've also started automating sparse analyses in our PLM tool, which
will show an error and warning count. Here's an example:

http://developer.osdl.org/markw/pgsql/sparse/pg-8.0rc5.html

Hmm. Well, it showed the multiple incorrect uses of 0 as NULL in
dllist.c and other places, but there's still lots of spurious entries.
See backend/transam/xlog.c: it's strange that the parser is so confused
about XLogCtrl, for example.

It's complaining in several places about function as variables in
function declarations (the multiple walkers and mutators for example);
not sure how correct that is.

It is also analyzing flex and bison output files ... is it capable of
analyzing .y and .l files instead?

It's strange that DatumGetInt32 shows as a undefined identifier ...
there's some problem with postgres.h apparently. And fmgroids.h is
missing; not sure what's the minimal make target to install it, because
make -C src/backend/utils fmgroids.h
generates it, but the symbolic link to src/include/utils is still
needed.

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
"La gente vulgar solo piensa en pasar el tiempo;
el que tiene talento, en aprovecharlo"

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#6)
Re: sparse (static analyzer) report

Alvaro Herrera <alvherre@dcc.uchile.cl> writes:

... And fmgroids.h is
missing; not sure what's the minimal make target to install it, because
make -C src/backend/utils fmgroids.h
generates it, but the symbolic link to src/include/utils is still
needed.

Looks like the symlinks are made in src/backend/Makefile.

regards, tom lane

#8Mark Wong
markw@osdl.org
In reply to: Alvaro Herrera (#6)
Re: sparse (static analyzer) report

On Fri, Jan 14, 2005 at 09:54:24PM -0300, Alvaro Herrera wrote:

On Fri, Jan 14, 2005 at 03:53:09PM -0800, Mark Wong wrote:

On Thu, Jan 13, 2005 at 03:09:19PM -0800, Mark Wong wrote:

On Thu, Jan 13, 2005 at 08:06:23PM -0300, Alvaro Herrera wrote:

On Thu, Jan 13, 2005 at 01:31:36PM -0800, Mark Wong wrote:

We've also started automating sparse analyses in our PLM tool, which
will show an error and warning count. Here's an example:

http://developer.osdl.org/markw/pgsql/sparse/pg-8.0rc5.html

Hmm. Well, it showed the multiple incorrect uses of 0 as NULL in
dllist.c and other places, but there's still lots of spurious entries.
See backend/transam/xlog.c: it's strange that the parser is so confused
about XLogCtrl, for example.

I'm sure there are a number of false positives, if that's what you're
getting at.

It's complaining in several places about function as variables in
function declarations (the multiple walkers and mutators for example);
not sure how correct that is.

It is also analyzing flex and bison output files ... is it capable of
analyzing .y and .l files instead?

Yeah, I generate the file list to run sparse against with a
"find . -name '*.c'". So that's simple enough. But flex and bison
files don't end in .c, do they?

It's strange that DatumGetInt32 shows as a undefined identifier ...
there's some problem with postgres.h apparently. And fmgroids.h is
missing; not sure what's the minimal make target to install it, because
make -C src/backend/utils fmgroids.h
generates it, but the symbolic link to src/include/utils is still
needed.

Perhaps part of this has to do with my rather blind selection of files
to run sparse against. For example, you still have to run configure
first and it probably doesn't make sense to run on some of the files
in ports. I did notice one of the files complained about missing
Windows.h. :)

Mark

#9Neil Conway
neilc@samurai.com
In reply to: Alvaro Herrera (#6)
Re: sparse (static analyzer) report

BTW, perhaps one reason for the relatively small number of legitimate
issues picked up by sparse is that I ran sparse on the tree a month or
two ago and fixed some of the stylistic issues it reported. Most of the
stuff I didn't bother to fix looked like either a sparse bug, or a
marginal style improvement I didn't bother applying (like fixing 0 =>
NULL in dllist.c).

I've been meaning to investigate whether sparse can be used as something
more than just a fussy syntax checker (i.e. whether it can do any
meaningful static analysis for interesting properties), but I haven't
had a chance yet.

Alvaro Herrera wrote:

It's complaining in several places about function as variables in
function declarations (the multiple walkers and mutators for example);
not sure how correct that is.

I believe the conclusion of prior discussions about making the
walker/mutator prototypes more precise is that it's not worth the cost.

-Neil

P.S. Hope everyone had a good holiday. I'm back at work on Monday.

#10Mark Wong
markw@osdl.org
In reply to: Neil Conway (#9)
Re: sparse (static analyzer) report

Ah, so you beat me to it Neil. ;) Out of curiosity, how much worse
was it before you started fixing things?

Mark

Show quoted text

On Sat, Jan 15, 2005 at 01:30:37PM +1100, Neil Conway wrote:

BTW, perhaps one reason for the relatively small number of legitimate
issues picked up by sparse is that I ran sparse on the tree a month or
two ago and fixed some of the stylistic issues it reported. Most of the
stuff I didn't bother to fix looked like either a sparse bug, or a
marginal style improvement I didn't bother applying (like fixing 0 =>
NULL in dllist.c).

I've been meaning to investigate whether sparse can be used as something
more than just a fussy syntax checker (i.e. whether it can do any
meaningful static analysis for interesting properties), but I haven't
had a chance yet.

Alvaro Herrera wrote:

It's complaining in several places about function as variables in
function declarations (the multiple walkers and mutators for example);
not sure how correct that is.

I believe the conclusion of prior discussions about making the
walker/mutator prototypes more precise is that it's not worth the cost.

-Neil

P.S. Hope everyone had a good holiday. I'm back at work on Monday.

#11Alvaro Herrera
alvherre@dcc.uchile.cl
In reply to: Mark Wong (#8)
Re: sparse (static analyzer) report

On Fri, Jan 14, 2005 at 05:50:30PM -0800, Mark Wong wrote:

Yeah, I generate the file list to run sparse against with a
"find . -name '*.c'". So that's simple enough. But flex and bison
files don't end in .c, do they?

Generated files do. I have a list of generated files here:

pl_gram.c
pl_scan.c
psqlscan.c
preproc.c
pgc.c
guc-file.c
fmgrtab.c
gram.c
scan.c
bootparse.c
bootscanner.c

Not sure how to skip them with find ... I think you can do that with
-regex but it's cumbersome. Re: port, I think you can -prune it.

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
Jude: I wish humans laid eggs
Ringlord: Why would you want humans to lay eggs?
Jude: So I can eat them

#12Greg Stark
gsstark@mit.edu
In reply to: Alvaro Herrera (#6)
Re: sparse (static analyzer) report

Alvaro Herrera <alvherre@dcc.uchile.cl> writes:

Hmm. Well, it showed the multiple incorrect uses of 0 as NULL in
dllist.c and other places,

Incidentally, while it may not be conformant to your style guidelines, use of
the constant 0 compared to or assigned to a pointer is a perfectly valid ANSI
spelling for NULL. (The same is not true for an expression that happens to
evaluate to 0.)

--
greg

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Greg Stark (#12)
Re: sparse (static analyzer) report

Greg Stark <gsstark@mit.edu> writes:

Alvaro Herrera <alvherre@dcc.uchile.cl> writes:

Hmm. Well, it showed the multiple incorrect uses of 0 as NULL in
dllist.c and other places,

Incidentally, while it may not be conformant to your style guidelines, use of
the constant 0 compared to or assigned to a pointer is a perfectly valid ANSI
spelling for NULL.

Absolutely. But I agree that it is more readable to use NULL when you
mean a null pointer, and 0 when you mean an integer zero. The C
standard may not distinguish these concepts, but I do ;-)

Something that I don't have a real strong feeling about is
if (ptr != NULL)
versus
if (ptr)
I've been known to write both. Can anyone mount a good readability
argument for one over the other?

How about the inverse case,
if (ptr == NULL)
versus
if (!ptr)
Applying a boolean ! to a pointer seems a bit shaky to me, though
it's certainly a common locution.

regards, tom lane

#14Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#13)
Re: sparse (static analyzer) report

Tom Lane wrote:

Greg Stark <gsstark@mit.edu> writes:

Alvaro Herrera <alvherre@dcc.uchile.cl> writes:

Hmm. Well, it showed the multiple incorrect uses of 0 as NULL in
dllist.c and other places,

Incidentally, while it may not be conformant to your style guidelines, use of
the constant 0 compared to or assigned to a pointer is a perfectly valid ANSI
spelling for NULL.

Absolutely. But I agree that it is more readable to use NULL when you
mean a null pointer, and 0 when you mean an integer zero. The C
standard may not distinguish these concepts, but I do ;-)

Something that I don't have a real strong feeling about is
if (ptr != NULL)
versus
if (ptr)
I've been known to write both. Can anyone mount a good readability
argument for one over the other?

How about the inverse case,
if (ptr == NULL)
versus
if (!ptr)
Applying a boolean ! to a pointer seems a bit shaky to me, though
it's certainly a common locution.

If we allow "if (ptr)" then allowing the inverse to be "if (! ptr)"
seems logical enough. As you say, it's a very common idiom, and allowing
one without the other would be rather non-orthogonal.

cheers

andrew

#15Alvaro Herrera
alvherre@dcc.uchile.cl
In reply to: Andrew Dunstan (#14)
Re: sparse (static analyzer) report

On Sat, Jan 15, 2005 at 08:31:42AM -0500, Andrew Dunstan wrote:

Tom Lane wrote:

Something that I don't have a real strong feeling about is
if (ptr != NULL)
versus
if (ptr)
I've been known to write both. Can anyone mount a good readability
argument for one over the other?

I assume people don't like the PointerIsValid() macro defined in c.h?

Just for consistency we could use that everywhere or get rid of it ...
but then, maybe it is used by external code so we shouldn't do the
latter.

How about the inverse case,
if (ptr == NULL)
versus
if (!ptr)
Applying a boolean ! to a pointer seems a bit shaky to me, though
it's certainly a common locution.

If we allow "if (ptr)" then allowing the inverse to be "if (! ptr)"
seems logical enough. As you say, it's a very common idiom, and allowing
one without the other would be rather non-orthogonal.

I'd rather have legibility over orthogonality on this issue. I prefer
ptr == NULL myself, though not too strongly.

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
"No necesitamos banderas
No reconocemos fronteras" (Jorge Gonz���lez)

#16Greg Stark
gsstark@mit.edu
In reply to: Tom Lane (#13)
Re: sparse (static analyzer) report

Tom Lane <tgl@sss.pgh.pa.us> writes:

if (ptr)

Can anyone mount a good readability argument for one over the other?

My argument in favour of "if (ptr)" as well as "if (integer)" and "if (!ptr)"
is simply that all else being equal the shorter expression is clearer. Forcing
the reader to slog through additional syntax to read the code has a cost.

Obviously that argument is subject to abuse and it's very dependent on whether
you consider all else equal.

For a reduction argument, pretending there's a boolean type with no casts
leads to the excessively verbose java-ish syntax like:

if (foo != 0 && bar != null && baz != 0)

which you can't tell me is easier to read than

if (foo && bar && baz)

What I miss most in both C and Java is the lispish ability to write
expressions like:

foo = bar() || baz() || qux();

instead you have to write out things like this (except worse since you have to
start storing things in temporary variables):

if (bar() != null) {
foo = bar();
} else if (baz() != 0) {
foo = baz()
} else {
foo = qux();
}

I've seen code that consists of a dozen or so of these things repeated. When
the code starts getting so verbose that what could be 12 very simple and clear
lines no longer fits on the screen because you're trying to make it easier to
understand, well, I think you've lost the battle.

--
greg

#17Bruno Wolff III
bruno@wolff.to
In reply to: Greg Stark (#16)
Re: sparse (static analyzer) report

On Sat, Jan 15, 2005 at 10:44:48 -0500,
Greg Stark <gsstark@mit.edu> wrote:

What I miss most in both C and Java is the lispish ability to write
expressions like:

foo = bar() || baz() || qux();

Are you sure that C doesn't guarenty short circuit evaluation?
I don't have my C reference handy, but my memory is that evaluation
will stop after the first function call that returns true in the
above expression.

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruno Wolff III (#17)
Re: sparse (static analyzer) report

Bruno Wolff III <bruno@wolff.to> writes:

Greg Stark <gsstark@mit.edu> wrote:

What I miss most in both C and Java is the lispish ability to write
expressions like:

foo = bar() || baz() || qux();

Are you sure that C doesn't guarenty short circuit evaluation?
I don't have my C reference handy, but my memory is that evaluation
will stop after the first function call that returns true in the
above expression.

Yeah, but you can only find out the boolean result, not the actually
returned value --- that is, foo will get 1 or 0.

regards, tom lane

#19Tommi Mäkitalo
t.maekitalo@epgmbh.de
In reply to: Bruno Wolff III (#17)
Re: sparse (static analyzer) report

Am Samstag, 15. Januar 2005 21:38 schrieb Bruno Wolff III:

On Sat, Jan 15, 2005 at 10:44:48 -0500,

Greg Stark <gsstark@mit.edu> wrote:

What I miss most in both C and Java is the lispish ability to write
expressions like:

foo = bar() || baz() || qux();

Are you sure that C doesn't guarenty short circuit evaluation?
I don't have my C reference handy, but my memory is that evaluation
will stop after the first function call that returns true in the
above expression.

C do guaranty short circuit evaluation.

You can also write:

(foo = bar()) || (foo = baz()) || (foo = qux())

this is a valid shortcut, where bar(), baz() and qux() are not evaluated
twice, like in the if-cascade. But it is a ugly style every stylechecker
should have no problems complaining about. Even a compiler would warn about
'=' and '=='-confusion. But you can fix it:

(foo = bar()) != NULL || (foo = baz()) != NULL || foo = qux()) != NULL;

It's short, but not quite that readable.

Tommi

#20Neil Conway
neilc@samurai.com
In reply to: Mark Wong (#10)
Re: sparse (static analyzer) report

Mark Wong wrote:

Ah, so you beat me to it Neil. ;) Out of curiosity, how much worse
was it before you started fixing things?

As I recall, not too different than things are today -- sparse flagged a
bunch of stylistic issues that I fixed, like:

void some_func() { ... } => void some_func(void) { ... }

extern void some_func(void) { ... } => void some_func(void) { ... }

etc. But given that there are 3000 odd warnings (mostly bogus), I doubt
I made a significant dent on the total warning count.

Also, I don't recall sparse picking up any significant errors or
problems. As I said, I've been meaning to explore if that's all that
sparse can do -- I just looked at the lowest of the low-hanging fruit.

-Neil