Thread: sparse (static analyzer) report

sparse (static analyzer) report

From
Mark Wong
Date:
Hi,

Just wondering if anyone finds spare's analysis useful.  I ran it
against 8.0-rc5:http://developer.osdl.org/markw/pgsql/sparse/pg-8.0rc5.txt

Sparse can be downloadedhttp://www.codemonkey.org.uk/projects/bitkeeper/sparse/
orbk://sparse.bkbits.net/sparse


Re: sparse (static analyzer) report

From
Mark Wong
Date:
We've also started automating sparse analyses in our PLM tool, which
will show an error and warning count.  Here's an
example:http://www.osdl.org/plm-cgi/plm?module=patch_info&patch_id=4065

On Wed, Jan 12, 2005 at 02:18:36PM -0800, Mark Wong wrote:
> Hi,
> 
> Just wondering if anyone finds spare's analysis useful.  I ran it
> against 8.0-rc5:
>     http://developer.osdl.org/markw/pgsql/sparse/pg-8.0rc5.txt
> 
> Sparse can be downloaded
>     http://www.codemonkey.org.uk/projects/bitkeeper/sparse/
> or
>     bk://sparse.bkbits.net/sparse
> 



Re: sparse (static analyzer) report

From
Alvaro Herrera
Date:
On Thu, Jan 13, 2005 at 01:31:36PM -0800, Mark Wong wrote:
> We've also started automating sparse analyses in our PLM tool, which
> will show an error and warning count.  Here's an example:
>     http://www.osdl.org/plm-cgi/plm?module=patch_info&patch_id=4065

I took a peek at the first sparse report you posted, and it's too noisy
to be useful.  The parser seems confused in several ways in thousands of
places.

Maybe there's something useful to be extracted, but we'd need to adapt
sparse.

-- 
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
www.google.com: interfaz de línea de comando para la web.


Re: sparse (static analyzer) report

From
Mark Wong
Date:
On Thu, Jan 13, 2005 at 08:06:23PM -0300, Alvaro Herrera wrote:
> On Thu, Jan 13, 2005 at 01:31:36PM -0800, Mark Wong wrote:
> > We've also started automating sparse analyses in our PLM tool, which
> > will show an error and warning count.  Here's an example:
> >     http://www.osdl.org/plm-cgi/plm?module=patch_info&patch_id=4065
> 
> I took a peek at the first sparse report you posted, and it's too noisy
> to be useful.  The parser seems confused in several ways in thousands of
> places.
> 
> Maybe there's something useful to be extracted, but we'd need to adapt
> sparse.
> 

It might not have helped to dump every source file's report into a
single file.  Would it have helped to split out the results per file?

Mark


Re: sparse (static analyzer) report

From
Mark Wong
Date:
On Thu, Jan 13, 2005 at 03:09:19PM -0800, Mark Wong wrote:
> On Thu, Jan 13, 2005 at 08:06:23PM -0300, Alvaro Herrera wrote:
> > On Thu, Jan 13, 2005 at 01:31:36PM -0800, Mark Wong wrote:
> > > We've also started automating sparse analyses in our PLM tool, which
> > > will show an error and warning count.  Here's an example:
> > >     http://www.osdl.org/plm-cgi/plm?module=patch_info&patch_id=4065
> > 
> > I took a peek at the first sparse report you posted, and it's too noisy
> > to be useful.  The parser seems confused in several ways in thousands of
> > places.
> > 
> > Maybe there's something useful to be extracted, but we'd need to adapt
> > sparse.
> > 
> 
> It might not have helped to dump every source file's report into a
> single file.  Would it have helped to split out the results per file?
> 

Something like this:http://developer.osdl.org/markw/pgsql/sparse/pg-8.0rc5.html

Mark


Re: sparse (static analyzer) report

From
Alvaro Herrera
Date:
On Fri, Jan 14, 2005 at 03:53:09PM -0800, Mark Wong wrote:
> On Thu, Jan 13, 2005 at 03:09:19PM -0800, Mark Wong wrote:
> > On Thu, Jan 13, 2005 at 08:06:23PM -0300, Alvaro Herrera wrote:
> > > On Thu, Jan 13, 2005 at 01:31:36PM -0800, Mark Wong wrote:
> > > > We've also started automating sparse analyses in our PLM tool, which
> > > > will show an error and warning count.  Here's an example:

>     http://developer.osdl.org/markw/pgsql/sparse/pg-8.0rc5.html

Hmm.  Well, it showed the multiple incorrect uses of 0 as NULL in
dllist.c and other places, but there's still lots of spurious entries.
See backend/transam/xlog.c: it's strange that the parser is so confused
about XLogCtrl, for example.

It's complaining in several places about function as variables in
function declarations (the multiple walkers and mutators for example);
not sure how correct that is.

It is also analyzing flex and bison output files ... is it capable of
analyzing .y and .l files instead?

It's strange that DatumGetInt32 shows as a undefined identifier ...
there's some problem with postgres.h apparently.  And fmgroids.h is
missing; not sure what's the minimal make target to install it, because
make -C src/backend/utils fmgroids.h
generates it, but the symbolic link to src/include/utils is still
needed.

-- 
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
"La gente vulgar solo piensa en pasar el tiempo;
el que tiene talento, en aprovecharlo"


Re: sparse (static analyzer) report

From
Tom Lane
Date:
Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
> ... And fmgroids.h is
> missing; not sure what's the minimal make target to install it, because
> make -C src/backend/utils fmgroids.h
> generates it, but the symbolic link to src/include/utils is still
> needed.

Looks like the symlinks are made in src/backend/Makefile.
        regards, tom lane


Re: sparse (static analyzer) report

From
Mark Wong
Date:
On Fri, Jan 14, 2005 at 09:54:24PM -0300, Alvaro Herrera wrote:
> On Fri, Jan 14, 2005 at 03:53:09PM -0800, Mark Wong wrote:
> > On Thu, Jan 13, 2005 at 03:09:19PM -0800, Mark Wong wrote:
> > > On Thu, Jan 13, 2005 at 08:06:23PM -0300, Alvaro Herrera wrote:
> > > > On Thu, Jan 13, 2005 at 01:31:36PM -0800, Mark Wong wrote:
> > > > > We've also started automating sparse analyses in our PLM tool, which
> > > > > will show an error and warning count.  Here's an example:
> 
> >     http://developer.osdl.org/markw/pgsql/sparse/pg-8.0rc5.html
> 
> Hmm.  Well, it showed the multiple incorrect uses of 0 as NULL in
> dllist.c and other places, but there's still lots of spurious entries.
> See backend/transam/xlog.c: it's strange that the parser is so confused
> about XLogCtrl, for example.

I'm sure there are a number of false positives, if that's what you're
getting at.

> It's complaining in several places about function as variables in
> function declarations (the multiple walkers and mutators for example);
> not sure how correct that is.
> 
> It is also analyzing flex and bison output files ... is it capable of
> analyzing .y and .l files instead?

Yeah, I generate the file list to run sparse against with a
"find . -name '*.c'".  So that's simple enough.  But flex and bison
files don't end in .c, do they?

> It's strange that DatumGetInt32 shows as a undefined identifier ...
> there's some problem with postgres.h apparently.  And fmgroids.h is
> missing; not sure what's the minimal make target to install it, because
> make -C src/backend/utils fmgroids.h
> generates it, but the symbolic link to src/include/utils is still
> needed.

Perhaps part of this has to do with my rather blind selection of files
to run sparse against.  For example, you still have to run configure
first and it probably doesn't make sense to run on some of the files
in ports.  I did notice one of the files complained about missing
Windows.h. :)

Mark


Re: sparse (static analyzer) report

From
Neil Conway
Date:
BTW, perhaps one reason for the relatively small number of legitimate 
issues picked up by sparse is that I ran sparse on the tree a month or 
two ago and fixed some of the stylistic issues it reported. Most of the 
stuff I didn't bother to fix looked like either a sparse bug, or a 
marginal style improvement I didn't bother applying (like fixing 0 => 
NULL in dllist.c).

I've been meaning to investigate whether sparse can be used as something 
more than just a fussy syntax checker (i.e. whether it can do any 
meaningful static analysis for interesting properties), but I haven't 
had a chance yet.

Alvaro Herrera wrote:
> It's complaining in several places about function as variables in
> function declarations (the multiple walkers and mutators for example);
> not sure how correct that is.

I believe the conclusion of prior discussions about making the 
walker/mutator prototypes more precise is that it's not worth the cost.

-Neil

P.S. Hope everyone had a good holiday. I'm back at work on Monday.


Re: sparse (static analyzer) report

From
Mark Wong
Date:
Ah, so you beat me to it Neil. ;)  Out of curiosity, how much worse
was it before you started fixing things?

Mark

On Sat, Jan 15, 2005 at 01:30:37PM +1100, Neil Conway wrote:
> BTW, perhaps one reason for the relatively small number of legitimate 
> issues picked up by sparse is that I ran sparse on the tree a month or 
> two ago and fixed some of the stylistic issues it reported. Most of the 
> stuff I didn't bother to fix looked like either a sparse bug, or a 
> marginal style improvement I didn't bother applying (like fixing 0 => 
> NULL in dllist.c).
> 
> I've been meaning to investigate whether sparse can be used as something 
> more than just a fussy syntax checker (i.e. whether it can do any 
> meaningful static analysis for interesting properties), but I haven't 
> had a chance yet.
> 
> Alvaro Herrera wrote:
> > It's complaining in several places about function as variables in
> > function declarations (the multiple walkers and mutators for example);
> > not sure how correct that is.
> 
> I believe the conclusion of prior discussions about making the 
> walker/mutator prototypes more precise is that it's not worth the cost.
> 
> -Neil
> 
> P.S. Hope everyone had a good holiday. I'm back at work on Monday.




Re: sparse (static analyzer) report

From
Alvaro Herrera
Date:
On Fri, Jan 14, 2005 at 05:50:30PM -0800, Mark Wong wrote:

> Yeah, I generate the file list to run sparse against with a
> "find . -name '*.c'".  So that's simple enough.  But flex and bison
> files don't end in .c, do they?

Generated files do.  I have a list of generated files here:

pl_gram.c
pl_scan.c
psqlscan.c
preproc.c
pgc.c
guc-file.c
fmgrtab.c
gram.c
scan.c
bootparse.c
bootscanner.c

Not sure how to skip them with find ... I think you can do that with
-regex but it's cumbersome.  Re: port, I think you can -prune it.

-- 
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
Jude: I wish humans laid eggs
Ringlord: Why would you want humans to lay eggs?
Jude: So I can eat them


Re: sparse (static analyzer) report

From
Greg Stark
Date:
Alvaro Herrera <alvherre@dcc.uchile.cl> writes:

> Hmm.  Well, it showed the multiple incorrect uses of 0 as NULL in
> dllist.c and other places, 

Incidentally, while it may not be conformant to your style guidelines, use of
the constant 0 compared to or assigned to a pointer is a perfectly valid ANSI
spelling for NULL. (The same is not true for an expression that happens to
evaluate to 0.)

-- 
greg



Re: sparse (static analyzer) report

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
> Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
>> Hmm.  Well, it showed the multiple incorrect uses of 0 as NULL in
>> dllist.c and other places, 

> Incidentally, while it may not be conformant to your style guidelines, use of
> the constant 0 compared to or assigned to a pointer is a perfectly valid ANSI
> spelling for NULL.

Absolutely.  But I agree that it is more readable to use NULL when you
mean a null pointer, and 0 when you mean an integer zero.  The C
standard may not distinguish these concepts, but I do ;-)

Something that I don't have a real strong feeling about isif (ptr != NULL)
versusif (ptr)
I've been known to write both.  Can anyone mount a good readability
argument for one over the other?

How about the inverse case,if (ptr == NULL)
versusif (!ptr)
Applying a boolean ! to a pointer seems a bit shaky to me, though
it's certainly a common locution.
        regards, tom lane


Re: sparse (static analyzer) report

From
Andrew Dunstan
Date:

Tom Lane wrote:

>Greg Stark <gsstark@mit.edu> writes:
>  
>
>>Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
>>    
>>
>>>Hmm.  Well, it showed the multiple incorrect uses of 0 as NULL in
>>>dllist.c and other places, 
>>>      
>>>
>
>  
>
>>Incidentally, while it may not be conformant to your style guidelines, use of
>>the constant 0 compared to or assigned to a pointer is a perfectly valid ANSI
>>spelling for NULL.
>>    
>>
>
>Absolutely.  But I agree that it is more readable to use NULL when you
>mean a null pointer, and 0 when you mean an integer zero.  The C
>standard may not distinguish these concepts, but I do ;-)
>
>Something that I don't have a real strong feeling about is
>    if (ptr != NULL)
>versus
>    if (ptr)
>I've been known to write both.  Can anyone mount a good readability
>argument for one over the other?
>
>How about the inverse case,
>    if (ptr == NULL)
>versus
>    if (!ptr)
>Applying a boolean ! to a pointer seems a bit shaky to me, though
>it's certainly a common locution.
>
>
>  
>

If we allow "if (ptr)" then allowing the inverse to be "if (! ptr)" 
seems logical enough. As you say, it's a very common idiom, and allowing 
one without the other would be rather non-orthogonal.

cheers

andrew


Re: sparse (static analyzer) report

From
Alvaro Herrera
Date:
On Sat, Jan 15, 2005 at 08:31:42AM -0500, Andrew Dunstan wrote:

> Tom Lane wrote:

> >Something that I don't have a real strong feeling about is
> >    if (ptr != NULL)
> >versus
> >    if (ptr)
> >I've been known to write both.  Can anyone mount a good readability
> >argument for one over the other?

I assume people don't like the PointerIsValid() macro defined in c.h?

Just for consistency we could use that everywhere or get rid of it ...
but then, maybe it is used by external code so we shouldn't do the
latter.

> >How about the inverse case,
> >    if (ptr == NULL)
> >versus
> >    if (!ptr)
> >Applying a boolean ! to a pointer seems a bit shaky to me, though
> >it's certainly a common locution.
> 
> If we allow "if (ptr)" then allowing the inverse to be "if (! ptr)" 
> seems logical enough. As you say, it's a very common idiom, and allowing 
> one without the other would be rather non-orthogonal.

I'd rather have legibility over orthogonality on this issue.  I prefer
ptr == NULL myself, though not too strongly.

-- 
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
"No necesitamos banderasNo reconocemos fronteras"                  (Jorge Gonz�lez)


Re: sparse (static analyzer) report

From
Greg Stark
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

>     if (ptr)
>
> Can anyone mount a good readability argument for one over the other?

My argument in favour of "if (ptr)" as well as "if (integer)" and "if (!ptr)"
is simply that all else being equal the shorter expression is clearer. Forcing
the reader to slog through additional syntax to read the code has a cost.

Obviously that argument is subject to abuse and it's very dependent on whether
you consider all else equal. 

For a reduction argument, pretending there's a boolean type with no casts
leads to the excessively verbose java-ish syntax like:
if (foo != 0 && bar != null && baz != 0)

which you can't tell me is easier to read than
if (foo && bar && baz)


What I miss most in both C and Java is the lispish ability to write
expressions like:
foo = bar() || baz() || qux();

instead you have to write out things like this (except worse since you have to
start storing things in temporary variables):
if (bar() != null) {  foo = bar();} else if (baz() != 0) {  foo = baz()} else {  foo = qux();}

I've seen code that consists of a dozen or so of these things repeated. When
the code starts getting so verbose that what could be 12 very simple and clear
lines no longer fits on the screen because you're trying to make it easier to
understand, well, I think you've lost the battle.

-- 
greg



Re: sparse (static analyzer) report

From
Bruno Wolff III
Date:
On Sat, Jan 15, 2005 at 10:44:48 -0500, Greg Stark <gsstark@mit.edu> wrote:
> 
> 
> What I miss most in both C and Java is the lispish ability to write
> expressions like:
> 
>  foo = bar() || baz() || qux();

Are you sure that C doesn't guarenty short circuit evaluation?
I don't have my C reference handy, but my memory is that evaluation
will stop after the first function call that returns true in the
above expression.


Re: sparse (static analyzer) report

From
Tom Lane
Date:
Bruno Wolff III <bruno@wolff.to> writes:
>   Greg Stark <gsstark@mit.edu> wrote:
>> What I miss most in both C and Java is the lispish ability to write
>> expressions like:
>> 
>> foo = bar() || baz() || qux();

> Are you sure that C doesn't guarenty short circuit evaluation?
> I don't have my C reference handy, but my memory is that evaluation
> will stop after the first function call that returns true in the
> above expression.

Yeah, but you can only find out the boolean result, not the actually
returned value --- that is, foo will get 1 or 0.
        regards, tom lane


Re: sparse (static analyzer) report

From
Tommi Mäkitalo
Date:
Am Samstag, 15. Januar 2005 21:38 schrieb Bruno Wolff III:
> On Sat, Jan 15, 2005 at 10:44:48 -0500,
>
>   Greg Stark <gsstark@mit.edu> wrote:
> > What I miss most in both C and Java is the lispish ability to write
> > expressions like:
> >
> >  foo = bar() || baz() || qux();
>
> Are you sure that C doesn't guarenty short circuit evaluation?
> I don't have my C reference handy, but my memory is that evaluation
> will stop after the first function call that returns true in the
> above expression.
>
C do guaranty short circuit evaluation.

You can also write:

(foo = bar()) || (foo = baz()) || (foo = qux())

this is a valid shortcut, where bar(), baz() and qux() are not evaluated 
twice, like in the if-cascade. But it is a ugly style every stylechecker 
should have no problems complaining about. Even a compiler would warn about 
'=' and '=='-confusion. But you can fix it:

(foo = bar()) != NULL || (foo = baz()) != NULL || foo = qux()) != NULL;

It's short, but not quite that readable.


Tommi


Re: sparse (static analyzer) report

From
Neil Conway
Date:
Mark Wong wrote:
> Ah, so you beat me to it Neil. ;)  Out of curiosity, how much worse
> was it before you started fixing things?

As I recall, not too different than things are today -- sparse flagged a 
bunch of stylistic issues that I fixed, like:

void some_func() { ... } => void some_func(void) { ... }

extern void some_func(void) { ... } => void some_func(void) { ... }

etc. But given that there are 3000 odd warnings (mostly bogus), I doubt 
I made a significant dent on the total warning count.

Also, I don't recall sparse picking up any significant errors or 
problems. As I said, I've been meaning to explore if that's all that 
sparse can do -- I just looked at the lowest of the low-hanging fruit.

-Neil