Thread: What is lurking in the shadows?

What is lurking in the shadows?

From
Peter Smith
Date:
Hi hackers,

Recently I was involved with some patches [1][2] to fix code which was
mistakenly using a global "wrconn" variable instead of a local one.

That bug led me to wonder if similar problems might be going
undetected elsewhere in the code. There is a gcc compiler option [3]
-Wshadow which informs about the similar scenario where one variable
is "shadowing" another (e.g. redeclaring a variable with the same name
as one at an outer scope).

PSA a log file from a PG14 build (code from last week) run using the
-Wshadow flag. In this logfile I have filtered out everything except
the shadow warnings.

My plan initially was to just fix the few warnings found and present
the patches here, but it turned out there are far more cases than I
was anticipating.

There seem to be basically 3 categories of shadowing exposed in this logfile:
1. where a var declaration is shadowing a previously declared local
var (205 cases found)
2. where a var declaration is shadowing a function parameter (14 cases found)
3. where a var declaration is shadowing a global variable (110 cases found)

~~~

Of the dozen or so cases that I have looked at, so far I have been
unable to find anything that would result in any *real* errors.

But that is not to say they are harmless either - at the very least
IMO they affect code readability in ways that span the full spectrum
from "meh" to downright "dodgy-looking".

Some examples are possibly deliberate (albeit lazy / unimaginative?)
local re-declarations of variables like "i" and "buf" etc.

But many other examples (particularly the global shadows) seemed
clearly unintentional mistakes to me - like the code evolved and
continued working OK without warnings, so any introduced shadowing
just went unnoticed.

And who knows... maybe there are a few *real* bugs lurking within this list too?

~~~

For now, I am not sure how to proceed with this information. Hence this post...

- Perhaps a consistent convention for global variable names could have
prevented lots of these cases from occurring.

- Many of these shadow cases look unintentional to me; I feel the code
would have been implemented differently had the developer been aware
of them, so at least advertising their presence seems a useful thing
to know. Perhaps the -Wshadow flag can be added to one of the
build-farm machines for that purpose?

- Finally, IMO the code is nearly always more confusing when there is
variable shadowing, so removal of these warnings seems a worthy goal.
Perhaps they can be slowly whittled away during the course of PG 15
development?

Or am I just jumping at shadows?

Thoughts?

----------
[1] https://github.com/postgres/postgres/commit/4e8c0f1a0d0d095a749a329a216c88a340a455b6
[2] https://github.com/postgres/postgres/commit/db16c656478b815627a03bb0a31833391a733eb0
[3] https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html

Kind Regards,
Peter Smith.
Fujitsu Australia

Attachment

Re: What is lurking in the shadows?

From
David Rowley
Date:
On Fri, 14 May 2021 at 12:00, Peter Smith <smithpb2250@gmail.com> wrote:
> That bug led me to wonder if similar problems might be going
> undetected elsewhere in the code. There is a gcc compiler option [3]
> -Wshadow which informs about the similar scenario where one variable
> is "shadowing" another (e.g. redeclaring a variable with the same name
> as one at an outer scope).

> For now, I am not sure how to proceed with this information. Hence this post...

I'm inclined to think that since a bug has already been found due to a
local variable shadowing a global one that it would be good to review
these and then consider if it's worth doing any renaming. I think the
process of looking at each warning individually will allow us to
determine if; a) there are any bugs, or; b) if it's worth doing any
renaming.

I see GCC also has -Wshadow=compatible-local to warn when there
shadowing is going on in local vars where both vars have compatible
types.  -Wshadow=local is any local var shadowing, then the option you
used which is the same as -Wshadow=global.

I'd say it might be worth aspiring to reduce the warnings from
building with these flags. If we reduced these down then it might
allow us to more easily identify cases where there are actual bugs.
Maybe we can get to a point where we could enable either
-Wshadow=compatible-local or -Wshadow=local.  I doubt we could ever
get to a stage where -Wshadow=global would work for us.  There's also
some quite old discussion in [1] that you might want to review.

I don't pretend to have found the best example of ones that we might
want to leave alone, but:

pg_controldata.c: In function ‘wal_level_str’:
pg_controldata.c:73:24: warning: declaration of ‘wal_level’ shadows a
global declaration [-Wshadow]
 wal_level_str(WalLevel wal_level)
                        ^
In file included from pg_controldata.c:24:0:
../../../src/include/access/xlog.h:187:24: warning: shadowed
declaration is here [-Wshadow]
 extern PGDLLIMPORT int wal_level;

I wonder if it would really clear up much if the parameter name there
was renamed not to shadow the GUC variable's name.

Also, doing any renaming here is not without risk that we break
something, so certainly PG15 at the earliest, unless there is an
actual bug.

I imagine starting with a patch that fixes the ones where the name
does not have much meaning. e.g, i, buf, tmp, lc

We also need to take into account that renaming variables here can
increase the overhead of backpatching fixes.  The process of fixing
those up to make the patch apply to the back branch does increase the
chances that bugs could make their way into the back branches.
However, it's probably more likely to end up as a bug if the patch was
written for the back branch then there's a bigger opportunity for the
patch author to pick the wrong variable name when converting the patch
to work with master. In the reverse case, that does not seem as likely
due to both variables having the same name.

David

[1] https://www.postgresql.org/message-id/flat/877k1psmpf.fsf%40mailbox.samurai.com



Re: What is lurking in the shadows?

From
Michael Paquier
Date:
On Fri, May 14, 2021 at 01:16:37PM +1200, David Rowley wrote:
> I'm inclined to think that since a bug has already been found due to a
> local variable shadowing a global one that it would be good to review
> these and then consider if it's worth doing any renaming. I think the
> process of looking at each warning individually will allow us to
> determine if; a) there are any bugs, or; b) if it's worth doing any
> renaming.

70116493 is another instance of that, from a not-so-far past..

> I'd say it might be worth aspiring to reduce the warnings from
> building with these flags. If we reduced these down then it might
> allow us to more easily identify cases where there are actual bugs.
> Maybe we can get to a point where we could enable either
> -Wshadow=compatible-local or -Wshadow=local.  I doubt we could ever
> get to a stage where -Wshadow=global would work for us.  There's also
> some quite old discussion in [1] that you might want to review.

Agreed, not before the 15 branch opens for business for cosmetic
changes.  compatible-local did not sound that much interesting to me
on first sight, but the report of Peter tells the contrary: most of
the conflicts come from local problems.  I am not sure that you could
enable that safely though as PG_TRY() would complain on that, for
example in ProcessUtilitySlow().

> We also need to take into account that renaming variables here can
> increase the overhead of backpatching fixes.  The process of fixing
> those up to make the patch apply to the back branch does increase the
> chances that bugs could make their way into the back branches.
> However, it's probably more likely to end up as a bug if the patch was
> written for the back branch then there's a bigger opportunity for the
> patch author to pick the wrong variable name when converting the patch
> to work with master. In the reverse case, that does not seem as likely
> due to both variables having the same name.

That may be tricky, even if global or local variables are changed,
but I'd like to think that there is room for improvement.  Looking at
the report, the global conflicts involve:
- synchronous_commit
- ssl_key_file
- wal_segment_size
- DataDir, with the control data business.

These seem changeable without much holes with potential back-patches.
--
Michael

Attachment

Re: What is lurking in the shadows?

From
Peter Smith
Date:
On Fri, May 14, 2021 at 11:16 AM David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Fri, 14 May 2021 at 12:00, Peter Smith <smithpb2250@gmail.com> wrote:
> > That bug led me to wonder if similar problems might be going
> > undetected elsewhere in the code. There is a gcc compiler option [3]
> > -Wshadow which informs about the similar scenario where one variable
> > is "shadowing" another (e.g. redeclaring a variable with the same name
> > as one at an outer scope).
>
> > For now, I am not sure how to proceed with this information. Hence this post...
>
> I'm inclined to think that since a bug has already been found due to a
> local variable shadowing a global one that it would be good to review
> these and then consider if it's worth doing any renaming. I think the
> process of looking at each warning individually will allow us to
> determine if; a) there are any bugs, or; b) if it's worth doing any
> renaming.
>

Hi David, Michael - Thanks for your replies.

Yeah, I would like to work my way through all of these warnings in my
spare time and report back to this thread (after 1-2 months?) with a
detailed analysis.

After that it should become much clearer what / if any action should
be taken next.

----------
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: What is lurking in the shadows?

From
Michael Paquier
Date:
On Tue, May 18, 2021 at 12:08:57PM +1000, Peter Smith wrote:
> Yeah, I would like to work my way through all of these warnings in my
> spare time and report back to this thread (after 1-2 months?) with a
> detailed analysis.

The next commit fest is at the beginning of July, so there are a
couple of weeks of margin here.

> After that it should become much clearer what / if any action should
> be taken next.

If you can dive into the details, that would be nice!  My take would
be to look first at the local-local conflicts and rename all the
variables that conflict so as any backpatch done in the areas changed
cause a compilation failure.  Some of the global-local conflicts are
simple enough to solve, these could go second.  Each of them requires
a case-by-case lookup, of course.
--
Michael

Attachment

Re: What is lurking in the shadows?

From
David Rowley
Date:
On Tue, 18 May 2021 at 14:09, Peter Smith <smithpb2250@gmail.com> wrote:
> Yeah, I would like to work my way through all of these warnings in my
> spare time and report back to this thread (after 1-2 months?) with a
> detailed analysis.

I'd recommend for any patches that they come in bite-sized chunks. A
committer is going to have to re-review each change. For me
personally, I'll probably run for the hills if I see a patch that
renames 200 variables.

I'd think about a dozen would be good. Starting with ones that are
least likely to raise objection also seems like a good idea.  That way
you'll have an idea if you want to trouble yourself with the more
questionable ones when the less questionable ones raised too many
questions.  Like I mentioned, start with ones like i, buf, tmp, lc. If
those are accepted then move on to the more difficult ones.  Unless
you discover bugs, then there's not really any urgency to fix these.
Doing it in bite-sized chunks is less likely going to cause
frustration for you if some of the work is rejected after you've gone
to all the trouble.

Also, going by what's mentioned in [1], in particular [2], I'm not so
certain that these changes will be received well by everyone.  So I
recommend just taking it slow.

David

[1] https://www.postgresql.org/message-id/flat/877k1psmpf.fsf%40mailbox.samurai.com
[2] https://www.postgresql.org/message-id/22920.1069708226%40sss.pgh.pa.us