Thread: The pgrminclude problem

The pgrminclude problem

From
Peter Geoghegan
Date:
I found a tool that Google authored that attempts to solve the same
problems as pgrminclude does in our codebase.  It's called "include
what you use", and is based on Clang. The project is hosted here:

http://code.google.com/p/include-what-you-use/

I'm not suggesting that we should start using this tool instead of
pgrminclude, because it has enough caveats of its own, and is mostly
written with Google's C++ codebase in mind. However, it's worth being
aware of. There is a useful analysis of the general problem in the
README - "Why IWYU is Difficult" (you can probably just skip the
extensive analysis of C++ templates as they relate to the problem
there).

The tool is authored by Craig Silverstein, Google's director of
technology. If he believes that IWYU is a difficult problem, well, it
probably is.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services



Re: The pgrminclude problem

From
Bruce Momjian
Date:
On Thu, Aug 16, 2012 at 04:48:34PM +0100, Peter Geoghegan wrote:
> I found a tool that Google authored that attempts to solve the same
> problems as pgrminclude does in our codebase.  It's called "include
> what you use", and is based on Clang. The project is hosted here:
> 
> http://code.google.com/p/include-what-you-use/
> 
> I'm not suggesting that we should start using this tool instead of
> pgrminclude, because it has enough caveats of its own, and is mostly
> written with Google's C++ codebase in mind. However, it's worth being
> aware of. There is a useful analysis of the general problem in the
> README - "Why IWYU is Difficult" (you can probably just skip the
> extensive analysis of C++ templates as they relate to the problem
> there).
> 
> The tool is authored by Craig Silverstein, Google's director of
> technology. If he believes that IWYU is a difficult problem, well, it
> probably is.

Good to know. We only use pgrminclude very five years or so, and Tom
isn't even keen on that.

I added the URL to our src/tools/pginclude/README.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



Re: The pgrminclude problem

From
Peter Geoghegan
Date:
On 16 August 2012 16:56, Bruce Momjian <bruce@momjian.us> wrote:
> Good to know. We only use pgrminclude very five years or so, and Tom
> isn't even keen on that.

Yeah. Even if this could be made to work well, we'd still have to do
something like get an absolute consensus from all build farm animals,
if we expected to have an absolutely trustworthy list. I don't think
pgrminclude is a bad idea. I just think that it should only be used to
guide the efforts of a human to remove superfluous #includes, which is
how it is used anyway.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services



Re: The pgrminclude problem

From
Robert Haas
Date:
On Thu, Aug 16, 2012 at 12:17 PM, Peter Geoghegan <peter@2ndquadrant.com> wrote:
> On 16 August 2012 16:56, Bruce Momjian <bruce@momjian.us> wrote:
>> Good to know. We only use pgrminclude very five years or so, and Tom
>> isn't even keen on that.
>
> Yeah. Even if this could be made to work well, we'd still have to do
> something like get an absolute consensus from all build farm animals,
> if we expected to have an absolutely trustworthy list. I don't think
> pgrminclude is a bad idea. I just think that it should only be used to
> guide the efforts of a human to remove superfluous #includes, which is
> how it is used anyway.

I actually think we'd probably be better off running pgrminclude once
per release cycle rather than any less often.  When the number of
changes gets into the hundreds or thousands of lines it becomes much
more difficult to validate that it's doing anything sensible.  I ran
it a while back and found a bunch of stuff that looked like it was
obviously worth fixing, but I was afraid of getting yelled at if I
went and fixed it, so I didn't.  Somehow that doesn't seem like an
ideal situation...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: The pgrminclude problem

From
Alvaro Herrera
Date:
Excerpts from Robert Haas's message of lun ago 20 11:43:44 -0400 2012:

> I actually think we'd probably be better off running pgrminclude once
> per release cycle rather than any less often.  When the number of
> changes gets into the hundreds or thousands of lines it becomes much
> more difficult to validate that it's doing anything sensible.  I ran
> it a while back and found a bunch of stuff that looked like it was
> obviously worth fixing, but I was afraid of getting yelled at if I
> went and fixed it, so I didn't.  Somehow that doesn't seem like an
> ideal situation...

Alternatively you could post a patch for comment.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: The pgrminclude problem

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Aug 16, 2012 at 12:17 PM, Peter Geoghegan <peter@2ndquadrant.com> wrote:
>> Yeah. Even if this could be made to work well, we'd still have to do
>> something like get an absolute consensus from all build farm animals,
>> if we expected to have an absolutely trustworthy list. I don't think
>> pgrminclude is a bad idea. I just think that it should only be used to
>> guide the efforts of a human to remove superfluous #includes, which is
>> how it is used anyway.

> I actually think we'd probably be better off running pgrminclude once
> per release cycle rather than any less often.

If it were more automatic and less prone to give bogus answers, I could
get behind that ... but as is, I'd frankly be happier if we *never* ran
it.  It took quite a lot of effort to dig out from under the mess it
made last time, and I don't recall that we have ever had a run that
was entirely trouble-free.

Now, a contributing factor to the most recent mess was that somebody had
created circular header #include's; maybe it would help if the thing
were programmed to notice that and punt, rather than doing its best to
wind the ball of string even tighter.  In general, though, any
recommendation from the tool to remove #includes in headers, as opposed
to consumer .c files, needs to be taken with about ten grains of salt.

The other serious problem, as Peter notes, is that there are inclusions
that are only needed on particular platforms or with particular build
options.  AFAIK, Bruce's current methodology for running pgrminclude
takes no account of that.
        regards, tom lane



Re: The pgrminclude problem

From
Robert Haas
Date:
On Mon, Aug 20, 2012 at 12:03 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Excerpts from Robert Haas's message of lun ago 20 11:43:44 -0400 2012:
>> I actually think we'd probably be better off running pgrminclude once
>> per release cycle rather than any less often.  When the number of
>> changes gets into the hundreds or thousands of lines it becomes much
>> more difficult to validate that it's doing anything sensible.  I ran
>> it a while back and found a bunch of stuff that looked like it was
>> obviously worth fixing, but I was afraid of getting yelled at if I
>> went and fixed it, so I didn't.  Somehow that doesn't seem like an
>> ideal situation...
>
> Alternatively you could post a patch for comment.

Yeah, maybe I'll try that if I get back around to working on this at some point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company