Thread: Re: [COMMITTERS] pgsql: Clean up the #include mess a little.

Re: [COMMITTERS] pgsql: Clean up the #include mess a little.

From
Bruce Momjian
Date:
Tom Lane wrote:
> Clean up the #include mess a little.
> 
> walsender.h should depend on xlog.h, not vice versa.  (Actually, the
> inclusion was circular until a couple hours ago, which was even sillier;
> but Bruce broke it in the expedient rather than logically correct
> direction.)  Because of that poor decision, plus blind application of
> pgrminclude, we had a situation where half the system was depending on
> xlog.h to include such unrelated stuff as array.h and guc.h.  Clean up
> the header inclusion, and manually revert a lot of what pgrminclude had
> done so things build again.
> 
> This episode reinforces my feeling that pgrminclude should not be run
> without adult supervision.  Inclusion changes in header files in particular
> need to be reviewed with great care.  More generally, it'd be good if we
> had a clearer notion of module layering to dictate which headers can sanely
> include which others ... but that's a big task for another day.

What pgrminclude does it to lock down the minimal set of includes, and
that easily could cause something like xlog.h becoming the go-to include
file for many C files.  I don't remember this problem happening before
but it clearly happened this time.

Not sure how to avoid that except, as you said, analyze the entire
changeset of pgrminclude.  For this run, I focused on not breaking any
platform builds so I was not focusing on the actual include file layout.
I assumed fiddling with the actual pgrminclude output would likely break
builds.

The process I used was to get pgcompinclude to allow all include files
to compile (so they their inclusion would not bleed into files that
included them), then run pgrminclude.

Well, I assume we are done for another five years.  The includes removed
were minimal, especially considering five years of work.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: [COMMITTERS] pgsql: Clean up the #include mess a little.

From
Greg Stark
Date:
On Mon, Sep 5, 2011 at 2:52 PM, Bruce Momjian <bruce@momjian.us> wrote:
> Well, I assume we are done for another five years.  The includes removed
> were minimal, especially considering five years of work.
>

What I wouldn't mind seeing is a graph of all includes and what they
include. This might help figure out what layering violations there are
like the one that caused this mess. I think I've seen tools to do this
already somewhere.

--
greg


Re: [COMMITTERS] pgsql: Clean up the #include mess a little.

From
Magnus Hagander
Date:
On Mon, Sep 5, 2011 at 15:55, Greg Stark <stark@mit.edu> wrote:
> On Mon, Sep 5, 2011 at 2:52 PM, Bruce Momjian <bruce@momjian.us> wrote:
>> Well, I assume we are done for another five years.  The includes removed
>> were minimal, especially considering five years of work.
>>
>
> What I wouldn't mind seeing is a graph of all includes and what they
> include. This might help figure out what layering violations there are
> like the one that caused this mess. I think I've seen tools to do this
> already somewhere.


http://doxygen.postgresql.org will do some of that, but I think not
globally - but if you click into one header, I think it shows you the
map from that perspective.


--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [COMMITTERS] pgsql: Clean up the #include mess a little.

From
Alvaro Herrera
Date:
Excerpts from Magnus Hagander's message of lun sep 05 11:02:23 -0300 2011:
> On Mon, Sep 5, 2011 at 15:55, Greg Stark <stark@mit.edu> wrote:
> > On Mon, Sep 5, 2011 at 2:52 PM, Bruce Momjian <bruce@momjian.us> wrote:
> >> Well, I assume we are done for another five years.  The includes removed
> >> were minimal, especially considering five years of work.
> >
> > What I wouldn't mind seeing is a graph of all includes and what they
> > include. This might help figure out what layering violations there are
> > like the one that caused this mess. I think I've seen tools to do this
> > already somewhere.
> 
> http://doxygen.postgresql.org will do some of that, but I think not
> globally - but if you click into one header, I think it shows you the
> map from that perspective.

Yeah; and it isn't always complete, because some graphs tend to get too
unwieldy so it has to prune (you can see this because some nodes show up
with red borders).

I am not sure it is really feasible to build a complete graph for all
headers.  We have too many of them and too many dependencies.

Another useful graph to see is what files include a given header.  A
funny thing is that doxygen doesn't always display this; for example
http://doxygen.postgresql.org/rel_8h.html

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: [COMMITTERS] pgsql: Clean up the #include mess a little.

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> I am not sure it is really feasible to build a complete graph for all
> headers.  We have too many of them and too many dependencies.

Yeah, it's the "too many dependencies" aspect that is bothering me.

The only concrete idea I've come up with so far is that it'd be a good
idea to isolate certain primitive datatypes into their own group of
headers.  We have a number of headers that are meant to be this sort
of animal already, eg storage/block.h, storage/relfilenode.h.  But
(1) there's no clear distinction between these headers and ones like,
say, storage/smgr.h or storage/proc.h.
(2) other things that have become widely-used primitive datatypes,
such as TimestampTz, are declared in places that ideally ought to be
near the top of the #include hierarchy not the bottom.

So I think we could make some forward progress by moving all these
simple datatype declarations into a separate set of headers in their
own subdirectory of src/include/, perhaps "datatype".  There would
be a hard and fast rule that no header in this set could depend on
anything beyond postgres.h and other members of the same set, so
that these headers clearly form the bottom level of the #include
hierarchy.  Probably some of the stuff now in postgres.h could migrate
to this group too.

Eventually I'd like to see some fairly clear layering rules at the
header-directory level, like "storage/ is lower-level than commands/
so nothing in the former directory should include anything in the
latter".  But achieving that is a long way off.

Of course, the problem with all of this is that making much progress
would be a large amount of work with relatively small concrete payoff.
Still, I'm starting to feel that we've got such a spaghetti-like mess
that we need to do something.
        regards, tom lane


Re: [COMMITTERS] pgsql: Clean up the #include mess a little.

From
Ants Aasma
Date:
On Mon, Sep 5, 2011 at 4:55 PM, Greg Stark <stark@mit.edu> wrote:
> What I wouldn't mind seeing is a graph of all includes and what they
> include. This might help figure out what layering violations there are
> like the one that caused this mess. I think I've seen tools to do this
> already somewhere.

I whipped together a quick Python script to do this. Attached is the
Python script (requires pydot) and the result of running it on includes/.
I didn't attach the png version of the output because it was 7MB.

If rendering all includes at once doesn't give a good overview it can
also select a subset through traversing dependencies. For example:
render_includes.py -i include/ \
  --select="storage/spin.h+*,access/xlog.h+*" output.png

This will render everything that directly or indirectly depends  on those
two headers. See --help for details.

--
Ants Aasma

Attachment

Re: [COMMITTERS] pgsql: Clean up the #include mess a little.

From
Alvaro Herrera
Date:
Excerpts from Ants Aasma's message of mar sep 06 12:40:04 -0300 2011:
> On Mon, Sep 5, 2011 at 4:55 PM, Greg Stark <stark@mit.edu> wrote:
> > What I wouldn't mind seeing is a graph of all includes and what they
> > include. This might help figure out what layering violations there are
> > like the one that caused this mess. I think I've seen tools to do this
> > already somewhere.
> 
> I whipped together a quick Python script to do this. Attached is the
> Python script (requires pydot) and the result of running it on includes/.
> I didn't attach the png version of the output because it was 7MB.
> 
> If rendering all includes at once doesn't give a good overview it can
> also select a subset through traversing dependencies. For example:
> render_includes.py -i include/ \
>   --select="storage/spin.h+*,access/xlog.h+*" output.png
> 
> This will render everything that directly or indirectly depends  on those
> two headers. See --help for details.

Wow, interesting, thanks.

What this says to me is that we should do something about execnodes.h
and some other nodes file (parsenodes.h I think).

I wonder what happens if files in the same subdir are grouped in a
subgraph.  Is that possible?

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: [COMMITTERS] pgsql: Clean up the #include mess a little.

From
Ants Aasma
Date:
On Tue, Sep 6, 2011 at 10:18 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
> I wonder what happens if files in the same subdir are grouped in a
> subgraph.  Is that possible?

Possible, and done. Also added possivility to add .c files to the graph,
coloring by subdir and possibility exclude nodes from the graph. I didn't yet
bother to clean up the code - to avoid eye damage, don't look at the source.

Bad news is that it doesn't significantly help readability for the all nodes
case. See all_with_subgraphs.svgz.  It does help for other cases.
For example parsenodes.h.svgz has the result for
render_includes.py --select='nodes/parsenodes.h+*-*' --subgraphs
and execnodes.h.svgz for
--subgraphs --select='nodes/execnodes.h+*-*'

--
Ants Aasma

Attachment

Re: [COMMITTERS] pgsql: Clean up the #include mess a little.

From
Bruce Momjian
Date:
On Wed, Sep  7, 2011 at 01:20:17AM +0300, Ants Aasma wrote:
> On Tue, Sep 6, 2011 at 10:18 PM, Alvaro Herrera
> <alvherre@commandprompt.com> wrote:
> > I wonder what happens if files in the same subdir are grouped in a
> > subgraph.  Is that possible?
> 
> Possible, and done. Also added possivility to add .c files to the graph,
> coloring by subdir and possibility exclude nodes from the graph. I didn't yet
> bother to clean up the code - to avoid eye damage, don't look at the source.
> 
> Bad news is that it doesn't significantly help readability for the all nodes
> case. See all_with_subgraphs.svgz.  It does help for other cases.
> For example parsenodes.h.svgz has the result for
> render_includes.py --select='nodes/parsenodes.h+*-*' --subgraphs
> and execnodes.h.svgz for
> --subgraphs --select='nodes/execnodes.h+*-*'

Should we add this script and instructions to src/tools/pginclude?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



Re: [COMMITTERS] pgsql: Clean up the #include mess a little.

From
Alvaro Herrera
Date:
Excerpts from Bruce Momjian's message of mié ago 15 18:30:40 -0400 2012:
>
> On Wed, Sep  7, 2011 at 01:20:17AM +0300, Ants Aasma wrote:
> > On Tue, Sep 6, 2011 at 10:18 PM, Alvaro Herrera
> > <alvherre@commandprompt.com> wrote:
> > > I wonder what happens if files in the same subdir are grouped in a
> > > subgraph.  Is that possible?
> >
> > Possible, and done. Also added possivility to add .c files to the graph,
> > coloring by subdir and possibility exclude nodes from the graph. I didn't yet
> > bother to clean up the code - to avoid eye damage, don't look at the source.
> >
> > Bad news is that it doesn't significantly help readability for the all nodes
> > case. See all_with_subgraphs.svgz.  It does help for other cases.
> > For example parsenodes.h.svgz has the result for
> > render_includes.py --select='nodes/parsenodes.h+*-*' --subgraphs
> > and execnodes.h.svgz for
> > --subgraphs --select='nodes/execnodes.h+*-*'
>
> Should we add this script and instructions to src/tools/pginclude?

Probably not, but maybe the developer FAQ in the wiki?

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: [COMMITTERS] pgsql: Clean up the #include mess a little.

From
Bruce Momjian
Date:
On Thu, Aug 16, 2012 at 11:15:24AM -0400, Alvaro Herrera wrote:
> Excerpts from Bruce Momjian's message of mié ago 15 18:30:40 -0400 2012:
> > 
> > On Wed, Sep  7, 2011 at 01:20:17AM +0300, Ants Aasma wrote:
> > > On Tue, Sep 6, 2011 at 10:18 PM, Alvaro Herrera
> > > <alvherre@commandprompt.com> wrote:
> > > > I wonder what happens if files in the same subdir are grouped in a
> > > > subgraph.  Is that possible?
> > > 
> > > Possible, and done. Also added possivility to add .c files to the graph,
> > > coloring by subdir and possibility exclude nodes from the graph. I didn't yet
> > > bother to clean up the code - to avoid eye damage, don't look at the source.
> > > 
> > > Bad news is that it doesn't significantly help readability for the all nodes
> > > case. See all_with_subgraphs.svgz.  It does help for other cases.
> > > For example parsenodes.h.svgz has the result for
> > > render_includes.py --select='nodes/parsenodes.h+*-*' --subgraphs
> > > and execnodes.h.svgz for
> > > --subgraphs --select='nodes/execnodes.h+*-*'
> > 
> > Should we add this script and instructions to src/tools/pginclude?
> 
> Probably not, but maybe the developer FAQ in the wiki?

I just added the script email URL to the pginclude README.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +