Thread: Re: [COMMITTERS] pgsql: Clean up the #include mess a little.
Tom Lane wrote: > Clean up the #include mess a little. > > walsender.h should depend on xlog.h, not vice versa. (Actually, the > inclusion was circular until a couple hours ago, which was even sillier; > but Bruce broke it in the expedient rather than logically correct > direction.) Because of that poor decision, plus blind application of > pgrminclude, we had a situation where half the system was depending on > xlog.h to include such unrelated stuff as array.h and guc.h. Clean up > the header inclusion, and manually revert a lot of what pgrminclude had > done so things build again. > > This episode reinforces my feeling that pgrminclude should not be run > without adult supervision. Inclusion changes in header files in particular > need to be reviewed with great care. More generally, it'd be good if we > had a clearer notion of module layering to dictate which headers can sanely > include which others ... but that's a big task for another day. What pgrminclude does it to lock down the minimal set of includes, and that easily could cause something like xlog.h becoming the go-to include file for many C files. I don't remember this problem happening before but it clearly happened this time. Not sure how to avoid that except, as you said, analyze the entire changeset of pgrminclude. For this run, I focused on not breaking any platform builds so I was not focusing on the actual include file layout. I assumed fiddling with the actual pgrminclude output would likely break builds. The process I used was to get pgcompinclude to allow all include files to compile (so they their inclusion would not bleed into files that included them), then run pgrminclude. Well, I assume we are done for another five years. The includes removed were minimal, especially considering five years of work. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On Mon, Sep 5, 2011 at 2:52 PM, Bruce Momjian <bruce@momjian.us> wrote: > Well, I assume we are done for another five years. The includes removed > were minimal, especially considering five years of work. > What I wouldn't mind seeing is a graph of all includes and what they include. This might help figure out what layering violations there are like the one that caused this mess. I think I've seen tools to do this already somewhere. -- greg
On Mon, Sep 5, 2011 at 15:55, Greg Stark <stark@mit.edu> wrote: > On Mon, Sep 5, 2011 at 2:52 PM, Bruce Momjian <bruce@momjian.us> wrote: >> Well, I assume we are done for another five years. The includes removed >> were minimal, especially considering five years of work. >> > > What I wouldn't mind seeing is a graph of all includes and what they > include. This might help figure out what layering violations there are > like the one that caused this mess. I think I've seen tools to do this > already somewhere. http://doxygen.postgresql.org will do some of that, but I think not globally - but if you click into one header, I think it shows you the map from that perspective. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Excerpts from Magnus Hagander's message of lun sep 05 11:02:23 -0300 2011: > On Mon, Sep 5, 2011 at 15:55, Greg Stark <stark@mit.edu> wrote: > > On Mon, Sep 5, 2011 at 2:52 PM, Bruce Momjian <bruce@momjian.us> wrote: > >> Well, I assume we are done for another five years. The includes removed > >> were minimal, especially considering five years of work. > > > > What I wouldn't mind seeing is a graph of all includes and what they > > include. This might help figure out what layering violations there are > > like the one that caused this mess. I think I've seen tools to do this > > already somewhere. > > http://doxygen.postgresql.org will do some of that, but I think not > globally - but if you click into one header, I think it shows you the > map from that perspective. Yeah; and it isn't always complete, because some graphs tend to get too unwieldy so it has to prune (you can see this because some nodes show up with red borders). I am not sure it is really feasible to build a complete graph for all headers. We have too many of them and too many dependencies. Another useful graph to see is what files include a given header. A funny thing is that doxygen doesn't always display this; for example http://doxygen.postgresql.org/rel_8h.html -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > I am not sure it is really feasible to build a complete graph for all > headers. We have too many of them and too many dependencies. Yeah, it's the "too many dependencies" aspect that is bothering me. The only concrete idea I've come up with so far is that it'd be a good idea to isolate certain primitive datatypes into their own group of headers. We have a number of headers that are meant to be this sort of animal already, eg storage/block.h, storage/relfilenode.h. But (1) there's no clear distinction between these headers and ones like, say, storage/smgr.h or storage/proc.h. (2) other things that have become widely-used primitive datatypes, such as TimestampTz, are declared in places that ideally ought to be near the top of the #include hierarchy not the bottom. So I think we could make some forward progress by moving all these simple datatype declarations into a separate set of headers in their own subdirectory of src/include/, perhaps "datatype". There would be a hard and fast rule that no header in this set could depend on anything beyond postgres.h and other members of the same set, so that these headers clearly form the bottom level of the #include hierarchy. Probably some of the stuff now in postgres.h could migrate to this group too. Eventually I'd like to see some fairly clear layering rules at the header-directory level, like "storage/ is lower-level than commands/ so nothing in the former directory should include anything in the latter". But achieving that is a long way off. Of course, the problem with all of this is that making much progress would be a large amount of work with relatively small concrete payoff. Still, I'm starting to feel that we've got such a spaghetti-like mess that we need to do something. regards, tom lane
On Mon, Sep 5, 2011 at 4:55 PM, Greg Stark <stark@mit.edu> wrote: > What I wouldn't mind seeing is a graph of all includes and what they > include. This might help figure out what layering violations there are > like the one that caused this mess. I think I've seen tools to do this > already somewhere. I whipped together a quick Python script to do this. Attached is the Python script (requires pydot) and the result of running it on includes/. I didn't attach the png version of the output because it was 7MB. If rendering all includes at once doesn't give a good overview it can also select a subset through traversing dependencies. For example: render_includes.py -i include/ \ --select="storage/spin.h+*,access/xlog.h+*" output.png This will render everything that directly or indirectly depends on those two headers. See --help for details. -- Ants Aasma
Attachment
Excerpts from Ants Aasma's message of mar sep 06 12:40:04 -0300 2011: > On Mon, Sep 5, 2011 at 4:55 PM, Greg Stark <stark@mit.edu> wrote: > > What I wouldn't mind seeing is a graph of all includes and what they > > include. This might help figure out what layering violations there are > > like the one that caused this mess. I think I've seen tools to do this > > already somewhere. > > I whipped together a quick Python script to do this. Attached is the > Python script (requires pydot) and the result of running it on includes/. > I didn't attach the png version of the output because it was 7MB. > > If rendering all includes at once doesn't give a good overview it can > also select a subset through traversing dependencies. For example: > render_includes.py -i include/ \ > --select="storage/spin.h+*,access/xlog.h+*" output.png > > This will render everything that directly or indirectly depends on those > two headers. See --help for details. Wow, interesting, thanks. What this says to me is that we should do something about execnodes.h and some other nodes file (parsenodes.h I think). I wonder what happens if files in the same subdir are grouped in a subgraph. Is that possible? -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Tue, Sep 6, 2011 at 10:18 PM, Alvaro Herrera <alvherre@commandprompt.com> wrote: > I wonder what happens if files in the same subdir are grouped in a > subgraph. Is that possible? Possible, and done. Also added possivility to add .c files to the graph, coloring by subdir and possibility exclude nodes from the graph. I didn't yet bother to clean up the code - to avoid eye damage, don't look at the source. Bad news is that it doesn't significantly help readability for the all nodes case. See all_with_subgraphs.svgz. It does help for other cases. For example parsenodes.h.svgz has the result for render_includes.py --select='nodes/parsenodes.h+*-*' --subgraphs and execnodes.h.svgz for --subgraphs --select='nodes/execnodes.h+*-*' -- Ants Aasma
Attachment
On Wed, Sep 7, 2011 at 01:20:17AM +0300, Ants Aasma wrote: > On Tue, Sep 6, 2011 at 10:18 PM, Alvaro Herrera > <alvherre@commandprompt.com> wrote: > > I wonder what happens if files in the same subdir are grouped in a > > subgraph. Is that possible? > > Possible, and done. Also added possivility to add .c files to the graph, > coloring by subdir and possibility exclude nodes from the graph. I didn't yet > bother to clean up the code - to avoid eye damage, don't look at the source. > > Bad news is that it doesn't significantly help readability for the all nodes > case. See all_with_subgraphs.svgz. It does help for other cases. > For example parsenodes.h.svgz has the result for > render_includes.py --select='nodes/parsenodes.h+*-*' --subgraphs > and execnodes.h.svgz for > --subgraphs --select='nodes/execnodes.h+*-*' Should we add this script and instructions to src/tools/pginclude? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Excerpts from Bruce Momjian's message of mié ago 15 18:30:40 -0400 2012: > > On Wed, Sep 7, 2011 at 01:20:17AM +0300, Ants Aasma wrote: > > On Tue, Sep 6, 2011 at 10:18 PM, Alvaro Herrera > > <alvherre@commandprompt.com> wrote: > > > I wonder what happens if files in the same subdir are grouped in a > > > subgraph. Is that possible? > > > > Possible, and done. Also added possivility to add .c files to the graph, > > coloring by subdir and possibility exclude nodes from the graph. I didn't yet > > bother to clean up the code - to avoid eye damage, don't look at the source. > > > > Bad news is that it doesn't significantly help readability for the all nodes > > case. See all_with_subgraphs.svgz. It does help for other cases. > > For example parsenodes.h.svgz has the result for > > render_includes.py --select='nodes/parsenodes.h+*-*' --subgraphs > > and execnodes.h.svgz for > > --subgraphs --select='nodes/execnodes.h+*-*' > > Should we add this script and instructions to src/tools/pginclude? Probably not, but maybe the developer FAQ in the wiki? -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Aug 16, 2012 at 11:15:24AM -0400, Alvaro Herrera wrote: > Excerpts from Bruce Momjian's message of mié ago 15 18:30:40 -0400 2012: > > > > On Wed, Sep 7, 2011 at 01:20:17AM +0300, Ants Aasma wrote: > > > On Tue, Sep 6, 2011 at 10:18 PM, Alvaro Herrera > > > <alvherre@commandprompt.com> wrote: > > > > I wonder what happens if files in the same subdir are grouped in a > > > > subgraph. Is that possible? > > > > > > Possible, and done. Also added possivility to add .c files to the graph, > > > coloring by subdir and possibility exclude nodes from the graph. I didn't yet > > > bother to clean up the code - to avoid eye damage, don't look at the source. > > > > > > Bad news is that it doesn't significantly help readability for the all nodes > > > case. See all_with_subgraphs.svgz. It does help for other cases. > > > For example parsenodes.h.svgz has the result for > > > render_includes.py --select='nodes/parsenodes.h+*-*' --subgraphs > > > and execnodes.h.svgz for > > > --subgraphs --select='nodes/execnodes.h+*-*' > > > > Should we add this script and instructions to src/tools/pginclude? > > Probably not, but maybe the developer FAQ in the wiki? I just added the script email URL to the pginclude README. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +