Thread: Breaking compile-time dependency cycles of Postgres subdirs?
This question is mostly just curiosity...
There are build-time dependency cycles between some of Postgres' code subdirectories. For example, "storage" and "access" have such a cycle:
storage/buffpage.h #includes access/xlogdefs.h
access/visibilitymap.h #includes storage/block.h
Has there been any discussion about reorganizing these directories so that no such cycles exist?
As someone very new to this code base, I think these cycles make it a little harder to figure out the runtime and compile-time dependencies between the subsystems these directories seem to represent. I wonder if that's a problem others face as well?
On Fri, Feb 7, 2014 at 7:39 AM, Christian Convey <christian.convey@gmail.com> wrote: > This question is mostly just curiosity... > > There are build-time dependency cycles between some of Postgres' code > subdirectories. For example, "storage" and "access" have such a cycle: > storage/buffpage.h #includes access/xlogdefs.h > access/visibilitymap.h #includes storage/block.h > > Has there been any discussion about reorganizing these directories so that > no such cycles exist? Not to my knowledge. > As someone very new to this code base, I think these cycles make it a little > harder to figure out the runtime and compile-time dependencies between the > subsystems these directories seem to represent. I wonder if that's a > problem others face as well? There are probably some cases that could be improved, but I have my doubts about whether eliminating cycles is a reasonable goal. Sometimes, two modules really do depend on each other. And, you're talking about this not just on the level of individual files but entire subtrees. There are 90,000 lines of code in src/backend/access (whose headers are in src/include/access) and more than 38,000 in src/backend/storage (whose headers are in src/include/storage); expecting all dependencies between those modules to go in one direction doesn't feel terribly reasonable. If it could be done at all, you'd probably end up separating code into lots of little tiny directories, splitting apart modules with logically related functionality into chunks living in entirely different parts of the source tree - and I don't think that would be an improvement. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Sun, Feb 9, 2014 at 8:06 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Feb 7, 2014 at 7:39 AM, Christian Convey
<christian.convey@gmail.com> wrote:
> This question is mostly just curiosity...
> As someone very new to this code base, I think these cycles make it a little> harder to figure out the runtime and compile-time dependencies between theThere are probably some cases that could be improved, but I have my
> subsystems these directories seem to represent. I wonder if that's a
> problem others face as well?
doubts about whether eliminating cycles is a reasonable goal.
Sometimes, two modules really do depend on each other. And, you're
talking about this not just on the level of individual files but
entire subtrees. There are 90,000 lines of code in src/backend/access
(whose headers are in src/include/access) and more than 38,000 in
src/backend/storage (whose headers are in src/include/storage);
expecting all dependencies between those modules to go in one
direction doesn't feel terribly reasonable. If it could be done at
all, you'd probably end up separating code into lots of little tiny
directories, splitting apart modules with logically related
functionality into chunks living in entirely different parts of the
source tree - and I don't think that would be an improvement.
Thanks Robert. IMHO, whether or not it would be beneficial depends on which files (or definitions within files) had to be broken out into additional subdirectories in order to break the cycles. If it could be accomplished with at most a few additional subdirectories that were also intuitively meaningful groupings of files/definitions, it could be a win. But if not, I agree it would be a step backwards.
Still, I'm thinking this might be a problem we need to partially solve if we're going to support a pluggable storage manager, particularly if we allow a pluggable storage manager to use the system's buffer system and/or block I/O system. I guess it depends on exactly what we want from a pluggable storage manager.
- Christian
Robert Haas <robertmhaas@gmail.com> writes: > On Fri, Feb 7, 2014 at 7:39 AM, Christian Convey > <christian.convey@gmail.com> wrote: >> As someone very new to this code base, I think these cycles make it a little >> harder to figure out the runtime and compile-time dependencies between the >> subsystems these directories seem to represent. I wonder if that's a >> problem others face as well? > There are probably some cases that could be improved, but I have my > doubts about whether eliminating cycles is a reasonable goal. Aside from Robert's points, I have a couple of thoughts: I think if it had been a clear, enforced goal all along, it might've been possible to build the system with such a restriction (for the most part at least). At this point though, the amount of work and code churn involved seems like it'd far exceed the benefits. It's also fair to question how much improvement in comprehensibility we'd really get. It's not like code's been dropped into completely random places where it doesn't belong. In the end, Postgres is a pretty big system and it's necessarily going to take time for newbies to learn their way around it. I believe there are some cases where circularity is just about unavoidable. As an example, the error reporting code in elog.c depends on memory management in mcxt.c, which itself uses elog.c's reporting facilities. There's another mutual dependency between error reporting and GUC (server configuration control). And on and on. I think the coding rule you're suggesting would require that each such dependency loop be confined to one major backend subsystem, which seems rather arbitrary. regards, tom lane
On Mon, Feb 10, 2014 at 10:28 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I think if it had been a clear, enforced goal all along, it might've beenpossible to build the system with such a restriction (for the most part at
least). At this point though, the amount of work and code churn involved
seems like it'd far exceed the benefits.
That makes sense to me. I certainly didn't think it was a slam-dunk that what I was proposing would be an improvement. It just seemed like a question worth asking. Thanks for your thoughts.
- Christian