Thread: Breaking compile-time dependency cycles of Postgres subdirs?

Breaking compile-time dependency cycles of Postgres subdirs?

From
Christian Convey
Date:
This question is mostly just curiosity...

There are build-time dependency cycles between some of Postgres' code subdirectories.  For example, "storage" and "access" have such a cycle:
storage/buffpage.h #includes access/xlogdefs.h
access/visibilitymap.h #includes storage/block.h

Has there been any discussion about reorganizing these directories so that no such cycles exist?

As someone very new to this code base, I think these cycles make it a little harder to figure out the runtime and compile-time dependencies between the subsystems these directories seem to represent.  I wonder if that's a problem others face as well?

Re: Breaking compile-time dependency cycles of Postgres subdirs?

From
Robert Haas
Date:
On Fri, Feb 7, 2014 at 7:39 AM, Christian Convey
<christian.convey@gmail.com> wrote:
> This question is mostly just curiosity...
>
> There are build-time dependency cycles between some of Postgres' code
> subdirectories.  For example, "storage" and "access" have such a cycle:
> storage/buffpage.h #includes access/xlogdefs.h
> access/visibilitymap.h #includes storage/block.h
>
> Has there been any discussion about reorganizing these directories so that
> no such cycles exist?

Not to my knowledge.

> As someone very new to this code base, I think these cycles make it a little
> harder to figure out the runtime and compile-time dependencies between the
> subsystems these directories seem to represent.  I wonder if that's a
> problem others face as well?

There are probably some cases that could be improved, but I have my
doubts about whether eliminating cycles is a reasonable goal.
Sometimes, two modules really do depend on each other.  And, you're
talking about this not just on the level of individual files but
entire subtrees.  There are 90,000 lines of code in src/backend/access
(whose headers are in src/include/access) and more than 38,000 in
src/backend/storage (whose headers are in src/include/storage);
expecting all dependencies between those modules to go in one
direction doesn't feel terribly reasonable.  If it could be done at
all, you'd probably end up separating code into lots of little tiny
directories, splitting apart modules with logically related
functionality into chunks living in entirely different parts of the
source tree - and I don't think that would be an improvement.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Breaking compile-time dependency cycles of Postgres subdirs?

From
Christian Convey
Date:
On Sun, Feb 9, 2014 at 8:06 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Feb 7, 2014 at 7:39 AM, Christian Convey
<christian.convey@gmail.com> wrote:
> This question is mostly just curiosity...
> As someone very new to this code base, I think these cycles make it a little
> harder to figure out the runtime and compile-time dependencies between the
> subsystems these directories seem to represent.  I wonder if that's a
> problem others face as well?

There are probably some cases that could be improved, but I have my
doubts about whether eliminating cycles is a reasonable goal.
Sometimes, two modules really do depend on each other.  And, you're
talking about this not just on the level of individual files but
entire subtrees.  There are 90,000 lines of code in src/backend/access
(whose headers are in src/include/access) and more than 38,000 in
src/backend/storage (whose headers are in src/include/storage);
expecting all dependencies between those modules to go in one
direction doesn't feel terribly reasonable.  If it could be done at
all, you'd probably end up separating code into lots of little tiny
directories, splitting apart modules with logically related
functionality into chunks living in entirely different parts of the
source tree - and I don't think that would be an improvement.


Thanks Robert.  IMHO, whether or not it would be beneficial depends on which files (or definitions within files) had to be broken out into additional subdirectories in order to break the cycles.   If it could be accomplished with at most a few additional subdirectories that were also intuitively meaningful groupings of files/definitions, it could be a win.  But if not, I agree it would be a step backwards.

Still, I'm thinking this might be a problem we need to partially solve if we're going to support a pluggable storage manager, particularly if we allow a pluggable storage manager to use the system's buffer system and/or block I/O system.  I guess it depends on exactly what we want from a pluggable storage manager.

- Christian

Re: Breaking compile-time dependency cycles of Postgres subdirs?

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Feb 7, 2014 at 7:39 AM, Christian Convey
> <christian.convey@gmail.com> wrote:
>> As someone very new to this code base, I think these cycles make it a little
>> harder to figure out the runtime and compile-time dependencies between the
>> subsystems these directories seem to represent.  I wonder if that's a
>> problem others face as well?

> There are probably some cases that could be improved, but I have my
> doubts about whether eliminating cycles is a reasonable goal.

Aside from Robert's points, I have a couple of thoughts:

I think if it had been a clear, enforced goal all along, it might've been
possible to build the system with such a restriction (for the most part at
least).  At this point though, the amount of work and code churn involved
seems like it'd far exceed the benefits.

It's also fair to question how much improvement in comprehensibility
we'd really get.  It's not like code's been dropped into completely
random places where it doesn't belong.  In the end, Postgres is a pretty
big system and it's necessarily going to take time for newbies to learn
their way around it.

I believe there are some cases where circularity is just about
unavoidable.  As an example, the error reporting code in elog.c depends
on memory management in mcxt.c, which itself uses elog.c's reporting
facilities.  There's another mutual dependency between error reporting
and GUC (server configuration control).  And on and on.  I think the
coding rule you're suggesting would require that each such dependency
loop be confined to one major backend subsystem, which seems rather
arbitrary.
        regards, tom lane



Re: Breaking compile-time dependency cycles of Postgres subdirs?

From
Christian Convey
Date:
On Mon, Feb 10, 2014 at 10:28 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I think if it had been a clear, enforced goal all along, it might've been
possible to build the system with such a restriction (for the most part at
least).  At this point though, the amount of work and code churn involved
seems like it'd far exceed the benefits.


That makes sense to me.  I certainly didn't think it was a slam-dunk that what I was proposing would be an improvement.  It just seemed like a question worth asking.  Thanks for your thoughts.

- Christian