Breaking compile-time dependency cycles of Postgres subdirs?

Started by Christian Conveyalmost 12 years ago5 messages
#1Christian Convey
christian.convey@gmail.com

This question is mostly just curiosity...

There are build-time dependency cycles between some of Postgres' code
subdirectories. For example, "storage" and "access" have such a cycle:
storage/buffpage.h #includes access/xlogdefs.h
access/visibilitymap.h #includes storage/block.h

Has there been any discussion about reorganizing these directories so that
no such cycles exist?

As someone very new to this code base, I think these cycles make it a
little harder to figure out the runtime and compile-time dependencies
between the subsystems these directories seem to represent. I wonder if
that's a problem others face as well?

#2Robert Haas
robertmhaas@gmail.com
In reply to: Christian Convey (#1)
Re: Breaking compile-time dependency cycles of Postgres subdirs?

On Fri, Feb 7, 2014 at 7:39 AM, Christian Convey
<christian.convey@gmail.com> wrote:

This question is mostly just curiosity...

There are build-time dependency cycles between some of Postgres' code
subdirectories. For example, "storage" and "access" have such a cycle:
storage/buffpage.h #includes access/xlogdefs.h
access/visibilitymap.h #includes storage/block.h

Has there been any discussion about reorganizing these directories so that
no such cycles exist?

Not to my knowledge.

As someone very new to this code base, I think these cycles make it a little
harder to figure out the runtime and compile-time dependencies between the
subsystems these directories seem to represent. I wonder if that's a
problem others face as well?

There are probably some cases that could be improved, but I have my
doubts about whether eliminating cycles is a reasonable goal.
Sometimes, two modules really do depend on each other. And, you're
talking about this not just on the level of individual files but
entire subtrees. There are 90,000 lines of code in src/backend/access
(whose headers are in src/include/access) and more than 38,000 in
src/backend/storage (whose headers are in src/include/storage);
expecting all dependencies between those modules to go in one
direction doesn't feel terribly reasonable. If it could be done at
all, you'd probably end up separating code into lots of little tiny
directories, splitting apart modules with logically related
functionality into chunks living in entirely different parts of the
source tree - and I don't think that would be an improvement.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Christian Convey
christian.convey@gmail.com
In reply to: Robert Haas (#2)
Re: Breaking compile-time dependency cycles of Postgres subdirs?

On Sun, Feb 9, 2014 at 8:06 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Feb 7, 2014 at 7:39 AM, Christian Convey
<christian.convey@gmail.com> wrote:

This question is mostly just curiosity...

As someone very new to this code base, I think these cycles make it a
little

harder to figure out the runtime and compile-time dependencies between

the

subsystems these directories seem to represent. I wonder if that's a
problem others face as well?

There are probably some cases that could be improved, but I have my
doubts about whether eliminating cycles is a reasonable goal.
Sometimes, two modules really do depend on each other. And, you're
talking about this not just on the level of individual files but
entire subtrees. There are 90,000 lines of code in src/backend/access
(whose headers are in src/include/access) and more than 38,000 in
src/backend/storage (whose headers are in src/include/storage);
expecting all dependencies between those modules to go in one
direction doesn't feel terribly reasonable. If it could be done at
all, you'd probably end up separating code into lots of little tiny
directories, splitting apart modules with logically related
functionality into chunks living in entirely different parts of the
source tree - and I don't think that would be an improvement.

Thanks Robert. IMHO, whether or not it would be beneficial depends on
which files (or definitions within files) had to be broken out into
additional subdirectories in order to break the cycles. If it could be
accomplished with at most a few additional subdirectories that were also
intuitively meaningful groupings of files/definitions, it could be a win.
But if not, I agree it would be a step backwards.

Still, I'm thinking this might be a problem we need to partially solve if
we're going to support a pluggable storage manager, particularly if we
allow a pluggable storage manager to use the system's buffer system and/or
block I/O system. I guess it depends on exactly what we want from a
pluggable storage manager.

- Christian

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#2)
Re: Breaking compile-time dependency cycles of Postgres subdirs?

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Feb 7, 2014 at 7:39 AM, Christian Convey
<christian.convey@gmail.com> wrote:

As someone very new to this code base, I think these cycles make it a little
harder to figure out the runtime and compile-time dependencies between the
subsystems these directories seem to represent. I wonder if that's a
problem others face as well?

There are probably some cases that could be improved, but I have my
doubts about whether eliminating cycles is a reasonable goal.

Aside from Robert's points, I have a couple of thoughts:

I think if it had been a clear, enforced goal all along, it might've been
possible to build the system with such a restriction (for the most part at
least). At this point though, the amount of work and code churn involved
seems like it'd far exceed the benefits.

It's also fair to question how much improvement in comprehensibility
we'd really get. It's not like code's been dropped into completely
random places where it doesn't belong. In the end, Postgres is a pretty
big system and it's necessarily going to take time for newbies to learn
their way around it.

I believe there are some cases where circularity is just about
unavoidable. As an example, the error reporting code in elog.c depends
on memory management in mcxt.c, which itself uses elog.c's reporting
facilities. There's another mutual dependency between error reporting
and GUC (server configuration control). And on and on. I think the
coding rule you're suggesting would require that each such dependency
loop be confined to one major backend subsystem, which seems rather
arbitrary.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Christian Convey
christian.convey@gmail.com
In reply to: Tom Lane (#4)
Re: Breaking compile-time dependency cycles of Postgres subdirs?

On Mon, Feb 10, 2014 at 10:28 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I think if it had been a clear, enforced goal all along, it might've been
possible to build the system with such a restriction (for the most part at
least). At this point though, the amount of work and code churn involved
seems like it'd far exceed the benefits.

That makes sense to me. I certainly didn't think it was a slam-dunk that
what I was proposing would be an improvement. It just seemed like a
question worth asking. Thanks for your thoughts.

- Christian