# Abstract

Brooks Law states: “Adding manpower to a late software project makes it later.” This Law is applicable to any task involving lots of people in complex interaction. The only evidence Brooks provides is anecdotal: “Since software construction is complex, the communications overhead is great.” Furthermore, his graph illustrating the perverse “bathtub” relationship between men and months has axes with no numbers on them. No one seriously doubts the general validity of Brooks strange graph but there is something disturbing about a Law that lacks quantification and has no coherent theory to explain why it must be so. This Knol provides a scientific underpinning, derived from Information and Hierarchy Theory, that allows quantification of Brooks Law. Given this theoretical support, it may be possible to determine when adding more software engineers to a project is likely to delay completion rather than speed it.

## Abstract

Brooks Law states:

Adding manpower to a late software projectmakes it later.

This Law is applicable to any task involving lots of people in complex interaction, not just software engineering.

This famous remark by Fredrick P. Brooks, Jr. in his classic book

*The Mythical Man-Month*[Brooks, 1975] is, unfortunately, lacking in scientific rigor. The only evidence he provides is anecdotal:

Since software construction is complex, the communications overhead is great.

Furthermore, his graph (see his “Fig. 3.” above) illustrating the perverse “bathtub” relationship between “men” [1] and months has axes with no numbers on them.

Never the less, this bit of wisdom has stood the test of time for over three decades. It has done so in the face of software’s new status as an engineering discipline.Despite the advent of higher-level languages, integrated programming environments, object-oriented approaches, and capability maturity models, no one with experience in software engineering for complex systems seriously doubts the general validity of Brooks strange graph.

However, there is something disturbing about a

*Law*that lacks quantification and has no coherent theory to explain why it must be so. This paper provides a scientific underpinning, derived from*Information*and*Hierarchy Theory,*that allows quantification of Brooks’ Law. Given this theoretical support, it may be possible to determine when adding more software engineers to a project is likely to delay completion rather than speed it.

## Introduction

According to Fredrick P. Brooks, Jr. [Dorfman, 1997, p351]:

…*the man-month as a unit for measuring the size of a job is a dangerous and deceptive myth*… Men and months are interchangeable commodities only when a task can be partitioned among many workers with*no communication among them.*[*Italics*in original]

Brooksprovides a graph to illustrate the above point (see his “Fig. 1.” at right).Note that there areno numbers on the axes for “Men” and “Months”. He states his rationale as follows:

The term “man-month” implies that if one man takes 10 months to do a job, 10 men can do it in one month. This may be true of picking cotton.

The best that can be done when worker training and intercommunication is required is a diminishing results trade between men and months (see Brooks “Fig. 2.” at right).He states his rationale as follows:

Even on tasks that can be nicely partitioned among peoiple, the additional communication required adds to the total work, increasing the schedule.

For the case of software development, the situation may be even worse than indicated in thefigure to the right. The inherent complexity of software, and the need for continual intercommunication between workers, may lead to the situationwhere addition of more workers may, according to Brooks Law, actually lengthen the schedule!

What is the scientific basis for the perverse “bathtub” curve (see Brooks “Fig. 3.” above in the Abstract section).

## Quantifying Brooks Curves

How may we add scientific rigor, quantification, and a coherent theoretical underpinning for

*Brooks Law*? What mathematical equations may be utilized to generate Brooks strange “bathtub” curve? This section quantifies Brooks three curves.

From here on,Brooks obsolete terminology will be updated, using “people” instead of “men”in recognition of the fact software engineers are at least as likely to be women as men.

The figure below shows the results of analysis based on combining the

*Law of Diminishing Returns*and the*Optimal Span Hypothesis*.While not identical, the upper graph is a “bathtub” similar to Brooks “Fig. 3.” and it is quantified. The figure also includes two other quantified curves, the middle curve is similar to Brooks “Fig 2.” (the*Law of Diminishing Returns*)and the lower curve similar to Brooks “Fig. 1.” (“Men and months are interchangeable”)The section that follows provides the equations and rationale used to generate the quantified results presented here.

## 1.InterchangeablePeople and Months

Brooks “Fig. 1.” where people and months are interchangeableis quite easy to quantify.

If one person produces one unit of work in one unit of time, ten peoplewill produce ten units of work in one unit of time. The work produced by

** P**people in parallel in one unit of time will be:

** W =P**(1)

See the lower curve in the figure above for a quantified version of Brooks “Fig. 1.” for a100 MMM project.

## 2. Law of Diminishing Returns

Brooks “Fig. 2″is also easy to quantify using the well-known

*Law of Diminishing Returns.*This law is based on the well-known fact that adding people[2]in parallel seldom increases the work performed in direct proportion to the number of people. In general, if one person produces one unit of work in a unit of time, the work produced by** P**peoplein parallel in one unit of time will be:

** W =P ^{f}**(2)

where

** f = ½ = 0.5**(square-root) is a commonly used value. Brooks gives a version of this relationship, namely that the number of man-months to complete a software project is proportional to “complexity” raised to the power of 1.5 (equivalent to the case where

*f = 1/1.5 = 0.667*). Brooks assumes that what he calls “complexity” is proportional to code size[3]. Since Brooks gives no mathematical justification for his use of the “power of 1.5”, I will use the more traditional square root “power of 2″ value in this Knol.

Assuming

** f = 0.5**, the result is that

**peoplein parallel will not double the productivity, but merely do about**

*P = 2***units of work in time**

*W = 1.4***. To double the productivity and increase the amount of work to**

*t***units will require adding about**

*W = 2***people(for a total of**

*3***). To further increase the amount of work to**

*P = 4***units will require adding another**

*W = 3***people(for a total of**

*5***).**

*P = 9*Complex projects arealmost always performed by a department where one person, the manager, does no direct productive work on the project. Therefore we must subtract

** 1**from the number of people assigned and the work produced by

**peoplein parallel in one unit of time will be:**

*P*** W =(P-1) ^{f}**(2a)

See the lower curve in the figure above for a quantified version of Brooks “Fig. 2.” for a 100 MMM project.

As discouraging as the curves corresponding to Brooks’ Figure 2 may be, the amount of work continues to increase as peopleare added, but at a diminishing rate. This falls short of the situation represented by Brooks Law”bathtub” curve where, after a certain point, an increase in peopleresults in an increase in months rather than a decrease. Therefore, the Law of Diminishing Returns, in and of itself, no matter how low the value for

** f**, does not and can not theoretically justify Brooks Law.

## 3. Optimal Span Hypothesis YieldsBrooks’ “Bathtub” Curve

How may we quantify Brooks “Fig. 3.”?The answer is to multiply the effect of the

*Law of Diminishing Returns*by the effect of the*Optimal Span Hypothesis*(OSH)[4]for a 100 MMM project, whichyieldsthe top curve in the abovefigure that is somewhat similarto Brook’s “bathtub curve”.

In general, if one person produces one unit of work in a unit of time, the work produced by

** P**people in parallel in one unit of time will be:

** W = (P-1) ^{0.5}**x

*(3)*

**– ((2/(P-2) Log**/_{2}(2/(P-2))**0.53**The first term,

** ( P-1) ^{0.5}**, represents the effect of the

*Law of Diminishing Returns*from equation (2a) an that is multiplied by the second term,

*, which represents the effect of the*

**– ((2/(P-2) Log**/_{2}(2/(P-2))**0.53***Optimal Span Hypothesis*.(See the following section for the derivation of the second term in this equation.)

## Application of the Optimal Span Hypothesis to the Effectiveness of Hierarchical Organizations

According to

*Hierarchy Theory*[Pattee, 1973], hierarchical containment and control structures are dominant in nature and human affairs because they are more effective than non-hierarchical structures. A control hierarchy, such as the typical management organization of a project, consists of one or more layers of managers who each control a number of workers. This type of hierarchy is typically drawn as an inverted tree graph, with the top-level manager at the top*root**node*.*Edges*are drawn from that node down to subordinate nodes representing the lower-level managers and their departments, and so on, down to the*leaf*nodes representing the workers who have a direct role in producing the product or service. The number of departments reporting to a higher-level manager or the number of workers reporting to a first line manager is called that manager’s*Span of Control*.[5]

Glickstein’sOptimal Span Hypothesis [6]depends upon a concept in

*Information Theory*called the*intricacy*of a graph. The idea is that, given a set of resources represented as nodes and edges, there are many different ways to interconnect them. Each of these interconnection patterns may have a different amount of*intricacy*, which is related to information content and is measured in bits (binary digits). The more bits the more*intricacy*, and, in general,*intricacy*is good. All else being equal, structures with greater*intricacy*do more work than structures with less*intricacy*, even though both may consume the same amount of resources.

The intuitively pleasing thing about

*intricacy*is that it may be maximized by a*moderate*amount of interconnection between nodes, neither too dense nor too sparse. Structures where*all*nodes are connected to*all*other nodes have zero*intricacy,*as do structures where*none*of the nodes are connected.

For example, three alternative management hierarchy structures are shown in the figure below.

Each of the above hierarchies utilizes the same number of People (sum of the number of Managers and Workers), but (a) has a”Broad” management span of control (MSOC) of 48, (b) a “Tall” MSOC of 3.3, and (c)a “Moderate” MSOC of 6.5. Intuitively, anyone familiar with management will conclude that (c) makes much more effective use of the resources available (49 People) and that (a) has too few Managers and (b) too many.

The Optimal Span Hypothesis is merely a mathematical way to quantify this intuitive knowledge!

## Quantification of Brooks “Mythical Man-Month”

In the spirit of Brooks’ paper, let usassume the “Mythical Man-Month” (MMM) is a measure of the amount of work a

*super*programmer we’ll call “Geek Zipperhead” can do in his or her best month, with absolutely no management help or interference of any kind. Unhindered by specifications, standards, design walk-throughs, code verification and validation, and not having to attend meetings of any kind, Zipperhead cranks out thousands of lines of wonderfully high quality working code per month*. It lacks nothing except documentation.*

Any real software department, measured by specified, standardized, verified, validated, and tested lines of code per month, is bound to be

*less*efficient than Zipperhead. But*how much less*efficient?

## Single-Level Organization

Consider a single-level organization, adepartment witha total of

** P = 8**people, one of whom is a manager, and

**whoexert productive effort on the product or service of the department. This department has a span,**

*P-1 = 7***.**

*S = 7*## Theoretical efficiency

What is the theoretical working efficiency of this single-level organization?

(a) According to the

*Law of Diminishing Returns*with*f*** = 0.5**(corresponding to the square-root, SQRT of the Span), it is:

* E _{SQRT} = 100 (P-1) ^{0.}*

^{5}/**P****(%)**(4)

The middle term is equation (2a), with

** 0.5**substituted for

**. That value represents the amount of actual work done by the workers in a month. It is divided by**

*f***, the amount of work that would be done by an ideal (and imaginary) group of super programmers without the help or interference of a manager. It is multiplied by**

*P**to get a percentage. If we substitute*

**100****for**

*8***,**

*A*

*E*_{SQRT}= 33.1%Note the appalling effect of the intercommunication withinsingle typical department. According to the

*Law of Diminishing Returns*, the coordination and communications and management of seven workers and one Manager reduces the effectiveness down to only a bit more than 33% of the amount of work a super programmer, Geek Zipperhead, could do on his or her own. Of course, Geek would takeabout threetimes as longto do the task and, since time is money, we ordinarily cannot afford the delay. Also, decisions are generally made by managers and they would all be out of work if they let Geek do the job without any management direction. If Geek left the organization, lack of clear documentation wouldbe a problem. Therefore, despite the appalling news in this Knol (and it gets more so as you read on), hierarchical organizations are definitely the way to go.

But wait, we are not done! It gets worse when we consider the additional hit imposed by the

*Optimal Span Hypothesis.*

(b) According to the Optimal Span Hypothesis, considered alone:

* E _{OSH} = -100((P-1)/P) (2/(P-1-1)) Log _{2} (2/(P-1-1)) *

*/*

**0.53****(%)**(5)

The middle term is equation (A9) [see Appendix], with

** P-1**substituted for the span. It is divided by 0.53, the number of bits/node that would result from an organization with an Optimal Span. It is multiplied by

**to account for the assumption that one of the people, themanager, does no direct productive work on the end product of the department. It is multiplied by**

*(P-1)/P**to get a percentage. If we substitute*

**100****for**

*8***,**

*P*

*E*_{OSH}= 87.2%This is for a department with a management span of control of 7 which is pretty close to optimum. Try running the above equation with fewer workers or more workersin the department and their effectiveness goes downhill. For example, for

** P = 5**,

**For**

*E*_{OSH}= 59.9%.**,**

*P = 12***For**

*E*_{OSH}= 80.3%.**,**

*P = 22*

*E*_{OSH}= 59.8%.(c) Combining these two factors:

** E _{C}**=

**(6)**

*E*_{SQRT}E_{OSH}(P/(P-1))/100 (%)When combining

** E _{SQRT}**and

**we have to multiply by**

*E*_{OSH}**because the non-productive manager has been accounted for separately in each and we only need to account for it once. We divide by**

*P/(P-1)***because both are percentages. For the department under consideration (**

*100***), the net**

*P=8*

*E*_{C}**For**

*= 33 %.***, the net**

*P=5*

*E*_{C}**For**

*=16.4 %.***, the net**

*P=12*

*E*_{C}**For**

*=24.2 %.***, the net**

*P=22*

*E*_{C}=13 %.

## Maximum theoretical efficiency

Note that the maximum

** E _{C}**is 35.2% and it occurs for the case of six people on the job. Thus, if we organize our project with five software engineers and a manager, we will be about 1/3 as efficient as our super programmer, Geek Zipperhead,

*which is the best we can hope to be*. In this case,

**(Actual Person Months).**

*1 MMM ≈ 3 APM*

## Schedule and cost issues

Well then, how long will it take to complete the project?

If it is a 100 MMM project, Geek Zipperhead, if he or she could keep up the pace that long, could theoretically do it in 100 calendar months, or about 8.3 years. That is a very long time for any project! Our eight-person department could do it in about four years, which is probably within reason. But, what if we are willing to put up with a bit more inefficiency, what is the best we can do in terms of schedule?

It turns out the fastest we could do it would be about 34 months, using a department with 18 agents. Any more agents on the job will lengthen the schedule, not shorten it!

Brooks was correct and now we know the theoretical number where Brooks Law takes effect. For a single-level organization (one department), any more than about 18 workers will actually lengthen rather than shorten the schedule!

What will the project cost? The figure below shows the relationship between duration and cost and highlights the

**minimum cost**and**minimum duration**points.**Minimum cost**(for a non-Geek Zipperhead solution) is about 24 Person-Years, and that applies to a six-agent department working for 4 years.**Minimum duration**is 2 years and 8 months and applies to a fifteen-person gang that will cost us a whopping 48 Person-Years. The**“best” plan**, which is most cost-effective in terms of both cost and duration, is somewhare between these extremes. For example, a nine-person department will complete the project in 3 years and cost 27 Person-Years.

Three to four years, which is how long it will take a single-level department to do a 100 MMM project, seems a bit long, but it may be acceptable. However, what will we do if we have a much larger project, say 500 MMM? It would take poor Geek Zipperhead, the super programmer, his or her entire career of over 40 years to do such a project, and the best single-level department about 20 years!Ask any management expert and he or she will say “create a two-level organization”!Two-Level OrganizationThe obvious solution is a two-level organization, with an executive manager who has several first-line managers reporting to him or her and with several software engineers in each first-line department.The impact of the Law of Diminishing Returns and the Optimal Span Hypothesis will be felt at each of the two levels, as will the impact of even more managers who are essential to the hierarchical organization, yet do not direct productive work onthe end product of the organization.I am not exactly sure how to handle the calculations for a multi-level organization, but I believe the overall efficiency (as compared to Geek Zipperhead) will be quite low, perhaps in the 20% range in the best case.A 500 MMM job may be able to be completed in around four years, but with the expenditure of 2,500 APM (Actual Person Months).If the project is required to be completed in two years, it might cost 10,000 APM!Perhaps the “Best plan” would result in completion in about three years, at a cost of 3,500 APM.SummaryThis Knol outlines a potential method for quantifying Brooks Law and the relationship between people and duration (or, as Brooks put it “men and months”). The graphs and tables demonstrate that, for any given-size job (expressed in MMM – “mythical man-months”) there is a particular organizational structure with a specific span that: Yields the lowest cost to completion -OR- Yields the shortest project duration -OR- Represents the “best” plan trade-off between cost and schedule The key point is they are not the same .The material in this Knol may help experienced Program Managers trade off duration vs cost to find the most cost-effective way to reach project completion in a reasonably short time period. Application of this method and approach to “real-world” projects willfurther refine and improve the theoryto a point where it is of practical use.In Organization Structures for Dealing with Complexity Bart R. Meijer[ Meijer, 2006 ] devotes a sub-section to Glickstein’s hierarchical span theory [p 103] and states:… the optimal span theory of Glickstein offers another complexity based figure of merit for the intricacy of a hierarchical structure.” [p87] and “… Glickstein’s optimal span hypothesis offers an additional insight in the organisation and evolution of many structures …[and] what would be an optimal hierarchical organization of almost identical resources for a complex task, such as software development.” [p107]Meijer further observes:Big may not be beautiful. From the viewpoint of the coordination cost model and from Glickstein’s hierarchical span theory, one may seriously question the efficiency of large faculties and departments. [p140]… field tests of thecoordination cost model and Glickstein’s optimal span theory can improve our ability to build more effective research organizations [[p 200]Brooks, Frederick P., Jr., 1974 – “The Mythical Man-Month,” extract from [Brooks, 1975] in Datamation ., December 1974, pgs. 44-52.Brooks, Frederick P., Jr., 1975 – “The Mythical Man-Month,” Addison-Wesley Publishing Co., Reading, MA.Dorfman, Merlin and Thayer, Richard H., 1997 – (Editors) “Software Engineering,” IEEE Computer Society Press, Los Angeles.Glickstein, Ira S., 1996 – “Hierarchy Theory: Some Common Properties of Competitively-Selected Systems,” PhD Dissertation, Binghamton University.Klir, George J., 1985 – “Architecture of System Problem Solving,” Plenum, New YorkMiller, George A., 1956 – “The Magical Number Seven, Plus or Minus Two, Some Limits on Our Capacity for Processing Information” in Psychological Review , Vol 63, #2, pg. 81-97.Pattee, Howard H.,1973 – (Editor) “Hierarchy Theory, The Challenge of Complex Systems,” George Braziller, New YorkSmith, Temple F. and Morowitz, Harold J., 1982 – “Between History and Physics” in Journal of Molecular Evolution , Springer-Verlag[1] Brooks wrote this book back in 1975, well before it became “politically incorrect” to use the word “men” to represent software engineers of both genders. I’ve retained his usage in this paper only when paraphrasing or quoting Brooks.[2] In the interest of gender neutrality, I’ve substituted the term “people” for “men”.[3] The “Complexity” of anything, including software, may be defined in different ways, but is generally a measure of how hard it is to understand. Generally, larger code size is harder to understand, but this is not always the case. I would say that the complexity of a program is proportional to the minimum code size that will perfoem the function and is understandable to a competent software engineer.[4] Glickstein’s work on the OSH was inspired by another classic paper, “The Magical Number Seven Plus or Minus Two,” by George A. Miller [Miller, 1956]. Miller observed that a host of psychological tests clearly demonstrate that humans can reliably distinguish no more than about five to nine gradations or categories using the senses of vision, hearing, touch, and taste. Miller called this limit the “Span of Absolute Judgment”.In his PhD dissertation [Glickstein, 1996], showed that this span, five to nine, was characteristic not only of the cognitive domain, but also applied, in a statistically significant way, to the domains of human language, human (management) organization, animal and plant organization, cell structure and gene regulation organization, and, most convincingly because it does not involve human categorization or interpretation (which could be affected by our cognitive “programming” to favor groups of about seven), the folding of RNA and the structure of DNA. He also derived a mathematical formula for Optimal Span in a variety of physical contexts. To do this, he adapted the equation for the intricacy of a graph derived by Temple F. Smith and Harold J. Morowitz [1982]. Their equation is based on the well-known Information Theory work of Claude Shannon.[5] A given node (department or worker) will generally interact strongly with one or more other nodes at its level in the hierarchy. The number of nodes a given node interacts with strongly at its level is called the Degree of that node.[Glickstein, 1996] showed that theoretically the most efficient hierarchy is one where, on the average, each node interacts strongly with about two other nodes. This type of hierarchy is called a linear folded string . In graph theory, a tree may be traversed to yield the equivalent string and a string may be parsed to reconstitute the tree. Some examples are: (1) A written document may be represented from the top-down as a tree (containment hierarchy), consisting of sections, paragraphs, simple sentences, words, and characters. Considered from the bottom up, written language may be represented as a folded string of characters, each strongly connected to the character before and after it and folded at the space character that demarks each word, then folded again at each simple sentence, folded yet again at each paragraph, and so on, up the hierarchy. (2) Computers store all types of document files as binary strings of “1” and “0” bits folded every eighth bit into bytes. In a computer, each letter, number, punctuation mark, symbol, and space character is generally composed of a unique pattern of eight to sixteen bits. (3) In the physical world, proteins are literally a folded string of amino acids, and RNA is a folded string of bases along the sugar-phosphate backbone. When a protein or a string of RNA is denatured , it becomes unfolded into a string. Natural forces cause these strings to fold to form the three-dimensional structures necessary for them to do their work. (4) If you want to imagine the hierarchical tree of your project traversed (or denatured ) into a string, consider all the managers with their employees in a long line. Each first-level manager is at the head of his or her employees, and these sub-strings are hooked together with each second-level manager at the head, and so on, up to the project manager, who leads the parade. To parse this string, just fold it at each demarcation point (manager) to form the structure of your project.[6] See Optimal Span (Knol) for a derivation of the Optimal Span equation.Appendix – Derivation of Optimal Span and IntricacyTo help understand the theoretical underpinning of the Optimal Span Hypothesis (OSH), and how intricacy may be maximized, consider a group of S nodes and how many edges we may use to interconnect them. Assuming unique, equally weighted, bi-directional links, the maximum number of edges, M , to interconnect S nodes is: M = S(S-1)/2 (A1)If we know the node degree, D , we can compute the actual number of edges, A ,: A = S D /2 (A2)The ratio, A/M is the connectivity ratio, and, the intricacy , measured in bits, of an interconnected group of nodes is: I = – S (A/M) Log 2 (A/M) (bits) (A3) According to this formula, if the connectivity ratio is very low or very high, intricacy will be low. In fact, maximum intricacy results when: A/M = 1/e = 0.37 (A4)Where e is the “natural number” (2.718284590…). Using the equations above, the formula for Optimal Span, S O , is: S O = 1 + De (A6)The Optimal Degree is D = 2 , and, for that case, the Optimal Span S O is: S O = 1 + 2e = 6.4 (A7)This is the span that yields the greatest intricacy for a given investment in nodes and edges, according to the tenets of Information Theory . Using the equations above, and assuming D = 2 , the intricacy for any span, S , may be computed with the following equation: I = – S (2/(S-1)) Log 2 (2/(S-1)) (bits) (A8)The intricacy , measured in bits/node, is: I = – (2/(S-1)) Log 2 (2/(S-1)) (bits) (A9) When S = 6.4 , intricacy is at its maximum and it is: I = – (2/(6.4-1)) Log 2(2/(6.4-1)) = 0.53 (bits/node) (A10)

The table below summarizes our choices. Our most practical choices are highlighted and involve completion in three to four years for a cost of about 24 to 27 agent-years. | |||

Agents on Job | Efficiency (%) | Duration (Years) | Cost (Agent-Years) |

1 (must be Zipperhead) | 100 | 8.3 | 8.3 |

6 (Minimum Cost) | 35.2 | 4 | 23.7 |

9 (“Best” Plan) | 30.6 | 3 | 27.2 |

15 ( Minimum Duration) | 19.6 | 2.8 | 42.6 |