“In everyday life, the Flaw of Averages ensures that plans based on average customer demand, average completion time, average interest rate, and other uncertainties are below projection, behind schedule, and beyond budget.” – Sam L. Savage, 2009 – The Flaw of Averages

“Variation is the hard reality, not a set of imperfect measures for a central tendency. Means and medians are the abstractions.”  – Stephen Jay Gould, 1985 – “The Median Isn’t the Message”

“Essentially, all models are wrong, but some are useful” – George E. P. Box, “Empirical Model Building and Response Surfaces”, (1919 – 2013)

“Remember that a model is not the truth. It is a lie to help you get your point across.” – Sam L. Savage, “The Flaw of Averages”, 2009

“You are allowed to lie a little, but you must never mislead.” – Paul Halmos, mathematician, (1916 – 2006)


Last week I read an article about yet another tool showing the ability to produce CFDs (cumulative flow diagrams). Maybe you’re already a user of one of these tools that help you visualize your workflow, and generate them for you automatically (or “auto-magically”) as part of reports they provide. Or, perhaps like me, you still generate them mostly using MS-Excel. Either way, have you wondered just a little about how a CFD works?

As most do, this article displayed a line extending vertically between “stages” on the CFD (or workflow processes, as I often call them) and identified this distance as the WIP (work-in-progress) on a specific date for the respective stage (or stages) of interest. There was also a description of another distance, a line extending horizontally between stages of interest on the CFD and identifying it as the “average lead time” for the requests (workitems) arriving on a specific date. That is, the average time for a request (workitem) to “flow” through (arrive into and depart out of) one or more stages of interest. Lastly, there was a description of a sloped line, as a mean delivery rate of requests (workitems) flowing into or out of a stage (workflow process) depending on viewing the workflow upstream or downstream. Note: for more on the basics on reading a CFD, see this earlier post here.

A Difference Is Not An Average, Right?

It is easy to understand that WIP is simply a “difference” between two counts on the CFD and represents the number of requests (workitems) at a point in time. Similarly, seeing the slope as a simple rise over run calculation of a number of requests per unit of time (a rate over a period of time of interest) is not a complex concept to accept. But, what about the notion of the “average lead time” derived from the CFD? How is it a “difference” between two points in time read from the CFD (ex. calendar dates) can represent an “average” unit of time for a request (workitem) to flow through a stage (workflow process)? A “difference” that represents N numbers summed up and divided by N. Yes, really! But how can this be?

In a previous post here I provide a reference for the best starting point I’ve found so far to explore further what allows CFDs, and in a related manner Little’s Law, to work. I’ll tell you now it is not magic! However, in this short blog post I can only briefly summarize what I learned from that reference and since then related references by saying CFDs (and similarly Little’s Law) are very robust, and applicable in wide variety of domains and contexts.

However, a key point here is using CFDs effectively requires understanding the assumptions they are based on and in particular the nuances in different contexts. This is even more important in my opinion, when using a tool that “auto-magically” creates them for you. Consider that as you deviate more and more from these assumptions and nuances specific to your context, any “average” lead time you read from your CFD becomes a “grimier approximation” (less meaningful, less useful) and will likely not reflect the actual average lead time of those requests completing.

But Remember An Average Is Just An Average

Even if you’re fully convinced the CFD you have in hand is giving you a good approximation of the average lead time for requests (workitems), it is only an “average.” Is this also important to understand? Yes! Especially when considering how you might use it to make decisions about next steps to improve your workflow processes, or to communicate to others when a request (workitem) might be completed, or when a set of requests (workitems that make up say a feature, MMF, MVP, or epic) will be done.

Am I suggesting that a CFD is not useful? In fact, it is quite the opposite. I found studying how they work to be extremely helpful in more deeply understanding principles of flow and concepts of pull scheduling. In particular, in understanding how “everyday rules” (polices), those written down explicitly, if any, and especially those unwritten ones, maybe even “unspoken” ones, like “who screams loudest wins”, impact the performance and predictability of workflows.

However, a second point I’m making is that I would not stop with looking only at the “averages” read from the CFD, especially that average lead time. As with any other average used in analysis, they should be considered in light of the distribution of the data used to create them. Visualizing the data with simple tools like scatter plots and histograms help go beyond the initial information found on the CFD. Add to that also a close inspection of the workflow context that produced the data used to create the CFD (or any average), and especially the “everyday rules” (polices).


In summary, I’m recommending these three things: learn more about the assumptions and nuances for specific contexts that that allow CFDs to work; go beyond just looking at the average lead time read from a CFD by inspecting the scatter and shape of the data used to create the CFD; and look closely at the policies governing how requests (workitems) enter, move through, and exit your workflow processes.

In particular they’ll help you learn a lot more about the variation (“the hard reality”) of your workflow process; the hard reality likely contributing in a much greater way to long lead times (duration) for individual requests (workitems). That is, influences negatively effecting lead times in ways that are often much greater than those contributed from the “complexity” of the requests (workitems) arriving into your workflow, or the “level of effort” (direct hands-on time) needed for completing a request (workitem). This deeper understanding I believe is essential for creating more predictable workflows. So, I hope you’ll give them a try and not be just an average CFD user!

Take care,