Note: In his book “Actionable Agile Metrics for Predictability”, my colleague Dan Vacanti expands on my FedEx example while discussing further the notion and benefits of “slack” in the software development context. You’ll find this discussion in Ch. 13 Pull Policies.
“The beauty of using a flow system and a visual control is that we can measure cycle time and we can observe context…
The trappings of false certainty we gave ourselves in previous methodologies are being replaced with the comfort of graspable variation in kanban…
We now see knowledge work for what it is – a chaotic system that is fully manageable and understandable by its outputs and contexts, but very much unknowable by its actions and specifics.”
– Jim Benson, Sep 2011
The title of this post contains an answer and a question. However, it’s a “question” that prompted the answer part of this title, and my recollection of how both came to me originally, that I’m interested in sharing in this post and hope you find interesting if not amusing. As for providing an “answer” to the question part of this post’s title, that I’ll leave entirely to you.
Who Mentioned Variability?
This summer I had an opportunity to help out with a training class introducing the kanban method to a few members from several teams that all worked in a software development and operations support context. On the second day of this training we came to a section titled “Understanding Variability.” As the class began discussing why this might be important, I couldn’t help but flashback to a time when I heard my wife shout “It’s Slack! Who can’t see that?”, as she was watching a TV program in our family room. I’m sure at the moment of my flashback, more than a few in the training classroom wondered how I could be enjoying the discussion so much, while I sat there listening with a big smile on my face, chuckling a bit, as if I was somewhere else hearing a comic telling his latest and greatest jokes. The truth was, I was somewhere else for the moment.
The year 2009 was coming to an end, and it was a few days before New Year’s Eve. I was at home in the loft sitting at my computer reading or writing something, when I heard my wife shout from below “It’s slack! Who can’t see that?” as she walked out of the family room. The tone in her voice hinted it was the “Who can’t see that?” part that she was most excited and proud to share right at that instant. However, I was much more interested in knowing the yet unknown question that prompted her to excitedly shout “It’s slack!” It was obvious too, she had more to share with me right then, something I’ll now share a bit about with you.
A FedEx Story – When It Needs To Be On Time
On that evening, the History Channel aired a show titled “Job Site: Deadline Delivery.” I’m not sure from the program’s title if it was meant to be a series, but this particular show was specifically about how FedEx manages to get their packages delivered all over the world On Time! But how can this be All the Time?
The show tells how packages arrive at various places in the FedEx system at their intermittent locations (all over the world) as they journey to their final destination. At these intermittent locations packages are sorted and re-directed as needed from one plane to another (no surprise here).
However, part of this process, of course, requires loading the proper packages onto the proper planes. As planes are loaded, it is inevitable (due to variation in the number of packages, and their size and weight) that at times “the load” will exceed the capacity of the plane (either by volume or by weight). The show also highlights numerous other challenges as well that would cause a package to be delayed such as clearing customs. Therefore, everyday somewhere in the system some packages must be left behind as others are allowed to move on in order to be on time.
So what happens to these left behind packages? How does FedEx ensure they too get delivered on time? The answer is “slack.” FedEx knows well the capacity of their system as well as the nature of its variations. They’ve studied the system well and captured this knowledge and then used this knowledge to design the needed solution to the problem.
Background: This two-hour show depicts FedEx’s behind-the-scenes activity of the service they provide to their customers, one bound by extremely tight time deadlines. In many cases the service they provide is “critical” often requiring them to get a package halfway around the world to arrive on time! The show’s production involved FedEx team members from around the globe who are performing their normal job function. FedEx’s promise to their customer is on-time delivery even in an operation that has been described as ‘orchestrated chaos’ and every workgroup understands its role and why it is so important. Without that focus on the “Purple Promise”, FedEx management says they would be just another company.
Spares in the Air – FedEx Keeps Empty Planes in the Air!
So how does FedEx use “slack” in their system to solve this problem? At the time this show aired, they had determined, through the study of their system, that they needed to have 16 “empty planes” up in the air, at strategic locations, every day (or was it 19, the point is they have some “routine” number of empty planes in the air). Yes, I said empty planes in the air! SPARES IN THE AIR! (I believe that is what they called them, or I came to call them later). Then as the natural variation in the system generates “left behind” packages, these “slacker planes” (my term, not theirs) are directed just-in-time (JIT) and as needed to pickup the “left behinds” and get them back on track. SLACK KEEPS YOU ON TRACK! (My version of summarizing their approach).
FedEx doesn’t know each and every day where the “left behinds” will appear. However, they have learned they appear everyday, and learned the number of empty planes they need, and the approximate locations they are needed to handle these variations wherever they appear. Is this so amazing? I think so!
While FedEx’s system is different than a typical software development context, what could we learn from them about studying and better understanding the behavior and nature of our own workflows? How might a step toward a quantitative approach to measure and understand the variation inherent in our systems help in our context? What forms of “slack” might we employ in our own systems to help us manage these measured variations (the “expected unexpected”) and get better at being on time more often?
When I say being “On Time” here, I’m not talking about developing an estimate up front and then adding some padding buffer of time to our estimate to account for the “unexpected.” I’m talking about collecting historical data over time and using this to get a well-known picture of our system’s capabilities, as well as the nature of the work and how it goes through our software development system (or operations system, etc.). Then design, plan for, and maintain the “slack” in the system (again, I’m not talking about time padding an estimate here) to meet the expected average demand as well as some level of the measured (expected) variations. Is this even or always possible or cost effective in some or most software development contexts? Or is the idea of a “solution” to being on time (without this knowledge) at best simply wishful thinking?
Here’s a snippet from my personal “real world” experience. A few years ago, as a member of a development team, I also helped to introduce the kanban method into our broader software development environment. As part of this effort, I began gathering, plotting, and analyzing data for both MMF (minimum marketable feature, the point here is some larger work item than a two to four day story) and story work items. After collecting story counts, along with lead and cycle time data for two to three months, this team started becoming adept at forecasting the completion of an MMF, typically after one to two weeks from pulling it, to within plus/minus three to four days. While I won’t dive into the statistical or analysis details in this post, I will say they understood well by then the following variations in their workflow:
- variance of the number of stories per MMF
- variance of number of stories completed per reporting interval (throughput or flow)
- variance of lead and cycle times per story
To help bring some perspective to this, the cycle times for these MMFs ranged from two to six weeks. Now, some of you may think forecasting the completion of an MMF that size to within a week or less isn’t so impressive. However, consider that before this team ever started using a more quantitative approach to measuring and tracking throughput and variances inherent in their system, they rarely (as in nearly never) were successful at forecasting even just the number of stories they could complete in one or two weeks and often were working in a “hurry-and-catch-up” mode. So, going from that point to being able to, over 13 months and 12 MMF level work items, hit forecasted MMF target dates within three to four days was quite an improvement and more importantly doing this while working at a sustainable pace and without any heroic overtime efforts. Over this time they were also able to significantly reduce the technical debt that had accumulated previously under the “constant pressure” to hit “unhittable” targets. It was not for lack of trying hard with other methods (methods where variance was not quantitatively measured).
Was it measuring quantitatively, then visualizing and understanding the variances inherent in their system that helped them forecast better? Was it “slack” (tactics) they added as they adapted their process that got rid of the “hurry-and-catch-up” working mode and allowed them to reduce technical debt too? Did moving to a statistical quantitative measure of their (throughput) capability account for the overall improvements? Great questions to ponder, and there is definitely more about this experience to possibly revisit in future posts.
Landing This Post
However, for now I’ll close with what may be even more amazing and funny to me about the FedEx story (I know it makes me laugh every time I think about it). To be clear here, I didn’t see this show myself that evening (if I remember correctly I caught a second airing of it a few weeks later). Rather, it was in fact my wife who first relayed to me “in some detail” most all this information about FedEx. She said later to me that what “pulled” her into watching this show was because “I could just hear you in my head, recounting conversations where you spouted off about slack, capacity, and flow. So before I knew it, I found myself saying It’s Slack! It’s Slack! You Need Slack In The System! Who Can’t See That?”
Wow, I guess she really does listen to me sometimes. Now that is amazing!