Bosch Connected World Hackathon

There is no shortage of things to do in Berlin, and if you’re curious about high-tech, there are lots of opportunities to dive in and get connected to the high-tech culture. In my almost two years here I’ve had a great time spending my spare time attending events at startup-accelerators and co-working spaces, meetups, conferences and even hackathons.

When I saw the notice for a hackathon sponsored by Bosch I wasn’t sure what to think. To me, Bosch means spark plugs and power tools. Turns out there is a lot more to Bosch than that, and they are developing a big presence in the IoT space. So I decided I had to have a look at the Bosch Connected World conference in Berlin.  It was all about connected devices, and I would have loved to listen to the talks in the conference.

But there were also 4 separate hackathon tracks and I was curious about those. I signed up for the Connected Car hackathon. I’ve been learning Android mobile app development on my spare time lately so it was an opportunity for me to see how my skills stacked up.

IMG_2504 B.jpg

The way these things work is that the general topic of the hackathon is introduced, then anyone can pitch ideas to the audience. After the pitches are made, people circulate around the room and form teams to work on each idea. Then the teams design and code up their idea, working until the final presentation on the afternoon of the next day.

I decided to pitch a simple idea and was a bit surprised and very relieved that 4 university students from Stuttgart joined in and we formed a team. Since it was a Connected Car hackathon, we decided to integrate an Android app with the Bosch MySpin environment so that it would run on the car’s entertainment system.idea-1.png

The Bosch guys even brought a Jaguar with the MySpin system installed. We somehow missed to install our app in the car, but ran it on a prototype unit instead.


After two days of hacking we were able to retrieve GPS coordinates from one phone, and display google maps driving directions on another MySpin-connected mobile phone. Very cool! I was somewhat relieved that I was able to contribute my part and feel that I’m keeping pace with the industry.

The event was really well organized at Cafe Moskau (pictured in the tweet above) which turned out to be a great venue. Cafe Moskau was built in 1964 and is situated in the former East Berlin. The architecture of the building is fantastic, such a departure from the stern soviet-era concrete buildings lining the streets around it. Great lunch and dinner foods and refreshments served by an attentive staff – this was definitely not your average hackathon. There must have been more than 200 people hacking away in several different rooms. There were different hackathons for Connected Car, Industrie 4.0 (a big topic in Germany, I think it’s termed Industrial IoT elsewhere), power tools and cloud-connected sensors. Check out the twitter hashtag #BCX16 to see more on the event, or have a look at this video.

At the end of the two days we enjoyed a nice wrap-up dinner courtesy of Bosch where hackers and conference attendees mingled. I must say they did a good job putting this whole event on. And a big thanks to the hackathon organizers – they clearly put a lot of effort into it, and it all went off without a glitch.


Paint The Room!

I occasionally deliver internal Agile training classes, and one of the things I have struggled with is to explain the concept of Agile estimation. I’ve been searching for a good game to get the point across, but could never find anything that worked well. Some times I could work off a real backlog of items and features supplied by the class participants, but more often than not it was almost impossible to find a set of features that everyone in the room could relate to. So some people would get the point, others would not.

One day in such a training session I had a room full of people from various backgrounds and was silently stressing over how to best cover Agile estimation. Then, somehow, two previously unacquainted neurons connected for the first time and in a flash I had the solution.

This game will teach Relative Estimation, Planning Poker, Story Points, Backlog Refinement, Story Splitting and Story Elaboration all in one short 20-minute session. All you need is some Planning Poker cards. It’s so simple you just have to try it to see how effective it is.

If you’re not familiar with how to use Planning Poker, then this page from Mike Cohn (@mikewcohn) is a good start.

I call it the “Paint the Room” game.

Paint The Room

The game is played in whatever room you are delivering the training in. Chances are your particular room has at least four walls, a floor and a ceiling. Good – that’s all you need! Now here is the setup:

Working as a team we have been given the job of painting the room which we are currently in. The customer’s requirement is (intentionally) vaguely stated as “please paint the room” and we have drawn up an initial project backlog which looks like this:

  • paint the north wall
  • paint the south wall
  • paint the east wall
  • paint the west wall
  • paint the ceiling
  • paint the floor

You might of course give each wall a descriptive name such as “window wall”, “entry door wall” etc. instead of north/south/east/west. Many people can’t tell North from Up.

In the game you will represent the Customer/Product Owner when the team has questions about the job. The team has to come up with an estimate to paint the room. Distribute Planning Poker cards to everyone. Keep the number of participants to the size of a scrum team. Everyone else can just watch and still get the full benefit.

The nice thing about the game is that everyone can relate to the task of painting a wall, even if they have never touched a paintbrush. Furthermore, everyone’s assumptions about what is involved in doing a good paint job varies widely among white-collar knowledge workers.

Establish the “Anchor Feature”

In relative estimation it is useful to have an anchor feature that we can associate a known story point quantity with. Other features will be estimated relative to the anchor feature.

Pick one wall that looks like it is medium effort relative to all the others and tell the team that “this wall is a 5-story point wall”. That is your “anchor feature”. All other backlog items will be estimated relative to that wall. This establishes what a 5-story-point wall looks like.

Most engineers initially have a hard time separating the idea of Story Points from person-days when we talk about software features, but have no trouble at all accepting the abstract notion of a unit-less “5 Story Point wall”.

Now we’re ready to estimate.

Estimate one wall

Pick one wall that is a little less complicated than the anchor wall and ask the team to estimate how many story points that wall is relative to the anchor wall. Use the Planning Poker method to get input from each team member.

Simply ask “So assuming our anchor wall is 5 Story Points of effort, how many Story Points of effort is this wall?”

Make sure everyone provides an estimate, and that everyone flips their cards all at once. One of the key benefits of Panning Poker is to protect against the influence of expert estimates which may or may not be reflective of what the team can actually do. It’s essentially a variation of wide-band delphi estimation. And indeed you might get a wide range of responses in the range of 1 story points to 8 or even 20.

This is where the fun starts.

Talk about the high and low estimates. The value of Planning Poker is in the conversation which brings out unspoken assumptions, and you will really see that in this exercise. As the high/low estimators explain their reasoning for their estimates, you will hear things like

  • I assumed that we had to put masking tape on the trim and around the edges of the windows
  • I assumed that we had to remove all the covers on the electrical outlets
  • I assumed that we have to cover the floor
  • I assumed that the prep work was done (or was not done) ahead of time

In other words, different people make different assumptions about what the job really entails when the job is imprecisely specified, so they come up with different story point estimates. That is exactly what happens when software engineers with different skills and specializations give estimates on vague requirements. Many times you will find that there is an experienced handyman in the team, and he will be on one end of the scale relative to the others. That is ok, it’s usually that way with software experts too.

The other thing that usually happens is that the team starts spurring each other on with more clarifying questions:

  • are we using rollers or brushes?
  • do we assume that all preparation/making take is done before the job?
  • do we all work on one wall at a time, or several in parallel?
  • do we have to purchase supplies first?
  • Should we move the furniture out of the room?

and so on.

The conversation will flow back and forth a little as the team considers the now-spoken assumptions behind the high and low estimates. As the facilitator you just have to sit back and listen as the team starts peeling the layers off the onion. If the team gets stuck, you might volunteer some of the questions above, but never solutions. Your job is to make sure that the team talks about their assumptions and agree on the how between themselves.

It’s a game, so don’t waste the opportunity to have some fun with it. If you get questions from the team, don’t be shy to throw in new requirements or constraints to make the job harder. “No you can’t take the windows out of the window frames before painting” or “yes of course you must remove the windows from the window frames before painting” and “well yes, of course I want the floor painted in a black-and-white chequer-pattern. What did you think?” are good conversation-starters. The team may groan all they want, but every small requirement change makes the game a little more like reality.

As the team discusses their individual assumptions they will probably identify some new tasks. Most teams end up adding new items to the backlog:

  • remove electrical outlet covers
  • put masking tape around the windows
  • put electrical covers back on
  • Purchase supplies
  • clean up

After a few back-and-forth comments you might get to a point where the team has a new common enhanced understanding:

  • yes we will remove the electrical outlet covers
  • we will use rollers for painting large areas and special brushes around the windows
  • we will not cover the floor (it will be painted last)
  • etc.

Not too different from talking about how a given feature works, what the hidden requirements are an so on. The actual agreements and assumptions are not important, as long as the team ends up with a better shared understanding of the task. The new knowledge is an illustration of Story Elaboration, and you might even want to take the opportunity to specify some Acceptance Criteria, such as “no paint smudges” and “even color application” (might require two coats).

Do another round of Planning Poker estimation. The second round of estimation should have a tighter range since we have now had some conversations and clarification about the “requirements”.

Discuss the high/low estimates again and do one more round of estimations if needed. If you picked a good “anchor wall” and picked a slightly less complicated wall to estimate, then the Story Point estimates should settle around 3 or 5 Story Points. Good enough, let’s move on.

Estimate the ceiling

Now that the team has seen the Anchor Wall and produced a converged estimate for a simple wall, you ask the team to place a story point estimate on painting the ceiling. Again, it must be relative to the Anchor Wall of 5 Story Points.

Discuss the high and low estimates. Some people will estimate the ceiling as being easy, others will see added complexity in simply reaching high, using ladders, how to deal with light fixtures etc. In any case you will find more un-spoken assumptions that only come to light when you talk about the estimates.

This is also a good opportunity to point out that the customer requirement initially was just to “paint the room”. Nobody gave specific direction on what to do about things like the light fixtures. It’s a perfect way to illustrate that it’s critical to go back to your customer and clarify what the expectation is. I usually respond to the team, when they ask, that I like the way the light fixtures look so I don’t want paint all over them. Therefore I want the light fixtures removed and then put back in place after the ceiling has been painted. You can then show that it’s ok to split the story into three: (1) remove fixtures, (2) paint the ceiling and (3) reinstall fixtures. That is a lot more work than the initial expectation (do we need an electrician?), but hey that’s R&D. Sometimes we uncover unexpected complexity. Update the project backlog accordingly.

Check the backlog

As you have gone through the estimating process a couple of times you will see that the backlog has grown with new entries, and some stories have been split into smaller stories. That illustrates the mechanics of backlog refinement. In every case the refinement was a direct result of discussing with the customer and/or uncovering hidden assumptions.

Wrap it up

I never go through the whole backlog, but stop the game after estimating two or three backlog items. By then we have covered Relative Estimation, Story Points, Story Elaboration, Story Splitting, Planning Poker and Backlog Refinement. Not bad for 20 minutes of poker!

I’ve run the exercise a few times now, and every time the team has come out with a better understanding of how to use Agile estimation techniques in their projects. Hopefully it works for you too.

If it does, let me know!

ThingsCon 2015

Living in Berlin has its benefits. Lots of things are happening, especially now that winter has lost its grip on spring. One of the nice surprises lately was ThingsCon – the small but premier european conference on the Internet of Things. I wasn’t sure what to expect but I registered anyway. Berlin has so much energy around the startup and technology scene it’s just insane. ThingsCon turned out to be another one of those events that buzzed with energy.

The two days of talks and workshops focused no only on the technology of the Internet of Things, but also -and more importantly- on the social responsibility and implications of the Things we put on the Internet.

The opening Keynote by Warren Ellis set the tone. He doesn’t care how we build this stuff or how cool the next gizmo is. He just wants to get home without worrying that the connected world doesn’t crash and shuts him out of his connected house. He doesn’t want to fiddle with buttons or operate his light switch from the iPhone. He just wants it to work, and he doesn’t want to be bothered with the technology part. He is, as he put it, “your worst nightmare – I am your products’ and services’ most likely customer.”

It’s not all doom and gloom for him though. He is excited by the prospects and pleaded with his audience of technologists:

It was probably the best opening perspective that a crowd of young, excited and energetic product innovators could hear. In the end what matters is how these products enhance and enrich our lives. No amount of tech-cool or big-data monetization can make up for a deficient user experience.

This was the second year of ThingsCon and I can tell that this conference will grow. I am definitely going back next year to keep track with how this field is maturing. I was expecting a very “thing-centric” technology conference but was delighted to see that for the most part the human side and the personal aspect took center stage. Yes, lots of of attention to the technology of IoT, but most of the talks and workshops I attended framed the technology inside the user experience.

The Internet of Things is definitely a darling buzzword, but at the same time it is difficult to imagine how an exponential growth of connectedness could not have some fundamental impact on your daily life. If it is the next evolution of the Internet, it will be the Invisible Web. The more we push technology for technology’s sake, the less adoption we will see. To me it’s not so much about making new cool gadgets. The best technology, as far as I am concerned, is invisible.

That is the challenge for the folks working in the IoT space: not to highlight the new gizmo or technology, but to embed it invisibly in a natural and socially responsible way so as to improve and enrich our lives.

Socially responsible.That phrase kept coming back to me as I sat through workshops and talks, and by talking to the folks around me. The IoT manifesto draft looks a little “thing-centric” in its first form, but in the workshop critique all the comments were human-centered so it’s a nice signal that the folks involved with the technology recognize that there is a lot more to the challenge than just “the thing”.

Two days isn’t enough, but I got a first sense for where the IoT technology curve is right now. I also got a much better appreciation for the coming struggle of making sure the technology, devices and the data they collect are managed in an ethical and responsible manner. Once you endow personal devices with consent to collect information about yourself in your daily life, the privacy issues surface very quickly.

We will have to find a balance between personalized services (which most of us want, but which require sharing of personal information) and privacy (which we also want, but which precludes any personalization of service). Ease of use must be a foremost concern, but security and privacy could make a seamless experience difficult to achieve.

At the moment I am visualizing the IoT conundrum as the 4 corners of a square with conflicting attributes:

Personalized  –  Secure  –  Easy to use  –  Private.

If you cant have all 4 of them, then where does the compromise start? In those 4 constraints, innovation waits. Until ThingsCon 2016…

Scaling Agile – it’s not the same problem

I was prompted by Al Shalloway’s (@alshalloway) brief tweet this week combined with a long train journey to write this blog post.

That is exactly one of the key ideas in Nassim Nicholas Taleb’s (@nntaleb) Antifragile. The kinds of problem you get at scale are not the same kinds of problems you deal with in small numbers. To quote Nassim Taleb:

“A city is not a large village, a corporation is not a larger small business”.

Likewise a single Agile team does not behave the same way that a project team made up from several Agile teams does. The method and style of communication changes, as does the kind of risks and issues we deal with. It’s a transformation into something new, not a simple multiplication.

Why scale breeds complexity

If interactions grew linearly with the system size, then scale wouldn’t be such a problem. Each time you add new member to the system, the complexity (C) increases by one since you only add one more possible relationship as visualized below:

linear growth

Unfortunately as real systems grow in scale, their complexity grows exponentially because each component influences or interacts (directly or indirectly) with many others. That is, each member you add to the system adds exponentially more relationships or interactions. It’s an old idea that is surprisingly often neglected by those that design systems. You can visualize how the number of possible interactions grows as we add one more circle to each diagram below:

geometric growth

Complexity is a property of the behavior of the system, not the structure of it. That is, it’s not the number of components that makes it complex, but how they interact – as visualized by the number of lines in the diagrams above.

watchConversely we may have systems that have a complicated structure, but yield simple behavior. For example, a mechanical pocket watch has a complicated structure but very simple and (thankfully) predictable behavior.

So it’s complexity resulting from interactions and behaviors we have to be concerned about. This is why we see different types of problems at different levels of scale.

Aside: scale up and down

In order to be considered “scalable” the transformation must be two-way. I have seen software systems, for example, that are designed to “scale” but can’t run on modest hardware. That is not scaling, that is just big and bulky. Somehow we forget that scaling is not the same as simple expansion. To me a good scaling implementation is a dynamic one, that can scale both up and down. If you can only go one way, it’s not “scaling”.

So that’s the problem: the behavior of a small group is different than that of a large group, which is again different from that of a group of groups… and the effect shows up sooner than you might think because of the number of new communication paths.

The oft-cited Dunbar’s number says that people can have some sort of social relationship (but not necessarily a deep relationship) with up to around maximum 150 (or so) people, but we see behavior changes way sooner than that. The upper bound of a Scrum team (9 people) is based on experience and seems to be re-affirmed time and again. Beyond that people see themselves as part of a mass, not a team. In my organizations I never had a manager with more than 10-12 direct reports.

Sometimes trouble shows up at even smaller numbers. It is said that behind every successful man stands  a woman, behind every failed man there are two. I think that joke works with Scrum teams and Product Owners also, except it’s not funny.

Scale the problem down

This is where Agile and Antifragile meet again: big problems are best solved when you can scale them down and distribute the difficulty. The secret to successfully executing big projects is not to scale the project team up, it’s to scale the problem domain down.

What I mean by that is that we shouldn’t simply look at the project requirements and then figure out how to scale a team to build the whole thing all at once. Approaches such as Minimum Viable Product (MVP) and applying Design Thinking to focus the problem space can save a lot of time, effort and money by applying resources only to what is critical and important.

What about decision-making? The problem of scaling decision-making is a tough one. Many times we don’t even consider that we can change the way decisions are made, and so we fall back on a central person or core team that is responsible for making all the individual decisions. Its a fragile setup because these teams are disconnected from what happens on the ground.

Scaling decision-making

To me every scaling problem is a delegation and distribution problem. The most ineffective way of all is when someone decides to distribute workload but is unwilling to delegate authority. Micro-managers would object to the micro-manager label except they are too worn out from trying to apply their bottleneck everywhere at once… and they don’t read my blog anyway. When you’re too close to the tree you can’t see the forest.

So what can we do? First of all, recognize that you can’t possibly keep more than half a dozen balls in the air at one time. Divide and Conquer, distribute, subdivide… do what it takes to bring things into manageable chunks. This is after all what Agile methods do: break things down into manageable smaller pieces of work.

Distribute Distribute Distribute

Distributed systems, whether they are mechanical, software, political or social have some compelling properties. They are individually self-sufficient, resilient and effective. Centrally organized systems on the other hand are attractive only on paper or at very small scale. Works fine as long as the “central” part is available and capable of managing the information flow, but quickly breaks down and becomes a bottle-neck in any but the simplest projects.

So work obviously needs to be decomposed and distributed amongst multiple teams. Nothing new here, whether you distribute according to traditional functional teams or cross-functional Agile teams it has to be done.

It is not enough to just distribute the work, you also have to distribute authority and decision-making when you’re dealing with more than a couple of teams.

Central Mission command, local tactical decisions

One of the most powerful and elegant ideas in Don Reinertsen’s (@DReinertsen) well-equipped arsenal of golden nuggets is the idea of Mission Command. Instead of developing a detailed plan to be followed by all and centrally managed, we set higher-level objectives and let individuals and teams self-organize (and even improvise) and decide how to achieve the objectives.

Central mission command is very much different from central micro-management. We centrally decide on the overall (higher-level) objective to be achieved, then delegate responsibility, authority and decision-making for how to reach the goal to each team – while accountability still rests centrally.

Central command and local decision-making is not enough either. Even when teams self-organize around a central higher-level goal, their individual approaches may create trouble later down the road. So we can distribute decision-making and authority, but how do we ensure that everyone keeps the overall integrity of project and company in mind at any given time? And how do you retain central accountability?

Use decision-rules

As teams self-organize around delegated and common objectives, they don’t just need to meet their goals, they need to do their work and act in accordance with the company’s long-term interest and in concert with other teams. An orchestra only works if everyone plays in time with each other.

I recall with fondness my son’s first tuba recital in middle school band. There were 4 tubas on stage, and he finished first. Well what can I say, he’s competitive and has since grown into a splendid college athlete. He knows how to play in beautiful concert with his team on the field now but I still smile when thinking about that first recital.

Orchestration and alignment is needed both for the project goals and in the general backdrop of the company. For example, a team might decide to achieve their part of the project goal by using an open-source software package. Although that may result in achieving the project objectives, the team may have inadvertently put the company’s intellectual property at risk. Not all open-source licenses are the same. By simply using “freely” available software, you may be automatically entering into an agreement where your company must agree to make parts of their system software freely available to the rest of the world.

You can’t manage dozens of individuals by central decision-making, but on the other hand you can’t simply let everyone loose on their own either. One way to effectively sub-divide and delegate is to create a set of decision-rules that aligns everyone on how to make decisions. Think of it as a set of guide-rails that prevents you from falling off the path.

Identifying decision-rules is not the same as identifying who is responsible for making decisions and establishing an escalation path. That is not scalable either. It is all about agreeing on how teams will make local decisions on their own, what they can decide on their own and what must be decided centrally. Such rules can enable teams to make decisions consistent with the overall command objectives. These rules are set centrally, and is how the “central command” retains accountability for the decisions made in the project.

One of the best examples of decision-rules I have heard about came out of the Boeing-777 development program. When you design airplanes you have to make tradeoffs between weight, cost and space. If you increase the weight of your subsystem then it becomes a big deal since every ounce translates into increased operations cost for the customer. Every subsystem and every engineer is faced with these kinds of tradeoffs on a daily basis. How do you manage the choices of several thousand engineers all at once? Obviously a central design review committee wouldn’t work effectively. So the project team set “budgets” for each subsystem development team as to how much weight, cost and space they were allocated. If a team needed to exceed their weight allocation, they could do that as long as they found another team that would be willing to trade some of their weight against, for example, some additional cost. That way the weight/cost/space constraints of the 777 airplane were always satisfied overall. The individual teams could trade allowances between themselves as long as they stayed within the overall budget. They didn’t have to get every tradeoff approved centrally – they used decision-rules.

So instead of creating an elaborate escalation path for decision-making, create the guide-rails within which decisions can be made at the right level in alignment with the overall mission of the company.

The Agile Manifesto and the principles behind form an example of another such a decision-making framework. For every choice or decision to be made, every engineer or team can ask themselves, for example, “does what I am about to do satisfy the principle of simplicity?”, or “if we take this approach will we be able to show progress in terms of working software?”.  Similarly your company and project will have a different set of decision-rules that guide teams and individuals in making local decisions.

So is that it?

No that’s not all of it by a long mile. But it’s a start. If you can

  • understand that different levels of scale requires a different approach, and
  • distribute both work and decision-making authority, and
  • create a good set of decision-rules that can be used to align everyone,

…then you’ve laid the foundations for a much less stressful environment at scale.

The Formula-1 Pit Stop: Lean counter-counter-intuition

What does F1 racing and LeaF1-Leann Product Development have in common? Not much at the surface… but if you, as I do, see everything as systems then you can’t help but notice some interesting things. In this case I found an apparent counter-example to Lean that is Lean despite going against the grain of traditional Lean Thinking. Hmmm… we’re into double-negatives here but stay with me.

The racing analogy is interesting to me not because “Lean=Speed” but because someone questioned the “obvious right way” and came up with a counter-intuitive better solution.

640px-2010_Canadian_GP_race_startI am always amazed every time I catch even a glimpse of Formula-1 racing. The cars fly around the track at up to 300 miles per hour, pull 1.45g during acceleration and 4g when braking. High speeds, tight turns and frequent acceleration and braking wears hard on car and driver, but even more so on the tires which aren’t even engineered to last a whole race.

An F1 tyre these days is designed to only last for about 120 kilometers on average (it’s a weight vs. durability tradeoff), but most F1 races are at least 305 kilometers long. That means you need to change tires 2 or 3 times in a race that is won or lost by fractions of a second.

I’m fascinated by this, of course, because the pit stop is the biggest impediment to continuous flow around the track. If you could make one less pit stop than your competitor you would be several seconds ahead, turning 10th place into victory. For a long time that was the strategy: go easy around the curves so as to conserve tires and fuel. A lower average speed would win the race as long as you could avoid making too many pit stops. So far it sounds Lean, right? Slow down and you’ll finish the race faster. No surprise there: Lean solutions are usually counter-intuitive.

Twisting and Turning

Well, every good story has a twist and our little F1-Lean analogy is no different.

In the mid-1980s someone took a step back and looked at the whole end-to-end McLaren_pit_work_2006_Malaysiasystem and realized that the “economy” of racing could be improved. Start the race with only half a tank of fuel, and the car would be much lighter and go faster. Stop worrying about conserving tires and instead push the car to the limits on the track. The penalty for this strategy is an added number of pit stops.

Not a problem – you just need to minimize the time spent in each pit stop.

It’s a tradeoff curve, as always. If you can continuously reduce the amount of time needed to refuel and swap tires, then at some point down the curve  the wear-and-tear vs. pit stop balance will shift.

Now it gets interesting. Because we now have two different models of “good” racing strategy, we have to choose – but how? We have to take a systems view, and make the decisions based on the objective merits of each strategy, not by intuition or personal preference.

Yes but we’re talking about Product Development, right?

You see analogies in Product Development all the time. Whenever there is a bottle-neck in our process we have to decide to either fix/improve the bottle-neck, or try to avoid it. Not all bottle-necks are solvable or even visible. Some are disguised as “the way we always do things here”. Most companies have settled on a particular pattern of which bottle-necks (departments/phases) are reasonable (acceptable as the cost of doing business) and which ones aren’t. Companies that structure the development flow through phases and gates accept the overhead associated with functional departments as “not perfect but the best way to do things”.

Lean Thinkers challenge this view all the time: figure out where the the flow stops and then improve it. Usually it comes down to finding local optimizations and then reducing or eliminating tasks in the name of overall flow. For example, eliminating pit stops so that the race can flow un-interrupted.

And sometimes Lean Thinkers have to challenge themselves, to avoid getting stuck in the “best” Lean solution.

So where did Formula-1 end up?

Have a look at this Ferrari pit stop from 2013. Mid-race refueling is no longer allowed in F1, so now it’s all about how quickly you can get 4 new tires on the car. You can feel the anticipation as you watch the pit crew waiting for the car to arrive.

That pit stop took 2.1 seconds. It’s a huge improvement from the minute-plus pit stops of the early days of F1 in the 1950s. Pit crews spend a lot of time and money to squeeze out every millisecond they can from their process. You can almost visualize a value stream map on the garage wall and the team swarming to find and reduce the next bit of waste from the process.

Is it a fair analogy?

But, many will say, that’s not a fair analogy. Obviously they have a lot of specialized equipment and a huge crew of specialists standing by. This is high-stakes racing and has nothing to do with Lean or product development.

Well that’s the whole point behind this blog post. It’s a classic example of how Lean thinking differs from mass-production thinking. It just manifests itself in a different environment. Speed (and safety) matters in F1, and speed (and safety) matters in Lean.

Systems of systems: Lean Fractals

There is a very direct and obvious application of Lean Thinking and Value Streams to what happens in the pit. If you have seen a Value Stream Map, you can easily understand how that helps weed out waste and inefficiencies in the process flow of safely and quickly changing the tires.

But there is another higher-level system at play, which is includes both the pit stop and the track. This system-of-systems has a different kind of flow economy, working off a different set of aggregated information.

If we ignore the system-of-system effect and blindly apply Lean tools it can lead you astray. Pre-1980s racing solved for Lean flow based on how the costs  were incurred back then, i.e. relatively sloold pit stopw pit stops. Slow down around the track and finish in first place. However the cost of any activity changes over time, and the technology and capability of the pit crew improved such that the basic assumptions behind “the best way to race” had to change.

In this case they had to question some long-held beliefs assumptions in order to get to a new level of performance. Similarly we find the biggest improvements hiding in plain sight when we look at our overall product development system and question our traditional way of working. Our world is full of systems-of-systems.

I draw two lessons from the F1-Lean analogy:

Question the Status Quo

The obvious approach isn’t always right, and you won’t see it until you look at the whole end-to-end system. It is counter-intuitive that adding one more time-consuming pit stop to your race will speed things up overall… but it does – up to a point.

Even if you settle on a “best” Lean solution, this might also be a local optimum… keep looking and keep questioning. The journey through solution space is not always a linear one.

Invest in non-mainstream efforts

In order to lean out the value stream you may have to invest and focus on non-mainstream efforts such as tooling and supporting activities. This is the stuff of overhead. It’s hard to justify increasing the overhead cost when the normal pressure is to reduce expenses. But in a Lean system the right overhead isn’t a liability – it’s what enables the mainstream to go fast. The F1 teams had to shift investment away from car and engine design and onto the less-glorious pit crew tools and processes.

The difference is in the kind of overhead: instead of traditional overhead which is needed to manage the waste generated by stitching together the work of separate departments, Lean “overhead” is there to make the value stream flow faster at higher quality.

Yes it’s a fair analogy

To return to the point above, yes the analogy is totally fair. You can usually make your product development progress much faster if you invest in af1-pit-crewncillary processes and tools with an eye towards end-to-end Flow. The Ferrari team in the video clip shows 21 crew members, all working franticly for less than 3 seconds. It takes 3 people per tire to do the job. Wasteful resource utilization? Not if time matters and your objective is to get the car through the pit stop quickly. Sure, we obviously can’t allocate 21-person support teams for all tasks, but the idea is to figure out what it takes to achieve the best end-to-end flow and then invest accordingly.

When I first started developing software we had bi-weekly load builds because it was such a difficult thing to get a clean build with 40 engineers all submitting several weeks of code changes, and build servers were very expensive. We avoided load build “pit stops” until they were absolutely needed.

Gradually the situation changed, and we now have efficient continuous build and test environments and the ability to submit code changes daily. The “economy” of our pit stop changed. It was initially not easy convincing management to invest in extra build servers, tools and staff – but eventually we met the point on the tradeoff-curve where investing in ancillary things like load builds made more and more sense. Now every serious software group has a DevOps setup.

Local vs. Global Optimization

There is a balance between local and global optimizations, and the balance can
shift over time. You can’t even grasp the concept of that balance unless you look at the end-to-end system flow. Chances are that you are only looking at improvements within your own department. If you are, then you might routinely leave improvements of an order of magnitude or more on the table. Look one level up, at the system-of-systems, and see what you can find.

Lean Systems: Antifragile Applied

“Systems subjected to randomness—and unpredictability—build a mechanism beyond the robust to opportunistically reinvent themselves each generation”

– Nassim Nicholas Taleb

In a previous post I introduced the concept of Antifragility – systems that benefit from shocks, randomness and disorder. Classifying the world in the triad of Fragile – Robust – Antifragile helps us understand and manage the potential impact of the uncertainty surrounding us.

It’s initially hard to imagine that anything useful could benefit from disorder, so the first thing to realize is that although objects and things can be Fragile or Robust, they can’t be Antifragile. Systems, on the other hand (which if course includes Product Development Systems) are made up of multiple interacting components. Systems exhibit behavior as they respond to their surroundings, and can be Fragile, Robust or Antifragile. It is this ability to respond and interact that opens the door to antifragility. Antifragility can bee seen as a type of evolutionary mechanism, continuously picking the best of the available options. So, when we look for examples of antifragility we need to look at systems, not objects.

Stressors: the fuel of Antifragility

A stressor is something that puts a strain on the system, pulls it away from its equilibrium. It’s the system’s response to stressors that classifies the system as either Fragile, Robust or Antifragile.

A system that gets weaker from the encounter with the stressor is Fragile. For example, a pyramid scheme collapses when exposed to the light of day. Not only dictators (the individual) but the foundation of the dictatorship (the system) crumbles when the forces of democratic thought are applied. The best-laid project plan with all its gantt-charts has a best-before date sometime before the first problem is discovered.

Robust systems neither gets weaker nor stronger in the presence of a stressor. Most government bureaucracies seem to fall in this category – their inability to learn and evolve astounds me, as does their unequaled staying power. Many companies operate in this way too. New ideas get rejected and expelled by the corporate immune system, allowing the company structure to stay the same even in the face of certain bankruptcy. Remember Kodak? GM?

Antifragile systems on the other hand enjoy randomness and stressors, at least up to a point. Shocks and disruption make them stronger because they keep the system alert and in shape. Stressors exercise and improve the system the same way physical activity stresses and improves your body. Strength training, for instance, involves pushing your muscles just past their breaking point. Your body is able to repair this damage and even over-shoots in the repair effort. The result is that you are left with a little more muscle mass than you had before. This is how Schwarzenegger became Schwarzenegger and Ahnold was again a cool and acceptable name for your first-born. Without these stressors the system would stagnate, much like a couch-potato grows the wrong kind of body mass and ends up with clogged arteries.

Of course, there is a limit to how much stressors are beneficial. Running at a reasonable effort level puts you in better shape; the first marathoner supposedly expired at the goal line, having historically over-exerted himself to deliver with his last gasp the one-word message to the king: “victory”.

(hang on – if they won the battle, then why the life-and-death rush? Good news would still be reasonably good the next morning, right?)

The next important thing to understand about Antifragile Systems is that they work in layers. It is not enough that individual members get stronger, the system as a whole needs to be able to survive and thrive. It needs to be able to learn and select.

It’s in the DNA of the System

Going back to our example of Mother Nature as the ultimate antifragile system, we can observe that the individual member of a species are inherently fragile. In fact, each member will eventually die off, no matter how strong it is. There is a natural turnover to make room for the newer and more fit members. By natural selection and replacement of individuals the system becomes more and more fit. There is a layering effect here. Individual members (at the lowest layer) compete with each other. The strong propagate their DNA and have (presumably) stronger offspring, the weaker gradually (or abruptly, as the case may be) exit the gene pool. The system as a whole (at a higher layer) grows stronger as a result. The system survives the demise of each of its members because the information that makes up the system is preserved in its DNA, surviving generation after generation of individuals.

By evolution such a system improves gradually even if there is no master plan and things happen at random. The system continuously Inspects and Adapts, and the current “best recipe” is carried forward in our DNA. As long as we recognize and seize opportunity, even a random walk will be beneficial. Antifragile systems love errors and variation for that reason.

Lean Systems: Fragile?

Lean systems are called Lean because they deliberately operate with very small error margins. For example, Lean Manufacturing systems are sometimes called “zero-inventory” systems because they have almost no buffer inventory to absorb variations and problems at individual stations. If there is a problem somewhere on the production line, the whole system could shut down. This is by design: in a tightly coupled system small problems are amplified to make them painfully obvious, and every problem becomes an urgent matter.

In one sense Lean systems are therefore very fragile to disorder and error so one might be tempted to simply put Lean in the Fragile category. But it’s not that simple. The antifragility of Lean is in the DNA of the system.

Lean Systems: Antifragile

So we need to reconcile the apparent fragility of the small operating margins of a Lean system with the claim that Lean systems are antifragile.

I like Steven Spear’s (The High Velocity Edge) summary of a good Lean implementation:

  1. Build a system of “dynamic discovery” designed to reveal operational problems and weaknesses as they arise
  2. Attack and solve problems when and where they occur, converting weaknesses into strengths
  3. Disseminate knowledge gained from solving local problems throughout the company as a whole
  4. Lead by developing capabilities 1, 2 and 3

The ingenuity and beauty of Lean is that even small problems become intolerable at the system level. Lean Systems use this fragile tight coupling as a way to accelerate system-level learning. If a problem develops, it immediately becomes painfully obvious that something is wrong.

Rather than working around or ignoring these small problems, the team in charge is obligated to immediately seize the opportunity to improve the way the system works before the small problem becomes a big problem. A good lean team will swarm the problem to get it fixed, and put in place measures to ensure that similar problems don’t occur in the future. The result is that the particular process step which failed now has improved and is less likely to fail in the future.

Antifragile systems love errors, and so do Lean systems. The fragility of small error tolerances acts as a forcing function which brings problems to the surface, causing the old faulty processing step to evolve and be replaced with a new and more fit one. Each small failure alters the DNA of the Lean system just a little bit, evolving and improving. One more problem spot has been eliminated, and the probability of future defects is reduced.

So here is a perfect example of a system that is designed evolve over time, to learn from mistakes and to grow more capable after each error. It needs no top-down direction other than living the Lean Principles. There is no master plan, yet Lean systems evolve on their own to become the most competitive and effective man-made systems we have on our planet.

Evolving. Learning. Antifragile. Lean. Wonderful.

If it’s a Pipeline, it’s leaking

Many times we view the Product Development System as a Pipeline where we pour effort and energy in, and out comes a product sometime later. You’ve probably used this analogy before, talking about “products in the pipeline” or “the R&D pipeline”.


Seems pretty intuitive, and I use that analogy too. Except I recently thought perhaps the analogy isn’t quite right. If you’re working in a Waterfall or phase-gate process, it’s not a single pipeline. It’s a series of smaller pipe lengths which are joined together by hand-offs:


The trouble with hand-offs is that they generate waste.Throughout the journey there can be more energy lost in hand-offs than actually make it out of the pipeline. In every joint, effort and energy leaks out.


I find this analogy is a little more fitting, and although it’s a simple visual it helps make the point about hand-offs at the simplest possible level. The discussion usually turns to “what are the leaks and how we stop them” and there is your entry to discuss Lean and Waste.

What do you think?