Scaling Agile – it’s not the same problem

I was prompted by Al Shalloway’s (@alshalloway) brief tweet this week combined with a long train journey to write this blog post.

That is exactly one of the key ideas in Nassim Nicholas Taleb’s (@nntaleb) Antifragile. The kinds of problem you get at scale are not the same kinds of problems you deal with in small numbers. To quote Nassim Taleb:

“A city is not a large village, a corporation is not a larger small business”.

Likewise a single Agile team does not behave the same way that a project team made up from several Agile teams does. The method and style of communication changes, as does the kind of risks and issues we deal with. It’s a transformation into something new, not a simple multiplication.

Why scale breeds complexity

If interactions grew linearly with the system size, then scale wouldn’t be such a problem. Each time you add new member to the system, the complexity (C) increases by one since you only add one more possible relationship as visualized below:

linear growth

Unfortunately as real systems grow in scale, their complexity grows exponentially because each component influences or interacts (directly or indirectly) with many others. That is, each member you add to the system adds exponentially more relationships or interactions. It’s an old idea that is surprisingly often neglected by those that design systems. You can visualize how the number of possible interactions grows as we add one more circle to each diagram below:

geometric growth

Complexity is a property of the behavior of the system, not the structure of it. That is, it’s not the number of components that makes it complex, but how they interact – as visualized by the number of lines in the diagrams above.

watchConversely we may have systems that have a complicated structure, but yield simple behavior. For example, a mechanical pocket watch has a complicated structure but very simple and (thankfully) predictable behavior.

So it’s complexity resulting from interactions and behaviors we have to be concerned about. This is why we see different types of problems at different levels of scale.

Aside: scale up and down

In order to be considered “scalable” the transformation must be two-way. I have seen software systems, for example, that are designed to “scale” but can’t run on modest hardware. That is not scaling, that is just big and bulky. Somehow we forget that scaling is not the same as simple expansion. To me a good scaling implementation is a dynamic one, that can scale both up and down. If you can only go one way, it’s not “scaling”.

So that’s the problem: the behavior of a small group is different than that of a large group, which is again different from that of a group of groups… and the effect shows up sooner than you might think because of the number of new communication paths.

The oft-cited Dunbar’s number says that people can have some sort of social relationship (but not necessarily a deep relationship) with up to around maximum 150 (or so) people, but we see behavior changes way sooner than that. The upper bound of a Scrum team (9 people) is based on experience and seems to be re-affirmed time and again. Beyond that people see themselves as part of a mass, not a team. In my organizations I never had a manager with more than 10-12 direct reports.

Sometimes trouble shows up at even smaller numbers. It is said that behind every successful man stands  a woman, behind every failed man there are two. I think that joke works with Scrum teams and Product Owners also, except it’s not funny.

Scale the problem down

This is where Agile and Antifragile meet again: big problems are best solved when you can scale them down and distribute the difficulty. The secret to successfully executing big projects is not to scale the project team up, it’s to scale the problem domain down.

What I mean by that is that we shouldn’t simply look at the project requirements and then figure out how to scale a team to build the whole thing all at once. Approaches such as Minimum Viable Product (MVP) and applying Design Thinking to focus the problem space can save a lot of time, effort and money by applying resources only to what is critical and important.

What about decision-making? The problem of scaling decision-making is a tough one. Many times we don’t even consider that we can change the way decisions are made, and so we fall back on a central person or core team that is responsible for making all the individual decisions. Its a fragile setup because these teams are disconnected from what happens on the ground.

Scaling decision-making

To me every scaling problem is a delegation and distribution problem. The most ineffective way of all is when someone decides to distribute workload but is unwilling to delegate authority. Micro-managers would object to the micro-manager label except they are too worn out from trying to apply their bottleneck everywhere at once… and they don’t read my blog anyway. When you’re too close to the tree you can’t see the forest.

So what can we do? First of all, recognize that you can’t possibly keep more than half a dozen balls in the air at one time. Divide and Conquer, distribute, subdivide… do what it takes to bring things into manageable chunks. This is after all what Agile methods do: break things down into manageable smaller pieces of work.

Distribute Distribute Distribute

Distributed systems, whether they are mechanical, software, political or social have some compelling properties. They are individually self-sufficient, resilient and effective. Centrally organized systems on the other hand are attractive only on paper or at very small scale. Works fine as long as the “central” part is available and capable of managing the information flow, but quickly breaks down and becomes a bottle-neck in any but the simplest projects.

So work obviously needs to be decomposed and distributed amongst multiple teams. Nothing new here, whether you distribute according to traditional functional teams or cross-functional Agile teams it has to be done.

It is not enough to just distribute the work, you also have to distribute authority and decision-making when you’re dealing with more than a couple of teams.

Central Mission command, local tactical decisions

One of the most powerful and elegant ideas in Don Reinertsen’s (@DReinertsen) well-equipped arsenal of golden nuggets is the idea of Mission Command. Instead of developing a detailed plan to be followed by all and centrally managed, we set higher-level objectives and let individuals and teams self-organize (and even improvise) and decide how to achieve the objectives.

Central mission command is very much different from central micro-management. We centrally decide on the overall (higher-level) objective to be achieved, then delegate responsibility, authority and decision-making for how to reach the goal to each team – while accountability still rests centrally.

Central command and local decision-making is not enough either. Even when teams self-organize around a central higher-level goal, their individual approaches may create trouble later down the road. So we can distribute decision-making and authority, but how do we ensure that everyone keeps the overall integrity of project and company in mind at any given time? And how do you retain central accountability?

Use decision-rules

As teams self-organize around delegated and common objectives, they don’t just need to meet their goals, they need to do their work and act in accordance with the company’s long-term interest and in concert with other teams. An orchestra only works if everyone plays in time with each other.

I recall with fondness my son’s first tuba recital in middle school band. There were 4 tubas on stage, and he finished first. Well what can I say, he’s competitive and has since grown into a splendid college athlete. He knows how to play in beautiful concert with his team on the field now but I still smile when thinking about that first recital.

Orchestration and alignment is needed both for the project goals and in the general backdrop of the company. For example, a team might decide to achieve their part of the project goal by using an open-source software package. Although that may result in achieving the project objectives, the team may have inadvertently put the company’s intellectual property at risk. Not all open-source licenses are the same. By simply using “freely” available software, you may be automatically entering into an agreement where your company must agree to make parts of their system software freely available to the rest of the world.

You can’t manage dozens of individuals by central decision-making, but on the other hand you can’t simply let everyone loose on their own either. One way to effectively sub-divide and delegate is to create a set of decision-rules that aligns everyone on how to make decisions. Think of it as a set of guide-rails that prevents you from falling off the path.

Identifying decision-rules is not the same as identifying who is responsible for making decisions and establishing an escalation path. That is not scalable either. It is all about agreeing on how teams will make local decisions on their own, what they can decide on their own and what must be decided centrally. Such rules can enable teams to make decisions consistent with the overall command objectives. These rules are set centrally, and is how the “central command” retains accountability for the decisions made in the project.

One of the best examples of decision-rules I have heard about came out of the Boeing-777 development program. When you design airplanes you have to make tradeoffs between weight, cost and space. If you increase the weight of your subsystem then it becomes a big deal since every ounce translates into increased operations cost for the customer. Every subsystem and every engineer is faced with these kinds of tradeoffs on a daily basis. How do you manage the choices of several thousand engineers all at once? Obviously a central design review committee wouldn’t work effectively. So the project team set “budgets” for each subsystem development team as to how much weight, cost and space they were allocated. If a team needed to exceed their weight allocation, they could do that as long as they found another team that would be willing to trade some of their weight against, for example, some additional cost. That way the weight/cost/space constraints of the 777 airplane were always satisfied overall. The individual teams could trade allowances between themselves as long as they stayed within the overall budget. They didn’t have to get every tradeoff approved centrally – they used decision-rules.

So instead of creating an elaborate escalation path for decision-making, create the guide-rails within which decisions can be made at the right level in alignment with the overall mission of the company.

The Agile Manifesto and the principles behind form an example of another such a decision-making framework. For every choice or decision to be made, every engineer or team can ask themselves, for example, “does what I am about to do satisfy the principle of simplicity?”, or “if we take this approach will we be able to show progress in terms of working software?”.  Similarly your company and project will have a different set of decision-rules that guide teams and individuals in making local decisions.

So is that it?

No that’s not all of it by a long mile. But it’s a start. If you can

  • understand that different levels of scale requires a different approach, and
  • distribute both work and decision-making authority, and
  • create a good set of decision-rules that can be used to align everyone,

…then you’ve laid the foundations for a much less stressful environment at scale.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s