Who knew the race condition (or racetrack problem) could blow up your project schedule?

Like every other cool project manager these days I like to be agile. But I used to be a pretty good waterfall project manager before I found out how uncool it was.

One of the common things I used to do was called “crashing the project schedule”. In fact it is probably the most fun part of project management.

But one problem that always seemed to catch me out was the Race Condition (or Race Hazard). If you look it up in wikipedia then you will find it happens to electrical circuits, but I have found it also happens to all my project plans that involve more thabn half a dozen people or more than a month long planning horizon.

The race condition says this:

When two or more people “race to get their work done” then they will unexpectedly get in each others way.

If they unexpectedly get in each others way there is a surprisingly high chance that they will both be dependent on each other (the racetrack extension of the problem). When this happens then they will keep running in circles no matter how hard they work until an outside force intervenes

Unfortunately “race to get their work done” simply means that they work independently of each other. Ie they do not sit together and break all their work down together before starting.

In old fashioned waterfall projects we spent huge amounts of time planning and scheduling around dependencies. This should remover the race condition, but does not. The reason is that it is due to “unexpected dependencies”. In other words things you didn’t think would happen.

In cool new agile projects we remove all dependencies and even in my teams we break into small feature teams to isolate the dependencies in a small group of people. But we still get bitten by the race condition.

Why does it occur – why not just get rid or it?

I am not sure why the race condition occurs in electronics. But in projects it occurs because of the “interconnectedness of all things” or the “butterfly effect”. In other words we do not really understand the way every minute change in the world could impact everything else.

Increased traffic in Beijing could impact the rain in Morrocco and that could impact the crops in Sudan … or it might have no effect, who knows. Even our most complex climate models are really just models – the simply reality to help us understand it but they do not deal with every detail of every contingency of what could happen across the world.

Similarly even excellent project plans are really just a simplification of the likely flow of the project so we can make better decisions. One team could have a birthday cake and someone from a second team could wander over for a chat. She might notice the way the birthday team is doing their testing and adopt some of that back in their team. That might cause them to find one bug that would not have been discovered and miss another one that would be missed – but then again it might have no impact.

The better our plan is the better our decisions will be, but all decisions will still be based on an artificial model of reality and so they wil not take into account all the possible relationships. And in a complex social network of people working together – that means weird things will happen and the race condition will inevitably get us.

So – what can you do?

Better planning and project crashing can help. So can breaking work up to reduce dependencies. You can also use component based architecture to reduce (expected) dependencies. But the race condition will still get you.

Here is my plan of attack:

Keep a look out for unexpected delays or comments like “we are just waiting for team X to finish their testing so we can use the environment”. Mention these in retrospectives and look for patterns. If you notice the same one more than once then stop and talk to each other about whether there is a possible butterfly effect on each other. Consider slowing your work down to let the other guys pass if there is.
Have short bursts of work so that you integrate (and thus blow up) sooner, thus being able to fix it more easily.
Add some contingency to your project (not to individual features) and consider adding some extra overhead or slack to deal with this issue. Unfortunately if you are super-efficient and really remove all slack from the project then you will look good until the race condition hits you – it is like a prime breeding ground for it to exponentially blow up everywhere.
Consider an extra person (release manager) or meeting (scrum of scums or dependency meeting) to pick up race condions
Let things sort themselves out if they are not critical and urgent for you. But if they seem critical and urgent then immediately stop work because you are about to enter the race track version. Stop and have a coffee with the people you are “co-dependent with” and take the race out of things. Then if necessary do some back of the envelope planning together and treat your shared work as a one off mini-project.
You may not believe me but if you don’t immediately stop, you will waste two and a half weeks when you are “entering a race track”. people will keep doing bits of work and then have to wait for each other even though this seems to breach the rules of the basic principle of Cause and Effect.

But what is project crashing?

In case you were wondering – to crash a project you simply follow this fun process. It is really designed for task based projects but can be adapted for any project:

Get your sponsor to explain the problem you are solving and what you desired scope is (they will say it is the final scope but let’s pretend it could change)
Get the team together and have them plan the project
Look shocked and dismayed when it appears you will be 20-40 past the required end date by the time you are finished
Recover and work with the team to build the project schedule on what they think a logical approach to the project is, with no scope removal or time pressures
Take all contingency out of all tasks and estimates and add it to the end of the project as “contingency”. This is a scary thing to do but it is the right thing to do because you WILL need your contingency, but if you leave it in each task you will use it up and then need it again. I will explain one day but for the moment let’s accept that this is sad but true
Define the critical path (the longest series of dependent things that you have to do to finish).
Ignore any items not on the critical path – do not try to make them more efficient or descope them or add extra people to them. They don’t matter unless they end up on the critical path.
Crash your schedule – work out how you can reduce your critical path by slowing other things down, running two non-dependent tasks in parrellel, borrowing developers from other projects, working an extra shift or spending money. Usually it ends up with spending money,running a couple of things in parrellel and wasting some effort in other parts of the project.
Define your new critical path and start again. Keep going until you are happy that you are on time, even though it will cost more now.
Manage your critical path every day and let everything else slide until it looks like becomming the critical path.

James King