Using data is good but watch out

A lot of product teams use a lot of data and I think that is great. I am even in the process of helping some teams better use the data that is available to them.

However, doing so has reminded me that “using data” and “benefiting from using data” are different things.

This article is about how using data might not be useful and, based on my experience and the opinions of others (an absolutely no actual data), you should take some steps to protect yourself from being the victim of misleading data.

Any data appearing here is of a dubious nature

I have said that the steps I recommend are not based on “actual data” but that statement is “actually” subject to interpretation.

Often when I see data it is based on unclear goals or questions and therefore may not actually be relevant. Similarly, I sometime see data that is based on something subjective but is turned into a number and thus it is like a well-laundered opinion.

The evidence is dirty and unreliable but it has been recreated to appear objective.

So a more accurate statement about data in this article is that it is of a dubious and unreliable nature. Additionally, where it might be accurate, it is not directly going to support the steps or recommendations I provide – there might be data but it does not support the conclusions that I share.

Does that happen outside the world of the blog article?

Just as I might rely on dubious data to support my recommendations here, I see teams in the real world trick themselves into doing the same thing in the real world.

The first cause of this, responsible for approximately 38.2% of all suffering in agile teams, is the desire to turn something subjective into something objective.

For example, a team uses story points to measure velocity. It helps them guess how much work they can commit to in the next sprint.

Then a well meaning manager wants to look at team performance and sees the same number. She decides that if there are 5 people in the team and the velocity is 30, then each team member probably produces 6 points. Thus if someone produces , a team assigns their name as the owner of 9 points, then they are performing above expectations and if someone else only has 3 points then they are not doing so well. There is no malice involved but the number being used does not actually represent the thing being judged.

Worse though, the existence of a clear and dubious number causes changes in the team’s behaviour. People start to compete to score points rather than collaborate.

Fortunately the manager sees this change in behaviour and takes action to stop it, relying on their instinct and the existing trust in the team. Phew.

But the team is not yet safe – somewhere in the organisation, there is someone looking to compare the delivery of teams across their portfolio. It is so hard to know which teams are going fast and which teams are going slow. In the dark ages people had function points but they relied on the mystical work of the function points priests and sorcerers. But wait – all teams have story points and they are a measure of velocity. So teams with a high velocity are going fast, or so the portfolio analyst thinks to himself.

Again – the subjective points used to guess how much work to throw into the next sprint is, indeed a number, but it is not an objective number representing speed or velocity. A point is not a kilometer of work, nor even a centimeter of work, it is a guess about how much a team can commit to in a week. 100% of attempts to use a arbitrary and basically random number to measure speed result in wasted effort and 42% of those attempts result in poor decisions. Fortunately 58% of the time the people relying on comparative velocity are merely using it for gossip and not for actionable decision making, so this practice is relatively harmless).

You get my point though, the use of points to make decisions about performance or cross-team comparisons is pointless.

Resulting advice

Based on this subjective anecdote, I think there are some obvious steps that we can take

  • Realise that not all data are relevant, or at least not relevant to to the question you are addressing.
  • Ask yourself what you want to decide or understand before selecting relevant data to use, rather than seeing data and trying to make it work.
  • Separate data used for curiousity (playing with the data to see if something reveals itself) from data used to answer specific questions (having a goal or hypotheses). Attempt to have “the mind of the child” when playing with data and avoid prejudgment. Similarly, attempt to ensure the relevance of data when using it to test a hypothesis or report on something.

If you want to learn more about these dubious areas, google “Selection bias”, “hindsight bias” and “confirmation bias” which undermine the use of data and result in 45% of irrational conclusions that look entirely rational.

When things do not add up

One of the best things about data is that it can reveal things that we have not noticed. Data can overcome our own subjective biases. But the opposite is also true. Data can be wrong were subjective experience is right.

If you want to have good data insights about customers, try running your conclusions by you customer support people. Sometimes you will have evidence that they have not seen, but often they also have direct observations and experience that your data do not account for. It is easy to assume that good, logical data and algorythm outperform experience and hunches, but it is not always the case – sometime hunches and lived experience see what is invisible or misleading in the data.

That is an easy one, but can result in people telling you that you are wrong, so 59% of product teams avoid validating their views with those who deal with customers every day. Similarly many managers and HR teams base policies on recommendations based on data from other companies (say google), but don’t run the ideas past their own staff. While HR and Managers do not collect data on this (79% of HR people are bad at using statistical inference and this is totally not a stereotype or bias, it is a number), I think this is still prevalent (no supporting data).

Laundering subjective data

Laundering money is where money you don’t want to explain is converted into money you can explain, through dodgy practices and criminal activity.

Most of our teams would not participate in laundering money, but do we launder data. Let me look at one experience I had.

I asked people in a retrospective whether things were going well. Most said yes, or gave a non-committal answer. So I decided to get a more quantitive and objective view. I asked people to score things out of 5. We got a real number and in fact even better it was a positive rational number. Let’s say that the average rating was 3.9.

I then thought about comparing that number over time – if it want up then things are improving and if it went down then things are going the wrong way. The graph is easy to do and I can easily report it.

Now I have a rating that might look objective. I can clearly show that an improvement from 3.9 to 4.1 is a growth of 0.2 happy factor points. However it is still from the same source as the “I guess things are good” rating. The number looks objective but it is really a different representation of the same thing. It is still subject to peer group pressure, differing definitions of good, different internal attitudes to how to rate things and so forth.

I am not against the agile fist of five or the converting of opinion into numbers for discussions or surveys. My suggestion though, is to remember that it is still subjective.

So the step here is to ask what something actually means and where it came from. When you see an NPS, Velocity, Engagement Score, be aware of the source and collection of the number. It might be useful but it should not gain credibility just because it has been “laundered” from opinion to a number or graph.

The lazy Bureaucrat

Measures are good because they support decision making. “What gets measured gets managed” as they say.

Measures are also good because they change behaviours. A team that sees something will react to it, so showing where their effort is being spent or the number of bugs that are being created can help them to adjust their behaviour to improve the future.

But there is another saying – what gets measured gets gamed.

I once worked with a customer support administration area that were tracking the resolution time to resolve customer issues. An executive set a goal of 5 days when the current average was close to 20. The numbers improved quickly and people were happy. But I was then investigating some complaints from a customer that came up as part of a warranty support role I played. I found they had been waiting weeks for answers to previous problems but that we were reporting that their requests were resolved within days.

When I checked, I found that a request could be put on hold if it was “awaiting information” which was there to account for customers not responding to requests for information. I was shocked to discover though, that in multiple work teams, a request was automatically put into the status of waiting for information if it was about to breach a service level agreement (SLA).

This offended me and I escalated to the head of the department. I have softened my view since then though – I think the practice is still outrageous, but I think the error is more systemic than moral now. I think many good people have committed “minor” sins to keep the boss happy without really pausing to think about the impact.

This has to do with 2 (or more) related things. The first is not seeing the impact of the incorrect number and the second is the avoidance of short-term stress when there is no clear resolution.

So the step here is to anticipate that every measure risks becoming a goal and that when it becomes a goal it is likely that people will forget the original goal and find the quickest, easiest and least painful way to achieve the (now gamed) measure.

So when you go to use any number or data for anything beyond a single use, consider the “lazy bureaucrat” who will find the easiest way to achieve the score, potentially without achieving the goal. Ask yourself “If someone wanted to achieve these numbers with the least effort possible, what would they do?”.

I got that test from the book “Upstream” by Dan Heath and it clarified something I had been observing for many years. Dan Heath also asks, in the same book, several other questions that influenced this article.

Another is “what are the unintended consequences of this?”, or more specifically “what if we achieved these short term measures but actually created a poor outcome – what could explain that happening?”

So the step here is to first ask these questions an then to also consider a “countermeasure” or secondary measure that keeps you alert to unintended consequences of behaviours changed by your measures.

A rising tide lifts all ships

The final step in this growing article is to look at what else might explain the data we see.

There is a saying that a rising tide lifts all ships. In other words the captain of the ship is doing nothing, yet the ship is rising.

No great shock there, I guess, but the saying has broader implications.

  • A company’s share price might go up and the executives celebrate their leadership and the success of their latest efforts. But the The first challenge with data is that people think it is more objective than opinions.
  • A home-owner might spend $100,000 on renovations and then sell their house for $500,000 more than they bought it, but the market went up while they were renovating. How much of the increase in price was market related and how much was the renovation?
  • Staff turnover might go down because of the time of year, or the outside market, while agile coaches think it is because of the learning culture they are building.

OK – so this is the step that comes from the saying about rising tides. In complex systems most things have multiple causes and most causes have multiple effects. I cannot give a number on what “most” means here because 85% of attempts to use statistics and measures in complex adaptive systems fail to account for the complex and adaptive parts of the system and are thus simplistic numbers to make collectors feel good rather than predictors of outcomes.

Anyway – the step is this. When you look at a result or number, ask yourself:

  • What else could explain this?
  • What might someone else attribute this change to?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.