Previously in “teachers assessing agile teams”
In my previous post, I derived a lightweight model for building team and product metrics from the Kirkpatrick model, which is normally used to assess training effectiveness.
A key theme was that we should think about what, specifically, we want to learn from our metrics before adopting any typical “agile,” or “product” metrics out of the box.
The result was a series of questions that you would want to use your metrics to answer. The questions are, similar to the Kirkpatrick model, taken from different levels or perspectives. But you do not need to answer all the questions with a fleet of metrics. Instead, you choose your focus based on the benefit and the relative effort needed to create an assessment based on only 1 or 2 perspectives/levels. That way, you only measure 1 or 2 things at a time, allowing you and the team to focus on learning from the assessment rather than capturing and reporting data.
This is the first cut of the model that I played with as I created the post:
Using this model (or a similar one) we can combine data, feedback and team meetings to build a basis for team learning. We can also use a basic communication plan to decide on how we use the same (or additional) information to keep the team’s stakeholders well informed.
The story continues
Coaches, like teachers should be good at assessing assessing performance, results and improvement.
Assessment in education is quite sophisticated these days. In fact, from my reading as an outsider, we can look at 4 distinct types of assessment that are in common use, all of which include feedback loops, achieve specific goals and support continuous growth.
|Diagnostic||Creating a baseline of where a student or cohort is at across the board||Understanding overall learning needs and maturity||Instructional designers; Administrators assigning students to classes; Teachers building a curriculum|
|Summative (or “high stakes”)||Evaluating whether learning goals were achieved||Recognising and certifying the outcomes of learning||Assessors awarding a qualification; Teachers evaluating student achievement|
|Formative (or “low stakes”)||Providing useful feedback during the learning process||Creating useful feedback that assists ongoing learning||Students as they learn; Teachers adapting their lesson to student needs|
|Ecosystem||Assessing both whether the current curriculum is valuable and whether the overall system of learning is as helpful as it might be||Making education relevant and removing systemic impediments to learning||School management; Teachers working on improvement beyond the classroom|
How is that relevant to agile teams?
The approach that educators take to assessment is different to what many agile teams (and agile managers) currently do.
Many teams start with the assumption that either:
- We are agile, so we should empower teams to go and sprint all over the place, confident that things will end well if they are a good team;
- Agile teams need to deliver, just like everyone else, so we can use the existing metrics and governance that the organisation has in place. We can shoe-horn what we have to assess self managing teams that are continuously sensing and responding to new information; or
- Agile frameworks include the metrics, ceremonies and governance that will work well in our context. Thus we can start with by implementing the basic model and then adapt it over time as we become more mature at agile stuff. The agile framework or way of working we select will contain the most suitable metrics and other assessments available to our teams.
I might agree that we can trust the people in our agile teams; that many of the existing organisational measures and governance process that organisations have are good; and that agile frameworks often include useful metrics and ceremonies that we can use.
However I reject the idea that any of the three assumptions above is a sound starting point on its own. I believe that we should select fit-for-purpose assessment approaches (metrics, ceremonies, coaches assessing teams) based on what we are trying to achieve.
I think that the types of assessment used in modern education might actually be a better place to start, even though it will involve more work and iteration to create our overall assessment approach.
Let me see if I can identify the relevance of each category of assessment and see if it will lead us to a comprehensive assessment system (or toolkit) that that can be used by teams, coaches and managers alike, rather than a reporting system that seems to consume energy without generating real learning or ownership of ongoing improvement.
Diagnostic measures are good and bad
In Australia, new students arriving at a school are often subjected to a battery of tests when they arrive. We also inflict a similar battery of tests on students across our schools on a semi regular basis (every couple of years) to baseline where students are at across a core set of learning areas.
This is fairly expensive in terms of student time, teacher time and administration time. Hence it is only done rarely.
Where this adds a lot of value is in the “battery or tests” which allow a holistic picture of the team’s development and growth needs.
We can look broadly across many aspects of the teams processes, technical capability, interpersonal interactions, impediments and so forth. After doing this we can hone in on the 1-2 key areas that will become the focus of development in the near future (the duration of the initial coaching agreement).
We can use a similar approach across multiple teams, to find common areas of need that can inform the creation of training or coaching sessions across the portfolio.
However, this is expensive (in terms of time) and can produce an overwhelming amount of data.
This is why teachers do not continually assess progress across the whole battery of tests. Rather, they assign students to an appropriate learning journey, they create a specific curriculum for the class or they design specific improvements based on what they discover.
Whatever they design then has its own specific objectives and related assessment mechanism.
In an agile work environment this should also be the case. A diagnostic assessment could become a semi regular thing for a portfolio of teams (say once every 1-2 years). It also fits well in the toolkit that coaches choose from at the beginning of each coaching assignment.
But it is not well suited to continuous, ongoing assessment. Indeed it can be harmful:
|We come up with a dozen improvement areas for the team||Teams, like students do not know where to start and will have the chance to focus on small, specific and achievable goals. This is lazy coaching and leaves the job of the coach half done.|
|We compare teams, telling them who is good or bad, or secretly talking about them in our management meetings||This encourages a fixed mindset where teams, like students, learn to game the system. They learn to pass the next tests really well, while minimise the actual learning and growth involved. This also encourages bias among coaches and managers, creating a halo effect for some teams and a hard-to-shift stigma for others.|
|Ask the team to track and report on the “gaps” or opportunities identified||While this seems sensible at first glance, it results in a lot of overhead, which means that teams either adopt a “compliance with instructions” mindset in reporting or drop the tracking when they inevitably get busy. |
Another problem is that teams will often mistake the metric (gap) for a goal, when it is only a measure that informs goal setting. Putting in the extra effort to interpret the information from the diagnostic assessment and then create a specific initiative or team goal, with its own agreed assessment mechanism creates greater focus and better motivation
So diagnostic testing is, I believe, something that should be in every coach’s toolkit. It might also be something that teams use to “self-assess” without a coach, but it is important that teams understand that it is a only starting point. The next step is to find an area for focus and then agree on how to pursue and assess progress on the goal that they set. That will involve a combination of “summative” and “formative” assessment.
Assessing the ecosystem
Before we get to summative and formative assessment though, I want to look at assessing the ecosystem, because it is related to to diagnostic testing.
Running a battery of tests on a student is usually aimed at understanding the student’s overall level of learning in order to derive their learning needs. We can then channel our students into the right courses and we can design a curriculum for a whole cohort that takes them to “the next level.”
However, teachers might also identify patterns in the learning maturity and needs of different groups of students. They might find that all students with a common set of characteristics, unrelated to the curriculum, are consistently performing well or badly. For example, students who live in dangerous neighbourhoods might be getting the same maths lessons as everyone else, but the vast majority of them might be struggling to reach the standard that we think they should be able to achieve.
We might realise that we should stop looking at (or blaming) the students for this gap and instead look for a cause that is outside the classroom and often outside the control of the student or the teacher.
We might find that, for example, constantly living in fear and being “hyper-aware” means there is not mental capacity left for learning geometry. Or we might find that the culture in an elite school seems to be toxic for some groups of students (girls, migrants, kids with specific needs).
We might even find that while students are passing their exams, none of them are getting jobs after they graduate. Our classes are now excellently designed to waste student time and teach them things that they will never use.
In education, teachers and school administrators often conduct a separate assessment of the ecosystem in which they are creating what they intend to be first class learning. This is not about assessing the student but identifying the systemic impediments and challenges that are outside the control of the student.
An excellent example of this approach can be found in the “data wise” approach to evidence based improvement in education. While this focuses on and education based ecosystem, I have also found that it translates well into building an evidence based approach to identifying, leading and assessing systemic improvements in the work environment that agile coaches typically find themselves in.
In an agile work context or a product led organisation, this means moving from coaching the team to work well within the current context they work in to changing the context within which teams work.
Doing so can provide great benefits. However, this approach is time intensive and assumes:
- There is an eager group of people who are committed to evidence based improvement;
- That group’s stakeholders are willing to play a long game of slow but impactful improvement beyond team based improvement; and
- The eager group and their stakeholders have the agency (permission and opportunity) to implement ongoing improvement.
So how does this impact our team metrics and ceremonies?
Most out of the box agile (and organisational) metrics are designed to assess team performance or product success. They are great in that context but only provide glimpses of useful insights about the organisation’s design, governance and business model.
There are however good foundations for assessing the wider ecosystem. The key is to understand systems thinking and that the system you are assessing is NOT the teams and their performance.
You will find some great starting points lean approaches, business model driven approaches and even traditional change management and organisation design approaches.
if you are not sure where to start then you can start with “data wise” or a similar approach and slowly learning and adapting the way you use it to iteratively grow your own approach. You will be successful if you are persistent and if the 3 assumptions that I listed as a starting point are in place.
I believe that for some coaches and teams, assessing the wider ecosystem is a key part of their toolkit and that it involves creating fit for purpose metrics, not generally the same as the team (or even customer) metrics as they are seen out of the box.
This toolkit should not be confused with the tools that are used to assess and support team growth, because they are designed to change the context within which teams operate, rather than helping teams improve within the current constraints of the organisation.
Thus, coaches and managers have to choose their focus. To what extent do they want to focus on team improvement and to what extent do they want to focus on improvements outside the teams themselves? This is where approaches like OKRs can be useful in signalling what coaches and managers are planning to focus on.
But most assessment happens “in the classroom.”
I have covered diagnostic assessment and assessing the wider ecosystem, but most of the metrics a coach and agile team use will probably fall under the categories of Summative and Formative assessment.
So let’s look at how these integrate into learning in schools and, I hope, how they are relevant for our teams of “grown ups” at work.
I visited my daughter’s “agile kindergarten class” a few years ago. I was shocked how “agile” they really were and how their learning seemed to be. But the real thing that impressed me how well assessment and feedback was seamlessly integrated into the day, even for a class of 5 and 6 year old kids.
My recollection of school was that we would have exams at the end of every learning adventure. But I did not see these as part of the learning process. Rather, they were like regular tornadoes or other natural disasters that I learned to mitigate.
I would muck around most of the term and then a test would suddenly appear on the horizon. I would plan to study but generally avoid doing so. Then I would pack a term of study into 1-2 days before the test, sit the test and then escape back to my normal life.
As soon as I escaped, all the knowledge I had packed into my brain for the test did the same thing – it escaped into the ether. I am not sure if I would have done as well a week later if I had to repeat the test, little own a month or two down the track.
That is not how it is supposed to work, at school or at work. Assessments, tests and exams are not supposed to be external events that you do because you have to do them. They are an integrated part of the learning (and continuous growth) process.
Even at a young age kids are taught how to learn. They are taught how to control their stress to move from their comfort zone to their learning zone and then they are taught how to recognise and manage themselves when they enter the “fear zone” where learning is hard because their bodies are moving from thinking to fight or flight.
This builds an ownership of learning where the child (hopefully) starts to aspire to grow and learn more. However this is not possible without clear learning objectives and ongoing assessment of progress toward that objective.
Summative assessment and feedback
Summative assessment is the high stakes assessment that evaluates whether the student achieved mastery of the subject (or a pass mark or a fail). This might be a project, exam or activity where the performance of the student is evaluated.
Summative is still powerful and it is important. At the end of grade 1, kids get assessed to see if they are ready to move to grade two. If a young girl wants to play basketball, she will be assessed to see if she makes the team.
A team might rate themselves as great in a retro, they might find no defects in testing and they might have a throughput of work that they are proud of. But when a customer encounters the work that the team did, there is an immediate moment of truth.
Is the product or feature that just arrived desirable? Does it support a job to be done, or is it just a feature produced because the team liked it? This moment of truth is the equivalent of a “summative feedback loop” or a “high stakes assessment. The team achieves the goal or they do not.
Teams need some way to measure themselves against an external goal or standard. Metrics like velocity and retrospectives are not designed for this (which I will explain under formative assessment).
So in my dodgy Kirkpatrick-like model, you might want to assess not just level 1 “how did you enjoy that” types of feedback, but meatier (and harder to measure) feedback like contribution to revenue, containment of cost, customer adoption of the product or other metrics that tell the team that they are creating value and being successful.
The limitations of summative assessment
Kids with a growth mindset learn this lesson too. When they do a presentation or they sit a test, they are rated on their performance. The work was of a high standard or it was not.
However, students also learn that this not a reflection of whether the student worked hard or had talent (although these help). This is a reflection on the quality of the work produced this time.
This is the same for the team at work. For example we might use customer adoption metrics. A team might build awesome products that are years ahead of their time, but if customers are not adopting the product, revenue will not grow. If revenue does not grow then the team will score an F, even if their product is awesome. This is not a reflection of whether the team worked hard or have great potential. This is an assessment of whether what the team are producing is worth buying.
Summative assessment often provide delayed feedback (the work is done, not in progress) and the focus on rating performance rather than guiding the student (or team member) as they learn.
Formative assessment and feedback
Formative assessment is designed for the learner rather than the teacher. It is continuous feedback during the process of learning rather that at the end. For this reason it is sometimes called low stakes feedback or “assessment FOR learning.”
Since it is designed to support learning, as it is happening, formative assessment cannot be delayed. There is little point in telling a student that she has been holding her pencil wrong for the last 3 months. Instead the teacher observes the student in the moment and offers immediate, specific feedback, such as “hold the pencil further back.”
In an agile team, this may come from the interaction between the team members (an coach). It might involve a peer review of the code, or some testing. All of these things help the team improve as they work.
Sometimes formative assessment can look a little odd to old school parents who do not understand it. Students might be assessed by their fellow students instead of the teacher and the process of assessing others is teaching the student as much as the feedback they get from others. Some “exams” allow for multiple attempts. The student can actually sit the exam when they feel they are ready, receive results and then resit the sections they want to improve in. They can therefore set their own learning standard and quit when they hit it.
A similar thing happens with some of the “agile metrics” like velocity, retros and even cycle time. The team can define their own definition of done and define a story point that is not related to time or delivery. They can then assess their progress every day in the stand-up and in the ongoing discussions they have as they run outputs past each other.
Even testing moves from being summative to formative as we use approaches like TDD, prototypes, MVPs, iterative discovery and evolutionary design.
The key thing about formative assessment is that it is not designed (primarily) for the assessor or stakeholders to understand and make a decision. It is designed for the one being assessed to understand and make a decision about what to do next.
This is where a lot of agile concepts align very well with modern education and this is where a lot of the metrics that came from the agile community have, at least broadly, their parentage. They assume that the consumer of the information is the self managed team, who are continuously inspecting and adapting and learning, with the goal of continual improvement and continual delivery of value.
So where does that leave us?
Teachers and others evaluating education have come to look at “learning” at different levels (such as in the Kirkpatrick model) and they have created a very robust assessment system that can include:
- Assessing the outcome of learning;
- Using continual feedback and ongoing assessment as par of the learning process
- Creating a known baseline for designing and improving large scale construction of a complete portfolio of learning programs; and
- Evaluating and questioning the system within which learning is taking place.
I believe this same approach can form the basis of building an assessment approach for our teams (product teams, agile teams, BAU support teams) and our portfolios of work. I think the existing metrics that are available to agile teams are well suited to this approach, but that they may not create a holistic approach if we simply start using them out of the box, rather than seeing them as part of the system (or Way of Working) that we work in.
I would like to try an example or a straw man to look at how that might be done, but I feel as though this article is already too long.
My only choices now are to edit this article effectively or to delay the construction of a template or model of assessment until I get another burst of writing energy.
I choose to leave things as they are and come back another time. But let me know if you can see value in this approach (based on what you see).