Listen to this post:
This is part 6 of 7 posts on PMI-ACP Exam Prep (link to part 5). In this post, I focus on agile practices used to prevent, identify, and resolve threats and issues, including catching problems early, tracking defects, managing risk, and engaging the team in solving problems.
All project teams need to be proficient at managing problems, threats, and issues. Since problems will always arise on a project, our effectiveness in preventing, detecting, and resolving them is likely to determine whether our project succeeds or fails. This post is broken down into four themes: Understanding problems, detecting problems, managing threats and issues, and solving problems. The tools discussed in the managing threats and issues section time back to our discussion in my second post, risk management.
When a problem occurs on a project it can be tempting to ignore the issue and just continue pushing forward in the hopes that it will somehow go away or resolve itself. But even a single problem can result in the delay, waste, and rework take can bring our progress to a halt and even reverse it.
The secret to minimizing the impact of problems is to identify them as soon as possible, since early detection and reduces the potential for rework. Once a problem is detected, we also need to diagnose and solve it as quickly as possible so we don’t consume any more unplanned time than necessary.
The Cost of Change
The reason it’s so costly to procrastinate in dealing with problems is because of the cost of change curve. There are a number of reasons why the cost of fixing the problem increases over time, but here are two of the key factors:
- Over time, more work may have Been built on top of the error or problem, so that more work will need to be undone to fit it.
- The later we are in the development cycle, the more stakeholders will be impacted by the defect, making it much more expensive to fix.
Luckily, agile methods emphasize frequent verification and validation. Both through active stakeholder participation such as modeling, demonstrations, and reviews and through software development practices such as pair programming, continuous integration, and test-driven development. These practices are all intended to find defects and problems as soon as possible before the cost escalates too far up the cost change curve.
Also known as design debt or code debt, is a concept in software development that reflects the implied cost of additional rework caused by choosing an easy solution now instead of using a better approach that would take longer. Technical debt is the backlog of work that is caused by not doing regular cleanup, maintenance, and standardization while the product is being built. It is a backlog of things of that should be done to make work easier in the future, but aren’t done because of a push to deliver features.
In software projects, the solution to technical debt is refactoring. Refactoring is the process of taking time to simplify and standardize the code to make it easier to work on it in the future. When asking the team for estimates, always ask them to include time for refactoring, since this should be part of their regular work routine. The idea that we aren’t done until the code has been refactored is baked into many agile practices. For example, in test-driven development, we first write tests that fail and then write code until the tests pass. However, we don’t stop there, we still need to do the last step, refactoring, which is why this process is referred to as “Red, Green, Refactor.”
How do you deal with technical debt?
To keep technical debt as little as possible, refactoring should be done frequently. There is a saying “refactoring should be like daily hygiene, not spring cleaning.” This means that we don’t save it up for a refactoring Sprint, instead, we incorporate it into all of our regular work. Creating space for the team to refactor and reduce technical debt can take commitment and courage since the business is likely to be pushing for more features. So we need to explain that it is their best interest to take the time to refactor as we go, to reduce technical debt and streamline feature changes.
Here is a great article on technical debt and how to deal with it: https://hackernoon.com/there-are-3-main-types-of-technical-debt-heres-how-to-manage-them-4a3328a4c50c
Create Safe and Open Environment
We want people to feel comfortable not just to experiment, but also to admit their problems, failures, and mistakes and ask for help so that the project can recover as quickly as possible. Creating a safe and open environment is as much about avoiding catastrophic delays as it is about protecting peoples feelings. The team leader can look for opportunities to reinforce the importance of asking for help. For example, if the demo shows that the team has made little progress in a particular area, maybe they are facing the problem that they haven’t shared yet. This can be a coaching opportunity to remind people to share issues early.
Diagnostic tools such as cycle time, trend analysis, and control limits can point to potential problems before they occur or help us identify problems as soon as possible after they have occurred. The daily stand-up meeting is also an important mechanism for identifying problems. If you recall the last question in the daily stand-up asks whether there are any problems or impediments. The goal is to find problems early and address them as soon as possible rather than waiting for the next retrospective.
Lead time and Cycle Time
Lead time is a diagnostic tool that can be used to help identify and diagnose problems. The concept measures how long something takes to go through the entire process, for example, from design to shipping, or from requirements gathering through development to deployment.
Cycle time is a subset of lead time that measures how long something takes to go through part of the process, such as from product assembly to painting, or from coding to testing.
Cycle Time, WIP, and Throughput
In mathematical terms, the cycle time is a function of WIP and throughput and can be calculated by using the following formula:
Cycle time = WIP/Throughput
What this means in practice is that when keeping the same pace of work (i.e. throughput stays the same), if WIP goes down, then the cycle time goes down as well.
Software development is never a fully stable system and therefore does not strictly follow rules like that. But intuitively it makes sense – the more you multitask, the longer it takes to complete any individual task.
Throughput and Productivity
Throughput is the average amount of work the team can get done in a time period. In other words, their average completion time.
productivity, on the other hand, is the rate of efficiency at which the work is done. Such as the amount of work done per team member.
To put this in an example, if a team’s throughput goes up, it might be because their productivity went up. But not necessarily. They may have added an extra person, in which case their individual productivity might have actually gone down. Yet they were able to get more done as there were more hands on deck.
As discussed earlier, it is advantageous to catch and fix defects as quickly as possible to minimize rework and reduce costs. This where the idea of “defect cycle time” comes in. Although so far I have been talking about cycle time in terms of developing new work, cycle time is also useful for finding and fixing defects. “Defect cycle time” is the period between the time the defect was introduced and the time it was fixed.
Looking at the above chart, you can see that the team’s cycle time is 12 hours. Meaning that on average it takes them 12 hours to fix all reported bugs. The first iteration, however, the team spent more than 12 hours to fix them but as they moved passed the 5th iteration they were able to reduce the cycle time to around 8 hours.
By tracking both defect cycle time and their cycle time for creating new work, agile teams can minimize both the potential for rework and the cost of any rework that is required.
Unfortunately, no matter how hard we try to identify and prevent defects, there may be an occasional defect that makes it through all our tests and quality control processes and ends up in the final product. In software development, this is called escaped defects. Defects that are missed are most costly to fix as they are on the most top right corner of the cost of change curve. We must fix these issues but also make note of them and find ways to prevent them in the future.
A project’s defect rate measures the frequency of defects found, such as “one defective feature per 50 successful feature delivered.” To assess and improve the effectiveness of their testing, agile teams often track their defect rates so they can monitor the trend over time.
Variance is the measure of how far apart things are, or how much they vary from each other. For example, if you ask multiple people to estimate the same job, There will be some variance (differences) between their estimates. And once the work has been executed, there’ll be variance between the closest estimate and actual results. Much variance is normal and should be expected – We just want to keep it within acceptable limits. If it fluctuates beyond the acceptable limits, then we want to know so we can take action.
Cases of Variation
Let’s say we’re given a job of driving nails into wood all day. We can expect that not all males will be perfectly straight. Some of them will go in straight and others will go in at a slight angle. This is a common cause of variation. However, now let’s say that someone turns off the lights while we are kneeling. If we keep nailing in the dark, the variance in our nails is likely to be much larger, since the environment has changed (special cause variation).
Accept the Variance or Take Action?
We should simply accept common cause variation on our projects; We only need to investigate or take action in the case of a special cause variation. In other words, we want to avoid micromanaging the project and instead focus on removing true bottlenecks and impediments. Asking our developers why they only coded four features this week when they completed five features last week is as an example of failing to accept common cause variation; This type of stuff just varies a bit and isn’t perfectly predictable.
So on agile projects, in particular, focusing a lot of effort on tracking conformance to a rigid plan isn’t the best use of a leader’s time. Instead, we should look two external indicators in the daily standup meetings where the team reports any issues or impediments to their work to see if there are special issues that need to be resolved.
Trend analysis is a particularly important tool for detecting problems because it provides insights into future issues before they have occurred. Although measurements like the amount of budget consumed, are still important, such measurements are lagging metrics; In other words, they provide a view of something that has already happened. While lagging metrics that provide a perfect view of the past might be exciting to accountants, leading metrics that provide a view into the future are more exciting to agile leaders.
In an agile context, Control limits have a much lesser interpretation that includes tolerance levels and warning signs. Such Limits can help us diagnose issues before they occur or provide guidelines for us to operate within. Some of the agile recommendations for rules of thumb, such as limiting teams to 12 or fewer members, could be interpreted as control limits.
One way we could use control limits on an agile project is to monitor our velocity to gauge how likely it is that we will be able to complete the agreed-upon work by the release date. For example, if we have 600 story points to complete in our backlog and only 10 months left until deployment begins, we should set our lower control limit to 60 points per month (600/10=60), since that is the minimum velocity required to meet this goal. So if our velocity drops below 60 points per iteration, there is a chance that we won’t be able to finish the agreed-upon functionality by the release date.
Managing Threats and Issues
Since risk is anti-value, managing risk is critical for value-driven delivery. As a result, agile teams need to balance the goals of delivering business value and reducing risk each time they select the new batch of features or stories to work on. The project continues, they also need to continually assess the severity of the project threats and monitor their overall project risk profile.
Exam Tip 1: For the exam, bear in mind that in agile, “risk” generally refers to threats and issues that could negatively impact the project.
Agile teams manage threats and issues, by examining 3 tools: the risk-adjusted backlog, risk severity, and risk burndown graphs.
In planning each iteration, agile teams seek to balance delivering the highest-value features and mitigating the biggest risks that remain on the project. They do this by moving the items with the greatest value and risk to the top of their backlog. The backlog might start out as just a list of the business features involved in the project, divided into practical bundles of work – but once the risk response activities are added and prioritized, it can be referred to as a “risk-adjusted backlog.” This single prioritized list is what allows agile teams to focus simultaneously on both value delivery and risk reduction.
Creating the Risk-Adjusted Backlog
Most teams are comfortable ranking customer requirements, stories, features, and use cases on the basis of business value and risk level. This ranking is usually subjective and based on gut feeling. However, we can get much more scientific about building a risk-adjusted backlog by using the return on investment per feature. This process starts with the financial return expected from the project as a whole.
So for example, say you have a $2M project and the company does a cost-benefit analysis and determines that the expected return for this project would be $4M in three years. Once you have a dollar figure like this, the next step is for the business representatives, not the dev team, to distribute the amount across the project features. This is a very important part and can’t be overlooked.
After the business representatives have attributed a dollar value to each of the product features, we can prioritize those features based on business value, as shown below:
|Feature 1 (must have)||$5,000|
|Feature 2 (must have)||$4,000|
|Feature 3 (should have)||$3,000|
|Feature 4 (should have)||$2,000|
|Feature 5 (could have)||$1,000|
|Feature 6 (could have)||$500|
|Feature 7 (could have)||$400|
|Feature 8 (could have)||$100|
Next, we need to monetize the risk avoidance and risk mitigation activities. To do this we can calculate the expected monetary value of each risk. For example, say the project would need a reporting tool and to use the in-house reporting tool it would not cost the project anything but to purchase a high-performance reporting tool it would cost $10,000. And there is a 50% chance of us needing a high-performance reporting tool. To do the calculation we can use the following formula:
Expected Monetary Value (EMV) = Risk Impact (in dollars) x Risk Probability (as a percentage)
So in the example of a high-performance reporting tool, EMV would be $10,000 x 50% = $5,000.
EVM calculations can be done for most risks. It’s important to note that you shouldn’t get to bogged down on exact figures. You are trying to come with best numbers possible. As long as you can justify your number and feel they are realistic and reliable it’s good enough. Using this approach we can rank the project risks to produce a prioritized list of threats and issues as shown below:
|Risk 1||$9,000 x 50% = $4,500|
|Risk 2||$8,000 x 50% = $4,000|
|Risk 3||$3,000 x 50% = $1,500|
|Risk 4||$6,000 x 25% = $1,500|
|Risk 5||$500 x 25% = $125|
Of course, not all risk can be mitigated or avoided. Some will be accepted and transferred. But for the ones that can be tackled, the next step is to prioritize the response actions along with the functional features to get the risk-adjusted backlog.
We should think of the risk-adjusted backlog as a tool that uses calculations only to get at what is truly important – the priority of the work items that need to be done. The true benefit of this tool is that it helps the product owner and the dev team bridge communication gap and have a meaningful discussion about schedule and scope trade-offs. So although we assign numerical values to help level the playing field, creating a risk-adjusted backlog is really more of a qualitative practice than a quantitative practice.
Risks are generally assessed via two measures; the Risk Probability, a measure of how likely a risk is to occur; and Risk Impact, a measure of the consequence to the project should the risk actually occur. The product of Risk Probability and Risk Impact are given the overall Risk Severity.
Risk Severity = Risk Probability x Risk Impact
This allows us to rank risks and determine risk mitigation priorities. If we give scores of Low(1), Medium(2), and High(3) to both Risk Probability and Risk Impact, we will see that high probability and high impact risks get a Risk Severity score of 3 x 3 = 9, Whereas a high probability, but low impact risk would be given a Risk Severity score of only 3 x 1 = 3.
It is possible to take the analysis of risks much further, using techniques such as Expected Value that assign percentage probabilities to the Risk Probability score and dollar values to the Risk Impact value (e.g. EMV = 25% x $8000 = $2000), But for the purposes of illustrating risk profiles and trends, the abstract values of Risk Severity of 1-9 are all we require.
For any project, we should engage the development team, sponsors, customers and other relevant stakeholders in the process of risk identification. Their ideas along with reviews of previous project’s lessons learned notes, risk logs, and industry risk profiles should be used to identify the known and likely risks for the project. Once we have this list we can undertake our risk analysis and assign probability and impact scores to each risk and calculate the risk severities. Similar to estimation, engage the team members in risk analysis because they are closer to the technical details and the process of inclusion generates increased buy-in to the risk management plan and mitigation actions. No involvement, no commitment.
Risk Burndown Graph
Risk Burndown Graphs are “stacked area graphs” of risk severity. The risk severity scores for each risk are plotted one on top of another to give a cumulative severity profile of the project. When risks and their history of severities are displayed like this it is much easier to interpret the overall risk status of the project.
For instance, we can tell from the general downward trend that the project risks are reducing. These Risk Burndown Graphs are an excellent way of demonstrating the value of “Iteration 0” activities that may not deliver much/any business value but are extremely useful in proving approaches and reducing project risks.
Agile methods don’t rely on a reactive approach of “fix the problem when it arises.” Instead, they identify potential issues during retrospectives and iteration reviews. This minimizes the need for ad hoc problem solving during the iterations. Therefore, they include iterative opportunities for the team to capture those lessons, probe the strengths and weaknesses of their approach, and ensure that their mistakes won’t be repeated. These team-based problem-solving efforts are actually part of the continuous improvement process.
Benefits of team engagement:
- By asking the team for solutions, we inherit consensus for the proposal.
- Engaging the team accesses a broader knowledge base.
- Team solutions are practical.
- When consulted, people work hard to generate good ideas.
- Asking for help shows confidence, not weakness.
- Seeking others’ ideas models desired behavior.
I will go through continuous improvement in the next post, which will be the final post in these series for the PMI-ACP prep.