Uncovering Root Causes with the 5 Whys in IT Problem Management

Uncovering Root Causes with the 5 Whys in IT Problem Management

In IT environments, problems and incidents are inevitable. However, the difference between a smoothly operating system and one plagued with recurring issues lies in how efficiently these problems are diagnosed and resolved. IT Problem Management plays a crucial role in identifying and eliminating the root causes of incidents to prevent them from happening again. One of the simplest yet most effective root cause analysis techniques used in this process is the 5 Whys method.

Let’s dive into how the 5 Whys can be applied in IT Problem Management to tackle complex issues at their core.

 

What is IT Problem Management?

IT Problem Management focuses on identifying, diagnosing, and resolving the underlying causes of incidents that disrupt IT services. Its goal is to minimize the recurrence of incidents and improve overall system reliability by addressing the root causes of issues, rather than just treating the symptoms.

There are two types of problem management:

  • Reactive Problem Management: Solves problems after incidents occur.
  • Proactive Problem Management: Identifies potential issues before they cause incidents, preventing problems from arising in the first place.

A critical part of problem management is Root Cause Analysis (RCA), which seeks to understand the "why" behind incidents. This is where the 5 Whys technique comes into play.

 

What is the 5 Whys Technique?

The 5 Whys is a simple, iterative problem-solving tool that helps teams drill down to the root cause of an issue by asking "Why?" multiple times. Each answer leads to the next "Why?" until the fundamental cause of the problem is uncovered.

The method was originally developed by Sakichi Toyoda, founder of Toyota Industries, and has since been widely adopted in various industries, including IT. While five iterations of "Why?" are common, the number of whys can vary depending on the complexity of the issue.

 

How the 5 Whys Works in IT Problem Management

The 5 Whys is particularly useful in IT Problem Management for identifying the true root cause of incidents that might otherwise go unnoticed. Here’s how to use the 5 Whys technique in IT Problem Management:

  1. Identify the Problem: Clearly define the incident or problem that needs to be solved. For example, “Our website keeps crashing during peak traffic hours.”
  2. Ask the First "Why?": Why is the website crashing? For instance, the initial answer might be, "Because the server is overwhelmed by too many simultaneous requests."
  3. Ask the Second "Why?": Why is the server overwhelmed by too many requests? The answer could be, "Because the server capacity is not sufficient to handle peak traffic."
  4. Ask the Third "Why?": Why is the server capacity insufficient? The next answer might be, "Because we haven’t scaled the infrastructure to match increased traffic demands."
  5. Ask the Fourth "Why?": Why haven’t we scaled the infrastructure? The answer could be, "Because the monitoring system didn't alert us to the increasing load."
  6. Ask the Fifth "Why?": Why didn’t the monitoring system alert us? The final answer might be, "Because the alert thresholds were set too high and didn’t account for gradual traffic increases."

At this point, the root cause of the problem is clear: the monitoring system’s alert thresholds need to be reconfigured to detect gradual increases in traffic before they overwhelm the server.

Note: The 5 whys technique is often used in combination with other techniques, in particular the Ishikawa Diagrams and brainstorming. Whilst effective it is only one technique, that needs to be used in combination when dealing with more complex issues.

 

Benefits of Using the 5 Whys in IT Problem Management

  • Simple and Effective: The 5 Whys technique doesn’t require advanced tools or complicated processes. It can be done quickly in a meeting or even informally as part of a team discussion.
  • Gets to the Root Cause: By continuing to ask "Why?", the method helps peel back layers of symptoms to reach the true root cause, ensuring that the real issue is addressed.
  • Prevents Recurrence: Addressing the root cause identified by the 5 Whys prevents the problem from recurring, saving time and reducing future incidents.
  • Encourages Collaboration: The 5 Whys approach often involves input from multiple stakeholders, encouraging collaboration across IT teams, developers, and operations to resolve issues effectively.

 

How to Apply the 5 Whys in IT Problem Management: A Step-by-Step Process

  1. Gather the Right Team: Involve people from different departments (IT, development, support) who are familiar with the incident to provide insights and perspectives.
  2. Define the Problem: Clearly state the issue. It’s essential to frame it correctly so that the process targets the right symptoms.
  3. Ask "Why?" Start by asking why the problem occurred. Write down the answer and then ask "Why?" again, based on the response.
  4. Repeat the Process: Continue asking "Why?" up to five times or until the root cause is identified. Each iteration should move closer to the real cause of the problem.
  5. Take Corrective Action: Once the root cause is identified, take steps to eliminate it. Implement solutions that address the real problem, not just the symptoms.
  6. Document and Share Findings: It’s important to document the 5 Whys process and the findings for future reference. Sharing the outcome with the broader IT team can prevent similar problems in the future.

 

Example: Applying the 5 Whys to IT Incident

Let’s look at an example of the 5 Whys applied to an IT incident:

Problem: Users are unable to access the company’s cloud-based application.

  1. Why are users unable to access the application?
    • Because the application server is down.
  2. Why is the application server down?
    • Because it ran out of storage space.
  3. Why did it run out of storage space?
    • Because log files weren’t automatically deleted.
  4. Why weren’t the log files automatically deleted?
    • Because the log retention policy wasn’t correctly configured.
  5. Why wasn’t the log retention policy configured correctly?
    • Because there was no review of the retention settings when the server was set up.

Root Cause: The server setup process didn’t include a review of critical configurations like log retention. The solution would be to implement a review step in the server setup process to prevent similar issues in the future.

 

Conclusion

The 5 Whys is an invaluable tool in the IT Problem Management process for uncovering the root cause of incidents quickly and efficiently. By asking "Why?" repeatedly, IT teams can dig deeper into the underlying issues, ensuring that the real problem is addressed, not just its symptoms. When implemented correctly, the 5 Whys technique helps reduce recurring issues, improve system stability, and enhance overall service delivery.

Incorporating the 5 Whys into your IT Problem Management workflow can lead to more sustainable solutions and long-term improvements, benefiting both IT teams and the broader organization.

 

------------------------------------------------------------------------------------------------

The Problem Management Co. (PMCO) develops and delivers the  world’s leading Best Practice Training and Certification program in IT Problem Management worldwide.

Learn more:  www.problemmanagementcompany.com

Back to blog