Troubleshooting Methodology

This post is a summary (list type) from an online class I took in 2022. I’m writing it as a note/cheatsheet for myself and to those who may find it useful.

  1. IDENTIFY the problem and determine the scope.
    • Question all stakeholders concerned.
    • Identify the scale of the problem.
    • Collect additional logs / reports.
    • If possible, try to replicate the problem.
    • Perform backups.
    • Escalate if necessary. Asking for help is not a weakness.
  2. ESTABLISH A THEORY of probable cause.
    • Question the obvious.
    • Does the problem stem from a central point or isolated?
    • Escalate if needed.
  3. TEST THE THEORY to determine the cause.
    • Establish a new theory if not confirmed.
    • Undo previous steps.
    • Escalate if needed.
  4. ESTABLISH A PLAN of action to resolve the problem.
    • Notify users/ stakeholders that will be impacted.
  5. IMPLEMENT THE SOLUTION or escalate.
    • Make one change at a time, test and confirm.
    • Reverse the change if not resolved.
  6. VERIFY FULL SYSTEM functionality.
    • Implement preventive measures.
  7. PERFORM ROOT CAUSE ANALYSIS.
    • Did you threat the symptom and not the root cause?
  8. DOCUMENT your findings, actions and outcomes.
    • Create a wiki, KBA or FAQ for the admin team.
    • Use the Notes/Remarks inputs.
    • Share the knowledge!

Leave a Reply

Your email address will not be published.