Skip to main content

Problem and Major Incident Detection

An AI agent that watches the incoming stream of incidents, groups semantically-similar tickets into living clusters, and alerts agents in real time when a pattern emerges, so a Major Incident or recurring Problem can be declared before it spreads.

The same engine produces two outputs depending on cadence and cluster shape. Fast-growing clusters surface as Major Incident candidates that need attention in minutes. Slow-burning recurring patterns surface as Problem candidates that need attention over days. The agent stays in the loop on both: every cluster surfaces a Dismiss / Report as Problem / Report as Major Incident choice, and the AI drafts the resulting record for a human to review and publish.

Where it shows up

SurfaceWho sees itWhat appears
Inline banner on a ticketService Desk Agent"This looks related to N other tickets in the last 3 hours. View Cluster / Dismiss"
Broadcast popupService Desk AgentWhen a Major Incident is declared that touches one of the agent's tickets, a popup lists the impacted tickets and links into the MI record
Problem / Major Incident / Change Management DashboardProblem ManagerActive Clusters, Problem Candidates, Recent Actions, with filters for time range, severity, service, status
Draft record drawerService Desk Agent / Problem Manager / MI ManagerAn auto-generated record with linked incidents, suggested priority, and a pre-filled RCA template - the human reviews and publishes
KEDB matchesService Desk AgentWhen a Problem or MI is reported, related Known Errors surface as suggested workarounds or solutions

Personas served

  • Service Desk Agent sees the inline banner on individual tickets, the broadcast popup when a Major Incident is declared, and KEDB suggestions when a cluster is reported.
  • Problem Manager uses the Problem / Major Incident / Change Management Dashboard and a weekly digest to spot recurring patterns and turn the strongest ones into Problem records.
  • Major Incident Manager uses the pre-populated MI record to coordinate the response, with runbook hooks for war-room creation and stakeholder communication.

How it relates to the rest of Agent Assist

The other Agent Assist panels operate on a single ticket at a time. They read the ticket the agent is looking at and surface help inside that ticket's sidebar.

This feature operates on the stream of tickets. It does not show up as a panel inside one ticket. It shows up in three places where stream-level signals matter: the inline banner on individual tickets that are part of an active cluster, the popup that surfaces when a Major Incident is declared, and a dedicated dashboard for Problem Managers.

The two work together. A ticket flagged by High Risk Tickets might also be the first ticket in a forming cluster - the agent sees both signals on the same screen, and either one can lead them to investigate.

What ships today

CapabilityPage
Semantic clustering and cluster lifecycleIncident clustering
Inline banner, broadcast popups, AI-drafted records, RCA templateAlerts and reporting
Problem / Major Incident / Change Management Dashboard for Problem ManagersProblem / Major Incident / Change Management Dashboard
Known Error Database with Workaround / Solution taggingKnown Error Database
Tenant-level thresholds, RCA template, similarity tuningConfiguration

Not yet shipped

The current release does not include:

  • Change-ticket correlation with cluster timing ("Timeline matches CHG-0042")
  • Monitoring-log and CMDB health-data integration
  • Visual relationship map across Incident, Cluster, Problem, Change, and Known Error

These are planned for a later milestone.