2.148.2 Incident Management Process

Manual Transmittal

May 11, 2020

Purpose

(1) This transmits revised IRM 2.148.2, IT Support Services Management, Incident Management Process.

Material Changes

(1) Revised entire document with updated IT IRM policy and process templates incorporating required internal controls.

Effect on Other Documents

IRM 2.148.2 dated March 30, 2016 is superseded.

Audience

This process description is applicable to all organizations within the IRS requesting IT support through the Information Technology Service Desk (ITSD), by anyone in the Internal Revenue Service Information Technology (IRS IT) following the IRM 2.148 who has the responsibility of providing service and support to Information Technology customers.

Effective Date

(05-11-2020)


Nancy A. Sieger
Acting Chief Information Officer

Program Scope and Objectives

  1. Overview - IRM 2.148.2 describes the formal process for managing the Incident Management process.

  2. Purpose - This IRM describes the formal process for implementing the requirements of the Incident Management Process.

  3. Audience - The primary users of IRM 2.148.2 are all organizations within the IRS requesting IT support through the Information Technology Service Desk (ITSD).

  4. Policy Owner - The policy owner is the User and Network Services (UNS), ACIO, who is responsible for oversight of Incident Management.

  5. Program Owner - UNS is the program owner for Incident Management.

  6. Primary Stakeholders - The primary stakeholders are UNS, Enterprise Operations (EOps), ACIO Strategy & Planning (S&P), Application Development (AD), Cybersecurity, and Enterprise Services (ES).

  7. Program Goals - The objective of IRM 2.148.2 is to provide an operational definition of the major components of the process and how to perform each step in the process. It also describes the logical arrangements of steps that are essential to successfully completing the process and achieving its desirable outcome.

Background

  1. This Incident Management Process description describes what happens within the Incident Management Process and provides an operational definition of the major components of the process. This description specifies in a complete, precise, and verifiable manner, the requirements, design, and behavior characteristics of the Incident Management Process. The Process Description (PD) is a documented expression of a set of activities performed to achieve a given purpose. Tailoring of this process in order to meet the individual needs of each project is covered in the Tailoring Guidelines section of this document. This document describes the formal process for implementing the requirements of the Incident Management Process. It provides an operational definition of the major components of the process and how to perform each step in the process. This document also describes the logical arrangements of steps that are essential to successfully completing the process and achieving its desirable outcome.

  2. For this document, roles such as User, Customer, Incident Manager, First Level Support, Second Level Support, and Service Support Provider are provided to describe a set of responsibilities for performing a set of related activities.

Process Description
  1. Incident Management is the process responsible for tracking and recording incidents throughout the incident's lifecycle. This document describes the major steps within Incident Management Process, which include.

    • Activity 1 - Incident Recording

    • Activity 2 - Categorize and Prioritize

    • Activity 3 - Investigate and Diagnose

    • Activity 4 - Resolve and Dispatch

    • Activity 5 - Closure

Goal
  1. Specific Process Goals:

    • Incidents are properly logged

    • Incidents are properly routed

    • Incident status is accurately reported

    • Queue of unresolved incidents is visible and reported

    • Incidents are properly prioritized and handled in the appropriate sequence

    • Resolution provided meets the requirements of the Service Level Agreement (SLA) for the Customer

Objectives
  1. This document describes a set of interrelated activities, which transform inputs into outputs, to achieve a given purpose and states the guidelines that all projects should follow regarding the Incident Management Process. The document also includes process goal and objectives, metric, role definitions, policies and other process related attributes. The format and definitions used to describe each of the process steps of the Incident Management Process are described below:

    • Purpose – The objective of the process step.

    • Roles and Responsibilities – The responsibilities of the individuals or groups for accomplishing a process step.

    • Entry Criteria – The elements and conditions (state) necessary to trigger the beginning of a process step.

    • Input – Data or material needed to perform the process step. Input can be modified to become an output.

    • Process Activity – The list of activities that make up the process step.

    • Output – Data or material that are created (artifacts) as part of, produced by, or resulting from performing the process step.

    • Exit Criteria – The elements or conditions (state) necessary to trigger the completion of a process step.

Authority

  1. All proposed changes to this document should be directed to the IRS IT User and Network Services (UNS) Customer Service Support Director as owner of this process description and be pursued via the Integrated Process Management (IPM) process to clearly define interfaces, roles, responsibilities, and coordinate participation with collaboration between stakeholders.

Roles and Responsibilities

  1. Each process defines at least one role. Each role is assigned to perform specific tasks within the process. The responsibilities of a role are confined to the specific process. They do not imply any functional standing within the hierarchy of an organization. For example, the process manager role does not imply the role is associated with or fulfilled by someone with functional management responsibilities within the organization. Within a specific process, there can be more than one individual associated with a specific role. Additionally, a single individual can assume more than one role within the process although typically not at the same time. The following roles have been identified for this process:

    Name Description
    Process Owner Accountable for ensuring that a Process is Fit for Purpose. The Process Owner’s responsibilities include sponsorship, Design, Change Management and continual improvement of the process and its metric.
    Process Manager Responsible for Operational management of a Process. The Process Manager’s responsibilities include planning and coordination of all activities required to carrying out, monitoring and reporting on the Process.
    Process Reviewer Performs review of service provided measuring customer satisfaction, field coding, accuracy of assignment groups, etc.
    Incident Manager A manager who is responsible for managing the end-to-end Lifecycle of one or more IT Services. The term Incident Manager is also used to mean any manager within the IT Service Provider. Most commonly used to refer to a Business Relationship Manager, a Process Manager, an Account Manager or a Senior Manager with responsibility for IT Services overall.
    First Level Support / IT Enterprise Service Desk Specialist Responsible for performing the recording procedures, the categorize and prioritize procedures, the investigate and diagnosis procedures, the resolve or dispatch procedures, and the closure procedures relating to incident records received using the approved instructions in the Incident Management Knowledgebase (Knowledge Library; Known Error Library). Reviews and monitors progress of incidents. Provide status updates to users and customers.
    Second Level Support Responsible for routine incidents as well as providing a higher level of technical expertise for specific incidents when the incidents cannot be resolved by the First Level Support IT Specialist.
    Service Support Provider Any organization that delivers a standard service or product to a Customer or User.
    Service Level Reviewer Responsible for analyzing and reviewing Service attainment against the Service Level Agreements (SLAs), Organizational Level Agreements (OLAs) and Underpinning Contracts (UCs).
    Customer The Customer of an IT Service Provider is the person or group who defines and agrees on the Service Level Targets.
    User Provides requested information during the life cycle of the Incident. Responds timely and takes requested actions as determined by those working on the resolution.

    A person who uses the IT Services on a day-to-day basis. Users are distinct from Customers, as some Customers do not use the IT Service directly.

Program Management and Review

  1. The purpose of this procedure is to outline and provide steps to accomplish the Incident Management Process. The procedure will restore a normal service operation as quickly as possible and minimize the impact on business operations thus ensuring the best possible levels of service quality and availability are maintained. The activities involved ensure the process is predictable, stable, and consistently operating at the target level of performance.

  2. Process Management
    Statement:
    This document defines the step-by-step instructions on how to conduct the activities used to implement the Incident Management Procedure in IT UNS Customer Service Support. The purpose of a procedure document is to institutionalize and formalize the preferred method of performing tasks that staff is using. The objective is to have everyone using the same tools and techniques and follow the same repeatable steps so that the organization can quantify how well the procedure is working and train future staff members who may not currently know the routine. Ensuring consistency is a critical component for ensuring optimum efficiency.

  3. People
    Statement:
    Roles and responsibilities for the process must be clearly defined and appropriately staffed with people having the required skills and training. The mission, goals, scope and importance of the process must be clearly and regularly communicated by upper management to the staff and business customers of IT. All IT staff (direct and indirect users of the process) shall be trained at the appropriate level to enable them to support the process.
    Rationale:
    It is imperative that people working in, supporting, or interacting with the process in any manner understand what they are supposed to do. Without that understanding the Incident Management Process will not be successful.

  4. Process
    Statement:
    All proposed changes to this document should be directed to the IRS IT. User and Network Services (UNS) Customer Service Support owner of this procedure and pursued via the IPM process to clearly define interfaces, roles, responsibilities, and coordinate participation and collaboration between stakeholders.
    Rationale:
    The process must meet operational and business requirements.

  5. Technology and Tools
    Statement:
    All tools selected must conform to the enterprise architectural standards and direction. Existing in-house tools and technology will be used wherever possible, new tools will only be entertained if they satisfy a business need that cannot be met by current in-house tools. The selection of supporting tools must be process driven and based on the requirements of the business. Selected tools must provide ease of deployment, customization, and use. The selected tools must support heterogeneous platforms. Automated workflow, notification and escalation will be deployed wherever possible to minimize delays, ensure consistency, reduce manual intervention and ensure appropriate parties are made aware of issues requiring their attention. The tools used by this process are the following:

    • KISAM

    • Symantec Management Platform

    • IRS Desktop Softphone

    • Bomgar Representative Console

    • Remote Desktop Connection

    • Remote Tools Console (RTC)

    • Symantec Endpoint Encryption One Time Password

    • QWERT


    Rationale:
    Technology and tools should be used to augment the process capabilities, not become an end themselves.

Program Control

  1. Activities involved in ensuring a process is predictable, stable, and consistently operating at the target level of performance.

Controls
  1. Process controls represent the policies and guiding principles on how the process will operate. Controls provide direction over the operation of processes and define constraints or boundaries within which the process must operate.

Metrics
  1. Metrics are used for the quantitative and periodic assessment of a process. They should be associated with targets that are set based on specific business objectives. Metric provide information related to the goals and objectives of a process and are used to take corrective action when desired results are not being achieved and can be used to drive continual improvement of process effectiveness and efficiency.

  2. Management will regularly review quantifiable data related to differed aspects of the Incident Management process to make informed decisions and take appropriate corrective action, if necessary.

  3. Examples of Key measurements are:

    • Service Level Agreement breaches

    • Average time to resolve by incident type

    • Accuracy of classification

Tailoring Guidelines
  1. Process can only be tailored in extenuating circumstances with the prior approval of the process owner.

Terms/Definitions/Acronyms

  1. Definitions of Incident Management terms and acronyms.

Terms and Definitions
  1. The definitions listed below are some commonly used terms and are provided as an aid to understanding Incident Management.

    Terms Definition
    Acquisition The process of obtaining products (goods and services) through contract.
    Artifact A work product created by a process or procedure step, e.g. plans, design specifications, etc.
    Service Catalog Entries in the Service Catalog that is associated with a specific product or fulfillment service that needs to be delivered.
    ITIL definition of Service Catalog: A database or structured Document with information about all Live IT Services, including those available for Deployment.
    The Service Catalog includes information about deliverable, prices, contact points, ordering and request Processes.
    Configuration Item (CI) A CI can be any piece of equipment or component that is tracked through a device record in Configuration Management.
    Customer The Customer of an IT Service Provider is the person or group who defines and agrees on the Service Level Targets.
    Escalation The action taken in Recording System to change an interaction into an incident
    or
    May also refer to Managerial Escalation when an incident is elevated to Management to assist in resolution. This could involve a change in the impact or urgency if the event warrants a more immediate response
    or
    Technical Escalation when an incident is transferred to a higher service level technician to assist in resolution. This could involve change in the assignment if event warrants a different skill set.
    Event A change of state which has significance for the management of a Configuration Item or IT Service.
    Event Management The process responsible for managing events thought out the life cycle. It allows for normal operation and detects and escalates exception conditions.
    Impact A measure of the effect of an Incident, Problem or Change on Business Processes. Impact and Urgency are used to assign Priority. Exhibit A (1)
    Incident An unplanned interruption to an IT Service or a reduction in the quality of an IT Service.
    Incident Record A Record containing the details of an Incident. Each Incident record documents the Lifecycle of a single Incident.
    Information Alert (Info Alert) Email notifications to relay information to internal and external customers. The templates used to distribute this information are General (information only) Scheduled (scheduled outages),Unscheduled Outages (major system outages),CADE2 Elongated Day/Week and End of Day (daily processing).
    Interaction The documented conversation between the First Level Support IT Service Desk Specialist and the User/Customer to obtain information regarding the issue, or automatically created through the System when User/Customer submits break/fix or service request via the web.
    Knowledgebase A logical database containing the data used by the Service Knowledge Management System.
    A published source of approved definitive workaround and resolutions for known errors commonly experienced. Integrates with interaction, incident, and problem management so that Users/Customers can search for and use knowledge articles developed from prior incidents or problems to resolve a new instance of the same type of incident or known error.
    Knowledge Library The Knowledge Library is a central repository of the data, information and knowledge that the IT organization needs to manage the lifecycle of its services. Its purpose is to store, analyze and present the service provider's data, information and knowledge.
    Known Error A Problem that has a documented Root Cause and a Workaround. Known Errors are created and managed throughout their lifecycle by Problem Management. Known Errors may also be identified by Development or Suppliers.
    Known Error Library The Known Error Library contains opened and closed known error records which hold exact details of problems with precise details of workarounds or resolution actions that should be used to restore service and/or resolve a problem. These records are created by Problem Management (PM) and used by Incident and PM.
    Priority A Category used to identify the relative importance of an Incident, Problem or Change. Priority is based on Impact and Urgency and is used to identify required times for actions to be taken. Exhibit A (3)
    Problem A cause of one or more Incidents.
    The cause is not usually known at the time a Problem Record is created and the Problem Management Process is responsible for further investigation.
    Process Review Review of Incident to ensure that all internal processes were followed and documented to expedite problem resolution.
    Problem Management The Process responsible for managing the Lifecycle of all Problems. Problem Management proactively prevents Incidents from happening and minimizes the impact of Incidents that cannot be prevented.
    PM Candidate Incidents that are flagged as potential problems to be opened/investigated.
    Resolution Action taken to repair the Root Cause of an Incident or Problem, or to implement a Workaround.
    Risk A possible Event that could cause harm or loss, or affect the ability to achieve Objectives.
    A Risk is measured by the probability of a Threat, the Vulnerability of the Asset to that Threat, and the Impact it would have if it occurred.
    Role A set of responsibilities, Activities and authorities granted to a person or team. A Role is defined in a Process. One person or team may have multiple Roles. For example the Roles of Configuration Manager and Change Manager may be carried out by a single person.
    Service Level Agreement (SLA) An Agreement between an IT Service Provider and a Customer. The SLA describes the IT Service, documents Service Level Targets, and specifies the responsibilities of the IT Service Provider and Customer. A single SLA may cover multiple IT Services or multiple Customers.
    Service Request A request from a User/Customer for information, or advice, or for a Standard Change or for Access to an IT Service.
    For example to reset a password, or to provide standard IT Services for a new User/Customer. Service Requests are usually handled by a Service Desk, and do not require an RFC to be submitted.
    Status The name of a required field in many types of Record.
    It shows the current stage in the Lifecycle of the associated Configuration Item, Incident, Problem etc.
    System The IT System that is used to record, classify, prioritize, document, provide supporting information, route, and track all events, incidents, problems and requests within the IT environment.
    Urgency A measure of how long it will be until an Incident, Problem or Change has a significant Impact on the Business. The lower the value number the sooner it must be addressed. Impact and Urgency are used to assign Priority. Exhibit A (2)
    User A person who uses the IT Service on a day-to-day basis.
    Workaround Reducing or eliminating the Impact of an Incident or Problem for which a full resolution is not yet available. For example, by restarting a failed Configuration Item. Workaround for Problems are documented in Known Error Records.
    Workaround for Incidents that do not have associated Problem Records are documented in the Incident Record.

    .

Acronyms
  1. The abbreviations and acronyms include an alphabetical listing of some commonly used terms in Incident Management.

    Acronyms Definition
    ACIO Associate Chief Information Officer
    AD Applications Development
    CMDB Configuration Management Data Base
    CMMI Capability Maturity Model Integrated
    CSS Customer Service Support
    CIO Chief Information Officer
    FIFO First In First Out
    IPM Integrated Process Management
    IRM Internal Revenue Manual
    IT Information Technology
    ITIL Information Technology Infrastructure Library
    ITPAL Information Technology Process Asset Library
    ITSD Information Technology Service Desk
    ITSM Information Technology Service Management
    KISAM Knowledge Incident/Problem Service and Asset Management
    OLA Organizational Level Agreement
    OTP One Time Password
    PD Process Description
    PM Problem Management
    PMI Project Management Institute
    PWM Password Management
    RFC Request for Change
    SKMS Service Knowledge Management System
    SLA Service Level Agreement
    SOP Standard Operating Procedures
    UNS User and Network Services
    UC Under Pinning Contracts

Related Resources

  1. Related Directives are:

    • CTO, CIO, ACIO approved official Playbooks

    • CTO, CIO, ACIO approved Business rules

    • CTO, CIO, ACIO approved measures

Training

  1. Enterprise Service Desk Training

  2. Remote tools training for Enterprise Service Desk

  3. Incident Ticket Processing Training

  4. Enterprise Service Desk current Incident systems training

  5. Service Support organizations specific training programs

Process Workflow

  1. A workflow consists of Activities and Tasks, Inputs and Outputs, Roles, and Flow Diagrams. It describes the tasks, procedural steps, organizations or people involved, required input and output information, and tools needed for each step of the process. .

Main Process Diagram

  1. IT Incident Management Process Diagram

    Figure 2.148.2-1

    This is an Image: 68034001.gif
     

    Please click here for the text description of the image.

  2. Note: When an incident is reassigned update the ticket to provide a written justification indicating why the ticket needs to be reassigned and what actions the service provider needs to perform.

Inputs

  1. Process inputs are used as triggers to initiate the process and to produce the desired outputs. Users, Customers, Stakeholders or other processes provide inputs.

ACTIVITY 1: Incident Recording
  1. Incident Recording

Purpose
  1. The purpose of this process step is to record the details of the incident and verify User/Customer information to log the incident

Roles and Responsibilities
  1. The User/Customer is responsible for providing accurate information related to the incident being reported:

    • Details of the service interruption

    • Impact information

    • Urgency of outage

  2. The First-Level Support IT Enterprise Service Desk Specialist is responsible for documenting the incident data

  3. The Second-Level Support is responsible for:

    • Aiding the First-Level Support IT Enterprise Service Desk Specialists, as needed, in determining appropriate impacts and urgencies and providing the first step in the technical escalation process

  4. The Incident Manager at the Service Desk is responsible for:

    • Ensuring employees are following the Incident Management Process

    • Providing the first step in the managerial escalation process within the Service Desk

Entry Criteria
  1. Generally, the Incident Recording Process occurs after one or more of the following contacts have occurred:

    • An Incident has been initiated by phone call, web chat session, web, or e-mail to the Service Desk

    • The incident was determined to be a failure in an IT service or product. (example: break/fix)

Input
  1. The following are inputs to the Incident Recording Process step:

    • Contacts from Service Desk (reference Service Desk Function)

    • IRS IT service User and Customer contact by phone, web chat, web, or email

    • Service Level Agreements

    • Catalog of Services

    • Knowledgebase (Known Error Library, Knowledge Library, etc.)

    • Automated alerts from IT assets

    • Event records meeting the Entry Criteria to the Incident Management process

Process Activity
  1. Receipt and recordation of Incident

    • Receipt of Incident

    • Automatically Route Incident record (yes/no)

    • Route to next available IT Specialist

    • IT Specialist contacts User/Customer if not already on call with them

    • User/Customer information verified

    • Generate incident identification number

    • Incident record updated with any additional information

    • Exits to Categorize and Prioritize

Output
  1. The following are outputs to this process step:

    • Incident record with completed User/Customer information

    • User/Customer’s initial definition of incident

    • Incident record linked to related problem record, known errors, etc.

    • Incident record number is provided to the User/Customer

Exit Criteria
  1. This process step is complete when:

    • An Incident record is systemically processed

    • An Incident record is manually processed

ACTIVITY 2: Categorize and Prioritize
  1. Categorize and Prioritize

Purpose
  1. The purpose of this process is to outline the steps required to categorize the service, project and program the incident relates to as well as prioritize the incident based on its impact and urgency.

Roles and Responsibilities
  1. The User/Customer is responsible for:

    • Being available to support the IT Provider in their efforts to resolve the incident

  2. The First-Level Support IT Enterprise Service Desk Specialist is responsible for:

    • Working with the User/Customer

    • Referring to Knowledge Article data on prioritizing incidents for proper coding

    • Documenting categorization and prioritization

  3. The Second-Level Support is responsible for:

    • Helping the First-Level Enterprise Service Desk IT Specialists, as needed

  4. The Incident Manager is responsible for:

    • Ensuring their employees are following the Incident Management Process

    • Providing the first step in the managerial escalation process within the Service Desk

Entry Criteria
  1. Generally, Categorize and Prioritize occurs after the following events have occurred:

    • An Incident record was systemically processed

    • An Incident record was manually processed

Input
  1. The following are inputs to this process step:

    • An Incident record

    • Data in the Knowledgebase

    • Impact & Urgency

    • Service Level Agreements

    • Catalog of Services

    • User/Customer information

Process Activity
  1. Categorize Incident

    • Based on incident description

    • Utilize Knowledge Articles and SLAs to identify the appropriate categories

    • Update the incident record with the appropriate categories

  2. Prioritize Incident
    Use the following:

    • Impact parameters – based on number of customers

    • Urgency parameters – based on Service Level Agreements

Output
  1. The following are outputs to this process step:

    • An Incident record that has been categorized and a priority level has been assigned

    • Data related to the incident record

Exit Criteria
  1. This process step is complete when:

    • The incident has been categorized and prioritized

ACTIVITY 3: Investigate and Diagnose
  1. Investigate and Diagnose

Purpose
  1. The purpose of this process step is to collect detailed information on the symptoms of an incident and develop a diagnosis of the issue, so a Knowledge Article can be identified.

Roles and Responsibilities
  1. The First-Level Support Enterprise Service Desk Specialist is responsible for:

    • Working with the User/Customer

    • Researching the Knowledge Database (Known Error Library and Knowledge Library)

    • Identifying a candidate for Knowledge Database if no article found

    • Confirming categorization and prioritization

  2. The Second-Level Support is responsible for:

    • Helping the First-Level Support IT Specialists, as needed

  3. The Incident Manager is responsible for:

    • Ensuring their employees are following the Incident Management Process

    • Providing the first step in the managerial escalation process within the Service Desk

Entry Criteria
  1. Generally, Investigate and Diagnose occurs after the following events have occurred:

    • Receipt of an incident record from Categorize and Prioritize

Input
  1. The following are inputs to this process step:

    • An incident record with a priority assigned related to its classification type

    • Incident record that was created in error (e.g. a misidentified Service Request; a Change Request)

Process Activity
  1. Investigate

    • Collect and confirm detailed information

    • Direct incident record created in error to the correct process

  2. Diagnosis

    • Research Knowledgebase (Known Error Library, Knowledge Library, etc.)

    • Identify as a candidate for Knowledge Database, if no article found

    • Move incident record to Resolve or Dispatch step

Output
  1. The following are outputs to this process step:

    • Incident record

    • Data related to the incident or problem management candidate

    • Incident record created in error is routed to the correct process

Exit Criteria
  1. This process step is complete when:

    • A diagnosed incident record is associated to the appropriate Knowledge Article and moved to the next step

    • A problem task is routed to Problem Management

    • An incident is linked to another existing incident for same issue

    • The Incident record created in error is routed to the correct process

ACTIVITY 4: Resolve or Dispatch Incident
  1. Resolve or Dispatch Incident

Purpose
  1. The purpose of this procedure is to outline the steps to determine if an incident record is resolved at the Service Desk, dispatched to a Service Provider, or to invoke the Change Management Process.

Roles and Responsibilities
  1. The User/Customer is responsible for:

    • Being available and supportive of the inquiry made by the IT Specialist in efforts to resolve the incident

    • Provide additional information to clarify the problem

  2. The First-Level Support Enterprise Service Desk Specialist is responsible for:

    • Working with the User/Customer and other parties as needed to implement the approved workaround or solution documented in the Knowledgebase (Known Error Library or Knowledge Library) for the incident reported

    • Documenting the steps taken to resolve the incident

    • Being proficient in all the tools required to resolve and restore services

  3. The Second-Level Support is responsible for:

    • Helping the IT Specialists and other parties as needed to implement the approved workaround or solution in the Knowledgebase (all libraries) to resolve incident for the User/Customer

    • Being proficient in all the tools required to resolve and restore services

  4. The Incident Manager is responsible for:

    • Insuring employees are following the Incident Management process steps

    • Providing the first step in the managerial escalation process within the Service Desk

  5. The Service Support Provider is responsible for:

    • Working with the User/Customer and other parties as needed to implement the approved workaround documented in the Known Error Library or the solution documented in the Knowledge Library. These skills may be technical, process or system related. This support is provided by employees in other IT Functions, vendors, and business units

    • Being proficient in all the tools required to resolve and restore services

Entry Criteria
  1. Generally, Resolve or Dispatch step occurs after the following events have transpired:

    • An Incident Record vetted and diagnosed from the previous step, Investigate and Diagnose; as well as an appropriate Knowledge Article is associated

    • A solution or workaround for incident records is developed where none previously existed

Input
  1. The following are inputs to this process step:

    • Incidents identified as Critical Incidents

    • A diagnosed incident record that was associated to the appropriate Knowledge Article

    • A diagnosed incident linked to an existing incident for same issue

    • Incident records received from Investigate and Diagnose step

    • Incidents transferred for Problem Management solution development

    • Incidents reopened for rework by management based on feedback, process review or escalation issues

    • Data in the Knowledgebase related to the issue in incident

    • Information on related open issues and incidents

Process Activity
  1. Incident Closure

    • Incidents identified as “Critical” will follow Standard Operating Procedures (SOP) for Critical Incidents

    • Follow the Knowledge Article instructions to resolve or dispatch to resolve

    • Confirm resolution

    • Route to Incident Closure step

Output
  1. The following are outputs to this process step:

    • Confirmed resolution

    • Data related to the resolved incident

    • Incidents identified as candidate for Knowledge Database

    • Request For Change for non-standard activities

Exit Criteria
  1. This process step is complete when:

    • The incident is resolved

ACTIVITY 5: Incident Closure
  1. Incident Closure

    • Incidents identified as “Critical” will follow Standard Operating Procedures (SOP) for Critical Incidents

    • Follow the Knowledge Article instructions to resolve or dispatch to resolve

    • Confirm resolution

    • Route to Incident Closure step

Purpose
  1. The purpose of this process step is the closure of a resolved incident

    • Confirmed resolution

    • Data related to the resolved incident

    • Incidents identified as candidate for Knowledge Database

    • Request For Change for non-standard activities

Roles and Responsibilities
  1. The User/Customer is responsible for:

    • Being available to support the IT Provider in their efforts to close the incident

  2. The First-Level Support IT Enterprise Service Desk Specialist is responsible for:

    • Working with the User/Customer

    • Documenting knowledge articles followed

    • Confirming restoration of service

    • Documenting restored service

    • Escalating to management (e.g. complaint process; policy clarification; customer request, reworks)

    • Closing the incident

  3. The Second-Level Support is responsible for:

    • Working with the User/Customer

    • Documenting knowledge articles followed

    • Confirming restoration of service

    • Documenting restored service

    • Escalating to management (e.g. complaint process; policy clarification; customer request, reworks)

    • Closing the incident

    • The Incident Manager is responsible for:

  4. The Incident Manager is responsible for:

    • Ensuring employees are following the Incident Management process steps

    • Ensuring proper documentation relating to resolution steps were performed timely

    • Providing the first step in the managerial escalation process within the Service Desk

  5. The Service Support Provider is responsible for:

    • Working with the User/Customer

    • Documenting knowledge articles followed

    • Confirming restoration of service

    • Documenting restored service

    • Escalating to management (e.g. complaint process; policy clarification; customer request, reworks)

    • Closing the incident

  6. The Service Level Reviewer is responsible for:

    • Confirming SLA, OLA’s and UPC requirements are met

    • Ensuring service level improvement plans are implemented if needed

    • Coordinating improvement process with all Service Support Providers

    • Confirming restoration of service

  7. The Process Reviewer is responsible for:

    • Performing survey of customers

    • Evaluating per statistical sampling

      □Field validation

      □Customer concurrence

      □Documentation

    • Providing feedback to management for continuing process improvement

Entry Criteria
  1. Generally, Incident Closure occurs after the following:

    • A service is restored from the previous step, Resolve or Dispatch

    • Process review is completed

    • Requestor canceled the incident (automated closure by the system)

Input
  1. The following are inputs to this process step:

    • Resolved Incidents

    • Cancelled incidents

Process Activity
  1. Review incident details

    • IT Specialist or Service Support Provider reviews incident details

  2. Obtain User/Customer concurrence

    • IT Specialist or Service Support Provider requests customer concurrence to resolve:

      □If Yes – incident proceeds to Process Review, if included in sampling

      □If Yes – but not included in sampling, proceed to User Satisfaction Survey

      □If No – Management escalation process

  3. Process Review

    • Evaluates work product based on sampling size or priority of incident

  4. User Satisfaction Survey

    • Evaluates Customer Satisfaction with the service received

  5. Formal closure of the incident record

Output
  1. The following are outputs to this process step:

    • Closed incidents for IT services

    • Cancelled incidents for IT services

    • Data related to closed incidents

    • Data related to cancelled incidents

    • Management Escalation if no customer concurrence

    • Process Review results

    • Customer Satisfaction Survey results

Exit Criteria
  1. This process step is complete when:

    • Incident records are closed or cancelled

Process Measurement
  1. Management will regularly review quantifiable data related to different aspects of the Incident Management process to make informed decisions and take appropriate corrective action, if necessary.

  2. Examples of Key measurements are:

    a. Service Level Agreement breaches

    b. Average time to resolve by incident type

    c. Accuracy of classification

    d. Quality of work products

    e. Cost of service

Outputs

  1. The primary outputs of the Incident Management Procedure are:

    • Closed incidents for IT services

    • Cancelled incidents for IT services

    • Data related to closed incidents

    • Data related to cancelled incidents

    • Management Escalation if no customer concurrence

Exit Criteria
  1. This Incident Management Procedure is exited when:

    • Closed or cancelled Incidents

Activities

  1. This Incident Management Procedure covers the following activities:

    • Incident Recording

    • Categorize and Prioritize

    • Investigate and Diagnose

    • Resolve or Dispatch Incident

    • Incident Closure

Procedure Flow Diagram
  1. The procedure flow diagram is illustrated under 2.148.2.2.4.2, Activity and Steps

Activity and Steps
  1. This section delineates the activity steps, including roles and tools or templates, needed to perform each step of this Procedure.

    Incident Recording  
    Steps Roles
    1. Receipt of Incident
    • Event Records identified as an incident will enter the Incident Management. These records can be sourced from WEB, e-mail, phone calls, web chats or automated alerts, etc.

    System
    First Level Support
    Second Level Support
    2. Is Incident automatically routed?
    If yes, System will route incident to defined queue based on how it was received through web.
    1. System will determine urgency and impact based on input data and SLA’s as predetermined for incident record.

    2. System will complete investigate and diagnose procedure based on SLA, OLA, UC and Knowledge articles.

    3. System will complete resolve or dispatch procedure by routing incident record to assignment group based on SLA, OLA, UC and Knowledge articles.


    If no, Go to the next Step
    System
    3. Route to next available IT Specialist
    • System will combine all incident records from various sources received (examples: email, web, voice, chat).

    • System will place incident records into a queue to be worked. Goal is to allow for (First in First out) FIFO processing.

    • Go to next Step

    System
    4. IT Specialist contacts User/Customer.
    IT Specialist determines if incident record requires interaction with the User/Customer, via email, voice or instant ,messaging
    • If already interacting with User/Customer, Go to next Step

    • If not interacting with User/Customer, IT Specialist checks Knowledge Article to determine if contact with User/Customer is required.
      • If no, go to Step 6, Generate incident identification number.
      • If yes, IT Specialist contacts User/Customer.

    • If user not able to be contacted, document and place incident in Enterprise Service Desk queue for follow up. Activity completed, go to Step 3.

    • If User/Customer contact is made go to next Step.

    First Level Support
    5. Verify User/Customer Information
    • IT Specialist verifies User/Customer information

    • IT Specialist confirms all answers match data in tool

    • IT Specialist enters any changes of information into tool

    • IT Specialist goes to next Step

    First Level Support
    6. Generate incident identification number
    • System will provide a tracking record number

    • Go to next Step

    System
    7. Update Incident Record with any additional information
    • IT Specialist inquires with user for any additional descriptive or detailed information that may help with ticket type determination.

    • IT Specialist documents any additional descriptive or detailed information.

    • IT Specialist does a quality review check to ensure all required fields are correct and required information is documented in the ticket.

    • Exit to Categorize and Prioritize.

    First Level Support

    Figure 2.148.2-2

    This is an Image: 68034002.gif
     

    Please click here for the text description of the image.

    Categorize and Prioritize  
    Steps Roles
    1. Receive Incidents from Incident Recording Procedure.
    • Go to next Step

    System
    2. Categorize Incident
    • IT Specialist reviews information in incident record

    • IT Specialist determines the multi-level categorization using the ticketing system’s categorization fields.

    • Go to next Step

    First Level Support
    3. Assign Impact
    • IT Specialist determines number of Users, Customers and Systems affected by issue

    • IT Specialist looks up the correct impact (See Exhibit A)

    • IT Specialist inputs the impact into the incident record

    • Go to next Step

    First Level Support
    4. Assign Urgency
    • IT Specialist researches service effected

    • IT Specialist assigns urgency based on service impacted (See Exhibit A)

    • Go to next Step

    First Level Support
    □ System automatically calculates priority based on Impact & Urgency and impact (See Exhibit A) System
    5. Exit to Investigate and Diagnose Procedure System
    First Level Support

    Figure 2.148.2-3

    This is an Image: 68034003.gif
     

    Please click here for the text description of the image.

    Investigate and Diagnose  
    Steps Roles
    1. Receive incident from Categorize and Prioritize Procedure
    • Go to next Step

    System
    First Level Support
    2. Investigate incident IT Specialist determines
    • incident symptoms

    • Service or services effected

    • Recent changes related to incident starting


    IT Specialist asks Users/Customers questions about the incident symptoms
    • Go to the next Step

    First Level Support
    3. Collect and confirm detailed information
    • IT Specialist documents questions and answers from Step 2.

    • IT Specialist documents Issues, Problem, Facts, Assumptions, and Conclusions

    • Go to next Step

    First Level Support
    4. Confirm incident correctly identified
    • If INCIDENT. IT Specialist checks to see if a related issue is being worked for same incident-type

    • If RELATED ISSUE OPEN, associate to existing incident

    • Incident will be resolved with related incidents

    • If NO RELATED INCIDENT OPEN, proceed to Step 5

    • If NOT AN INCIDENT, IT Specialist stops the Incident Management Process. Exit for closing of incident record and generation of Request (if appropriate).

    First Level Support
    5. Diagnose Incident
    • IT Specialist reviews incident information from Step 3, Collect and confirm detailed information.

    • IT Specialist identifies search keywords to query the Knowledgebase.

    • Go to next Step

    First Level Support
    6. Research Knowledgebase for Known Errors and Knowledge article
    • IT Specialist performs a Knowledgebase search

    • IT Specialist determines if relevant Known Error exists
      □ If YES – follow instructions in Known Error record
      □ If NO – IT Specialist searches for Knowledge Library solution

      □ If YES, follow Knowledge article solution, Proceed to Step 6a
      □ If NO, solution exists, proceed to Step 7

    First Level Support
    6a. If Knowledgebase solution exists, IT Specialist updates incident with:
    • Knowledge Article number, if found

    • Detailed analysis

    • Details of triage actions

    • Why the incident is being reassigned (if assigned incident to another group)

    • Actions that need to be performed (if assigning to another group)

    • Proceed to Step 8

    First Level Support
    7. If no solution exists, IT Specialist upates incident with:
    • Detailed analysis

    • Details of triage actions

    • Why the incident is being reassigned indicating no solution found

    • Actions that need to be performed.
      IT specialist stops the Incident Management Process and routes incident record to Problem Management for resolution

    First Level Support
    8. Exit to Resolve or Dispatch Procedure First Level Support

    Figure 2.148.2-4

    This is an Image: 68034004.gif
     

    Please click here for the text description of the image.

    Resolve or Dispatch Procedure  
    Steps Roles
    1. Identify Knowledge Article
    • If article NOT found, Route to Problem Management Process

    • If article found, based on the guidance in the Knowledge-base:

    • □ Proceed to Step 2: Service Desk Initiates Repairs
      OR

    • □ Proceed to Step 3: Dispatch to Service Support Provider


    NOTE: Critical Incidents will be routed to Escalation & Info Alerts SOP
    First Level Support
    2. Service Desk Initiates Repairs
    Knowledgebase review indicated incident is to be resolved without need for assignment.

    Note:

    This is a standard fix and/or recovery, details are available and no Service Support Provider is required to directly complete restoration of service


    If a Change Request is needed, a Change Management task ticket is submitted requiring approval or knowledge before solution can be implemented.
    IT Enterprise Service Desk Specialist initiates repairs based on information provided in Knowledgebase and Standard Operating Procedures related to the issue:
    • A remote tool may be used to resolve issue

    • Incident will be updated with documentation of issue resolved

    • Incident will have resolution steps taken and documented, and reference Knowledge Article used

    • Technical or managerial escalation rules apply

      Note:

      This work is generally never dispatched to supplemental support and requires managerial approval to do so


    Go to Step 7
    First Level Support
    3. Dispatch to Service Support Provider
    • An incident is dispatched based on need of physical touch, special handling and/or meeting criteria described in Standard Operating Procedures, IRM, SLA, etc.
      Change Management task ticket submitted requiring approval or knowledge before the change to resolve the issue.
      See specific Service Support Provider Standard Operating Procedures (SOP) for Incident Routing methodologies and Knowledge Article development

    • Assign to Service Support Provider staff in related organization: they perform work and take steps to resolve based on information provided in Knowledgebase, OLAs & IRS Standard Operating Procedures related to the issue.

    • Go to Step 4

    First Level Support
    4. Look up Estimated Resolution Time per SLA/OLA/UC:
    • IT Specialist reviews and verifies resolution time by checking related specific SLA/OLA/UC

    • IT Specialist updates incident with information found

    • Go to Step 5

    First Level Support
    5. Set User/Customer expectations based on Priority:
    • Set expectation based on SLA & Directive Policies by informing the User or Customer of the Estimated Resolution Time according to the priority and time frames as defined (See Catalog of Services or Knowledge Article).

    • Go to Step 6

    First Level Support
    6. Provide Incident number to User/Customer
    Remind User or Customer they can follow-up for incident status by calling the Service Desk or accessing the website.
    First Level Support
    □ System Monitoring Process automatically track work status and time open. System
    □ Assign ticket to appropriate Service Support Provider assignment group per Knowledge Article.
    Note: If not a critical incident, they will invoke the Service Support Provider Organization work procedures.
    Note: Critical Incidents invokes Escalation & Info Alerts SOP.
    Go to Step 7
    System
    or First Level Support
    7. Confirm Incident Resolved
    • Specialist working the incident reviews and verifies incident is resolved based on OLAs/SLAs/UCs, and the IRM guidelines after following Knowledge Articles resolution guidance

    • Concurrence from User/Customer that service is restored is REQUIRED


    If issue UNRESOLVED, go to Step 8
    If issue was RESOLVED, go to Step 10
    First Level Support Service Support Provider (whoever is working the ticket)
    8. Escalate for Managerial or Technical Review
    Incident unresolved and additional assistance needed, escalate to higher-level technical employee or incident manager in group
    First Level Support Service Support Provider (whoever is working the ticket)
    □ The appropriate Second-Level IT Specialist/ Incident Manager/ Service Support Provider, reviews ticket details to determine issues with resolving incident; they will:
    • Determine root cause for failure to resolve

    • Develop plan of action to resolve incident

    • Take steps needed to resolve incident such as:
      □ Assign incident for additional processing
      □ Flag incident for Knowledgebase document modification
      □ Provide training or direction
      □ Provide managerial intervention
      □ Share or assign incident to other supplemental support
      □ Follow Problem Management IRM


    If incident still unresolved, go to Step 9
    Second Level Support
    /Incident Manager/Service Support Provider as appropriate
    □ If incident resolved, go to Step 10 First Level Support
    9. Document Resolution Failure
    Incident is unresolved, document incident and evaluate for incident routing
    • Update incident with details of why resolution in Knowledgebase did not work

    • What was done

    • What the results were

    • Document Knowledgebase (articles) used

    First Level Support
    Second Level Support
    Service Support Provider
    (whoever is working the ticket)
    Flag Incident and Route to Problem Management Process Second Level Support
    Service Support Provider
    (whoever is working the ticket)
    10. Update Resolution details
    Incident is resolved, IT Specialist updates incident with Resolution information:
    • Update incident with clear resolution detail from Knowledgebase

    • Input resolved time

    • Document results

    • Verify ticket codes and Impact & Urgency are accurate

    First Level Support
    or Service Support Provider
    (whoever is working the ticket)
    Exit to Incident Closure Procedure System

    Figure 2.148.2-5

    This is an Image: 68034005.gif
     

    Please click here for the text description of the image.

    Closure Procedure  
    Steps Roles
    1. Resolved Incidents Received
    Resolved Tickets (incidents) are received from the Resolve or Dispatch procedure, or auto-routed from the Incident Recording procedure.
    • System will be resolved systemically when time frame has been met:

    • If resolved SYSTEMICALLY, go to Step 2

    • If NOT SYSTEMICALLY, go to Step 3

    System
    2. Systemic Closure
    System closes incident and Procedure ends.
    System
    3. Review Incident details
    • Review Incident details to insure all steps taken to resolve and all ticket fields are completed accurately
      Can the incident close?

    • If No, return to Investigate and Diagnose procedure

    • If Yes,

      • If closing based on Known Error with no workaround, manually close using the following guidelines below:

      □ Notate the Known Error number and search for the associated Problem Record
      □ Document the following in the solution field of the Incident record
      □ Problem PM##_KE## is being investigated by IT Specialist, Resolution Code of CROSSREF go to Step 4

    First Level Support
    Second Level Support
    Service Support Provider (whoever is working the ticket)
    4. Close the Incident.
    Go to Step 5.
    First Level Support
    Service Level Support
    Service Support Provider (whoever is working the ticket)
    5. User Satisfaction Survey
    • Survey triggered by statistically valid sampling.

    • Statistics for trend analysis and reporting are gathered Report Generation.


    Procedure Ends
    System

    Figure 2.148.2-6

    This is an Image: 68034006.gif
     

    Please click here for the text description of the image.

    Exhibit A: Priorities:

    There are four priorities, they are 1-4. The Priority code is calculated by a formula that is based on the Impact and Urgency rating of the Incident

    • Priority 1 - Severe Critical Work Stoppage – Any issue causing severe mission critical work stoppage or any IT issue impacting safety or health, i.e. fire, shock from equipment etc. Impact may be on multiple internal or external customers and service to taxpayers. Immediate action required.
      o Assignment of Ticket - No Later than 30 minutes
      o Updates: On an hourly basis
      o Target Resolution Time - Within 4 hours

    • Priority 2 - Potential Critical Work Stoppage – Any issue that could have a direct impact on the service to taxpayers or if its scope is multi-user and there is no work-around. Could lead to severe mission critical work stoppage if actions are not taken to resolve problem.
      o Assignment of Ticket - No Later than 1 hour
      o Updates: At least every 2 hours
      o Target Resolution Time - Within 8 hours (1 working business day)

    • Priority 3 – Any issue causing work stoppage for one customer with no work around. Examples include password resets or unlocks; service disruption for single customer resulting in work stoppage; alert notification of authorized outage or scheduled maintenance resulting in a work stoppage
      o Assignment of Ticket - No Later than 1 hour
      o Updates: At least every 4 hours o Target Resolution Time - Within 16 hours (2 working business days)

    • Priority 4 – Any issue that is non-Critical or non-software problems where it is not a work stoppage and there is a workaround.

    • Examples include service disruption for single customer not involving work stoppage or alert notification of authorized outage or scheduled maintenance not resulting in a work-stoppage
      o Assignment of Ticket - No Later than 2 hours o Updates: No later than 3 business days
      o Target Resolution Time - Within 32 hours (4 working business days) Exceptions: Then there are exceptions for areas that may have a different Service Level requirement and a Service Level Agreement has been approved -or- processes with Security level implications that require immediate responses Examples: Counsel Hotline tickets; lost/stolen Blackberry/IPhone

Exhibit A: Priorities (continued) The Priority code is calculated by a formula that is based on the Impact and Urgency rating of the Incident.

  1. Impact: The degree of business disruption

    Impact Description
    1 Enterprise Severs, widespread business disruption; or Nationwide or multi-site outage of system on “Premium Service List”, or Affects multiple Business Areas; or major disruption to taxpayers or external partners and no workaround
    2 Site/Department A facility/department/function is unable to operate; or affects 30 or more users; or disruption to groups of taxpayers or external partners, or there is a workaround in place for a system on the "Premium Service List"
    3 Multiple Users Multiple individuals (<30) are unable to operate
    4 User - Single user unable to operate
  2. Urgency: How quickly the issue must be resolved

    Urgency Description
    1 Critical Interferes with core business functions or loss/potential loss of mission critical data. An immediate and sustained effort using all necessary resources until resolved. On-call procedures and Vendor support used as required
    2 High Interferes with subset of core business functions, or some functions remain operational with workaround option. Technicians respond immediately, assess the situation, may interrupt other staff working low or medium priority jobs for
    3 Medium Interferes with normal completion of work, or tasks are more difficult but not impossible to complete, with medium impact to productivity. Response using standard procedures and operating with normal supervisory management structures
    4 Low Interferes with normal completion of work or tasks are more difficult but not impossible to complete, with minimal impact on productivity. Response using standard procedures as time allows.
  3. Prioritization Matrix (based on Urgency and Impact)
    To determine priority, intersect urgency and impact

      Urgency:
    Impact   1 2 3 4
    1 1 1 2 2
    2 1 2 2 3
    3 2 2 3 3
    4 2 3 3 4
  4. Prioritization times

    Note:

    These times are general and specific times will be found in SLAs

    Priority Code Description Target Response Time Target Resolution Time
    1 Critical Immediate 4 Hours
    2 High 1 Hour 8 Hours
    3 Medium 4 Hours 2 Business Days
    4 Low 1 Day 4 Business Days