2.152.1 Data Engineering Policy

Manual Transmittal

August 19, 2019

Purpose

(1) This transmits revised IRM 2.152.1, Data Engineering, Data Engineering Policy.

Material Changes

(1) Revised the entire document with updated Integrated Process Management Policy Templates adding Internal Control Section.

Effect on Other Documents

IRM 2.152.1 dated October 13, 2015 is superseded.

Audience

This policy is applicable to all projects following the Enterprise Life Cycle (ELC), and all applications that use the Data Management Service.

Effective Date

(08-19-2019)


Chief Information Officer

Program Scope and Objectives

  1. Overview - The objectives of Data Engineering Policy is to establish the High Level Guidelines for the Data Management Process, Naming Data Element(s)/Object(s), and if possible, Future Subprocesses within IRS for agency-wide Project Offices to follow.

  2. Purpose - The purpose of this policy is to establish the organizational policy for planning and performing the Data Engineering Processes within the IRS environment. This policy has been created to manage enterprise-wide data engineering processes more easily and efficiently so that the business can maximize its data value within IRS.

  3. Audience - This policy is applicable to all projects following the Enterprise Life Cycle (ELC), and all applications that use the Data Management Service.

  4. Policy Owner -The Chief Information Officer is responsible for overseeing all aspects of Data Management Process that operates IRS agency-wide Data Management Process.

  5. Program Owner -The Chief of Process Maturity Group under the Director of Enterprise Services, Solution Engineering Division is responsible for the administration, procedures, and updates related to the program IRM.

  6. Primary Stakeholders - The Chief of Data Engineering Group under Enterprise Services, Solution Engineering Division is the Primary Stakeholders of this IRM has input in the procedures.

  7. Program Goals - This IRM provides the fundamental knowledge and procedural guidance for employees who request the Data Management Services from Data Engineering Group.
    .

Background

  1. Enterprise Services Solution Engineering is responsible for the establishment, development, implementation, maintenance, and revision of this Policy (Directive). Approval of this Policy (Directive), including updates, rests with the ACIO of Enterprise Services. All proposed changes to this directive must be submitted to the Chief, Solution Engineering Process Maturity Practice Group.

Purpose
  1. The purpose of this policy is to establish the organizational policy for planning and performing the Data Engineering Processes within the IRS environment. This policy has been created to manage enterprise-wide data engineering processes more easily and efficiently so that the business can maximize its data value within IRS.

Scope
  1. All projects following the Enterprise Life Cycle (ELC) are required to perform Data Engineering Processes and associated activities in accordance with this policy.

Authority

  1. Enterprise Services Solution Engineering Process Maturity Practice Group is responsible for the development, implementation, and maintenance, of this directive. Approval of this directive, including updates, rests with the ACIO Enterprise Services, Solution Engineering Process Maturity Practice Group. All proposed changes to this directive must be submitted to Enterprise Services, Solution Engineering Process Maturity Practice Group.

Mandate

  1. Applicability.

    All IT projects following the ELC will follow the Data Engineering Process to ensure all work products are produced.

  2. Completion of Data Engineering Process allows

    • Execution of Data Engineering Processes

    • Obtaining needed resources to perform Data Engineering Processes

    • Control of work products required by the Data Management Process

    • Engagement of stakeholders affected by the Data Management Process

    • Monitoring and Controlling of the Data Management Process

    • Collection of Data Management Process measures

    • Review of Data Management Process status with high level of management

    • Establishment of the Data Management Process within the project's defined processes

    • Submission of lesson learned and process improvement suggestions from the execution of Data Management Process

  3. Data Engineering Processes
    Data Engineering serves the user demand for live, heterogeneous data. Data Engineering is the solution needed to furnish users with the rapidly growing requests for growing volumes and varieties of data. The Data Engineering core process is the solution for these needs within the IRS and Data Management is the key subprocess for accomplishing this solution.

    1. Data Management-Data Management is one of major processes of Data Engineering. Data Management includes 5 core components which are Collect, Consolidate, Certify, Connect, and Consume. These C's are the cornerstone of Data Management.

    • Collect data that exist in database (structured) and that exist outside the database (unstructured), from across the agency and external sources.

    • Consolidate and condense disjoined data/project into that which can be homogenized and that which can remain distinct so the relevant data should be mostly easily accessible.

    • Certify the designated sources, processes, project, understand data quality, and govern the data assets.

    • Connect the data to each other as well as users to the data. Building out the metadata repository helps expose data element to the larger audiences.

    • Consume data by presenting the data in the proper/desired state for the recipient to efficiently make right decisions. The effectively turns the data into information that then can be strategically used as knowledge.

    2. Naming Data Elements– Naming Data Elements is a new major process added to Data Engineering. The goal of Naming Data Elements is to support the business goal for data naming standardization by ensuring that the EDSG are followed service wide. Naming Data Elements will be divided into parts: one is Naming Business Data Elements, the other is Naming Technical Data Elements. The objectives of Naming Data Elements are:

    • Standardized data names facilitate data sharing, data consistency, and communication between organizational areas in the Service.

    • Standardized data names also assist the data administration effort, by making it possible to eliminate data redundancies and inconsistencies.

    3. Future Subprocesses- This Policy is applicable for future Data Engineering processes that are created or changed for the Data Engineering Chapter of the IRM by the process. This includes the addition of new processes, as well as additions, deletions, and modifications to existing processes.