2.152.1 Data Engineering Policy

Manual Transmittal

October 13, 2015

Purpose

(1) This transmittal revises IRM Part 2 Chapter 152 Section 1 Data Engineering Policy with updated IT IRM Policy formats and inclusion of a new process in the Data Engineering Chapter. The Data Engineering Policy for the Data Engineering Processes is owned by Solution Engineering

Material Changes

(1) Revises the entire document with updated IPM IT IRM Policy formatting.

(2) Adds a new Naming Data Elements process.

Effect on Other Documents

This policy does not impact any other document.

Audience

This policy is applicable to all projects following the Enterprise Life Cycle (ELC), and all applications that use the Data Management Service.

Effective Date

(10-13-2015)

Terrence V. Milholland, Chief Technology Officer

Policy

  1. Data Engineering Policy

Administration

  1. Enterprise Services Solution Engineering is responsible for the development, implementation, and maintenance, of this directive. Approval of this directive, including updates, rests with the ACIO of Enterprise Services. All proposed changes to this directive must be submitted to the Chief, Solution Engineering Process Maturity Practice Group.

Purpose

  1. The purpose of this policy is to establish the organizational policy for planning and performing the Data Engineering Processes within the IRS environment. This policy has been created to manage enterprise-wide data engineering processes more easily and efficiently so that the business can maximize its data value within IRS.

Scope

  1. All projects following the Enterprise Life Cycle (ELC) are required to perform Data Engineering Processes and associated activities in accordance with this policy.

Mandates

  1. Applicability.

    All IT projects following the ELC will follow the Data Engineering Process to ensure all work products are produced.

  2. Completion of Data Engineering Process allows

    • Execution of Data Engineering Processes

    • Obtaining needed resources to perform Data Engineering Processes

    • Control of work products required by the Data Management Process

    • Engagement of stakeholders affected by the Data Management Process

    • Monitoring and Controlling of the Data Management Process

    • Collection of Data Management Process measures

    • Review of Data Management Process status with high level of management

    • Establishment of the Data Management Process within the project's defined processes

    • Submission of lesson learned and process improvement suggestions from the execution of Data Management Process

  3. Data Engineering Processes
    Data Engineering serves the user demand for live, heterogeneous data. Data Engineering is the solution needed to furnish users with the rapidly growing requests for growing volumes and varieties of data. The Data Engineering core process is the solution for these needs within the IRS and Data Management is the key subprocess for accomplishing this solution.

    1. Data Management -Data Management is one of major processes of Data Engineering. Data Management includes 5 core components which are Collect, Consolidate, Certify, Connect, and Consume. These C's are the cornerstone of Data Management.

      • Collect data that exist in database (structured) and that exist outside the database (unstructured), from across the agency and external sources.

      • Consolidate and condense disjoined data/project into that which can be homogenized and that which can remain distinct so the relevant data should be mostly easily accessible.

      • Certify the designated sources, processes, project, understand data quality, and govern the data assets.

      • Connect the data to each other as well as users to the data. Building out the metadata repository helps expose data element to the larger audiences

      • Consume data by presenting the data in the proper/desired state for the recipient to efficiently make right decisions. The effectively turns the data into information that then can be strategically used as knowledge.

    2. Naming Data Elements – Naming Data Elements is a new major process added to Data Engineering. The goal of Naming Data Elements is to support the business goal for data naming standardization by ensuring that the EDSG are followed service wide. Naming Data Elements will be divided into parts: one is Naming Business Data Elements, the other is Naming Technical Data Elements. The objectives of Naming Data Elements are:

      • Standardized data names facilitate data sharing, data consistency, and communication between organizational areas in the Service.

      • Standardized data names also assist the data administration effort, by making it possible to eliminate data redundancies and inconsistencies.

    3. Future Subprocesses - This Policy is applicable for future Data Engineering processes that are created or changed for the Data Engineering Chapter of the IRM by the process owner. This includes the addition of new processes, as well as additions, deletions, and modifications to existing processes.