Protecting Federal Tax Information (FTI) in Databases Through Labeling



Databases are the central point for reviews conducted by the Office of Safeguards. Databases are used by the agencies to store federal tax information (FTI) which is then retrieved using queries for use in applications, making the FTI accessible to end users and on the back-end component by databases administrators (DBAs). 

It is recommended that FTI be kept separate from other information to the maximum extent possible to avoid inadvertent disclosures. However, in situations where physical separation is impractical, Publication 1075, Tax Information Security Guidelines for Federal, State and Local Agencies (Pub. 1075), requires records to be clearly labeled to indicate that FTI is included in the record.

Database Element Labeling

The Office of Safeguards has observed a wide range of database data element labeling practices while reviewing labeling and auditing procedures. Agencies are responsible for implementing audit logging of FTI, which includes: identifying the data to be audited;, creating audit logs as the data is accessed and performing analytics and monitoring on those audit logs. The first step to effectively audit FTI access requires that data is properly labeled upon receipt. If the data is not properly labeled, the auditing function cannot be configured to be compliant.

Organized, consistently applied labeling can help the agency better enforce access control to the data elements, easily identify what needs to be audited and logged, and can identify those network components which are required to be in compliance with Pub. 1075.

Mandatory Requirements for Data Labeling

In order to utilize a database to store FTI, the agency must meet the following mandatory requirements and apply them to each database which contains FTI: 

1)  Proper FTI Labeling 

Agencies must determine and identify the FTI data they have and consistently apply labels to that data before it migrates into the agency’s IT environment., in such a way that the data is easily identified even when commingled. In addition, a data labeling legend or other explanation document must be maintained by the agency, which identifies the labeling methodology applied and allows a reviewer to quickly identify which data elements are FTI in an individual table or database.

In order to properly label data, agencies must first determine how the data is to be identified. Typically, this includes identifying the data at the entry point into the agency’s environment. Although it is not a requirement to include source information in the labeling convention, this is strongly recommended in order to better track FTI throughout the IT environment.

Data labeling can be accomplished in a variety of ways depending on the vendor. For example, one product allows for the configuration of data security based on sensitivity levels, composed of a combination of levels, compartments, and groups. With the level being the sensitivity, the compartment indicating the type of data, and the group which further separates the data and can indicate the origin. Another product allows for label-based access control (LBAC), which enforces access at the row and column levels.

Pub. 1075 does not prohibit FTI data from being commingled with non-FTI data, given the proper controls are in place. However, when data is commingled, it must be identified at the lowest level at which all data is FTI. For example, if data is commingled at the table level, e.g., a database comprising both FTI and non-FTI data tables, the tables must be labeled in such a way so that it is readily apparent that those tables contain FTI. Additionally, if data is commingled within a table that includes FTI and non-FTI data, the FTI data must be explicitly labeled and identified as such at the column, record or data element level.

Labeling must be applied consistently. Data elements must retain their labelling throughout the data movement process from the point that the data is received to wherever it moves within the network. The labels must never be removed from the data.

Proper labeling allows an agency to easily identify the security requirements for the data and allow for data of different sensitivity to be stored together. This reduces administrative overhead from a database maintenance perspective, not having to maintain a database for each data sensitivity level.

If FTI is not properly identified and labeled in the agency’s environment, it is likely that data will not be audited correctly. In databases with tables that only contain FTI (are not commingled), the FTI can be identified at the table level. In situations where FTI is commingled with non-FTI at the data element level, the FTI must be labeled at that level so that each FTI data element can be clearly identified as such.

2)  Documenting Labeling Methodology

The agency must document its labeling methodology and maintain a listing of how each element is labeled throughout each database that contains FTI. The agency can choose its own documentation approach; however, a matrix is often most useful for documenting data labeling.

Using a matrix allows the agency to not only map how data is labeled throughout the environment, but can also be used to map group permissions to the data elements. This will serve the dual purpose of documenting the methodology and ensuring that least privilege is applied.


Additional information can be found in the following documents: