Data Governance Glossary

Open book with depictions of a locked laptop, the cloud, and other items related to data governance.

Access Management

Policies and procedures that define, track, and control the data an individual can access in systems or applications.


Deidentify data by stripping Personally Identifiable Information (PII) from it.

Audit Trail

An electronic log used to track computer activity—at a system or individual level.
and procedures that define, track, and control the data an individual can access in systems or applications.


The process of verifying the identity of a user or process when accessing a computing system

Authorized Access

Also known as Permissible Access, allowances made for internal and external users to view and process PII on a need-to-know basis.

Big Data

Extremely large amounts of data processed with powerful systems and analytics tools to find trends and patterns that lead to insights.

Business Intelligence

Insights, resulting from data analysis, that are delivered as reports, dashboards, and visualizations (e.g., charts, graphs).

Chief Information Security Officer

The executive-level manager who directs strategy, operations, and the budget for the protection of the enterprise information assets and manages that program.


The process of labeling and sorting data assets based on predefined criteria, such as sensitivity level or data owner.

Cloud-Access Security Broker

An on-premises or cloud-based security policy enforcement point that is placed between cloud service consumers and cloud service providers to combine and interject enterprise security policies as cloud-based resources are accessed.


The practice(s) of ensuring that sensitive data types are organized and secured in such a way as to enable organizations to meet legal and governmental regulations. Examples of common data privacy laws include GDPR, HIPAA, CCPA and FDA regulations.

Content Governance

A system of tools, policies, people, and processes defining who within an organization has authority and control over unstructured, human-readable files, commonly known as enterprise content. Common examples include documents, PDFs, email, and images.

Cloud Content Governance

A class of technology that uses cloud-first architecture and machine learning to analyze large volumes of unstructured, human-readable files and automatically apply protections. 

Content Management

Processes and technology used to collect, deliver, retrieve, and manage data in a variety of formats. It is also used for data governance.


An acronym for the four basic functions of persistent storage—Create, Read, Update, and Delete—that are interfaces to databases to allow users to create, view, modify, and alter data.

Cyber Hygiene

 the practices and steps that users of computers and other devices take to maintain system health and improve online security. These practices are often part of a routine to ensure the safety of identity and other details that could be stolen or corrupted.

Data Analytics

Processes and algorithms used to examine raw data and extract meaning. Data analysis systems transform, organize, and model data to draw conclusions and identify patterns.

Data Architecture

A framework of rules, policies, standards, and models that govern what data is collected then how it is used, stored, managed, and integrated across an organization

Data Breach

An incident that involves the intentional or unintentional viewing, access, retrieval, or removal of data by an individual, application, or service.

Data Classification

The organization of data based on its level of sensitivity and the impact should that data be used, shared, altered, or destroyed without authorization.

Data Discovery

The process of detecting and organizing data by identifying key characteristics and applying a distinct class to make it easier to locate, track, and retrieve. Once undergoing discovery, data is then tagged, often with the specification of its access restrictions.

Data Custodian

 An administrator responsible for the appropriate storage, transportation, and access of data as well as the technical environment and database structure.

Data Flow

The path that data follows through a system—from source to final instantiation (e.g., report, database).

Data Governance

A system of tools, policies, people, and processes for defining who within an organization has authority and control over data assets and how those data assets may be used and shared.

Data Hygiene

The collective processes conducted to ensure the cleanliness of data. Data is considered clean if it is relatively error-free. Dirty data can be caused by a number of factors including duplicate records, incomplete or outdated data, and the improper parsing of record fields from disparate systems. Poor data hygiene can lead to improper classification that causes increased risk exposure.

Data Integrity

The completeness, validity, reliability, accuracy, and consistency of data.

Data Loss Prevention (DLP)

A strategy for making sure that end users do not send sensitive or critical information outside the corporate network. The term is also used to describe software products that help network administrators control what data end users can share. It’s most effective for preventing data loss, either purposeful or accidental. 

Data Minimization

The practice of limiting the collection of personal information to that which is directly relevant and necessary to accomplish a specified purpose.

Data Residency

The physical or geographic location of an organization's data or information. Some privacy regulations, such as GDPR, require that certain kinds of data physically reside in the same geographic location as the individual they reference.

Data Ownership

Assignment of formal accountability and legal ownership of data—a single piece or set of data. This comes with a list of owner rights and responsibilities.

Data Privacy

Defines whether or how data is shared, with whom data is shared, and how data is legally collected or stored.

Data Security

Measures to protect data, residing in systems or applications, from unauthorized access, corruption, or theft.

Data Silos

A collection of information isolated from — and not accessible to — other parts of the organization due to incompatible systems or permissions. Silos restrict visibility of data and content across the organization.

Data Steward

Role within an organization focused on high-level policies and procedures for the monitoring, security, and management of data use according to data governance rules related to access, accuracy, classification, and privacy.

Data Stewardship

Tactical coordination, implementation, and enforcement of data governance policies and procedures across an organizations’ data stakeholders.

Data-Centric Audit and Protection (DCAP)

An approach to information protection that combines extensive data security and audit functionality with simplified discovery, classification, granular policy controls, user and role based access, and real-time data and user activity monitoring to help automate data security and regulatory compliance.


An organized collection of structured data that can easily be accessed, managed, and updated.


The process of removing or obscuring personal data in a document or record


The process of converting information or data into a code, especially to prevent unauthorized access

Enterprise Password Management

Practices and software that use security controls to prevent internal and external threats from capturing master passwords, credentials, secrets, tokens, and keys to gain access to confidential systems and data. These centralized password management systems can be on-premises or in the cloud. Most important is that they provide password security for all types of privileged accounts throughout your enterprise.

File Versioning

The digital practice of storing more than one version of a file simultaneously with the goal is to provide access to previous iterations of important documents for a number of potential scenarios, including mitigating ransomware attacks.

Identity and access management (IAM)

Is a framework of policies and technologies for ensuring that the proper people in an enterprise have the appropriate access to technology resources.

Indirect Identifiers

Information that can be combined with other information to identify individuals, such as date of birth, race, education, occupation, marital status, and zip code.


The result of processing to data to provide context and meaning.

Insider Threat or Insider Data Breach

An incident where sensitive, protected, or confidential personal information and personal data has potentially been accessed, stolen, or used without authorization due to negligence or malice by an employee or contractor.


A data security technique in which sensitive data is obfuscated for testing or training purposes.


A set of data that describes and gives information about other data.

Nonpublic Personal Information (NPI)

Personal data that is already widely available, such as data obtained through Internet collection devices or cookies.

Personally Identifiable Information (PII)

Information that can directly identify an individual when used alone or with other relevant data. PII includes name, address, social security number or other identifying number or code, telephone number, and email address.


A rule or set of rules that outlines how companies and their employees are intended to interact with corporate data.


A data management and de-identification procedure by which personally identifiable information fields within a data record are replaced by one or more artificial identifiers.The GDPR defines as “the processing of personal data in such a way that the data can no longer be attributed to a specific data subject without the use of additional information


A type of malware that threatens to publish the victim's data or perpetually block access to it unless a ransom is paid to regain control of the content.

Risk Management

The identification, analysis, assessment, control, and avoidance of risk through precautionary steps that reduce or eliminate threats.

Sensitive Data

Data that is classified as information that requires elevated protection and tightly managed access.

Shadow IT

The use of computer or network hardware or software by a department or individual without the knowledge of the IT or security group within the organization.

Structured Data

Data that resides in a fixed field within a file or record. It is easily organized and searchable. A SQL database containing customer records is a common example of structured data.

System of Record

A storage system that is the authoritative data source for a given data element or piece of information.


A label attached to a data asset for the purpose of identifying, grouping, or providing context.

Threat Detection

The practice of analyzing the entirety of an information-security ecosystem to identify any malicious activity that could compromise the network.

Unstructured Data

Information, in many different forms, that doesn't fall into conventional data models and thus typically isn't a good fit for a mainstream relational database. Most enterprise content is unstructured data, including email, documents, spreadsheets, images, and PDFs. 

Zero-Day Detection

The deployment of behavior or activity-based AI to detect suspicious actions indicative of an attack in near-real time.

Zero-Day Vulnerability

An exploit for an unknown vulnerability previously in a software program or operating system.

Last updated: 05/05/2021

Share this Page

Get started with Egnyte.

Additional Resources

Understanding Data Governance

A starting point for IT teams to build a comprehensive data governance program.

Learn More


A helpful checklist of the most critical security and compliance controls in M365.

Learn More

7 Steps

Microsoft MVP shares how IT leaders can maintain security and compliance in M365

Learn More

Get Started with Egnyte Today