Data management is the aggregation of an array of practices, processes, and tools that connect all phases of the information lifecycle to drive value from data. Included under data management are the policies, strategies, and programs developed, executed, and managed to govern, secure, and operationalize the data collected by an organization.
Developing and maintaining a framework for ingesting, organizing, storing, mining, and archiving data is an integral part of data management. A framework supports all aspects of the data lifecycle management process and guides optimization that improves decision-making with high-quality data.
A data management framework includes:
- designing and deploying a data architecture
- developing data models
- creating, processing, and storing data
- integrating data into a data warehouse or data lake
- checking data quality
- identifying and resolving errors
- implementing data governance
Let’s jump in and learn:
Data Management Challenges
Despite the worthwhile benefits of data management, it has its challenges, including the following:
- Connecting disparate platforms to avoid issues, such as:
- duplicate data
- incorrect data
- improperly formatted data
- data reconciliation
- Managing data accessibility to provide stakeholders with the information they need without overprovisioning.
- Meeting data security requirements to protect sensitive information by controlling data expansion while balancing the desire and need to collect vast amounts of information.
- Ensuring data quality, cleanliness, and consistency when data comes from multiple sources, many of which are not controlled by the organization and are spread across different repositories.
- Maintaining data integrity and fidelity to protect it from loss or corruption.
- Sharing siloed data between teams.
- Preparing and storing unstructured data to make it accessible for analysis.
- Integrating the principles of data management into the organization’s culture.
Importance of Data Management
Effective data management supports:
- Productivity: Makes it easier to find, analyze, interpret, and share information.
- Cost-efficiency: Eliminates costs related to duplication and time spent looking for information.
- Operational dexterity: Enables faster responses to internal and external requests, as well as market and competitive changes.
- Security risk mitigation: Protects information by knowing where it is and who has access to it.
- Reduced data loss: Maintains easily accessible, regular backups of information.
- Better decisions: Provides all users with ready access to the same versions of the most recent information.
Data Management Best Practices
Establishing data management best practices and strategies that align with business requirements and goals is vital for success. Best practice considerations include:
Develop and write a data management plan that has scheduled periodic reviews and updates. The plan should include:
- estimated data usage
- guidelines for access
- directions for archiving
Understand the volume and velocity of incoming data as well as the use and accessibility requirements to determine which storage best fits data management objectives—e.g., data warehouses, data lakes, on-prem, or in the cloud.
Plans for backup must include the storage of up-to-date copies of data and the steps required for its recovery.
All details of the approach, along with policies and procedures for tactical functions, must be documented.
Data management requires that security systems and processes be put into place to protect data and comply with corporate, industry, and government regulations.
Part of security and data management protocols should include details about how credentials are issued (and replaced, if lost) and how permissions are granted to provide fine-grained access controls. Directions about how to access data across multiple, disparate sources should also be considered.
A subset of access, it is important to consider user privileges and gates, such as:
- Who owns the data?
- Can the data be shared, changed, or copied?
- Who can access the data?
- What data should all company employees have access to? (for example, company benefits information)
- When can the data be accessed?
- Is the data considered sensitive information?
Especially with large volumes of data, automation (e.g., monitoring, data transformation) is a must-have for effective data management.
- Data quality
Follow established data governance directives and integrate data cleansing into the data lifecycle.
- Data tags
Unique, persistent identifiers should be appended to data and document descriptions to facilitate data discovery.
Data management should include processes for measuring efficacy. Metrics that should be reported include:
- data accuracy
- data standardization and consistency
- data completeness
- data integrity
- data timeliness
- data to error ratio
- number of empty or incorrect records
- data storage costs
Data Management Functions
- Big data collection, integration, management, and analysis
- Data analytics
- Data architecture
- Data governance
- Data integration
- Data lakes
- Data marts
- Data modeling
- Data privacy
- Data quality
- Data security
- Data warehouses
- Database management systems
- Document and content management
- Master data management
Data Management Techniques
A common technique applied for data management revolves around what is known as FAIR Data Principles: Findable, Accessible, Interoperable, and Reusable (for data and metadata).
- Data are assigned unique and persistent identifiers.
- Data folder names are based on projects (i.e., not people).
- Succinct, unique names are used for folders.
- Data are described with rich metadata that explains the identifier.
- Folder names are not changed and metadata is not deleted.
- Data are registered or indexed in a searchable resource.
- Data are retrievable by their identifier, using a standard protocol that is open, free, and universally deployable.
- Authentication and authorization rules are implemented and enforced.
- Sensitive data are explicitly named and managed by a designated person (data steward).
- Consistent naming methods are used, along with formal, accessible, shared, and broadly applicable vocabulary.
- Data include qualified references to other data and metadata.
- Data is captured in consistent templates.
- Data are richly described with accurate, contextual, and relevant attributes.
- A detailed provenance is indicated for data and analysis.
- Data meet domain-relevant standards.
Benefits of Effective Data Management
With data management, data is in the right place, at the right time, in the correct format, and its usage is governed. This results in far-reaching benefits that positively impact all aspects of an organization’s operations, including:
- Increased productivity
- information is easier to find and understand
- results and conclusions can be validated
- information structure can be easily shared
- information is stored for future reference and easy retrieval
- duplicate records are eliminated
- redundant research, analysis, and work is reduced
- the amount of data in play can be measured and optimized
- decision making is expedited
- operating expenses are reduced
- customer experiences are improved
- Reduced risk
- data security is improved
- data loss is reduced
- data can be trusted
Data Management Platforms
A data management platform (DMP) provides a unified system to collect and organize first-, second- and third-party audience data from any source. Much of the data processed by a DMP is considered big data, which is often unstructured. A DMP helps prepare data so that it can be broadly accessed and used.
DMPs consist of a number of components, including:
- data lakes and warehouses
- data analytics tools
The benefits of a DMP include:
- unified data
- breaking down silos
- a cohesive view of data
- continuous results and reporting
- increased data accessibility
Data Management Roles
Depending on the organization, data management functions can be consolidated and run by a few people or separated into granular roles and handled by many people. Key functions include:
- big data management
- data analysis
- data collection
- data governance
- data model design
- data quality management
- data security
- data stewardship
- data warehousing
- database architecture
- master data management
- metadata generation
Data Management Tasks
Manage Data Management Processes and Policies
- Create and enforce policies for quality data collection.
- Ensure adequacy, accuracy, and legitimacy of data.
- Develop and implement efficient and secure procedures.
- Establish rules and procedures for data access and sharing.
- Assist in the application and implementation procedures of data standards and guidelines on data ownership, coding structures, and data replication.
Support Teams with Data Management
- Help team members adhere to legal and company standards.
- Assist with reports and data extraction.
- Work with stakeholders to ensure alignment of data rules and operations.
- Translate business requirements and models into data warehouse designs.
- Provide data consulting to support business and information technology initiatives.
Enable Data Management Systems
- Monitor and analyze information and data systems.
- Evaluate and optimize systems’ performance to discover ways of enhancing them (new technologies, upgrades, etc.).
- Protect data from unauthorized access—accidental and malicious.
- Troubleshoot data-related problems with error detection and correction, process control and improvement, and process design strategies.
- Authorize systems’ maintenance or modifications.
- Define, design, and build databases.
- Conduct regular data cleansing to rid systems of old, unused data or duplicate data.
Data Management and Data Governance
At a high level, data governance establishes policies and procedures around data, while data management executes those policies and procedures with the goal of improving data quality and optimizing usage.
Rules set forth with data governance establish a strategy for the appropriate use, handling, and storage of data. A data governance framework defines:
- who can access and use the data
- data storage processes, including how long data is retained
- security protocols that are related to data storage
Key components of data governance are:
- people—data users and data stewards
- standards—for data quality
- policies—for access, use, retention, security, and storage
Effective data governance strategy delivers benefits across an organization, including:
- support in maintaining regulatory compliance
- minimized risks
- improved data security
- standards for data quality
Data management is the implementation of the data governance framework’s strategy, standards, and policies. For instance:
- A data governance policy may specify that client data must be retained for five years to meet regulatory requirements. Data management processes are used to direct data archiving and deletion within data storage systems.
- A data governance policy may dictate that protected health information (PHI) can only be accessed by certain staff members. Data management processes are created to grant role-based access to the staff members who meet the proper criteria to access sensitive data.
Data Management and Data Privacy
While different, data management and data privacy have some overlap, including:
- access control
- data accuracy and integrity
- accountability for privacy and security
Combined, data governance and data privacy deliver robust programs to address each constituent’s operational, compliance, and security requirements.
A few examples follow.
Define and classify
For all data elements, include attributes that help direct appropriate data governance and data privacy policies. Attributions include definitions, the purpose of use, data quality rules, risk impact, ownership, and classification (i.e., confidential, sensitive, internal).
|Data governance||Data privacy|
|Data governance provides guidance about how to handle information classified as sensitive—e.g., access level and prioritization of governance projects.||Data privacy helps identify specific risks associated with processing activities involving that information.|
Data tagging helps organize information more efficiently by associating pieces of information (e.g., usage parameters, descriptions) with tags or keywords.
|Data governance||Data privacy|
|Data governance describes how data can be used.||Data privacy helps with compliance requirements that mandate that the purpose of data use must be as described (e.g., GDPR, CCPA).|
Data provenance or lineage
Together, data management and data privacy can provide a record of the course of data as it flows from source to consumption. This includes information about all transformations the data underwent with details about how the data was transformed, what changed, and why.
|Data governance||Data privacy|
|Data governance documents and illustrates the end-to-end journey of a data element, starting from the “authoritative” source that created the data to downstream sources and applications that store it, display it, or both.||Data privacy reveals the original data source or provenance of where the data is collected, as well as its lifecycle throughout the business.|
Industries that Use Data Management
Among the top data management use cases is compliance. Changing regulatory requirements are one of the biggest drivers for data governance.
Data is at the core of most of the rules that organizations must follow. While there is some overlap, the regulations and data management requirements vary by industry.
Regulations that impact companies in the life sciences industry include:
- 21 CFR Part 11
- Falsified Medicines Directive (FMD)
- General Data Protection Regulation (GDPR)
Among the regulations that financial services organizations must comply with are:
- 2003 Fair and Accurate Credit Transactions Act (FACTA)
- Bank Secrecy Act (BSA), commonly known as the Anti-Money Laundering (AML) law
- Common Reporting Standard (CRS)
- Home Mortgage Disclosure Act (HMDA)
- Payment Card Industry Data Security Standard (PCI-DSS)
- Sarbanes Oxley (SOX)
Numerous areas of compliance can impact manufacturers directly or indirectly, including:
- data protection
- employment law
- export controls
- fair competition
- health, safety, and environment
- IT safety and security
- product safety
Key rules and regulations for government contracting include:
- Armed Services Procurement Act of 1947
- Berry Amendment of 1941
- Buy American Act
- Civil Sundry Appropriations Act of 1861
- Eight-Hour Work Law of 1892
- FAR (Federal Acquisition Regulations)
- FASA (Federal Acquisition Streamlining Act)
- Federal Acquisition Reform Act of 1996 (FARA) or (Clinger-Cohen Act)
- Federal Acquisition Streamlining Act of 1994 (FASA)
- International Traffic in Arms Regulations (ITAR)
- Public Law 95-507, Amendment to the Small Business Act (1978)
- Purveyor of Public Affairs Act of 1795
- Sherman Antitrust Act of 1890
- Small Business Act of 1953
- The Davis-Bacon Act of 1931
- Truth in Negotiation Act of 1962
- Walsh-Healey Public Contracts Act of 1936
Getting Started with Data Management
Building a data management strategy provides the foundation required to support consistent project approaches, successful integration, and business growth. At a high level, key steps in starting a data management program include:
- Identify business objectives
- What are the overall objectives?
- What are the most critical use cases?
- What data is needed?
- Develop and document data processes
- Data collection
- What are the data sources?
- Will users need to access external and internal assets?
- Will the data be structured, unstructured, or both?
- How will the data be collected?
- Data preparation
- How will raw data be cleaned and transformed to prepare it for analysis?
- How will incomplete or incorrect data be fixed?
- What are protocols for naming data, documenting lineage, and adding metadata?
- Data storage
- Will data be stored on-prem, cloud, or hybrid?
- What format will structured data be stored in—XML, CSV, or relational databases?
- Is a data lake required for unstructured data?
- What is the plan for data security?
- Data analysis and distribution
- Which teams or departments need the ability to collaborate?
- How will users access data and prepared analyses?
- How will data insights be distributed?
- Data collection
- Evaluate technology
- Tools and platforms
- Hardware and software
- Cloud, on-prem, or hybrid deployment
- Resources required for management
- Establish data governance
- Data quality: What is the plan to keep data accurate, complete, and current?
- Data security: What will be done for data security?
- Data privacy: Has permission been granted to collect and use data?
- Data transparency: Are there processes to maintain an ethical data environment?
- Train stakeholders and users
- How will data owners be identified?
- What skills need to be built?
- What knowledge needs to be transferred?
- What materials and programs are required for education?
Be Proactive with Data Management
A strategic and proactive strategy, rather than an ad hoc and reactive approach, helps to avoid the pitfalls of an ineffective data management program rollout or the challenges that come with not having one. Strategic data management helps make data collection and usage as effective and efficient as possible by making data easier to govern.
Regardless of the details of the data management framework, ensure that everyone understands the overarching data management strategy and how it helps them and your organization.
Egnyte has experts ready to answer your questions. For more than a decade, Egnyte has helped more than 17,000 customers with millions of customers worldwide.