Foundational Data Governance

The US Office of the Inspector General offers resources for maintaining compliance in healthcare. Their documentation is comprehensive and easy to access. But these resources don’t have much impact if your organization can’t find the data needed for those audits. Similar complications are seen with medical researchers, who are provided with a baseline of metrics needed for grant applications but cannot find the data itself in time to apply. Missing or obfuscated data is expensive at the individual and institutional level. It doesn’t have to be though.

At CTI, we know that, particularly in healthcare, maintaining foundational data governance practices throughout an organization has the potential to save money, bring in more funding, and even maintain institutional status. Basic governance, as straightforward as asset cataloging, increases transparency and reduces confusion during an audit. Classifications categorize data assets to be more easily searchable around specific topics or regulatory needs. And auto-provisioning removes the manual intervention needed to provide both researchers and analysts with the data they need.

Protected Health Information (PHI)

In addition to concerns around Personally Identifiable Information (PII), healthcare has the additional responsibility of maintaining Protected Health Information (PHI). This includes all information about a patient such as demographics, diagnoses, insurance, and visit data. In addition to simply ensuring the security of this information, institutions also have to follow strict data lifecycle requirements. This in particular drives the archiving and deletion of data either at the patient’s request or within a specific time frame. Each of these elements require an enterprise-wide understanding of what data is PHI, where it sits across the data architecture, and what processes are impacted by that data.

Implementing a governance program gives insight into each of these factors as well as who has been given access to the data and how they were provided that data. CTI knows that maintaining a system agnostic program while also having all governance features is important in the healthcare space. Whether it be Informatica, Alation, Purview, Unity, or another data governance platform, maintaining metadata within a single application will provide a comprehensive view of data assets improving security, compliance, and the ability to audit for PHI.


For medical institutions, there are few problems larger than the possibility of losing accreditation. The processes themselves can be long and arduous, and with accreditation on the line it is of utmost importance the right data is identified, defined, and located on time. Data governance provides a singular platform to identify the data you need, understand where it is across the enterprise data systems, and easily request that data for quick turnaround provisioning. This reduces the stress and administrative burden on the organization during a process that holds significant weight over a medical institution.

Research Grant Applications

Federal and private grant applications for medical researchers are comprehensive, complex, and require metrics up front for submission. For many organizations these metrics differ and can be vague, resulting in confusion and rejections for misunderstanding. This is beginning to change as larger funding organizations standardize their metrics and make clear baseline needs and definitions available to the public.

One such instance is at the Agency for Healthcare Research and Quality (AHRQ), where standardization of required metrics provided during application is intended to ease the process for both applicants and reviewers alike. By building out these baselines like facilities details (type, location, capacities, capabilities, and availability) and available supporting resources (classes, training, career enrichment programs, peer groups, logistical support), the AHRQ has showed the importance it gives to standardization in data. They’ve also given researchers an opportunity to more easily and quickly submit applications with greater confidence they have provided the correct information. However, this only becomes quick and easy if their institution also has enterprise data governance in place. These are all data that should be maintained by the institution and the hurdle is just finding where it exists. The ability to search a catalog for these metrics reduces time to application, and therefore time to funding, significantly. Complement that with classifications (detailed below) and the process becomes even faster.

Key Elements of Healthcare Governance


Creating transparency in security practices is required when building out governance in the healthcare space. As mentioned above, highly sensitive data is being moved around healthcare institutions with carefully crafted though not always well understood security and provisioning restrictions. Through data governance, these policies are made available to all users.

At CTI we have a few specific best practices around building policies. A well-constructed policy should include clear language (please no legalese!), definitions of roles and their allowed permissions, and any specific requirements beyond security clearance. For example, your department may purchase 3rd party data to supplement what’s created in house. The data in and of itself is not sensitive, however there are only 10 licenses available for the entire organization. The policy should clearly document that, and ensure the policy is made available to anyone interested in using or responsible for provisioning that data.

In another situation, sensitivity restrictions may be structured around how the data is used. A dataset with sensitive employee data should not be used outside of HR purposes. The users of this data may span across a variety of roles and departments including HR analysts, department heads, DEI initiatives, or C-suite reviews. With the security policy documented and available, it becomes clear just who would be allowed to provision this data and who should be restricted.


Aligned with policies, classifications in data governance support security efforts and allow for those policies to be reused for assets across the catalog. They can also be used to give greater context to the use of that data. A PHI classification would associate that asset with the PHI policy, but it would also allow for easy searching if that data was needed for an audit. Medical universities can use classifications to define which assets are needed for accreditation, ensuring the data is available and can be provided to auditing organizations like LCME quickly. Private healthcare companies can assign product classifications to indicate which data belong to those applications.

One of the things to keep in mind with classifications is to provide as much information on the metadata while keeping the number of classifications in check. The categorization created from these classifications should be concise and precise. If there are questions about an asset falling into a classification, don’t include it. If a more accurate classification needs to be made, make it. As always in governance this will be an iterative process, but you want to be sure your users are not confused by assets that don’t belong.


Lineage shows the data journey from collection through transformation all the way to consumption. This includes source applications, databases, data warehouse/ lake houses, ETL code, and visualization. Overall, lineage allows you to see what data is impacted by up- and down-stream processes and the various transformations that data goes through. In the healthcare field this is particularly important when tracing PHI through the system, when understanding the drivers behind a diagnosis, when identifying the calculations of a metric during an audit.

Providing a comprehensive view of your enterprise data architecture through metadata reduces time spent hunting for information or reworking code because impact analysis didn’t reveal an impacted system or asset. This is applicable to your business users as well though. A researcher may not understand the complex transformations of technical lineage, but reviewing business lineage they will be able to identify a metric in Epic’s front end, and then trace that metric to a report they need. Or a provider may see a diagnosis for a patient on their BI dashboard, and follow that diagnosis back into SlicerDicer.

Reducing Human Intervention

Manual intervention in any process exposes that process to error. The more manual intervention, the more potential for error. When giving access to sensitive health data, it is of utmost importance to reduce those error. As an extension of an end-to-end governance program, auto-provisioning works to remove provisioning errors and decrease an organization’s exposure rate.

Auto-provisioning can be either a simple or complex process based on the maturity of your data program overall as well as the resources available. A simple auto-provisioning may include an approval process, where the final approval triggers the provisioning process. More complex systems may include business logic rules, provisioning based on policies or role, or integrations with specific delivery methods like Snowflake or Databricks.

Stakeholder Engagement

Governance is a tool with which you can increase the usability of your data. There is no intrinsic value without that data or those users. Engaging stakeholders early in the process of developing a data governance program is crucial for its success. These stakeholders encompass everyone from data owners and stewards to end users and governing body members.

In healthcare, stakeholder engagement is even more critical due to the unique challenges of the industry. Concerns over Protected Health Information (PHI) and the often-fragmented network of locations and providers can lead to data silos. Data and metadata are often held close to the chest, hindering collaboration and hindering the potential benefits of data analysis. A successful data governance program must build trust from the start. This means demonstrating early wins to those impacted. By showcasing how governance can decrease access barriers while simultaneously increasing security, stakeholders will become invested in the program’s goals.

Final Thoughts

CTI recognizes the complexities of healthcare data governance. Fragmented provider networks, patient privacy concerns, and ever-evolving regulations can create a disconnect between data and its potential users. Building a successful governance program goes beyond engaging data owners and stewards. Our approach incorporates legal, compliance, and security teams from the outset, ensuring alignment with funding data requirements and other regulations.

Our expertise empowers you to navigate these intricacies and establish a robust, long-term data governance program. CTI streamlines processes, enabling researchers to efficiently apply for grants and administrators to locate data for audits with ease. The resulting improvements in data quality, accessibility, and security unlock the full potential of your healthcare data, leading to significant and measurable benefits.

Amanda Darcangelo is a Senior Data & Analytics Consultant at CTIData.

Contact us for more information on how we can help support your Data Governance initiative.

© Corporate Technologies, Inc.   |  Privacy & Legal