AI has become engrained in everyday life, especially in the workplace. As uses of the technology continue to expand, many organizations are recognizing the need to govern their AI; everything from models to functions to agents and even to the data itself. With the additional renaissance of data governance, however, there often is confusion over what belongs to AI governance and what lies in the realm of data governance. As both of these theories and tactical implementations continue to mature many organizations will find themselves defining a unique set of features for each. One question on everyone’s mind, though: How do you ensure nothing slips through the cracks as responsibilities are doled out across the two disciplines?

Data Governance

Data governance is made of those elements needed to support the use of raw data, data analytics, and dashboards. Effective data analytics can, to an extent, be built without a structured data governance program as long as there is informal rigor around how data infrastructure and modeling standards, naming conventions, and access control are maintained. However, as an organization’s data use scales beyond a small data Center of Excellence (CoE), more formal features need to be deployed. This includes governing bodies to develop best practices and naming conventions, enforce best practices, and prioritize use cases. This also includes a robust stewardship and ownership network with all data assets assigned both a steward and an owner. In some instances, these will be the same resource but responsibility and accountability for both roles need to be clearly defined.

A data governance program also requires some sort of technical platform that can be as simple as a series of Excel files and as complex as enterprise solutions like data.world, Atlan, Informatica Data Management Cloud, Unity Catalog, or any of a host of others. Features maintained in this platform should include the ability to scan and catalog metadata from sources, develop and describe lineage, persist a standardized enterprise language, and indicate access, security, and use requirements for any given asset. More advanced data governance solutions may also include the ability for users to request access to the actual data asset after reviewing the metadata of that asset. For more on data governance fundamentals and our approach to effective implementation, review our whitepaper on User-Empowered Governance.

AI Governance (Data Gov+)

AI governance takes the premise and foundational features of data governance and expands them to encompass more complex assets, models, functions, and development pipelines. An effective program looks beyond what traditionally would be considered part of data governance and looks more holistically at what processes, procedures, and protocols need to be in place to put out functional and effective AI solutions. A center of excellence for AI governance includes traditional roles like an Enterprise AI Governance Steward but also requires individuals for AI testing audits, AI risk and compliance analysis, AI ethics analysis and other specialized roles that cater to the specific needs of AI deployment. Additionally, governing bodies will not only be represented by stewards, owners, and governance CoE members but will expand to data scientists, Continuous Integration Continuous Delivery (CICD) engineers, quality testers, and more. This larger group of individuals will be responsible for ensuring that every step in the innovation cycle (discussed in the next blog in this series) is scrutinized before deploying AI solutions to production or granting access to either the metadata or the solution itself.

The technical needs of AI governance also need to expand to fit this broader scope. One of the more straightforward examples centers around data quality. Good or bad data quality can significantly impact whether an AI solution operates as expected or is outputting biased and wholly inaccurate results. In data governance there is an expectation of measuring data quality of those assets maintained in the catalog. AI governing bodies go a step beyond this and declare data quality requirements that indicate the minimum quality scores and attributes needed to use a given data set for AI development. Technical features of AI then need to be reactive to that policy, using the measurements deployed during data governance to enforce those requirements and spur remediation as needed.

Along with features that grow on foundations from data governance, there are a number of net new features that are required for AI governance. AI solutions need to go through rigorous bias, ethics, and QA testing, drift monitoring and remediation, version control, lifecycle automation, training compliance auditing, and other technical elements to support AI solution needs. Ultimately the goal of AI governance is to ensure AI models work as intended and within an organization’s policies both at deployment to production and throughout the life of the solution. Any technical feature needed to support this goal falls under AI governance.

Iterative Development

Just because AI governance is an expansion of data governance does not mean that all features of data governance need to be developed before integrating AI governance elements. Instead, these can be worked both iteratively and in parallel by identifying features that are intrinsically bound versus those that are independent. For example, data quality measurement (a data governance feature) needs to be deployed before restrictions can be set on data quality and remediation is done (both AI governance features). However, each of those features can be deployed before and independently of data lineage which is a feature of data governance. When developing a use case strategy, consider the features of data governance that have already been deployed, and which need to be deployed before moving on to the key AI features of the use case. This will allow for more flexible development of both programs and decrease time to value as use cases begin to feature not just governance elements but also AI models themselves (discussed in the next blog in this series).

Amanda Darcangelo is a Lead Data & Analytics Consultant at CTIData.

Where Will You Take AI with Data?

Deliver AI That Gets Adopted.

Build Data Products. At Scale.

Use Data Governance to Fuel AI.

Ensure Trusted, Explainable AI.

Launch a GenAI MVP. Prove Value.

Let’s talk—No Pitch. Just Strategy.

© Corporate Technologies, Inc.