This article is the first in a two-part series that highlights the various sources in which your organization is already storing and pulling metadata from. The second article in the series helps organizations understand the difference between active and passive metadata, as well as how to begin utilizing active metadata to unlock immediate business impacts. Step one is understanding where to access your metadata today.
While metadata and tagging are known to be key components to discovering the necessary data in your organization’s data lake, metadata investment is often predicated on the idea that the work put in to organize and catalog it now will become useful for future endeavors. While accurate, organizations should begin to consider the potential business value their metadata can provide when used actively for immediate impacts as opposed to simply documenting and maintaining it for potential future use.
Modern organizations already have an overabundance of metadata stored and pulled from a variety of sources, much more than they may realize. When working towards trying to actively utilize metadata for immediate impact, it’s important to consider the following sources:
Master data management plays a critical role in overarching data governance efforts. This includes all technologies, tools and systems that ensure data align across your entire organization. The metadata available within master data management platforms are your master records, including all the fields within them that create the building blocks of your organization, defining your primary main data domains. Along with master data management platforms, there is also important metadata stored in your operational applications, such as your enterprise resource planning application (ERP) and customer relationship management application (CRM). Data stewards need to think upstream when considering metadata sources to utilize for active impacts.
Databases and data lakes are well known sources where organizational data is being stored and can often be easy places to also access metadata. Many industry standard database platforms have metadata tools built in that require little overhead work from your organization’s IT department to gain access to your metadata. Tools such as Microsoft Purview, within Microsoft Azure, or AWS Glue, within Amazon Web Services (AWS), can scan and analyze this data through automated processes. By storing your data in these tools, you gain quick insights into metadata that’s available for immediate use.
One often-overlooked source of metadata is within the processes which move data between applications and data storage. This includes extract, transform, load (ETL/ELT) tools which pull data into a data warehouse, application programming interfaces (APIs) that connect applications to other sources, or other tools which extract and transform datasets. These sources often capture metadata about the actual data that’s moving through your system, specifically about how the data is moving. The metadata pulled from these sources can also tell you about:
Additional insights can be made when analyzing the logic that business and IT teams are applying within these data processes and how that reflects operational needs.
There is a treasure trove of information lying in how end users are consuming data. Business users run hundreds if not thousands of reports daily, and this report usage helps us understand user habits over time which inform business decisions. Data analysts run numerous data queries and searches in service of their teams, requesting information directly from organizational databases. These are sources of metadata that go underutilized in the majority of organizations. By monitoring report usage and database activity, organizations can see how users are leveraging the data. They can track what reports and queries are being run, by whom, and how often. This can lead to insights and opportunities which can be shared to a broader audience.
Understanding how security roles are set up within and external to your applications can provide key insights into your data asset usage. Whether it’s your master data management system, operational applications, or database, there are opportunities to identify data risks and data enablement gaps by analyzing and updating security role setup. This metadata tracks who has access to the data, what data they have access to and what kind of access they have. Utilizing this metadata regularly can lead to immediate business impacts.
These sources are a good starting point to consider as you begin to examine the use of your active metadata and its role in your organization’s overarching data governance efforts. If your organization is struggling to understand metadata’s use or are looking to expand your data governance efforts, contact one of our professionals today.