Businesses use metadata to classify, manage and organize the massive amounts of diverse data collected across their enterprise. This information is critical to both understanding and effectively deploying data resources to support enterprise departments. In addition, metadata provides essential information required to enable advanced analytics.
However, data is continuously in motion across an organization’s data supply chain. As data is constantly ingested, created and transformed, its metadata is also changing. If metadata is not updated and current, the value of data declines as understanding diminishes and users become less likely to identify the right data for the right task. As a result, insights are missed, opportunities squandered and value diminished.
Metadata management gives business and technical users vital information on assets in data repositories, from where data is located, to when and how it was created. But the details on data’s origin, lineage, transformations, level of quality and relationships to other data or reports can change at any moment. By leveraging descriptive metadata to produce active metadata, businesses can apply machine learning algorithms to automate asset curation by reducing the need for manual tasks, ensuring the metadata descriptions remain accurate and reliable. Furthermore, active metadata helps to proactively manage risk associated with changes in data quality and data use.
Active metadata takes action by leveraging descriptive metadata that is captured by descriptive analytics and stored alongside other critical technical, physical and logical metadata. Machine learning is used to help with the actions that automate data management implementation and maintenance as well as curation of data assets. It automates time-consuming, manual tasks such as data cataloging and data tagging, identifying relationships between data sets and linking related business terms.
Active metadata also automates the process of managing data quality risk. By applying data quality rules to data, active metadata can trigger on the descriptors captured. Active metadata can relate similar data sets and apply the same quality rules, instead of building new rules for each data set. This drastically reduces the resources consumed by businesses to manage data.
In addition, active metadata helps to provide more context in data sets. This is not just to provide its availability (easily discoverable), but also to provide its suitability, providing context on how suitable it is for a given purpose. For example, when a data scientist is searching for data, active metadata guides them to the exact data they need for a specific project or purpose.
Businesses can also use active metadata to create recommendation engines to enable data asset discovery. For example, when a user is searching for data, it can find similar data sets that are likely of interest. Previously, those data sets would have gone undiscovered. The same active metadata descriptors can also tell businesses what data is being used, what information is obsolete and what data is redundant to eliminate any unused, duplicate or outdated data.
Although businesses can automatically capture data, discover relationships between data sets and link the data to other technical and business resources, they still need the right tools, technologies, and processes to create active metadata.
Active metadata applications can jump start use cases from data governance, to data quality, to analytics, and can take data management to the next level as part of an integrated data intelligence platform that features all three. A solution that features pre-built and customizable connectors to quickly and efficiently harvest descriptive metadata from multiple sources can deliver automated active metadata that ensures a complete view of the organization’s data landscape. These include the available data, its location, the data owner/steward and lineage. By having a holistic, real-time view of the evolving data landscape, users will have unified and quickly accessible curated glossary of definitions, synonyms, and business attributes for data. Users can also easily define, track and manage all aspects of their data assets to overcome challenges and make important business decisions.
An integrated intelligence platform should also include data quality capabilities to conduct data integrity checks that include completeness, conformance, and validity. Active metadata ensures that data is transformed correctly as it flows through multiple systems, and that the data remains accurate. Analytics capabilities should then leverage machine learning algorithms for continuous self-learning to improve data quality.
In addition, the platform should have automatic data discovery capabilities, enabling the constant capture and monitoring of changes to metadata. Changes in metadata can then be automatically discovered and applied across the data supply chain to deliver meaningful insights on data.
If appropriately managed, active metadata is a major business advantage. From new product development to target marketing to risk management, it can reduce long manual processes so businesses are a step ahead of the competition.
Are you looking for additional information on managing metadata? Check out the e-book below.
For a deeper dive into this topic, visit our resource center. Here you will find a broad selection of content that represents the compiled wisdom, experience, and advice of our seasoned data experts and thought leaders.