Have you ever met someone, perhaps at work, but didn’t catch their name or weren’t formally introduced during your first encounter? Since then, you exchanged pleasantries on numerous occasions, perhaps engaged in entire conversations—but still didn’t know their name?
By that point, enough time had passed and so, too, the opportunity to ask their name without it being awkward at best, and insulting at worst.
That’s actually how many people feel about data governance. As 2018 begins, it seems that everyone is talking about it, and its importance, but there are many among us who feel like we missed some vital introduction. We think we should know exactly what data governance is, but we missed our opportunity to ask, and now it seems almost embarrassing to inquire at this late date.
But the truth is, formalized data governance is really in its infancy, and you’ll find that the people who don’t quite understand it comprise the majority. Even among those who say they understand exactly what data governance is, definitions will vary widely. Industries, lines of business, job function—all of these inform and influence how we define data governance, and what it means to us. But the important thing is not to establish a universal, formal definition, but rather to understand the fundamentals of what data governance entails, what it is intended to accomplish, and how it can serve increasingly important functions in an era of big data. So let’s dig into this topic and try to understand what everyone seems to be talking about.
We’re certainly not going to dig into a comprehensive history of data. But a basic understanding of data governance today requires historic context of how data has evolved over time. In the early days, data was largely a transactional concern. The use or production of data was process-centric, applied or generated from business processing activities and limited to a select few.
But over time, the realization gradually dawned that data had real potential beyond the realm of IT and data processing. Organizations began to consider ways that data could be elevated from byproduct to business asset through data analysis for decision-making, an evolutionary step that marked the birth of what is commonly called Business Intelligence (BI). Since that time, the use cases for BI have grown exponentially, and technological advancements have enabled increasingly sophisticated mining of data for business insights. For a number of years, though, it seemed that only the largest companies with the deepest pockets were in a position to reap the full benefits of BI and data analytics, but those days are over. Big data no longer pertains only to big business, as diverse organizations of varying size can collect data at a dizzying pace—yet the value of data lies not in volume, but in an organization’s ability to quickly leverage that data for business advantage. And in an increasingly complex regulatory landscape, the compliance risks can be steep if data and processes aren’t properly managed.
Data today represents a critical asset, and the need to extract value from those assets has moved from a business advantage to a competitive imperative. Its broad array of use cases now require business professionals to find and manipulate data to quickly perform analytics to solve business problems. But to realize data’s full potential, it must be managed like any other asset before it turns into a liability. You need to know where it came from, how old it is, what’s the quality of it, where to find it and how to use it appropriately. Take for example a 3rd party or licensed data set. How do you know if you are authorized to use it in your data analysis? How do you know you can trust it? You don’t unless governance policies and data owners clearly state the scope of its use and metrics to understand its data quality.
The answers to all of these questions comprise the foundation of data governance in business. It requires a repository of these answers, as well as the people and systems that govern data across an enterprise. Simply defined, it is the formal orchestration of people, processes, and technology that enables an organization to leverage data as an enterprise asset. Sounds easy, but organizing data governance on a spreadsheet or in SharePoint will only get you so far.
Depending on the organizational role, one’s viewpoint of data governance can be quite narrow. For instance, a compliance professional will understandably view data governance through the lens of potential regulatory violations. For example, in the banking industry, BCBS 239 informs many strategies when it comes to data management, but beyond that, banking data also offers a wealth of analytics opportunities for improved customer experience and competitive differentiation beyond the compliance arena. For this reason, data must be properly catalogued, scored, and defined so that users across an organizational enterprise can view available assets, understand what they are and how to use them, and have a reliable barometer to gauge the caliber of that data to produce quality business decisions.
A data governance program needs to begin with the basics of data governance, such as data lineage (defined as the lifecycle of data from origin over time and through systems and processes); a data dictionary (a description of each data object within a database, its type, and its relationship to other data); and a business glossary (the definitions of business terms and how they may vary across business functions). Take data lineage for example. It is of utmost importance to IT professionals, but its information overload for business professionals that require it to be translated into a business lineage which is a key capability of data governance. Beyond these key components, data governance must also define policies, ownership, and data quality across an enterprise. But arguably most important to data governance, or at least the key to enabling accurate, meaningful predictive analytics that turn raw data assets into real business value, is another oft-misunderstood buzzword: metadata management.
If data governance maps the ecosystem of how data flows and functions across an organization’s data supply chain, metadata (often referred to as “data about data”) management provides the underpinnings for understanding that data at a granular level and therefore effectively using it across an enterprise. We are dependent on metadata everyday and don’t even know it. Jump in your car, turn on the radio or plug in your smartphone and metadata automatically shows you the name, artist, and duration of the song you’re playing.
To transform data governance you need the ability to connect the metadata to the data governance business glossary (aka data catalog), in order to create a rich understanding of the data beyond its data definition and data quality to include pertinent metadata information. For example, without this fusion of metadata with the data catalog, you won’t be able to connect the dots to do things like translate technical data lineage into easy-to-understand business lineage.
Data, as mentioned previously, can be a tremendous asset, but if business users don’t understand what it is, where it is or how to use it through clearly defined policies and processes, it may as well not exist. But managing metadata in real time as part of a comprehensive data governance framework enables users to easily understand and utilize that data to run analytics and uncover actionable insights. Misunderstanding, on the other hand, breeds mistrust and misuse, leading to questionable results or underutilization.
Clearly there are many moving parts to constructing a successful data governance framework, but building a solution step-by-step maximizes the value of data assets and creates a successful synergy of people, process, and technology. The best data governance program not only maximizes the value of analytic insights, but also ensures the ongoing quality of your data through machine learning, enhanced efficiency and asset utilization through understanding and transparency, and increased collaboration across your enterprise through clearly defined responsibilities and workflows. Data governance is dependent on a supporting framework of systems and processes, to be sure, but it is equally reliant on data owners, stewards, and the business users who turn that data into value.
So start asking yourself some simple questions like, “Can a broad set of users provide the same answers to what is the definition of the data on this report?” Or, “Who is the data owner and what is the quality of the data?” More often than not, these answers differ depending on who you ask, which is a symptom that one’s data governance isn’t functioning properly.
For more information on the steps necessary to start and implement a data governance framework, check out the eBook below.
For a deeper dive into this topic, visit our resource center. Here you will find a broad selection of content that represents the compiled wisdom, experience, and advice of our seasoned data experts and thought leaders.