What Happened to Big Data?
Is “big data” even a thing anymore? For so many years it was thing, but I’m at a point where I believe big data is just the new normal. You can’t use your smart phone, drive your car, monitor your fitness or swipe your credit card anywhere, or anytime, without creating volumes of information. Just a few short years ago, so much data would have taken up the entire hard drive of your desktop PC. If you have any illusions that big brother isn’t here, let them go right now! That being said, it’s not all bad.
Believe it or not, advanced analysis of big data stores is making your life easier. Targeted marketing, analytics around usage patterns and customized plans designed to match your unique needs expedite delivery of products and services. In addition, risk models can monitor activity to proactively notify you of fraud and other unwelcome surprises.
Big Data Unicorns
Enter the unicorns. With the relatively recent explosion into petabytes, exabytes and zettabytes of information, expertise in the big data world is very rare. Environments built on Hadoop or other high volume platforms require highly skilled technical professionals that have extensive experience or training to execute not just analysis, but to acquire, prepare and operationalize insights to enable business users to immediately consume and act on the data. These big data professionals are modern day unicorns. Elusive and hard to find, these IT specialists are sometimes viewed as having ‘mystical’ powers as they make sense of models that uncover and predict patterns and trends to optimize business processes and customer satisfaction. With so few available, and demand rising every day, IT executives are searching for more self-service or automated big data solutions for analysts so their big data unicorns can focus on more strategic and complex projects. Let’s look at some of the key processes within a big data environment where self-service and/or more automation would give our unicorns a break:
- Data Acquisition: Your big data environment, whether a data lake or another platform, should be able to ingest a constant flow of data from many sources and in various formats. Acquisition challenges could easily derail the productivity of our unicorns if some specific attributes aren’t set up. Internal operational systems, as well as third party feeds, should automatically load or schedule into your landing zone to assure the freshest data is always available. Whether loads are in batch or real-time, your environments should be able to accommodate structured and un-structured data. A best practice would be to implement validation checks to notify end-users of the timeliness/arrival of new feeds so they would automatically know when fresh data sets have completely arrived. Another best practice that would facilitate self-service adoption would be to integrate a business and tech-friendly governance layer over all data sets to house definitions, policies, ownership and other metadata. This would expedite acquisition by allowing all users—even our unicorns—to understand and appropriately select data sets and know exactly who to contact if they had questions or concerns.
- Data Preparation: Preparing data for analysis is another challenge for our unicorns. Requests for ever-changing data elements, profiling the quality of the data sets and transforming information into usable formats can be very time consuming. And just when the end-user thinks they have it down, the data refreshes making the analysis out of date! The most advanced big data environments now allow drag-and-drop data prep capabilities making data prep very intuitive. Analysts can literally drag data sets across a canvas to prepare the data they want to use. The ability to profile data sets for data quality is also imperative. Self-service capabilities to check completeness, value conformity, consistency and many other functions are now available. Here again, a collaborative governance layer is required to understand data quality scores, policies and key metadata in order to stream-line adoption by end-users who need to understand the data sets they are using. In addition, machine learning analytics around data quality validations automate the process even further, and with far deeper insights than would be feasible through manual analysis.
- Operationalization: Delivering analytical insights to team members that can put them into action is another call for unicorn-like resources. Advanced visualization solutions that can accommodate big data sources exist, but once again, unique skills are required to deliver, which means delays in getting analysis to operational teams. A big data platform that is directly integrated with visual, interactive reporting and dashboards allows real-time access to enable immediate results. Analytics can always be pushed up to enterprise visualization solutions as needed, but the time crunch to put insights into action is relieved from those limited skilled resources.
- Analysis: One of the final considerations is big data analytical capabilities themselves. Data scientists are arguably the scarcest of all unicorns. Putting machine learning into the hands of the analysts will allow these technical specialists to focus on the highest level strategic projects within an organization. Analysis like being able to segment outliers from terabytes of information for immediate case management, classification of massive data sets as pertaining to a specific compliance policy, or analyzing historic information to forecast future behaviors is optimizing business processes and customer satisfaction across pro-active organizations worldwide. The more these and other big data analytics can be leveraged by the average user, the greater the return on the investment that makes your company unique – your data.
Clearly there ARE unicorns out there. Maybe you’ve even seen one. Big data environments that deploy some of the self-service capabilities described above are liberating the unicorns by spreading the unicorn magic around…could you be the next unicorn in your company?
To learn more about self-service data prep to help identify data unicorns that can help you find the analytics you need, check out the data sheet below.