With the promise of big data (solving the unsolvable problems, informing better decision making, creating new products and services, discovering patterns and acting on them, etc.) on the horizon, what has really changed? Does this mean that everything we know and do with not-so-big data should be tossed?
Not so fast. While big data as a mainstream concept is somewhat new, traditional information management is not. It can still be a valuable organizing structure to bring order to chaos. After all, organizations big enough for big data ought - ought - to have an information governance program already in place. And big data should be part of that program. According to "Big Data Governance" by Sunil Soares, a book I recommend:
Big data governance is part of a broader information governance program that formulates policy relating to the optimization, privacy, and monetization of big data by aligning the objectives of multiple functions. ...
... The traditional disciplines (of an information governance program) are organization, metadata, privacy, data quality, business process integration, master data integration, and information lifecycle management.
These disciplines are exactly what data big and small need. The rules of the road do still apply.
I am not making light of the significant issues facing effective big data usage, or the need for different tools to handle it. To abuse a lyric from the wise philosopher and rapper Eminem, these issues come from both the top down and bottom up, like "when a tornado hits a volcano."
Representing the top down tornado: massive amounts of regulation, the convergence of traditional patient, payer, provider and life science roles, and the rapid introduction of new, frequently disruptive technologies. While we IT professionals have always been in the business of change, even our definition of normal has been blown away.
Representing the bottom up volcano, as discussed in previous posts (HIT legacy systems: Outrunning the Zombie Apocalypse and Fracking healthcare big data: Drilling for value or just hot gas?), there are issues with the current state of existing data:
- Legacy systems house significant data, usually in mysterious ways
- Vendors allow non-standard data entry within their own systems, and do not play nicely with others
- Clinical data is corrupted as it is translated across standards, regional and national registers and Health Information Exchanges
And there is the very nature of what big data is: "high volume, velocity and/or variety of information assets" (Gartner Group) that traditional methods and tools cannot handle well. A bubbling cauldron of data, to be sure.
So yes, I get it. But let's take a big step back. We've seen this sort of 'the old rules don't apply' froth before. The term 'big data' is currently enjoying quite a ride on the hype cycle (is the term "Chief Big Data Officer" far behind??? Please, no!). Tornados, volcanoes and hype-mongers aside, it still is, after all, just data, even if your data center (or cloud) needs its own zip code...
This article was originally published in Computer World on May 22, 2013.