Big Data

Thoughts on the Strata Rx Healthcare Conference

Bill Schmarzo By Bill Schmarzo Chief Technology Officer, "Dean of Big Data" October 23, 2012

Last week myself and several colleagues from EMC attended the Strata Healthcare Conference here in San Francisco, and I thought it was a huge success. I also expect that for some folks, it was a very troublesome conference. Why would I say that?  Because the number of massive data sets available today, and the number that are coming on-line in very short order, is almost debilitating. It’s one of the reasons why many of the entrepreneurs and venture capitalists who had success in the internet and financial services industries, have now trained their sights on the healthcare industry – a $1 trillion industry (in just the US) where severe processing, analytic, and business inefficiencies exist (see Figure 1).

Figure 1:  US Healthcare Spend 1997-2017

Healthcare Data Challenges and Opportunities

There are many massive, big data sources – both structured and unstructured – available that can yield new information and insights about patients, procedures, medical treatments, medicine testing, clinical studies, drug research, and the payer-provider relationship (see Figure 2).

Figure 2:  Tsunami of Current and New Healthcare Data Sources

And new massive, big data sources are on their way, such as:

  • Genomics or gene sequencing, which contain over 2.3B snips of data per each human strand of DNA.  And not only has the price of DNA and genomics testing dropping to the level of the common man, but organizations like are working to “liberate” your DNA so that it is easier to share with other genomics organizations and to pool your genomics with the genomics of others in order to identify causes of diabetes, cancer, blindness, and even baldness.
  • Mobile apps and social media are creating vibrant communities around specific heath issues.  There is much innovation in the mobile healthcare space to simplify and encourage the capture and sharing of your personal health data by startups such as WebMD and
  • “Intelligent” personal home health monitoring devices (blood glucose, blood pressure, medication monitoring, smart toilets) that will unleash a tsunami of data and insights about your current health conditions and flag patterns or trends of personal health concerns.

Healthcare Big Data Enabling Technologies

As I’ve discussed in the past, the advent of new, more detailed data sources typically requires new technologies designed to acquire, store, manipulate, and analyze these new data sources.  Big data technologies, many of which have been perfected in other industries like digital media (ad serving, real time bidding, attribution analysis) and financial services (algorithmic trading, fraud detection), are now available to the healthcare industry to merge, integrate, synchronize, and tease out the insights buried in these new data sources.

Figure 3:  Big Data Enabling Technologies

Big Data Healthcare Use Cases

A number of healthcare use cases, enabled by these new sources of data and innovative big data technologies, are starting to emerge.  Here are a few examples:

Figure 4:  Healthcare Big Data Use Cases

  • Detecting fraud in real-time by using Hadoop to match historical claims and payment data, with in-memory computing to analyze current transactions to flag or score potential fraudulent activities.
  • Reducing hospital readmissions by using MPP databases and data virtualization to access and integrate past admissions and outcomes with current patient data, and in-database analytics to create re-admission scores at the time of patient check-in that can suggest personalized hospitalization plans for at-risk patients.
  • Improving patient care using Hadoop and data virtualization to synchronize all of a patient’s history of treatments, procedures, lifestyle changes, therapy – and even DNA data in the near future – with advanced analytics to attribute the effectiveness of different medications, treatments, and lifestyle changes upon a patient’s health score.

“Data Is Good.  More Data Is Better!”

Few industries have the variety of data sources, many of them publicly available, to provide unique, actionable insights into the quality and cost of healthcare.  The potential is almost endless, as healthcare organizations look to take the next step in pooling data across patients, treatments, procedures, studies, and more to preempt disease outbreaks or identify the potential causes of life-threatening health conditions like diabetes (see Figure 5).

Figure 5:  Pooling Data to Yield New Healthcare Insights

To quote my colleague Hulya Emir-Farinas, a data scientist within our Greenplum division, “Data is good.  More data is better.”  More detailed and diverse data sets can yield new insights and perspectives on some of our healthcare problems, and enable new solutions to fulfill that goal of providing better patient care at lower costs.   More data can enable healthcare organizations to uncover the real causes of healthcare problems so that actionable, cost-effective solutions and care can be directed at those problems.

By the way, if you’d like to see my Strata Rx presentation, you can check it out here.

Bill Schmarzo

About Bill Schmarzo

Chief Technology Officer, "Dean of Big Data"

The moniker “Dean of Big Data” may have been applied in a light-hearted spirit, but Bill’s expertise around data analytics is no joke. After being deeply immersed in the world of big data for over 20 years, he shows no signs of coming up for air.

Bill speaks frequently on the use of big data, with an engaging style that has gained him many accolades. He’s presented most recently at STRATA, The Data Science Summit and TDWI, and has written several white papers and articles about the application of big data and advanced analytics to drive an organization’s key business initiatives. Prior to joining Consulting as part of EMC Global Services, Bill co-authored with Ralph Kimball a series of articles on analytic applications, and was on the faculty of TDWI teaching a course on designing analytic applications.

Bill created the EMC Big Data Vision Workshop methodology that links an organization’s strategic business initiatives with supporting data and analytic requirements, and thus helps organizations wrap their heads around this complex subject.

Bill sets the strategy and defines offerings and capabilities for the Enterprise Information Management and Analytics within Dell EMC Consulting Services. Prior to this, he was the Vice President of Advertiser Analytics at Yahoo at the dawn of the online Big Data revolution.

Bill is the author of "Big Data: Understanding How Data Powers Big Business" published by Wiley.

Read More

Join the Conversation

Our Team becomes stronger with every person who adds to the conversation. So please join the conversation. Comment on our posts and share!

Leave a Reply

Your email address will not be published. Required fields are marked *