This is the third in a three-part series by Dr. Heather Buschman on Big Data in the pharmaceutical and biotechnology industries. Check out Part 1 and Part 2.

big data open science

Source: Wikimedia Commons

Big Data—the vast amount of information being collected by recent advances in medical technologies such as high-content screening, next-gen DNA sequencing, wireless health monitoring—could revolutionize medical research and healthcare. Yet a number of bottlenecks, such as the inherent complexity of living organisms, lack of computing power, and data inaccessibility, are holding back progress. What can you do? Here are three places to start:

1. Make it good

There’s no sense in wasting storage space and analysis time on bad data. As they say, garbage in, garbage out. Get back to basics with well-planned experiments, proper controls, and lots of meta-information (e.g., when and where the sample was collected, and how it was collected and processed).  Focus on verification and reproducibility. It doesn’t even have to be fancy—according to a recent survey of pharmaceutical executives, 63 percent say the best validation technique is simply verifying patient information over the phone1.

2. Get help from Big Data experts

Drowning in data? Don’t suffer alone. Get help from expert collaborators or CROs who can quickly implement the infrastructure you need and scale it as necessary2. Begin by checking out this list of contract services available in data visualization, mining, analysis, interpretation, and database management through Assay Depot.

3. Support Open Science

In a perfect world, datasets would be stored electronically and shared freely among basic scientists, drug discovery experts, clinical researchers, healthcare providers, and CROs2. Over the last couple of decades, the movement toward Open Science of this sort has grown—the belief that science will be better if methodologies, data, and negative and positive results are shared freely, even before publication. To get started, join the Open Science community and take advantage of the free, collaborative tools available online: follow the #OpenScience hashtag on Twitter, participate in collaborative projects such as OpenWorm and Open Source Malaria, learn how to protect your intellectual property at Creative Commons, publish all of your research outputs on Figshare, and check out open access publishers such as the Public Library of Science (PLoS) and PeerJ.

What are YOU doing to overcome Big Data bottlenecks? Let us know in the comments below.


  1. Paul Tunnah. Does pharma want good data or big data? Pharmaphorum. November 15, 2013.
  2. McKinsey & Company. How big data can revolutionize pharmaceutical R&D. April 2013.