Google is joining forces with the Broad Institute of MIT and Harvard to launch a Genome Analysis Toolkit (GATK) in alpha on its Cloud Platform.
The Institute has the largest collection of genetic data regarding various diseases, and this collaboration with Google aims to benefit not only the parties directly involved, but academic researchers and scientists especially.
The raw data of a person's genome takes up at least 100GB, which means that DNA sequencing requires whopping amounts of data. The Broad Institute has a collection of more than 1.4 million biological samples, either sequenced or genotyped, and analyzing all of this data involves tremendous resources. The partnership with Google can provide several benefits.
This collaboration will focus on providing the necessary computing infrastructure to store and process huge sets of data, while also creating tools for analyzing the data and allowing for biomedical research progress unhindered by infrastructure limitations.
GATK will be part of Google Genomics and be available as a service on Google Cloud Platform, allowing scientists to perform genomic sequencing data analysis in a fast and efficient manner. Researchers will be able to benefit from the software for free, but they will nonetheless have to pay to use Google Cloud Platform. At the same time, Broad will also license the software to interested business users. So far, more than 20,000 users have reportedly processed genomic data thanks to GATK.
"The goal is to enable any genomic researcher to upload, store, and analyze data in a cloud-based environment that combines the Broad Institute's best-in-class genomic analysis tools with the scale and computing power of Google," noted the press release on Wednesday, June 24, announcing the partnership.
Eric Lander, Broad Institute President and Director, further points out the importance of large-scale genomic information in increasing the rate of scientific progress regarding many diseases, including cancer, diabetes, or psychiatric disorders. Biometrical researchers face a great challenge when it comes to storing, analyzing and managing genomic sequencing data, but this new partnership with Google will bring valuable solutions to overcome such hurdles.
"We are excited to work with Google's talented and experienced engineers to develop ways to empower researchers around the world by making it easier to access and use genomic information," stated Lander.