Betaboston.com

Broad Institute, Google Genomics to develop online tools to analyze genetic data

2015-06-24

The Broad Institute of MIT and Harvard, which has amassed the world’s largest collection of genetic data about diseases, is teaming up with Google to create a simpler way to help far-flung scientists pursue their own research online.

The partners will give researchers from Boston to Beijing easy online access through Google to the Broad Institute’s tools for converting genetic data gathered from blood and tissue samples into useful information about mutations that underlie cancers and other diseases.

“Our mission is to empower the biomedical revolution that’s happening around the world,” Eric Lander, founding director of the Broad Institute, said Tuesday. “Central to our mission is to make sure the tools we develop are as broadly accessible as possible. As you get to huge volumes of data, you get to the question of where you’re going to store this data. And storing it on your own computers makes less and less sense.”

The Cambridge institute’s tools will be available on Google’s “cloud platform” for a yet-to-be-determined charge. Google Genomics, a two-year-old division of the online search company, was set up to organize the world’s genomic data. It will store research information and enable scientists from different universities and institutions to collaborate with one another over the Google platform.

“The Broad is expert at developing the algorithms to analyze genomic data,” Google Genomics engineering director David Glazer said. “Google is expert at running algorithms at large scale. When you put that together, you have the best algorithms running in the best environment.”

Typically, genomics researchers run biological samples through a DNA sequencer, converting them into bits of information. But to analyze that information, the researchers must download the Broad’s tools, run their computations on multiple computers, copy files onto hard drives, and ship them around the country or across the ocean to collaborators.

The Broad-Google partnership will enable much of that to be done on Google’s servers.

Other technology companies, including Microsoft and Amazon, are creating their own Internet platforms for managing genetic data. The Broad Institute’s collaboration with Google is nonexclusive and would not prevent the institute from working with others, Lander said. The Broad’s tools have been used by about 20,000 scientists around the world.

“This is another option for researchers,” Lander said. “If I look out the window here [in Cambridge], I’m looking at Google’s office. It’s natural that we collaborate on this.”

The move is part of a larger push to put health information on the Internet, where it can be more easily stored and accessed, said Judy Hanover, research director for the technology research firm IDC Health Insights in Framingham.

“We’re seeing more digitization and more collaborations,” Hanover said. “When it comes to large data sets that are difficult to manage, move, and manipulate, you want to use a third party. And cloud computing is becoming more important. The cloud lends itself to fast tools and public-use tools that are important in collaborations using genomics data.”

Lander said the Broad Institute is sequencing the equivalent of one human genome — the set of genetic instructions that define a person — every 25 minutes. To date, it has sequenced about 15,000 genomes and about 140,000 exomes, the part of human DNA that encodes proteins.