Electronic patient records, clinical trials, DNA sequencing, and medical imaging and disease registries are a sampling of the sources contributing to the exponential growth of public databases housing biomedical information. Researchers hope mining this vast reservoir of data will accelerate the process of understanding disease while driving down the costs of developing new therapies.
But the challenge of harnessing big data to transform scientific research and improve human health is one that is so complex that it can’t be solved alone by a single person, institution or company; collaboration among government, academia and industry is imperative. To foster such partnerships, Stanford and Oxford University are sponsoring the Big Data in Biomedicine conference from May 21-23.
The conference is part of a big data initiative launched by Stanford and Oxford to solve large-number problems at a global scale to improve health worldwide. Euan Ashley, MD, who directs the effort at Stanford, has been involved in several major projects over the past few years to link an individual’s genome sequence to possible increases in disease risk. In the following Q&A, he shares insights about the upcoming conference program, provides an update on the initiative, and discusses how big data can drive innovation for a healthier world.
A collaborative effort between Oxford and Stanford aims to accelerate discovery from large-number data sets to provide new insight into disease and to apply targeted therapies on an unprecedented scale. In what ways are the universities currently working together to achieve this goal?
The Global Institute for Human Health Initiative is a very exciting new venture between these two universities. Catalyzed by the Li Ka Shing Foundation, the initiative draws on the complementary strengths of each institution. Stanford excels in innovation, technology and data management and analysis. Oxford has global reach through its School of Public Health. So it makes sense to work together.
One of our primary goals will be to build “bridges” between the largest databanks of health information in the world. These individual large-scale efforts are remarkable in their own way, but each one has by definition to focus primarily on its own data. This means that limited bandwidth is available to develop mechanisms of secure sharing and analysis. That bandwidth and expertise are things we hope to provide through the initiative. The seed grants awarded through our program in Data Science for Human Health are another way we have started to collaborate. Each one has an Oxford-Stanford collaboration at its heart.
Tell us more about those seed grants. How many have you awarded, and for what kinds of projects?
We received 60 applications and were able to award 12 grants totaling $807,171.48. Among the projects receiving funding were new methods for analyzing accelerometer data in smartphones, approaches to imaging data, and ideas for large scale data analysis, point of care testing for infectious disease and mobile application development. It was an amazing group of applications and I wish we could have funded more projects. At the conference, there will be a brief satellite meeting for the recipients to interact.
Let’s talk more about the upcoming conference. What else can attendees expect from it?
We have an exciting program with a number of high-profile speakers. I’m particularly pleased this year with the broad representation of presenters across sectors. There will be speakers across government, industry and academia, including representatives from the National Institutes of Health, Google, Intel, Mount Sinai and Duke.
We’ve also expanded our international reach, and one of the keynote speeches will be delivered by Ewan Birney, director of the European Bioinformatics Institute. Additionally, this year’s program includes two new topic areas: computing and architecture, which will be chaired by Hector Garcia Molina, PhD, and infectious disease genomics, a particular strength at Oxford. Another addition is the Big Data Corporate Showcase, where companies ranging from industry giants to start-ups will share their achievements and innovations related to big data. So, lots to look forward to!
Why is it so important to gather together experts across academia, technology and industry when discussing big data and human health?
The fact is that quality data science happens across all sectors. One of the amazing characteristics about Silicon Valley is the extent to which people can move across what are in effect artificial boundaries. Collaboration is very natural, essential even, in a big data world. For starters, end-to-end solutions usually require more than one expertise so any kind of data analysis pipeline will involve transforming data, manipulating information, statistical analysis and, finally, visualization. In biomedicine, any analysis needs to start with patients. Additionally, practitioners have to be at the core to make sure we are asking the right questions and providing answers that translate to better health care. Bringing together all these groups in same room is one of the most exciting aspects of the Big Data for Biomedicine conference.
At last year’s conference, you discussed how with six billion data points in each person’s genome there’s an inherent difficulty in ultra-detailed personalized analysis. What progress is being made to automate the process of interpreting individuals’ genomic data to understand diseases risk and guide treatment decisions?
We’re delighted to feature a session on genomic medicine this year with a stellar faculty, including Teri Manolio, MD, PhD, director of genomic medicine at the National Human Genome Research Institute. She’ll likely outline the many areas in which NIH is investing how to bring the “base pairs” to the bedside.
Stanford is of course heavily involved in this area. We recently launched our Clinical Genomics Service, which is very exciting and features the most up-to-date version of the Stanford genome interpretation pipeline. Additionally, in the last year Stanford assumed a leadership role in the new clinical variants database called the Clinical Genome Resource (ClinGen), which was funded by NIH and National Center for Biotech Information. Carlos Bustamante, PhD, is the primary investigator from Stanford, but Mike Cherry, PhD, myself and many others are involved. ClinGen will likely become the world’s premier resource for disease-associated genetic variants, similar to the way Pharmacogenomics Knowledge Base (PharmGKB), which was led by Stanford’s Russ Altman, MD, PhD, is today for genetic variants related to medications.
How are experts outside of the School of Medicine involved in your efforts?
We’re extremely lucky at Stanford to have expertise in data science in every school in the university. As part of the Global Institute Initiative, we’ve been reaching out to leaders in these schools to discuss how best to bring their unique expertise to the table. There are many areas of overlap. For example in social science, computational approaches to demography tie in very well with population sciences in the School of Medicine. Computer science and the School of Engineering clearly represent an area of huge importance for us. Equally theoretical data science, including statistics, help us with the underpinnings of much of what we aim to do.
Previously: Registration opens for Big Data in Biomedicine conference at Stanford, Grant from Li Ka Shing Foundation to fund big data initiative and conference at Stanford, Big laughs at Stanford’s Big Data in Biomedicine Conference and A call to use the “tsunami of biomedical data” to preserve life and enhance health
Photo from last year’s Big Data in Biomedicine conference by Saul Bromberger