At the back end of last year I suggested that healthcare is increasingly a big data problem, as the volume of data surrounding our health grows exponentially. Indeed, it’s typically growing much faster than our ability to capitalize on it.
The latest big data dump comes by way of the American Association for Cancer Research (AACR), who have released 19,000 de-identified genomic records to the research community to support their efforts around precision medicine.
The data dump is the result of the AACR Project Genomics Evidence Neoplasia Information Exchange (GENIE) project, and as such contains data on nearly 60 different types of cancer. The data consists of both clinical and genomic data.
“We are excited to make publicly available this very large set of clinical-grade, next-generation sequencing data obtained during routine patient care,” the team say. “These data were generated as part of routine patient care and without AACR Project GENIE they would likely never have been shared with the global cancer research community.”
The project is working with a number of research institutions, including eight that focus on cancer research. The partners aim to expand our knowledge of cancer, and of course the potential treatments for it.
“We are committed to sharing not only the real-world data within the AACR Project GENIE registry but also our best practices, from tips about assembling an international consortium to the best variant analysis pipeline, because only by working together will information flow freely and patients benefit rapidly,” they say.
The project is a part of the wider initiative launched by President Obama into precision medicine research. It also provides us with further evidence of the growing appreciation of the possibilities when big data is shared openly and transparently.
With the Cancer Moonshot being the focal point of this initiative, it is perhaps not surprising that cancer has received considerable attention from the research community.
The latest of these projects was the recently launched Deloitte X-Prize, which provides a $20 million prize fund to help combat the disease.
One of the core goals of the Moonshot is to develop a system-wide infrastructure for the sharing of medical data, with institutions encouraged to exchange insights and support the work of each other.
Projects such as the X-Prize and AACR Project GENIE, alongside existing ventures such as the Biobank in the UK, will increasingly make genomic and clinical data available to researchers around the world.
At the moment, much of that data is quite siloed, but imagine if genomic data could be paired easily with the lifestyle data generated by mobile and wearable devices, and that paired with medical records. The open collaboration around data opens up a world of possibilities, and it seems, to quote Victor Hugo, as though it’s an ‘idea whose time has come’.