At its heart, genomics is a data issue, and much of the science that has developed around genomics has been in the successful management of that data rather than specific medical advances.
So it’s perhaps not surprising to see a new search engine released by University of California San Diego researchers that aims to make it easier for us to search our genomics data records.
The search engine, called GeNemo, has been documented in a recently published paper, and aims to make it easier to search for functional genomic data.
Searchable genomic data
Functional genomics data is valuable as it helps to record the range of activities of each piece of the genome. The new search engine hopes to help researchers uncover the various functional aspects of certain parts of the genome that we believe are responsible for disease.
The search engine allows users to query a range of databases, including the entire ENCODE dataset. The search algorithm utilizes pattern matching to offer richer results than traditional text-based searches.
“If you think of functional genomic data files as video files, then the ‘text search’ is like searching by keywords in the title or the description of a video file. The ‘inside data search’ is like searching for a video clip by pattern matching within the video itself,” the team say.
“Functional genomic assays are producing massive amounts of data, in challenging data types. We have developed an online tool that empowers users to input any complete or partial functional genomic dataset, for example, a binding intensity file like bigWig, or a peak file,” they continue. “GeNemo reports any genomic regions, ranging from 100 bases to 100,000 bases, from any of the online ENCODE datasets that share similar functional patterns such as binding, modification and accessibility.”
The search engine is believed to be the first facility to provide such functional genomic searching online, and as such the team are very excited about the prospects for the tool and how it can help researchers around the world.