Identifying Differences in Disease-causing Genomes Just Went from a Few Days to a Few Hours

Annotating Genomes - DNA Image

From disease-causing organisms to mutations caused by cancer, the first step in diagnosing many health concerns often begins with identifying variations in genomes. But, this can take several days with existing technology.

new algorithm developed at Georgia Tech, however, is shifting this paradigm by dramatically reducing the time it takes to identify genomic variations in a cell or organism to just a few hours or even minutes.

“The faster these variations can be determined, the more quickly an outbreak can be thwarted or a treatment can be prescribed. This is why we need to analyze variations in genomes as accurately and quickly as possible,” said Chirag Jain, the lead investigator on the project. 

"Aligning DNA sequences to an annotated reference is a key step for genotyping in biology,” said Jain. “That is why we created PasGAL, the first multi-core parallel algorithm that can align a sequence to a complex genome graph."

The software packaging PasGAL, co-designed by School of Computational Science and Engineering and Intel researchers, utilizes modern computer architectures that are able to leverage multiple cores to perform the same operation on multiple data points simultaneously.

“That is why matching a DNA sequence to a reference graph, as is done with PasGAL, speeds up the process of genome analysis so remarkably,” he said.

Jain said, “The previous way of identifying variations in genomes was to use reference DNA from a particular individual – which could only asses a single individual genome at a time. This method did not use the extensive genomics data now available across multiple individuals and populations.

Recent studies indicate that by using a graph rather than a linear reference, it is possible to include multiple genomes of multiple individuals simultaneously. 

“Now, we are able to compute optimal alignments of DNA sets to large graphs and multiple genomes within a few minutes or hours, which was not feasible with prior algorithms.”

According to Jain, the team intends for this module to eventually be used with other software so that it can be directly used in hospitals and labs for quick analysis. 

This research won best poster award at RECOMB 2019 and will be presented as part of the advance program at the 33rdIEEE International Parallel and Distributed Processing Symposium next week.

[Related Links: New Approach Speeds Genomic Testing of Microbial Species]

Contact: 

Kristen Perez

Communications Officer