Researchers have developed a novel genome assembly tool that could spur the development of new treatments for tuberculosis and other bacterial infections.

Mycobacterium tuberculosis

Source: CDC/ Antibiotic Resistance Coordination and Strategy Unit

Medical illustration of drug-resistant Mycobacterium tuberculosis bacteria

The new tool, which has created an improved genome map of one tuberculosis strain, should do likewise for other strains and other types of bacteria, according to researchers whose findings appeared in Nature Communications.

Mycobacterium tuberculosis, the bacteria responsible for the disease tuberculosis, infects about a quarter of the world’s population and killed 1.6 million people in 2021, according to the World Health Organization. 

“The key to beating this disease is to understand it, and the key to understanding it lies in its DNA,” said David Alland, the senior author of the study who is chief of the Division of Infectious Diseases at Rutgers New Jersey Medical School and director of the school’s Public Health Research Institute.

“We hope our new pipeline provides researchers around the world with the information they need to create faster, more effective treatments and, ideally, a fully effective vaccine.”

Scientists first sequenced the genome of one tuberculosis strain  H37Rv  in 1998, but they never could generate the sort of complete and accurate sequence that would maximize their chances of eradicating the disease — until now.

The new pipeline, dubbed Bact-Builder, combines common open-source genome assembly programs into a novel and easy-to-use tool which is freely available on GitHub.

Scientists today typically sequence new bacterial genomes by cutting large pieces of DNA into small, quick-to-scan fragments and then using a reference sequence such as H37Rv to align all the resulting pieces of data properly. However, assembling genomes without a reference, as Bact-Builder does with data from MinION sequencers, allows researchers to identify genes present in clinical strains that may not be present in the reference.

The tuberculosis sequence created by Bact-Builder contains approximately 6,400 thousand more pieces of information (base pairs) than the old reference and, more importantly, identifies gene new genes and gene fragments missing in the old reference.

“Just publishing a fully accurate genome for the H37Rv  reference strain, which is used in hundreds of studies a year, should significantly help tuberculosis research,” Alland said.

Having an easy way to sequence all strains accurately is even more important, Alland said, because strain comparison should answer many vital questions such as why some strains are more contagious than others.