In a groundbreaking study published in the journal Environmental Science and Ecotechnology, researchers from Huazhong University of Science and Technology have introduced ’Meta-Sorter’, an AI-based method that leverages neural networks and transfer learning to significantly improve biome labeling for thousands of microbiome samples in the MGnify database, especially those with incomplete information.


The Meta-Sorter approach comprises two crucial steps. Firstly, a neural network model is meticulously constructed using 118,592 microbial samples from 134 biomes and their respective biome ontology, boasting an impressive average AUROC of 0.896.

This model accurately classifies samples with detailed biome information, serving as a strong foundation for further analyses.

Transfer learning

Secondly, to address the challenge of newly introduced samples with different characteristics, researchers incorporated transfer learning with 34,209 newly added samples from 35 biomes, including eight novel ones.

The transfer neural network model achieved an outstanding average AUROC of 0.989, successfully predicting biome information for newly introduced samples annotated as ‘Mixed biome’. The results of Meta-Sorter are impressive, achieving an overall accuracy rate of 96.7% in classifying samples among the 16,507 lacking detailed biome annotations.

This groundbreaking breakthrough effectively resolves the issue of cascading errors and opens up exciting new possibilities for knowledge discovery across various scientific disciplines, particularly in environmental research.

Moreover, Meta-Sorter’s success extends to refining the biome annotation for under-annotated and mis-annotated samples. Its intelligent and automatic assignment of precise classifications to ambiguous samples provides valuable insights beyond the original literature, while the differentiation of samples into specific environmental categories enhances the reliability and validity of research conclusions.