Mingyan Fang

Research Area: Genomics, Computational Biology

Professor and Principal Investigator, BGI-Research, Shenzhen

Professor and Principal Investigator specializing at the intersection of Genomics and Artificial Intelligence, with a Ph.D. in Clinical Immunology from Karolinska Institutet, Sweden. Over 15 years of experience in computational biology and translational medicine, dedicated to deciphering the mechanisms of immune diseases through innovative AI/ML approaches and large-scale multi-omics integration.

70+
Publications
20+
Patents
3,400+
Citations
Mingyan Fang

Research Highlights

🤖

AI/Algorithm Innovation

Developed VIPPID (first specialized Primary Immunodeficiency variant predictor), GeneRAIN (Transformer-based GRN model), and VIPER (LLM for genetic disease gene identification).

🧬

Disease Mechanism Research

Systematically identified 50+ causative genes for human disease (Inborn error of immunity, autoimmune, neuroimmune, etc.), elucidating pathogenic mechanisms.

📊

Large-Scale Data Platforms

Developed the ZBOLT genomic analysis platform with a capacity of 100 Tbp/day, enabling ultra-large-scale population studies.

Selected Publications

Representative works from recent years. See the full list on the Publications page.

Genome sequencing of 7,140 newborns reveals hidden genetic disease burden

M. Fang#*, Y. Huang*, Y. Mei*, X. Jia*, Y. Gao*, X. Wang, Y. Sun, Y. Zeng, W. Huang, L. Zhu, Z. Duan, Y. Xie, X. Jiang, H. Zeng, J. Tang, X. Qian, Z. Li, Y. Yi, G. Zhang, Y. Huang, C. Liu, G. Huang, W. Zeng, B. Wang, Y. Miao, Y. Bai, H. Huang, Y. Xiao, J. Liu, X. Xu, S. Pan*, L. Hammarström*, F. Chen*, X. Jin*. Advanced Science, under first-round review (IF: 14.1, CAS Q1)

Large-scale newborn genome sequencing study revealing hidden genetic disease burden and establishing foundation for population-level genomic screening.

First & Corresponding AuthorGenomics - Clinical Applications

GeneRAIN: multifaceted representation of genes via deep learning of gene expression networks

Z. Su*, M. Fang*, A. Smolnikov, ME. Dinger, EC. Oates, F. Vafaee. Genome Biology 26, 288 (2025). (IF: 9.4, CAS Q1)

Novel deep learning framework for gene representation learning using 777K bulk transcriptomes, advancing AI-driven genomics research.

Co-First AuthorAI Foundational Model Development

Age-Related Dynamics and Spectral Characteristics of the TCRβ Repertoire in Healthy Children: Implications for Immune Aging

Fang M#*, Y. Miao#, L. Zhu, Y. Mei, H. Zeng, L. Luo, Y. Ding, L. Zhou, X. Quan, Q. Zhao, X. Zhao, Y. An#. Aging Cell, 2025.

Characterizes age-related dynamics in the TCRβ repertoire of healthy children, informing immune aging mechanisms.

First and Corresponding AuthorImmune Aging

An efficient large‐scale whole‐genome sequencing analyses practice with an average daily analysis of 100Tbp

Z. Li*, Y. Xie*, W. Zeng*, Y. Huang, S. Gu, Y. Gao, W. Huang, L. Lu, X. Wang, J. Wu, X. Yin, R. Zhu, G. Huang, L. Lu, J. Tang, Y. Zheng, Q. Liu, X. Zhou, R. Shan#, B. Wang#, M. Fang#, X. Jin#. Clinical and Translational Discovery, 2023. (IF: 1.9)

Development of ZBOLT platform enabling ultra-large-scale genomic data processing at 100 Tbp/day capacity.

Co-Corresponding AuthorBig Data Analysis Platform Development

VIPPID: a gene specific single nucleotide variant pathogenicity prediction tool for Primary Immunodeficiency Diseases

M. Fang*, Z. Su*, H. Abolhassani, Y. Itan, X. Jin, L. Hammarström. Briefings in Bioinformatics, bbac176 (2022). (IF: 13.994, CAS Q1)

First specialized variant pathogenicity predictor for Primary Immunodeficiency Diseases, revolutionizing clinical genetic diagnosis.

First AuthorMachine-learning Core Analytical Method

Get in Touch

For inquiries about collaboration, student supervision, or other matters:

Contact Page