AI Predicts More Than 1,000 Diseases Years Ahead Globally

Comments
AI Predicts More Than 1,000 Diseases Years Ahead Globally

5 Minutes

AI model forecasts thousands of diagnoses years before onset

Scientists from institutions in the United Kingdom, Denmark, Germany and Switzerland reported the development of an artificial intelligence model that can forecast rates for more than 1,000 medical conditions years in advance. The system, named Delphi-2M, builds on transformer-based neural network technology — the same high-level architecture used in consumer chatbots such as ChatGPT — and was described in a paper published in Nature.

Delphi-2M was trained primarily on longitudinal health records from the UK Biobank, a large biomedical research resource containing detailed health, genetic and lifestyle information for roughly 500,000 participants. The research team assessed performance further by validating the model using nearly two million patient records from Denmark’s national health database, demonstrating cross-country replication of many predictive signals.

How the model works and scientific context

Transformer models are best known for handling sequences of language, where they learn patterns and relationships among words. Researchers applied that same sequence-learning capability to clinical timelines: each diagnosis, test result or medical code in a patient’s history becomes an ordered token the model can learn from. As Moritz Gerstung of the German Cancer Research Center explained in the paper and public comments, understanding sequences of diagnoses is "a bit like learning the grammar in a text" — the model identifies which events commonly precede others and which combinations signal elevated future risk.

The team reports that Delphi-2M can identify individuals with considerably higher or lower risk for events such as heart attack than predicted by conventional risk calculators alone. Unlike single-condition tools (for example QRISK3, used to estimate cardiovascular risk in primary care), Delphi-2M aims to provide a multi-disease, long-horizon forecast: thousands of conditions simultaneously, over years rather than months.

The model also uses a wide range of inputs drawn from clinical histories, laboratory tests and coded diagnoses. "Delphi-2M learns the patterns in healthcare data, preceding diagnoses, in which combinations they occur and in which succession," the authors wrote, enabling what they call “health-relevant predictions.”

Validation, limitations and ethical considerations

Although early results are promising, the authors and external reviewers stress that Delphi-2M is not yet ready for clinical deployment. Validation across two large datasets strengthens confidence in the model’s predictive signals, but both datasets have known biases in age distribution, ethnicity representation and local healthcare practices. Peter Bannister, a health technology researcher and fellow at Britain’s Institution of Engineering and Technology, noted these limitations and emphasized the distance between a research prototype and improved routine care.

Co-author Tom Fitzgerald of the European Molecular Biology Laboratory highlighted systems-level benefits, suggesting that predictive models of this type could help optimize resource allocation across strained health services. Co-author Ewan Birney contrasted Delphi-2M with existing clinical risk tools by pointing out its disease-agnostic, multi-year scope: "It can do all diseases at once and over a long time period."

Gustavo Sudre, a medical AI specialist at King’s College London, described the work as "a significant step towards scalable, interpretable and — most importantly — ethically responsible predictive modelling." Interpretability remains a central research goal, because many large models still exhibit internal behavior that is difficult for human experts to fully explain.

Potential applications and next steps

If further validated and integrated with care pathways, models like Delphi-2M could influence preventive medicine by flagging patients for closer monitoring, lifestyle interventions, or earlier diagnostic testing. Health systems could use aggregated forecasts to plan staffing, diagnostics capacity and targeted public-health initiatives. However, robust external validation, prospective clinical trials, fairness assessments across diverse populations and clear regulatory frameworks will be essential before deployment.

Related technologies

This research intersects with broader developments in medical AI: electronic health record phenotyping, federated learning for multi-site training without raw data sharing, and explainable AI tools that highlight which features drive individual risk predictions.

Expert Insight

Dr. Anna Reyes, biomedical data scientist and science communicator, comments: "Delphi-2M demonstrates how sequence models can extract clinically meaningful signals from complex patient timelines. The real test will be translating those signals into actionable, equitable interventions. That requires careful prospective studies and collaboration between clinicians, data scientists and ethicists to avoid amplifying existing health disparities."

Conclusion

Delphi-2M represents a notable advance in predictive medicine: a transformer-based AI capable of estimating risks for more than 1,000 diseases years ahead by learning patterns in patient histories. Early validation across UK and Danish datasets shows promise, but authors and outside experts caution that biased data, interpretability challenges and the need for prospective clinical testing mean the technology is still some way from routine use. If those hurdles are addressed, disease-agnostic forecasting tools could become a component of future preventive care and health-system planning, complementing — not replacing — clinical judgment.

Source: sciencealert

Leave a Comment

Comments