Palo Alto, CA – In a groundbreaking achievement, scientists have unveiled Evo 2, the largest biological artificial intelligence (AI) model ever created. Trained on a massive dataset of 128,000 genomes spanning the entire tree of life, from humans to single-celled bacteria and archaea, Evo 2 possesses the unprecedented ability to write entire chromosomes and small genomes from scratch. More remarkably, it can decipher existing DNA, including previously inexplicable non-coding genetic variations linked to diseases.
Developed collaboratively by researchers at the Arc Institute, Stanford University, and NVIDIA, Evo 2 represents a significant leap forward in our understanding of biology. The model is accessible to scientists through an online interface, with its software code, data, and parameters for replication available for free download on GitHub.
The lead author of the study, Brian Hie, a computational biologist at Stanford University, highlighted the AI’s capacity to uncover patterns invisible to the human eye. This ability stems from the model’s deep learning architecture, which allows it to identify complex relationships within vast datasets of genomic information.
The predecessor to Evo 2, simply named Evo, garnered significant attention last November when it graced the cover of Science magazine. Evo, trained on 80,000 bacterial, archaeal, and viral genomes, laid the foundation for Evo 2’s more expansive capabilities.
In an interview with Quanta Magazine prior to the release of Evo 2, Hie drew parallels between DNA and human language, suggesting that AI can be instrumental in translating the complex code of life. AI can see patterns that humans simply can’t, Hie explained. It can identify subtle relationships between genes and their functions, even in regions of the genome that were previously considered ‘junk DNA’.
Evo 2’s potential applications are vast and transformative. It could accelerate drug discovery by identifying novel therapeutic targets, revolutionize synthetic biology by enabling the design of new organisms with desired traits, and deepen our understanding of the genetic basis of diseases.
The availability of Evo 2 as an open-source resource is expected to further accelerate research in these areas, fostering collaboration and innovation across the scientific community. As Hie noted, By making Evo 2 accessible to everyone, we hope to empower researchers around the world to unlock the secrets of the genome and develop new solutions to some of humanity’s most pressing challenges.
Conclusion:
Evo 2 marks a paradigm shift in biological research, demonstrating the power of AI to unravel the complexities of life. Its ability to generate novel genetic sequences and decipher hidden patterns in existing DNA holds immense promise for advancing medicine, biotechnology, and our fundamental understanding of the natural world. The open-source nature of the project ensures that its benefits will be widely shared, paving the way for a new era of discovery in the life sciences. Future research should focus on expanding Evo 2’s capabilities to incorporate other biological data types, such as protein structures and metabolic pathways, to create even more comprehensive and powerful models of living systems.
References:
- Arc Institute. (2024). Evo 2. GitHub. Retrieved from https://github.com/ArcInstitute/evo2
- ScienceAI. (2024, February 22). Evo 2作者、斯坦福计算生物学家 Brian Hie:AI可发现人类无法看到的模式. Retrieved from [Insert original article link here]
Views: 0