In recent years, artificial intelligence has revolutionized biology, particularly in understanding the relationship between protein structure and function. From predicting protein shapes to designing enzymes that digest plastics, AI has already proven its worth. But a new breakthrough takes this one step further—by moving beyond proteins to the DNA level, where evolution itself operates.
From Language Models to Genomic Models
A team at Stanford University has developed a system called Evo, a “genomic language model” trained on vast collections of bacterial genomes. Much like large language models that predict the next word in a sentence, Evo predicts the next base in a DNA sequence. Because bacterial genomes often cluster related genes together, Evo can learn functional patterns across entire biochemical pathways.
This approach allows Evo not only to reconstruct missing genes but also to generate novel DNA sequences that encode proteins with functions never seen before. In tests, Evo successfully restored deleted genes in bacterial clusters and produced variations that respected evolutionary constraints.
Novel Proteins and Antitoxins
The real excitement came when researchers challenged Evo to invent something new. Using bacterial toxins as prompts, Evo generated completely new antitoxins. Out of ten tested outputs, half showed partial activity, and two fully neutralized toxicity—despite having only weak similarity (about 25%) to known antitoxins. These proteins weren’t simple recombinations of existing sequences; they appeared to be assembled from fragments of dozens of proteins, creating entirely new molecular architectures.
Evo also demonstrated success with RNA-based inhibitors and even CRISPR inhibitors, producing proteins that confused structure-prediction software because they had no resemblance to anything known.
The Scale of Discovery
When prompted with 1.7 million bacterial and viral genes, Evo generated 120 billion base pairs of synthetic DNA. Some sequences matched known genes, but many appear to encode truly novel proteins. While it remains uncertain how these vast datasets will be applied, the potential for biotechnology, medicine, and synthetic biology is enormous.
Limits and Future Directions
It’s important to note that Evo’s success relies on bacterial genomes, which are relatively simple and organized. More complex organisms, like humans, have scattered genes and intricate regulatory systems that may challenge this approach. Still, the concept of linking nucleic acid-level patterns to functional proteins opens a new frontier in biology.
Conclusion
This breakthrough highlights how generative AI can accelerate evolution’s creative process, producing proteins that nature might never have discovered on its own. While practical applications are still emerging, Evo demonstrates that AI can operate at the very foundation of life—DNA itself.
Source: John Timmer, Ars Technica, “Generative AI meets the genome” (Nov 21, 2025). Available at: Ars Technica
No comments:
Post a Comment