Generative AI for DNA

πŸš€ Add to Chrome – It’s Free - YouTube Summarizer

Category: Biotechnology and Genomics

Tags: Genomic ModelsMutation AnalysisOpen SourceProtein DesignSynthetic Biology

Entities: BRAA1Chat GPTDNAEvo2synthetic biology

Building WordCloud ...

Summary

    Introduction to Evo2
    • Evo2 acts like a generative model for the genome, proposing new variations of biological sequences.
    • It can be used for designing proteins that act as therapeutics or degrade plastics.
    Functionality and Applications
    • Evo2 resembles a language model but works on DNA instead of human language.
    • The model can identify mutations in genes, such as those associated with breast cancer.
    • Evo2 can generate realistic DNA outputs from given DNA inputs, aiding synthetic biology applications.
    • Researchers can use Evo2 to design new protein sequences and test them in labs for improved functionality.
    Open Source and Community Impact
    • All Evo2 models, data, and infrastructure will be open source.
    • This openness allows for community-driven development and application building.
    • Open source models are considered safer as the community can evaluate their strengths and weaknesses.

    Transcript

    00:00

    [Music] evo2 can kind of act like a chat GPT or or a generative model for the genome and propose new variations of biological sequences that can improve some Downstream function so this function could be designing A protein that acts

    00:16

    as a therapeutic or designing A protein that degrades plastic or that cleans up oil Spells at its most fundamental evil does resemble a language model the first big

    00:33

    difference is the data language model works on typically sentences or chunks of language evil works on DNA a DNA language model trained on DNA from microbes from mammals from plants from all sorts of organisms that have been

    00:49

    sequenced biology is very complicated and it's written in a language that humans can't understand these A's C's G's and T's it's a foreign language to us what we're trying to do with even 2 is actually make biological design much more easy for the average researcher in

    01:07

    DNA often you have to have hundreds or thousands of base pairs in order to encode a single Gene just one gene but any one of those base pairs if they change may be functionally important the model can understand which

    01:22

    mutations lead to certain diseases and which ones might be more neutral so for example we show that the model can identify certain mutations in braa one which is a gene that's often associated with breast cancer another practical application is in using EVO to do design

    01:40

    tasks in the same way that chat GPT can generate realistic human language responses to a human language input if you give Evo a DNA input it will try to generate realistic DNA outputs this is very exciting because it opens options

    01:55

    in various synthetic biology applications say you want to design a new protein sequence so you can actually prompt the model with a piece of DNA and then just generate new versions of this protein by using evo2 to autocomplete The genome and when it generates these

    02:13

    variations of this protein you can test all of them in the lab and potentially identify new proteins that have a better function of interest and and proteins are are very useful because they're these molecular machines that accomplish many important biological

    02:28

    functions all evil models will be open source uh the data will be open source the training infrastructure will be open source the inference infrastructure will be open source this is a change compared to a lot of model releases across the board it's all open for other people to

    02:44

    build on uh apply to their use cases and hopefully to enable Community Driven development and we also think that models are safest when they're open so that the community can actually evaluate them and see their strengths and their

    02:59

    weakness es we envision this model really being a foundation for other models in genetics or in synthetic biology where people can build applications on top of evo2 and the entire Community can benefit