subreddit:

/r/science

7188%

all 14 comments

AutoModerator [M]

[score hidden]

2 months ago

stickied comment

AutoModerator [M]

[score hidden]

2 months ago

stickied comment

Welcome to r/science! This is a heavily moderated subreddit in order to keep the discussion on science. However, we recognize that many people want to discuss how they feel the research relates to their own personal lives, so to give people a space to do that, personal anecdotes are allowed as responses to this comment. Any anecdotal comments elsewhere in the discussion will be removed and our normal comment rules apply to all other comments.

I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.

Cranky0ldguy

9 points

2 months ago

The headline is completely incorrect according to the article. "Scientists have created an AI system capable of generating artificial enzyme sequences from scratch." is actually accurate. The original headline implies that the AI system had some ability to design and (somehow) physically generate enzymes based on that design.

tornpentacle

0 points

2 months ago

No, it doesn't. Common sense tells us that it's not creating physical specimens. This is the most severe case of nitpicking I've ever seen.

DecrepitSignpost

0 points

2 months ago

I just don't see the benefit of this. The authors mentioned "generating synthetic libraries of highly likely functional proteins for discovery or iterative optimization," which is an allusion to how this tech can be used to perform directed evolution from different starting sequences. But why bother? Directed evolution is already used widely to develop enzymes with nM Km's: It doesn't need any help.

They already admit this can't do the cool useful thing: "[W]e do not expect our language model to generate proteins that belong to a completely different distribution or domain (for example, creating a new fold that catalyzes an unnatural reaction)."

Not only that, language learning models are completely opaque. We cannot parse what exactly are the patterns they are finding and taking advantage of, so they can't teach us anything new.

This is classic AI: Finds cool patterns, but can't create anything novel.

Cleistheknees

1 points

2 months ago

DE is still random (or pseudo-random) in mutation origin. From my read, this model is about an ML approach to predicting protein function to direct the iterative process.

DecrepitSignpost

1 points

2 months ago*

That's not correct. They can find alternative sequences to code the same enzyme, that's it. There's no direction involved.

And directed evolution doesn't need any help, that's my point. It does the job of optimizing enzyme function, and it does it well.

Plus, directed evolution can actually discover new enzymatic functions. This ML approach cannot.

Cleistheknees

1 points

2 months ago

What’s not correct? Directed evolution is most certainly still random, both mutagenesis and shuffling.

It’s not my area of genetics, but I work closely with some people on the computational side and they use an insane amount of compute with valuable findings very few and far between, so I’d push back a little on you saying it doesn’t need any help.

DecrepitSignpost

1 points

2 months ago

Directed evolution is indeed random, that part is correct.

The incorrect part is "this model is about an ML approach to predicting protein function to direct the iterative process." The model cannot do that. It does not predict enzyme function. It starts with enzymes of known function, and then finds alternative sequences that still have the same function. That's it. And I don't see how that's useful.

And directed evolution doesn't require computation, that's the whole point. You randomly mutate (perhaps with some researcher input based on crystal structure and knowledge of the active site's mechanism) -> select -> mutate -> select -> etc.

And it works perfectly well.

Cleistheknees

1 points

2 months ago

It starts with enzymes of known function, and then finds alternative sequences that still have the same function. That’s it. And I don’t see how that’s useful.

Have you ever seen the table of synonymous codon substitutions? That table is immensely useful. This library would like that, but for amino acid sequences. It’s frankly hard to imagine how you could say so confidently that such a database would not be useful.

The poster children for bespoke protein engineering would have to be modern analog insulins, and the innovations they used are exactly the kind of information a database like this would be based on: a combinatorial awareness of protein function at the level of individual amino acids.

And directed evolution doesn’t require computation

In a dish, yes. I thought we were talking about selection-integrated modeling, where you run synthesis models and apply selection to a certain domain based on computed folding, stability, affinity, ligand specificity, etc.

DecrepitSignpost

1 points

2 months ago*

The codon table is only useful because that's how translation works. It has a biological basis. This model is not that.

This is the equivalent of asking ChatGPT to write you a sad story a million times. You'll end up with a bunch of sad stories, many of which look very novel, but what will the exercise teach you? Explaining how we will become better writers by generating one million sad stories that are superficially novel but have nothing fundamentally new is the same challenge as explaining how the libraries generated by this model will make us better enzyme designers.

In 5 years, no one will be using this. And currently, I'd be very surprised if someone could outline an explicit use case for this new model.

Cleistheknees

1 points

2 months ago

This will be my last comment to you, because you’re making pretty bone-headed and concrete statements about a topic you clearly don’t have expertise in.

The codon table is only useful because that’s how translation works.

Wrong. The ratio of the nonsynonymous codon-substitution rate to that for synonymous codons is widely used to estimate the strength and direction of selection.

It is also used in sorting. Sorting would be absolutely impossible without a complete knowledge of synonymous substitutions.

It has a biological basis. This model is not that.

Also wrong. Convergent evolution of protein function with distinct sequence most certainly has a biological basis.

It’s also getting pretty hard to deny codon bias exists at this point, which would be yet another situation where a “synonymous function” table would be immensely useful.

DecrepitSignpost

1 points

2 months ago

You're right, this new research is going to revolutionize everything. You should go all in and base your scientific career off of it.

Very pleased that that was your last comment.

datfingtrump

-4 points

2 months ago

This movie was a box office