File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1016/j.cels.2024.01.002
- Scopus: eid_2-s2.0-85185824181
- PMID: 38340729
- Find via
Supplementary
- Citations:
- Appears in Collections:
Article: Accurate top protein variant discovery via low-N pick-and-validate machine learning
Title | Accurate top protein variant discovery via low-N pick-and-validate machine learning |
---|---|
Authors | |
Keywords | active learning base editor Cas9 combinatorial mutagenesis CRISPR genome editing low-N machine learning protein engineering zero-shot |
Issue Date | 21-Feb-2024 |
Publisher | Elsevier |
Citation | Cell Systems, 2024, v. 15, n. 2, p. 193-203 How to Cite? |
Abstract | A strategy to obtain the greatest number of best-performing variants with least amount of experimental effort over the vast combinatorial mutational landscape would have enormous utility in boosting resource producibility for protein engineering. Toward this goal, we present a simple and effective machine learning-based strategy that outperforms other state-of-the-art methods. Our strategy integrates zero-shot prediction and multi-round sampling to direct active learning via experimenting with only a few predicted top variants. We find that four rounds of low-N pick-and-validate sampling of 12 variants for machine learning yielded the best accuracy of up to 92.6% in selecting the true top 1% variants in combinatorial mutant libraries, whereas two rounds of 24 variants can also be used. We demonstrate our strategy in successfully discovering high-performance protein variants from diverse families including the CRISPR-based genome editors, supporting its generalizable application for solving protein engineering tasks. A record of this paper's transparent peer review process is included in the supplemental information. |
Persistent Identifier | http://hdl.handle.net/10722/345914 |
ISSN | 2023 Impact Factor: 9.0 2023 SCImago Journal Rankings: 4.872 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Chu, Hoi Yee | - |
dc.contributor.author | Fong, John HC | - |
dc.contributor.author | Thean, Dawn GL | - |
dc.contributor.author | Zhou, Peng | - |
dc.contributor.author | Fung, Frederic KC | - |
dc.contributor.author | Huang, Yuanhua | - |
dc.contributor.author | Wong, Alan SL | - |
dc.date.accessioned | 2024-09-04T07:06:26Z | - |
dc.date.available | 2024-09-04T07:06:26Z | - |
dc.date.issued | 2024-02-21 | - |
dc.identifier.citation | Cell Systems, 2024, v. 15, n. 2, p. 193-203 | - |
dc.identifier.issn | 2405-4712 | - |
dc.identifier.uri | http://hdl.handle.net/10722/345914 | - |
dc.description.abstract | A strategy to obtain the greatest number of best-performing variants with least amount of experimental effort over the vast combinatorial mutational landscape would have enormous utility in boosting resource producibility for protein engineering. Toward this goal, we present a simple and effective machine learning-based strategy that outperforms other state-of-the-art methods. Our strategy integrates zero-shot prediction and multi-round sampling to direct active learning via experimenting with only a few predicted top variants. We find that four rounds of low-N pick-and-validate sampling of 12 variants for machine learning yielded the best accuracy of up to 92.6% in selecting the true top 1% variants in combinatorial mutant libraries, whereas two rounds of 24 variants can also be used. We demonstrate our strategy in successfully discovering high-performance protein variants from diverse families including the CRISPR-based genome editors, supporting its generalizable application for solving protein engineering tasks. A record of this paper's transparent peer review process is included in the supplemental information. | - |
dc.language | eng | - |
dc.publisher | Elsevier | - |
dc.relation.ispartof | Cell Systems | - |
dc.subject | active learning | - |
dc.subject | base editor | - |
dc.subject | Cas9 | - |
dc.subject | combinatorial mutagenesis | - |
dc.subject | CRISPR | - |
dc.subject | genome editing | - |
dc.subject | low-N | - |
dc.subject | machine learning | - |
dc.subject | protein engineering | - |
dc.subject | zero-shot | - |
dc.title | Accurate top protein variant discovery via low-N pick-and-validate machine learning | - |
dc.type | Article | - |
dc.identifier.doi | 10.1016/j.cels.2024.01.002 | - |
dc.identifier.pmid | 38340729 | - |
dc.identifier.scopus | eid_2-s2.0-85185824181 | - |
dc.identifier.volume | 15 | - |
dc.identifier.issue | 2 | - |
dc.identifier.spage | 193 | - |
dc.identifier.epage | 203 | - |
dc.identifier.eissn | 2405-4720 | - |
dc.identifier.issnl | 2405-4712 | - |