HTP genomic engineering platform

US10968445B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10968445-B2
Application numberUS-202017071691-A
CountryUS
Kind codeB2
Filing dateOct 15, 2020
Priority dateDec 7, 2015
Publication dateApr 6, 2021
Grant dateApr 6, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure provides machine learning techniques for computationally predicting the phenotypic performance of combinations of genetic variations and for designing new improved host cells. The machine learning models and methods described herein are host agnostic and therefore can be implemented across taxa. Furthermore, the disclosed platform can be implemented to modulate or improve any host cell parameter of interest.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for engineering a host cell with a beneficial combination of genetic alterations, said method comprising: a) populating a predictive machine learning model with a training data set, containing: i) a plurality of genetic alteration input variables representing a plurality of genetic alterations that have been introduced into a host cell, and ii) a plurality of experimentally validated phenotypic performance output variables representing a plurality of phenotypic performance measurements associated with the plurality of introduced genetic alterations; b) generating, in silico, a pool of design candidate host cells incorporating at least two genetic alterations from the plurality of genetic alterations; c) utilizing the predictive machine learning model to predict expected phenotypic performance of a member of the pool of design candidate host cells that comprises a combination of genetic alterations selected from step (a), said combination being uncharacterized for improving phenotypic performance at the time of carrying out step (c); and d) manufacturing the member of the pool of design candidate host cells of step (c), thereby engineering a host cell with a beneficial combination of genetic alterations; wherein (a)-(d) are repeated until a manufactured member of the pool of design candidate host cells exhibits a desired level of improved phenotypic performance. 2. The method of claim 1 , further comprising the steps of: e) culturing the manufactured member of the pool of design candidate host cells from step (d) in a in culture medium, thereby producing a host cell culture; and f) extracting a product of interest from the host cell culture. 3. The method of claim 2 , wherein the product of interest is selected from the group consisting of: a small molecule, enzyme, protein, peptide, amino acid, organic acid, synthetic compound, fuel, alcohol, primary extracellular metabolite, secondary extracellular metabolite, intracellular component molecule, and combinations thereof. 4. The method of claim 1 , wherein the predictive machine learning model incorporates at least one of the following: linear regression, kernel ridge regression, logistic regression, neural networks, support vector machines (SVMs), decision trees, hidden Markov models, Bayesian networks, a Gram-Schmidt process, reinforcement-based learning, cluster-based learning, hierarchical clustering, genetic algorithms, or combinations thereof. 5. The method of claim 1 , wherein the predictive machine learning model incorporates epistatic effects. 6. The method of claim 1 , wherein the predictive machine learning model is supervised, semi-supervised, or unsupervised. 7. The method of claim 1 , wherein the plurality of genetic alterations comprise a genetic alteration selected from the group consisting of: a single nucleotide polymorphism, nucleotide sequence insertion, nucleotide sequence deletion, and nucleotide sequence replacements. 8. A computer-implemented method for designing a host cell to have a beneficial combination of genetic alterations, said method comprising the steps of: a) populating a machine learning model with a training data set, containing: i) a plurality of genetic alteration input variables representing a plurality of genetic alterations that have been introduced into a host cell, and ii) a plurality of experimentally-validated phenotypic performance output variables representing phenotypic performance measurements associated with the plurality of introduced genetic alterations; b) generating, in silico, a pool of genetic alteration designs that can be incorporated into candidate host cells, the genetic alteration designs comprising a combination of genetic alterations from the plurality of genetic alterations said combination of genetic alterations being uncharacterized for improving phenotypic performance at the time of generating the pool of genetic alteration designs; and c) utilizing the machine learning model to predict expected phenotypic performance of a candidate host cell that comprises a genetic alteration design from said pool; wherein the machine learning model incorporates at least one of the following: linear regression, kernel ridge regression, logistic regression, neural networks, support vector machines (SVMs), decision trees, hidden Markov models, Bayesian networks, a Gram-Schmidt process, reinforcement-based learning, cluster-based learning, hierarchical clustering, genetic algorithms, or combinations thereof. 9. The method of claim 8 , comprising step d) perturbing the genome of a host cell to introduce the genetic alteration design that has a predicted expected phenotypic performance, as predicted in a previous step, thereby creating an engineered host cell. 10. The method of claim 9 , comprising step e) measuring, in an in vitro assay, phenotypic performance of the engineered host cell from the previous step; and f) adding to the training data set of (a) i. one or more genetic alteration input variables representing one or more genetic alterations that were introduced into the engineered host cell that was measured in the previous step, and ii. one or more measured phenotypic performance output variables representing the phenotypic performance measurements of the engineered host cell that was measured in the previous step. 11. The method of claim 9 , further comprising the steps of: e) culturing the engineered host cell from step (d) in a in culture medium, thereby producing a host cell culture; and f) extracting a product of interest from the host cell culture. 12. The method of claim 11 , wherein the product of interest is selected from the group consisting of: a small molecule, enzyme, protein, peptide, amino acid, organic acid, synthetic compound, fuel, alcohol, primary extracellular metabolite, secondary extracellular metabolite, intracellular component molecule, and combinations thereof. 13. The method of claim 8 , wherein the machine learning model incorporates epistatic effects. 14. The method of claim 8 , wherein the plurality of genetic alterations comprise a genetic alteration selected from the group consisting of: a single nucleotide polymorphism, nucleotide sequence insertion, nucleotide sequence deletion, and nucleotide sequence replacements. 15. A computer-implemented method for designing a host cell to have a beneficial combination of genetic alterations, said method comprising the steps of: a) populating a machine learning model with a training data set, containing: i) a plurality of genetic alteration input variables representing a plurality of genetic alterations that have been introduced into a host cell, and ii) a plurality of experimentally-validated phenotypic performance output variables representing phenotypic performance measurements associated with the plurality of introduced genetic alterations; b) generating, in silico, a pool of genetic alteration designs that can be incorporated into candidate host cells, the genetic alteration designs comprising a combination of genetic alterations from the plurality of genetic alterations said combination of genetic alterations being uncharacterized for improving phenotypic performance at the time of generating the pool of genetic alteration designs; and c) utilizing the machine learning model to predict expected phenotypic performance of a candidate host cell that comprises a genetic alteration design from said pool; wherein the predicted expected phenotypic performance is production of a product of interest, said product of interest selected from the group consisting of: a small molecule,

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10968445B2 cover?
The present disclosure provides machine learning techniques for computationally predicting the phenotypic performance of combinations of genetic variations and for designing new improved host cells. The machine learning models and methods described herein are host agnostic and therefore can be implemented across taxa. Furthermore, the disclosed platform can be implemented to modulate or improve…
Who is the assignee on this patent?
Zymergen Inc
What technology area does this patent fall under?
Primary CPC classification C12N15/1058. Mapped technology areas include Chemistry & Metallurgy.
When was this patent published?
Publication date Tue Apr 06 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).