Computational generation of chemical synthesis routes and methods

US11961595B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11961595-B2
Application numberUS-201916965222-A
CountryUS
Kind codeB2
Filing dateJan 30, 2019
Priority dateJan 30, 2018
Publication dateApr 16, 2024
Grant dateApr 16, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Retrosynthetic methods are described for determining one or more optimal synthetic routes to generate a target compound.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: determining, by a computing device, based on a first one or more sets of chemical reactions, a plurality of known chemical reactions; determining, by the computing device, based on a target compound and generalized known chemical transformations, a plurality of computationally generated chemical reactions that is different from the plurality of known chemical reactions; applying, by the computing device, a trained classifier to each of the plurality of computationally generated chemical reactions to classify one or more computationally generated chemical reactions from the plurality of computationally generated chemical reactions as successful computationally generated chemical reactions, the trained classifier comprising a machine learning model for execution by the computing device and trained using training data comprising one or more chemical reactions categorized as successful and one or more chemical reactions categorized as unsuccessful; generating, by the computing device, based on the one or more successful computationally generated chemical reactions and the plurality of known chemical reactions, a plurality of chemical reactions, wherein each chemical transition of the plurality of chemical reactions represents a transformation of a first compound to a second compound; determining, by the computing device, based on the target compound, a plurality of chemical synthesis routes, wherein each chemical synthesis route of the plurality of chemical synthesis routes comprises one or more chemical reactions of the plurality of chemical reactions and each chemical synthesis route of the plurality of chemical synthesis routes produces the target compound; identifying, by the computing device, a chemical synthesis route of the plurality of chemical synthesis routes having a corresponding cost less than a threshold; and performing chemical synthesis of the identified chemical synthesis route to synthesize the target compound. 2. The method of claim 1 , further comprising training, by the computing device, a classifier on a training data set, wherein the training data set comprises one or more of, a chemical reaction database, estimated yields, or predicted yields for the one or more sets of chemical reactions. 3. The method of claim 2 , wherein training the classifier on the training data set comprises: receiving a dataset comprising the plurality of known chemical reactions, wherein each of the plurality of known chemical reactions comprises at least one reactant, wherein each reactant of the at least one reactant is comprised of one or more atoms; for each reactant of the at least one reactant, classifying the one or more atoms into one or more categories based on a neighborhood atom, a bond order, a number of hydrogen atoms present, or a combination of one or more of the neighborhood atom, the bond order, or the number of hydrogen atoms present; for each reactant of the at least one reactant, determining a vector based on a histogram of the one or more categories; determining the training data set, wherein the training data set is comprised of a) vectors of reactions associated with a specific transformation and b) vectors of reactions associated with the specific transformation but yield a product from a different reaction type; exposing the classifier to a portion of the training data set to train the classifier; and exposing the trained classifier to another portion of the training data set to test the trained classifier. 4. The method of claim 3 , wherein exposing the trained classifier to another portion of the training data set to test the trained classifier comprises assessing performance of the trained classifier based on one or more metrics. 5. The method of claim 4 , wherein the one or more metrics comprise one or more of accuracy, positive precision, negative precision, positive recall, or negative recall. 6. The method of claim 1 , further comprising generating, by the computing device, a tree data structure, wherein the target compound is a root node of the tree data structure. 7. The method of claim 6 , further comprising adding, by the computing device, to the tree data structure, a plurality of branches, wherein each branch of the plurality of branches comprises a chemical synthesis route of the plurality of chemical synthesis routes. 8. The method of claim 1 , wherein determining the plurality of chemical synthesis routes associated with the target compound is based on one or more parameters. 9. The method of claim 8 , wherein the one or more parameters comprise one or more of available feedstock, available chemical substances, or available equipment. 10. The method of claim 1 , wherein determining the plurality of chemical synthesis routes is based on one or more parameters. 11. The method of claim 10 , wherein the one or more parameters comprise one or more of available feedstock, available chemical substances, available equipment, yield, financial cost, time, reaction conditions, or likelihood of reaction success. 12. The method of claim 1 , wherein determining the plurality of chemical synthesis routes comprises: determining all compounds that can reach the target compound in at most a predefined number of steps, and wherein identifying the chemical synthesis route of the plurality of chemical synthesis routes having a corresponding cost less than a threshold comprises determining, from among the plurality of chemical synthesis routes including routes that exclude work-up or solvent exchange steps, a minimal cost chemical synthesis route to the target compound. 13. The method of claim 12 , wherein determining the minimal cost chemical synthesis route comprises evaluating a cost function. 14. The method of claim 13 , wherein the cost function comprises: Cost ⁡ ( C R ) = ICost ⁡ ( R ) + ( ∑ C ∈ Reactants ⁡ ( R ) ⁢ Cost ⁡ ( C R i ) + ∑ f

Assignees

Inventors

Classifications

  • G16C20/70Primary

    Machine learning, data mining or chemometrics · CPC title

  • Computational theoretical chemistry, i.e. ICT specially adapted for theoretical aspects of quantum chemistry, molecular mechanics, molecular dynamics or the like · CPC title

  • Analysis or design of chemical reactions, syntheses or processes · CPC title

  • Data visualisation · CPC title

  • G06N20/00Primary

    Machine learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11961595B2 cover?
Retrosynthetic methods are described for determining one or more optimal synthetic routes to generate a target compound.
Who is the assignee on this patent?
Stanford Res Inst Int
What technology area does this patent fall under?
Primary CPC classification G16C20/70. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 16 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).