Adaptive sampling of training data for machine learning models based on PAC-bayes analysis of risk bounds
US-11200511-B1 · Dec 14, 2021 · US
US2021018902A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2021018902-A1 |
| Application number | US-202017063599-A |
| Country | US |
| Kind code | A1 |
| Filing date | Oct 5, 2020 |
| Priority date | Mar 13, 2018 |
| Publication date | Jan 21, 2021 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Operating a substrate processing system includes receiving a plurality of sets of training data, storing a plurality of machine learning models, storing a plurality of physical process models, receiving a selection of a machine learning model from the plurality of machine learning models and a selection of a physical process model from the plurality of physical process models, generating an implemented machine learning model according to the selected machine learning model, calculating a characterizing value for each training spectrum in each set of training data thereby generating a plurality of training characterizing values with each training characterizing value associated with one of the plurality of training spectra, training the implemented machine learning model using the plurality of training characterizing values and plurality of training spectra to generate a trained machine learning model, and passing the trained machine learning model to a control system of the substrate processing system.
Opening claim text (preview).
What is claimed is: 1 . A method of operating a substrate processing system, comprising: receiving a plurality of sets of training data, each set of training data including a plurality of raw training values, a timestamp for each raw training value from the plurality of raw training values, and a starting characterizing value and/or an ending characterizing value for the plurality of raw training values; storing a plurality of machine learning models, each machine learning model providing at least one different hyperparameter; storing a plurality of physical process models, each physical process model providing a different function to generate characterizing values as a different function of time and/or a different physical process parameter; receiving a selection of a machine learning model from the plurality of machine learning models and a selection of a physical process model from the plurality of physical process models to provide a combination of a selected machine learning model and a selected physical process model; receiving at least one hyperparameter value for the selected machine learning model and at least one physical parameter value for the selected physical process model; generating an implemented machine learning model according to the selected machine learning model and the at least one hyperparameter value; for each of a plurality of groups of one or more raw training values from the plurality of raw training values, calculating a characterizing values based on the one or more raw training values, one or more timestamps for one or more raw training values, the starting characterizing value and/or ending characterizing value for the set of training data, the physical parameter value, and the selected physical process model, thereby generating the plurality of training characterizing values with each training characterizing value associated with a group of one or more raw training values from the plurality of raw training values; training the implemented machine learning model using the plurality of training characterizing values and plurality of raw training values to generate a trained machine learning model; and passing the trained machine learning model to a processing control system of the substrate processing system. 2 . The method of claim 1 , wherein the plurality of raw training values comprise measurements from an eddy current monitoring system, motor current or torque monitoring system, or optical monitoring system. 3 . The method of claim 1 , wherein the substrate processing system comprises a chemical mechanical polishing system. 4 . The method of claim 2 , further comprising: polishing a substrate in the polishing system; during polishing of the substrate, monitoring the substrate with an in-situ monitoring system to generate the plurality of raw training values; passing the plurality of raw training values to the trained machine learning model to generate a plurality of characterizing values; and controlling at least one processing parameter of the polishing system based on the plurality of characterizing values. 5 . The method of claim 4 , wherein controlling the at least one processing parameter includes halting polishing and/or adjusting carrier head pressure. 6 . A computer program product for controlling processing of a substrate, the computer program product tangibly embodied in a non-transitory computer readable media and comprising instructions for causing a processor to: receive a plurality of sets of training data, each set of training data including a plurality of raw training values, a timestamp for each raw training value from the plurality of raw training values, and a starting characterizing value and/or an ending characterizing value for the plurality of raw training values; store a plurality of machine learning models, each machine learning model providing at least one different hyperparameter; store a plurality of physical process models, each physical process model providing a different function to generate characterizing values as a different function of time and/or a different physical process parameter; receive a selection of a machine learning model from the plurality of machine learning models and a selection of a physical process model from the plurality of physical process models to provide a combination of a selected machine learning model and a selected physical process model; receive at least one hyperparameter value for the selected machine learning model and at least one physical parameter value for the selected physical process model; generate an implemented machine learning model according to the selected machine learning model and the at least one hyperparameter value; for each of a plurality of groups of one or more raw training values from the plurality of raw training values, calculate a characterizing values based on the one or more raw training values, one or more timestamps for one or more raw training values, the starting characterizing value and/or ending characterizing value for the set of training data, the physical parameter value, and the selected physical process model, to thereby generate the plurality of training characterizing values with each training characterizing value associated with a group of one or more raw training values from the plurality of raw training values; train the implemented machine learning model using the plurality of training characterizing values and plurality of raw training values to generate a trained machine learning model; and pass the trained machine learning model to a processing control system of the substrate processing system. 7 . The computer program product of claim 6 , wherein the characterizing value comprises a thickness value for a layer on the substrate. 8 . The computer program product of claim 6 , wherein the plurality of machine learning models include a convolutional neural network and a fully connected neural network. 9 . The computer program product of claim 8 , wherein at least one different hyperparameter comprises a number of hidden layers in the neural network. 10 . The computer program product of claim 6 , wherein some of the plurality of physical process models include a linear function of time and some of the plurality of physical process models include a non-linear function of time. 11 . The computer program product of claim 6 , wherein the plurality of physical process models include different physical process parameters. 12 . The computer program product of claim 6 , wherein the physical process parameter includes one or more of pattern density, starting step height, critical step height, and process selectivity. 13 . The computer program product of claim 6 , comprising instructions to receive at least one hyperparameter value for the selected machine learning model, and wherein the instructions to generate the implemented machine learning model include instructions to generate the implemented machine learning model according to the selected machine learning model and the at least one hyperparameter value. 14 . The computer program product of claim 6 , comprising instructions to receive a physical parameter value for the selected physical process model, and wherein the instructions to calculate the characterizing value include instructions to calculate the characterizing value based on the physical parameter value. 15 . A semiconductor fabrication system, comprising: a plurality of polishing systems, each polishing system including a support to hold a polishing pad, a carrier to hold a substrate against the polishing pad, a motor to ca
Structural properties, e.g. testing or measuring thicknesses, line widths, warpage, bond strengths or physical defects · CPC title
Process monitoring, e.g. flow or thickness monitoring · CPC title
Apparatus for mechanical treatment or grinding or cutting · CPC title
of semiconductor materials · CPC title
Combinations of networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.