Automated generation of machine learning models
US-11348032-B1 · May 31, 2022 · US
US11475236B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11475236-B2 |
| Application number | US-202016880456-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 21, 2020 |
| Priority date | May 22, 2019 |
| Publication date | Oct 18, 2022 |
| Grant date | Oct 18, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A computing system can include an embedding model and a clustering model. The computing system input each of the plurality of inputs into the embedding model and receiving respective embeddings for the plurality of inputs as outputs of the embedding model. The computing system can input the respective embeddings for the plurality of inputs into the clustering model and receiving respective cluster assignments for the plurality of inputs as outputs of the clustering model. The computing system can evaluate a clustering loss function that evaluates a first average, across the plurality of inputs, of a respective first entropy of each respective probability distribution; and a second entropy of a second average of the probability distributions for the plurality of inputs. The computing system can modify parameter(s) of one or both of the clustering model and the embedding model based on the clustering loss function.
Opening claim text (preview).
What is claimed is: 1. A computing system comprising: one or more processors; an embedding model configured to receive each of a plurality of inputs and to respectively process each input to produce a respective embedding; a clustering model configured to receive the respective embedding for each input and to respectively process the respective embedding for each input to produce a respective cluster assignment for each input, wherein the respective cluster assignment for each input comprises a respective probability distribution for the respective embedding with respect to a plurality of clusters: one or more non-transitory computer-readable media that collectively store instructions that, when executed by the one or more processors, cause the computing system to perform operations, the operations comprising: inputting each of the plurality of inputs into the embedding model; receiving the respective embeddings for the plurality of inputs as outputs of the embedding model; inputting the respective embeddings for the plurality of inputs into the clustering model; receiving the respective cluster assignments for the plurality of inputs as outputs of the clustering model; evaluating a clustering loss function that evaluates: a first average, across the plurality of inputs, of a respective first entropy of each respective probability distribution; and a second entropy of a second average of the probability distributions for the plurality of inputs; and modifying one or more parameters of one or both of the clustering model and the embedding model based on the clustering loss function. 2. The computing system of claim 1 , wherein the clustering loss function provides a loss value that is positively correlated with the first average and negatively correlated with the second entropy. 3. The computing system of claim 1 , wherein modifying parameters of at least one of the clustering model or the embedding model based on the clustering loss function comprises modifying respective parameters of each of the clustering model and the embedding model based on the clustering loss function. 4. The computing system of claim 1 , wherein modifying parameters of at least one of the clustering model or the embedding model based on the clustering loss function comprises modifying parameters of the clustering model and holding parameters of the embedding model constant. 5. The computing system of claim 1 . wherein the second entropy is scaled by a diversity hyperparameter. 6. The computing system of claim 1 , wherein the cluster assignment that describes the mapping of the embedding with respect to the plurality of clusters describes respective centroids of the plurality of clusters. 7. The computing system of claim 1 , wherein the cluster assignment that describes the mapping of the embedding with respect to the plurality of clusters comprises an encoding of respective elements of the input with respect to one or more of the plurality of, clusters. 8. The computing system of claim 1 , further comprising a machine-learned primary model configured to receive the embedding, and in response to receiving the embedding, output a primary output, and wherein the operations further comprise modifying parameters of the embedding model based on a primary loss function evaluated with respect to the primary output of the machine-learned primary model. 9. The computing system of claim 1 , wherein the operations further comprise, at an inference time, inputting an input into the embedding model, and receiving the embedding as an output of the embedding model. 10. A computing system comprising: one or more processors; an embedding model configured to receive each of a plurality of inputs and to respectively process each input to produce a respective embedding; a clustering model configured to receive the respective embedding for each input and to respectively process the respective embedding for each input to produce a respective cluster assignment for each input, wherein the respective cluster assignment for each input comprises a respective probability distribution for the respective embedding with respect to a plurality of clusters, and wherein at least one of the embedding model or the clustering model has been trained based on a clustering loss function that comprises: a first average, across the plurality of inputs, of a respective first entropy of each respective probability distribution; and a second entropy of a second average of the probability distributions for the plurality of inputs; one or more non-transitory computer-readable media that collectively store instructions that, when executed by the one or more processors, cause the computing system to perform operations, the operations comprising: inputting an additional input into the embedding model; receiving an additional embedding as an output of the embedding model, the additional embedding generated by the embedding model by processing the additional input; inputting the additional embedding into the clustering model; and receiving an additional cluster assignment as an output of the clustering model, the additional cluster assignment generated by the clustering model by processing the additional embedding. 11. The computing system of claim 10 , wherein the clustering loss function provides a loss value that is positively correlated with the first average and negatively correlated with the second entropy. 12. The computing system of claim 10 , wherein the second entropy is scaled by a diversity hyperparameter. 13. The computing system of claim 10 , wherein the cluster assignment that describes the mapping of the embedding with respect to the plurality of clusters describes respective centroids of the plurality of clusters. 14. The computing system of claim 10 , wherein the additional cluster assignment describes an additional mapping of the additional embedding with respect to an additional plurality of clusters and comprises an encoding of respective elements of the additional input with respect to one or more of the additional plurality of clusters. 15. The computing system of claim 10 , further comprising a machine-learned primary model configured to receive the embedding, and in response to receiving the embedding, output a primary output, and wherein the operations further comprise modifying parameters of the embedding model based on a primary loss function evaluated with respect to the primary output of the machine-learned primary model. 16. A method for training one or more machine learned models, the method comprising: inputting, by one or more computing devices, each of a plurality of inputs into an embedding model that is configured to respectively process each input to produce a respective embedding; receiving, by the one or more computing devices, the respective embeddings for the plurality of inputs as outputs of the embedding model; inputting, by the one or more computing devices, the embeddings of the plurality of inputs into a clustering model that is configured to receive the respective embedding for each input and to respectively process the respective embedding for each input to produce a respective cluster assignment for each input, wherein the respective cluster assignment for each input comprises a respective probability distribution for the respective embedding with respect to a plurality of clusters; receiving, by the one or more computing devices, the cluster assignment as an output of the clustering model; evaluating, by the one or more computing devices, a clustering loss function that comprises: a mean per-exa
Incorporation of unlabelled data, e.g. multiple instance learning [MIL] · CPC title
using statistics or function optimisation, e.g. modelling of probability density functions · CPC title
Machine learning · CPC title
based on naturality criteria, e.g. with non-negative factorisation or negative correlation · CPC title
Combinations of networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.