Minimum-example/maximum-batch entropy-based clustering with neural networks

US11475236B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11475236-B2
Application numberUS-202016880456-A
CountryUS
Kind codeB2
Filing dateMay 21, 2020
Priority dateMay 22, 2019
Publication dateOct 18, 2022
Grant dateOct 18, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computing system can include an embedding model and a clustering model. The computing system input each of the plurality of inputs into the embedding model and receiving respective embeddings for the plurality of inputs as outputs of the embedding model. The computing system can input the respective embeddings for the plurality of inputs into the clustering model and receiving respective cluster assignments for the plurality of inputs as outputs of the clustering model. The computing system can evaluate a clustering loss function that evaluates a first average, across the plurality of inputs, of a respective first entropy of each respective probability distribution; and a second entropy of a second average of the probability distributions for the plurality of inputs. The computing system can modify parameter(s) of one or both of the clustering model and the embedding model based on the clustering loss function.

First claim

Opening claim text (preview).

What is claimed is: 1. A computing system comprising: one or more processors; an embedding model configured to receive each of a plurality of inputs and to respectively process each input to produce a respective embedding; a clustering model configured to receive the respective embedding for each input and to respectively process the respective embedding for each input to produce a respective cluster assignment for each input, wherein the respective cluster assignment for each input comprises a respective probability distribution for the respective embedding with respect to a plurality of clusters: one or more non-transitory computer-readable media that collectively store instructions that, when executed by the one or more processors, cause the computing system to perform operations, the operations comprising: inputting each of the plurality of inputs into the embedding model; receiving the respective embeddings for the plurality of inputs as outputs of the embedding model; inputting the respective embeddings for the plurality of inputs into the clustering model; receiving the respective cluster assignments for the plurality of inputs as outputs of the clustering model; evaluating a clustering loss function that evaluates: a first average, across the plurality of inputs, of a respective first entropy of each respective probability distribution; and a second entropy of a second average of the probability distributions for the plurality of inputs; and modifying one or more parameters of one or both of the clustering model and the embedding model based on the clustering loss function. 2. The computing system of claim 1 , wherein the clustering loss function provides a loss value that is positively correlated with the first average and negatively correlated with the second entropy. 3. The computing system of claim 1 , wherein modifying parameters of at least one of the clustering model or the embedding model based on the clustering loss function comprises modifying respective parameters of each of the clustering model and the embedding model based on the clustering loss function. 4. The computing system of claim 1 , wherein modifying parameters of at least one of the clustering model or the embedding model based on the clustering loss function comprises modifying parameters of the clustering model and holding parameters of the embedding model constant. 5. The computing system of claim 1 . wherein the second entropy is scaled by a diversity hyperparameter. 6. The computing system of claim 1 , wherein the cluster assignment that describes the mapping of the embedding with respect to the plurality of clusters describes respective centroids of the plurality of clusters. 7. The computing system of claim 1 , wherein the cluster assignment that describes the mapping of the embedding with respect to the plurality of clusters comprises an encoding of respective elements of the input with respect to one or more of the plurality of, clusters. 8. The computing system of claim 1 , further comprising a machine-learned primary model configured to receive the embedding, and in response to receiving the embedding, output a primary output, and wherein the operations further comprise modifying parameters of the embedding model based on a primary loss function evaluated with respect to the primary output of the machine-learned primary model. 9. The computing system of claim 1 , wherein the operations further comprise, at an inference time, inputting an input into the embedding model, and receiving the embedding as an output of the embedding model. 10. A computing system comprising: one or more processors; an embedding model configured to receive each of a plurality of inputs and to respectively process each input to produce a respective embedding; a clustering model configured to receive the respective embedding for each input and to respectively process the respective embedding for each input to produce a respective cluster assignment for each input, wherein the respective cluster assignment for each input comprises a respective probability distribution for the respective embedding with respect to a plurality of clusters, and wherein at least one of the embedding model or the clustering model has been trained based on a clustering loss function that comprises: a first average, across the plurality of inputs, of a respective first entropy of each respective probability distribution; and a second entropy of a second average of the probability distributions for the plurality of inputs; one or more non-transitory computer-readable media that collectively store instructions that, when executed by the one or more processors, cause the computing system to perform operations, the operations comprising: inputting an additional input into the embedding model; receiving an additional embedding as an output of the embedding model, the additional embedding generated by the embedding model by processing the additional input; inputting the additional embedding into the clustering model; and receiving an additional cluster assignment as an output of the clustering model, the additional cluster assignment generated by the clustering model by processing the additional embedding. 11. The computing system of claim 10 , wherein the clustering loss function provides a loss value that is positively correlated with the first average and negatively correlated with the second entropy. 12. The computing system of claim 10 , wherein the second entropy is scaled by a diversity hyperparameter. 13. The computing system of claim 10 , wherein the cluster assignment that describes the mapping of the embedding with respect to the plurality of clusters describes respective centroids of the plurality of clusters. 14. The computing system of claim 10 , wherein the additional cluster assignment describes an additional mapping of the additional embedding with respect to an additional plurality of clusters and comprises an encoding of respective elements of the additional input with respect to one or more of the additional plurality of clusters. 15. The computing system of claim 10 , further comprising a machine-learned primary model configured to receive the embedding, and in response to receiving the embedding, output a primary output, and wherein the operations further comprise modifying parameters of the embedding model based on a primary loss function evaluated with respect to the primary output of the machine-learned primary model. 16. A method for training one or more machine learned models, the method comprising: inputting, by one or more computing devices, each of a plurality of inputs into an embedding model that is configured to respectively process each input to produce a respective embedding; receiving, by the one or more computing devices, the respective embeddings for the plurality of inputs as outputs of the embedding model; inputting, by the one or more computing devices, the embeddings of the plurality of inputs into a clustering model that is configured to receive the respective embedding for each input and to respectively process the respective embedding for each input to produce a respective cluster assignment for each input, wherein the respective cluster assignment for each input comprises a respective probability distribution for the respective embedding with respect to a plurality of clusters; receiving, by the one or more computing devices, the cluster assignment as an output of the clustering model; evaluating, by the one or more computing devices, a clustering loss function that comprises: a mean per-exa

Assignees

Inventors

Classifications

  • Incorporation of unlabelled data, e.g. multiple instance learning [MIL] · CPC title

  • using statistics or function optimisation, e.g. modelling of probability density functions · CPC title

  • G06N20/00Primary

    Machine learning · CPC title

  • based on naturality criteria, e.g. with non-negative factorisation or negative correlation · CPC title

  • Combinations of networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11475236B2 cover?
A computing system can include an embedding model and a clustering model. The computing system input each of the plurality of inputs into the embedding model and receiving respective embeddings for the plurality of inputs as outputs of the embedding model. The computing system can input the respective embeddings for the plurality of inputs into the clustering model and receiving respective clus…
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G06N20/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 18 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).