Computer-implemented training method, classification method and system and computer-readable recording medium

US12073322B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12073322-B2
Application numberUS-202117327030-A
CountryUS
Kind codeB2
Filing dateMay 21, 2021
Priority dateMay 22, 2020
Publication dateAug 27, 2024
Grant dateAug 27, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer-implemented method for training a classifier (Φη), including: training a pretext model (ΦΘ) to learn a pretext task, so as to minimize a distance between an output of a source sample via the pretext model (ΦΘ) and an output of a corresponding transformed sample via the pretext model (ΦΘ), the transformed sample being a sample obtained by applying a transformation (T) to the source sample; S 20 ) determining a neighborhood (NXi) of samples (Xi) of a dataset (SD) in the embedding space; S 30 ) training the classifier (Φη) to predict respective estimated probabilities Φηj(Xi), j=1 . . . C, for a sample (Xi) to belong to respective clusters (Cj), by using a second training criterion which tends to: maximize a likelihood for a sample and its neighbors (Xj) of its neighborhood (Nxi) to belong to the same cluster; and force the samples to be distributed over several clusters.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented training method for training a classifier (Φη), wherein: a transformed sample being a sample obtained by applying a transformation (T) to a source sample, where the source sample is a datum of a source dataset (SD); the training method comprises: S 10 ) training a pretext model (ΦΘ) to learn a pretext task, based on a source dataset (SD), by using a first training criterion which tends to minimize, across the source samples of the source dataset, a distance between an output of a source sample via the pretext model (ΦΘ) and an output of a corresponding transformed sample via the pretext model (ΦΘ); S 20 ) for at least one sample among the samples (Xi) of the source dataset (SD), determining a neighborhood (NXi) of the at least one sample; wherein for the at least one sample, the neighborhood (NXi) of the at least one sample comprises K closest neighbors of the sample, K being an integer, K>=1, the K closest neighbors of the sample being K samples Xj of the dataset having smallest distances between ΦΘ(Xi) and ΦΘ(Xj); S 30 ) training the classifier Φη to predict respective estimated probabilities Φηj(Xi), j=1 . . . C, for a sample to belong to respective clusters (Cj), by using a second training criterion which: tends to maximize a likelihood for a sample and a neighbor (Xj) of the sample belonging to the neighborhood (NXi) of the sample to belong to the same cluster; and tends to force the samples to be distributed over a plurality of clusters; the second training criterion includes a summation: Λ = - 1  D  ⁢ ∑ X ∈ D ⁢ ⁢ k ⁢ ∑ ∈ N ⁢ X ⁢ i ⁢ f < Φ ⁢ η ⁡ ( X ) , Φ ⁢ η ⁡ ( k ) > where f is an increasing continuous function, for instance a logarithm; <, > is a dot product; D is a dataset used for training the classifier at step S 30 ; and |D| is the number of samples in the dataset. 2. The training method of claim 1 , wherein the first training criterion includes, for a considered sample, a term which increases when differences between a prediction ΦΘ(Xi) for the considered sample (Xi) and a prediction ΦΘ(Tj(Xi)) for the corresponding transformed sample (T(Xi)) increases. 3. The training method according to claim 1 , wherein at step S 20 , for at least a set of two or more similar samples, for which it has been determined that the similar samples should belong to the same cluster, a neighborhood is determined for each of the similar samples, the neighborhood comprising at least the other one(s) of the similar samples. 4. The training method according to claim 1 , wherein at step S 30 , the second training criterion is configured, for at least a set of two or more similar samples of the dataset, for which it has been determined that the similar samples should belong to the same cluster, to tend to maximize a likelihood for the similar samples of the considered set of similar samples to belong to the same cluster. 5. The training method of claim 1 , further comprising executing at least one time a fine-tuning step S 40 : S 40 ) training the classifier (Φη) based on a third training criterion (Λ), the third training criterion being configured to maximize, for each considered sample (Xi) among high-confidence samples (X hc i) whose highest probability (Φη max (Xi)) is above a predetermined threshold (Thr), a probability (Φη j )(Xi)) for the considered sample to belong to the cluster (Cj) indicated by a maximum coordinate (Φη max (Xi)) of the prediction (Φη max (Xi)) for the considered sample (Xi). 6. The training method of claim 5 , wherein an execution of fine-tuning step(s) (S 40 ) is stopped when it is determined that the number of high-confidence samples does not increase anymore. 7. The training method according to claim 5 , wherein at step S 40 , the third training criterion is configured, for at least a set of two or more similar samples of the dataset, for which it has been determined that the similar samples should belong to the same cluster, to tend to maximize a likelihood for the similar samples of the considered set of similar samples to belong to the same cluster. 8. The training method according to claim 5 , wherein a strong augmentation (SA(Xi)) of a high confidence sample (X HC i) is added to the dataset used for a step S 40 ; and the third training criterion is configured to maximize a likelihood for the strong augmentation (SA(Xi)) to belong to the same cluster as the high confidence sample (X HC i). 9. The training method according to claim 1 , wherein the training method comprises, at one or more of steps S 20 , S 30 and S 40 , taking into account prior knowledge that a plurality of samples, called similar samples, form a cluster; whereby, at step S 20 , the similar samples are to be considered as neighbors when defining the neighborhoods, and/or at step

Assignees

Inventors

Classifications

  • Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

  • using neural networks · CPC title

  • Distances to cluster centroïds · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12073322B2 cover?
A computer-implemented method for training a classifier (Φη), including: training a pretext model (ΦΘ) to learn a pretext task, so as to minimize a distance between an output of a source sample via the pretext model (ΦΘ) and an output of a corresponding transformed sample via the pretext model (ΦΘ), the transformed sample being a sample obtained by applying a transformation (T) to the source sa…
Who is the assignee on this patent?
Toyota Motor Co Ltd, Univ Leuven Kath
What technology area does this patent fall under?
Primary CPC classification G06N3/08. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 27 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).