Methods and systems for dynamic constitutional guidance using artificial intelligence
US-2021343407-A1 · Nov 4, 2021 · US
US12073322B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12073322-B2 |
| Application number | US-202117327030-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 21, 2021 |
| Priority date | May 22, 2020 |
| Publication date | Aug 27, 2024 |
| Grant date | Aug 27, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A computer-implemented method for training a classifier (Φη), including: training a pretext model (ΦΘ) to learn a pretext task, so as to minimize a distance between an output of a source sample via the pretext model (ΦΘ) and an output of a corresponding transformed sample via the pretext model (ΦΘ), the transformed sample being a sample obtained by applying a transformation (T) to the source sample; S 20 ) determining a neighborhood (NXi) of samples (Xi) of a dataset (SD) in the embedding space; S 30 ) training the classifier (Φη) to predict respective estimated probabilities Φηj(Xi), j=1 . . . C, for a sample (Xi) to belong to respective clusters (Cj), by using a second training criterion which tends to: maximize a likelihood for a sample and its neighbors (Xj) of its neighborhood (Nxi) to belong to the same cluster; and force the samples to be distributed over several clusters.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented training method for training a classifier (Φη), wherein: a transformed sample being a sample obtained by applying a transformation (T) to a source sample, where the source sample is a datum of a source dataset (SD); the training method comprises: S 10 ) training a pretext model (ΦΘ) to learn a pretext task, based on a source dataset (SD), by using a first training criterion which tends to minimize, across the source samples of the source dataset, a distance between an output of a source sample via the pretext model (ΦΘ) and an output of a corresponding transformed sample via the pretext model (ΦΘ); S 20 ) for at least one sample among the samples (Xi) of the source dataset (SD), determining a neighborhood (NXi) of the at least one sample; wherein for the at least one sample, the neighborhood (NXi) of the at least one sample comprises K closest neighbors of the sample, K being an integer, K>=1, the K closest neighbors of the sample being K samples Xj of the dataset having smallest distances between ΦΘ(Xi) and ΦΘ(Xj); S 30 ) training the classifier Φη to predict respective estimated probabilities Φηj(Xi), j=1 . . . C, for a sample to belong to respective clusters (Cj), by using a second training criterion which: tends to maximize a likelihood for a sample and a neighbor (Xj) of the sample belonging to the neighborhood (NXi) of the sample to belong to the same cluster; and tends to force the samples to be distributed over a plurality of clusters; the second training criterion includes a summation: Λ = - 1 D ∑ X ∈ D k ∑ ∈ N X i f < Φ η ( X ) , Φ η ( k ) > where f is an increasing continuous function, for instance a logarithm; <, > is a dot product; D is a dataset used for training the classifier at step S 30 ; and |D| is the number of samples in the dataset. 2. The training method of claim 1 , wherein the first training criterion includes, for a considered sample, a term which increases when differences between a prediction ΦΘ(Xi) for the considered sample (Xi) and a prediction ΦΘ(Tj(Xi)) for the corresponding transformed sample (T(Xi)) increases. 3. The training method according to claim 1 , wherein at step S 20 , for at least a set of two or more similar samples, for which it has been determined that the similar samples should belong to the same cluster, a neighborhood is determined for each of the similar samples, the neighborhood comprising at least the other one(s) of the similar samples. 4. The training method according to claim 1 , wherein at step S 30 , the second training criterion is configured, for at least a set of two or more similar samples of the dataset, for which it has been determined that the similar samples should belong to the same cluster, to tend to maximize a likelihood for the similar samples of the considered set of similar samples to belong to the same cluster. 5. The training method of claim 1 , further comprising executing at least one time a fine-tuning step S 40 : S 40 ) training the classifier (Φη) based on a third training criterion (Λ), the third training criterion being configured to maximize, for each considered sample (Xi) among high-confidence samples (X hc i) whose highest probability (Φη max (Xi)) is above a predetermined threshold (Thr), a probability (Φη j )(Xi)) for the considered sample to belong to the cluster (Cj) indicated by a maximum coordinate (Φη max (Xi)) of the prediction (Φη max (Xi)) for the considered sample (Xi). 6. The training method of claim 5 , wherein an execution of fine-tuning step(s) (S 40 ) is stopped when it is determined that the number of high-confidence samples does not increase anymore. 7. The training method according to claim 5 , wherein at step S 40 , the third training criterion is configured, for at least a set of two or more similar samples of the dataset, for which it has been determined that the similar samples should belong to the same cluster, to tend to maximize a likelihood for the similar samples of the considered set of similar samples to belong to the same cluster. 8. The training method according to claim 5 , wherein a strong augmentation (SA(Xi)) of a high confidence sample (X HC i) is added to the dataset used for a step S 40 ; and the third training criterion is configured to maximize a likelihood for the strong augmentation (SA(Xi)) to belong to the same cluster as the high confidence sample (X HC i). 9. The training method according to claim 1 , wherein the training method comprises, at one or more of steps S 20 , S 30 and S 40 , taking into account prior knowledge that a plurality of samples, called similar samples, form a cluster; whereby, at step S 20 , the similar samples are to be considered as neighbors when defining the neighborhoods, and/or at step
Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
using neural networks · CPC title
Distances to cluster centroïds · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.