What technology area does this patent fall under?

Primary CPC classification G06V10/764. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Nov 18 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Method of training image representation model

US12475680B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12475680-B2
Application number	US-202318116602-A
Country	US
Kind code	B2
Filing date	Mar 2, 2023
Priority date	Sep 2, 2022
Publication date	Nov 18, 2025
Grant date	Nov 18, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method generates an anchor image embedding vector for an anchor image using an image representation model, determine first similarities between the anchor image and negative samples of the anchor image using first image embedding vectors for the negative samples and the generated anchor image embedding vector, determine second similarities between the anchor image and positive samples of the anchor image using second image embedding vectors for the positive samples and the generated anchor image embedding vector, obtain one of a vector corresponding to a label of the anchor image and third similarities between the label of the anchor image and labels of the negative samples, determine a loss value for the anchor image based on the determined first similarities, and the determined second similarities, and one of the obtained third similarities and a fourth similarity.

First claim

Opening claim text (preview).

What is claimed is: 1 . A training method performed by a computing apparatus, the training method comprising: generating an anchor image embedding vector for an anchor image using an image representation model; determining first similarities between the anchor image and negative samples of the anchor image using first image embedding vectors for the negative samples and the generated anchor image embedding vector; determining second similarities between the anchor image and positive samples of the anchor image using second image embedding vectors for the positive samples and the generated anchor image embedding vector; obtaining one of a vector corresponding to a label of the anchor image and third similarities between the label of the anchor image and labels of the negative samples; determining a loss value for the anchor image based on (i) the determined first similarities, (ii) the determined second similarities, and (iii) one of the obtained third similarities and a fourth similarity, wherein the fourth similarity is a similarity between the obtained vector and the generated anchor image embedding vector; and updating weights of the image representation model based on the determined loss value. 2 . The training method of claim 1 , wherein the positive samples and the anchor image belong to a same class, and the negative samples do not belong to the class. 3 . The training method of claim 1 , wherein the determining of the loss value comprises: applying the obtained third similarities as weights to each of the determined first similarities; calculating normalized values for the obtained third similarities; and determining the loss value using a result of applying the obtained third similarities as weights to each of the determined first similarities, the calculated normalized values, and the determined second similarities. 4 . The training method of claim 1 , further comprising: determining similarities of pairings of labels of respective images in a training data set using an embedding model; generating a first dictionary to store the similarities for the pairings; forming a batch of images extracted from the training data set; forming an image set corresponding to the batch by performing augmentation on the images in the formed batch; and retrieving, from the first dictionary, similarities for respective pairings of labels of the batch. 5 . The training method of claim 4 , wherein the obtaining comprises obtaining the third similarities from among the retrieved similarities. 6 . The training method of claim 1 , wherein the third similarities are similarities between the vector corresponding to the label of the anchor image and vectors corresponding to the labels of the negative samples, and the vector corresponding to the label of the anchor image and the vectors corresponding to the labels of the negative samples are generated by an embedding model. 7 . The training method of claim 1 , wherein the determining of the loss value comprises: determining an initial loss value using the determined first similarities and the determined second similarities; applying a first weight to the determined initial loss value; applying a second weight to the fourth similarity; and determining the loss value by subtracting the fourth similarity to which the second weight is applied from the initial loss value to which the first weight is applied. 8 . The training method of claim 7 , wherein the sum of the first weight and the second weight is 1. 9 . The training method of claim 1 , further comprising: generating vectors respectively corresponding to labels of a training data set using an embedding model; generating a second dictionary to store the generated vectors; forming a batch by extracting images from the training data set; forming an image set corresponding to the batch by performing augmentation on the images in the formed batch; and retrieving vectors corresponding to labels of the batch from the second dictionary. 10 . The training method of claim 9 , wherein the obtaining comprises obtaining the vector corresponding to the label of the anchor image from among the retrieved vectors. 11 . A computing apparatus, comprising: a memory configured to store one or more instructions; and a processor configured to execute the stored instructions, wherein, when the instructions are executed, the processor is configured to: generate an anchor image embedding vector for an anchor image using an image representation model, determine first similarities between the anchor image and negative samples of the anchor image using first image embedding vectors for the negative samples and the generated anchor image embedding vector, determine second similarities between the anchor image and positive samples of the anchor image using second image embedding vectors for the positive samples and the generated anchor image embedding vector, obtain one of a vector corresponding to a label of the anchor image and third similarities between the label of the anchor image and labels of the negative samples, determine a loss value for the anchor image based on (i) the determined first similarities, (iii) the determined second similarities, and (iii) one of the obtained third similarities and a fourth similarity, wherein the fourth similarity is a similarity between the obtained vector and the generated anchor image embedding vector, and update weights of the image representation model based on the determined loss value. 12 . The computing apparatus of claim 11 , wherein the positive samples and the anchor image belong to a same class, and the negative samples and the anchor image do not belong to the class. 13 . The computing apparatus of claim 11 , wherein the processor is configured to apply the obtained third similarities as weights to each of the determined first similarities, calculate normalized values for the obtained third similarities, and determine the loss value using a result of applying the obtained third similarities as weights to each of the determined first similarities, the calculated normalized values, and the determined second similarities. 14 . The computing apparatus of claim 11 , wherein the processor is configured to determine similarities of pairings of labels of respective images in a training data set using an embedding model, generate a first dictionary to store the similarities for the pairings, form a batch of images extracted from the training data set, form an image set corresponding to the batch by performing augmentation on the images in the formed batch, and retrieve, from the first dictionary, similarities for respective pairings of labels of the batch. 15 . The computing apparatus of claim 14 , wherein the processor is configured to obtain the third similarities from among the retrieved similarities. 16 . The computing apparatus of claim 11 , wherein the third similarities are similarities between the vector corresponding to the label of the anchor image and vectors corresponding to the labels of the negative samples, and the vector corresponding to the label of the anchor image and the vectors corresponding to the labels of the negative samples are generated by an embedding model. 17 . The computing apparatus of claim 11 , wherein the processor is configured to determine an initial loss value using the determined first similarities and the determined second similarities, apply a first weight to the determined initial loss value, apply a second weight to the fourth similarity, and de

Assignees

Samsung Electronics Co Ltd

Inventors

Classifications

G06V10/764Primary
using classification, e.g. of video objects · CPC title
G06N20/00
Machine learning · CPC title
G06V20/70
Labelling scene content, e.g. deriving syntactic or semantic representations · CPC title
G06V10/761Primary
Proximity, similarity or dissimilarity measures · CPC title
G06V10/469
Contour-based spatial representations, e.g. vector-coding · CPC title

Patent family

Related publications grouped by family.

View patent family 90060795

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12475680B2 cover?: A method generates an anchor image embedding vector for an anchor image using an image representation model, determine first similarities between the anchor image and negative samples of the anchor image using first image embedding vectors for the negative samples and the generated anchor image embedding vector, determine second similarities between the anchor image and positive samples of the …
Who is the assignee on this patent?: Samsung Electronics Co Ltd
What technology area does this patent fall under?: Primary CPC classification G06V10/764. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Nov 18 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Systems and methods for contrastive learning of visual representations

Supervised contrastive learning with multiple positive examples

Systems and methods for video representation learning with a weak teacher

Self-supervised visual-relationship probing

Image embedding for object tracking

Learning unified embedding

Image classification utilizing semantic relationships in a classification hierarchy

Frequently asked questions