Method of training image representation model

US12475680B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12475680-B2
Application numberUS-202318116602-A
CountryUS
Kind codeB2
Filing dateMar 2, 2023
Priority dateSep 2, 2022
Publication dateNov 18, 2025
Grant dateNov 18, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method generates an anchor image embedding vector for an anchor image using an image representation model, determine first similarities between the anchor image and negative samples of the anchor image using first image embedding vectors for the negative samples and the generated anchor image embedding vector, determine second similarities between the anchor image and positive samples of the anchor image using second image embedding vectors for the positive samples and the generated anchor image embedding vector, obtain one of a vector corresponding to a label of the anchor image and third similarities between the label of the anchor image and labels of the negative samples, determine a loss value for the anchor image based on the determined first similarities, and the determined second similarities, and one of the obtained third similarities and a fourth similarity.

First claim

Opening claim text (preview).

What is claimed is: 1 . A training method performed by a computing apparatus, the training method comprising: generating an anchor image embedding vector for an anchor image using an image representation model; determining first similarities between the anchor image and negative samples of the anchor image using first image embedding vectors for the negative samples and the generated anchor image embedding vector; determining second similarities between the anchor image and positive samples of the anchor image using second image embedding vectors for the positive samples and the generated anchor image embedding vector; obtaining one of a vector corresponding to a label of the anchor image and third similarities between the label of the anchor image and labels of the negative samples; determining a loss value for the anchor image based on (i) the determined first similarities, (ii) the determined second similarities, and (iii) one of the obtained third similarities and a fourth similarity, wherein the fourth similarity is a similarity between the obtained vector and the generated anchor image embedding vector; and updating weights of the image representation model based on the determined loss value. 2 . The training method of claim 1 , wherein the positive samples and the anchor image belong to a same class, and the negative samples do not belong to the class. 3 . The training method of claim 1 , wherein the determining of the loss value comprises: applying the obtained third similarities as weights to each of the determined first similarities; calculating normalized values for the obtained third similarities; and determining the loss value using a result of applying the obtained third similarities as weights to each of the determined first similarities, the calculated normalized values, and the determined second similarities. 4 . The training method of claim 1 , further comprising: determining similarities of pairings of labels of respective images in a training data set using an embedding model; generating a first dictionary to store the similarities for the pairings; forming a batch of images extracted from the training data set; forming an image set corresponding to the batch by performing augmentation on the images in the formed batch; and retrieving, from the first dictionary, similarities for respective pairings of labels of the batch. 5 . The training method of claim 4 , wherein the obtaining comprises obtaining the third similarities from among the retrieved similarities. 6 . The training method of claim 1 , wherein the third similarities are similarities between the vector corresponding to the label of the anchor image and vectors corresponding to the labels of the negative samples, and the vector corresponding to the label of the anchor image and the vectors corresponding to the labels of the negative samples are generated by an embedding model. 7 . The training method of claim 1 , wherein the determining of the loss value comprises: determining an initial loss value using the determined first similarities and the determined second similarities; applying a first weight to the determined initial loss value; applying a second weight to the fourth similarity; and determining the loss value by subtracting the fourth similarity to which the second weight is applied from the initial loss value to which the first weight is applied. 8 . The training method of claim 7 , wherein the sum of the first weight and the second weight is 1. 9 . The training method of claim 1 , further comprising: generating vectors respectively corresponding to labels of a training data set using an embedding model; generating a second dictionary to store the generated vectors; forming a batch by extracting images from the training data set; forming an image set corresponding to the batch by performing augmentation on the images in the formed batch; and retrieving vectors corresponding to labels of the batch from the second dictionary. 10 . The training method of claim 9 , wherein the obtaining comprises obtaining the vector corresponding to the label of the anchor image from among the retrieved vectors. 11 . A computing apparatus, comprising: a memory configured to store one or more instructions; and a processor configured to execute the stored instructions, wherein, when the instructions are executed, the processor is configured to: generate an anchor image embedding vector for an anchor image using an image representation model, determine first similarities between the anchor image and negative samples of the anchor image using first image embedding vectors for the negative samples and the generated anchor image embedding vector, determine second similarities between the anchor image and positive samples of the anchor image using second image embedding vectors for the positive samples and the generated anchor image embedding vector, obtain one of a vector corresponding to a label of the anchor image and third similarities between the label of the anchor image and labels of the negative samples, determine a loss value for the anchor image based on (i) the determined first similarities, (iii) the determined second similarities, and (iii) one of the obtained third similarities and a fourth similarity, wherein the fourth similarity is a similarity between the obtained vector and the generated anchor image embedding vector, and update weights of the image representation model based on the determined loss value. 12 . The computing apparatus of claim 11 , wherein the positive samples and the anchor image belong to a same class, and the negative samples and the anchor image do not belong to the class. 13 . The computing apparatus of claim 11 , wherein the processor is configured to apply the obtained third similarities as weights to each of the determined first similarities, calculate normalized values for the obtained third similarities, and determine the loss value using a result of applying the obtained third similarities as weights to each of the determined first similarities, the calculated normalized values, and the determined second similarities. 14 . The computing apparatus of claim 11 , wherein the processor is configured to determine similarities of pairings of labels of respective images in a training data set using an embedding model, generate a first dictionary to store the similarities for the pairings, form a batch of images extracted from the training data set, form an image set corresponding to the batch by performing augmentation on the images in the formed batch, and retrieve, from the first dictionary, similarities for respective pairings of labels of the batch. 15 . The computing apparatus of claim 14 , wherein the processor is configured to obtain the third similarities from among the retrieved similarities. 16 . The computing apparatus of claim 11 , wherein the third similarities are similarities between the vector corresponding to the label of the anchor image and vectors corresponding to the labels of the negative samples, and the vector corresponding to the label of the anchor image and the vectors corresponding to the labels of the negative samples are generated by an embedding model. 17 . The computing apparatus of claim 11 , wherein the processor is configured to determine an initial loss value using the determined first similarities and the determined second similarities, apply a first weight to the determined initial loss value, apply a second weight to the fourth similarity, and de

Assignees

Inventors

Classifications

  • G06V10/764Primary

    using classification, e.g. of video objects · CPC title

  • Machine learning · CPC title

  • Labelling scene content, e.g. deriving syntactic or semantic representations · CPC title

  • G06V10/761Primary

    Proximity, similarity or dissimilarity measures · CPC title

  • Contour-based spatial representations, e.g. vector-coding · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12475680B2 cover?
A method generates an anchor image embedding vector for an anchor image using an image representation model, determine first similarities between the anchor image and negative samples of the anchor image using first image embedding vectors for the negative samples and the generated anchor image embedding vector, determine second similarities between the anchor image and positive samples of the …
Who is the assignee on this patent?
Samsung Electronics Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06V10/764. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 18 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).