Learning device, learning method and program
US-2024020530-A1 · Jan 18, 2024 · US
US12315228B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12315228-B2 |
| Application number | US-202217978425-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 1, 2022 |
| Priority date | Nov 5, 2021 |
| Publication date | May 27, 2025 |
| Grant date | May 27, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A processor-implemented method includes: generating a first sample image and a second sample image by performing data augmentation on an input training image; generating a first feature map of the first sample image and a second feature map of the second sample image by performing feature extraction on the first sample image and the second sample image using an encoding model; determining first loss data according to a relationship between first feature vectors of the first feature map and second feature vectors of the second feature map; estimating relative geometric information of the first feature map and the second feature map using a relationship estimation model; determining second loss data according to the relative geometric information, based on label data according to a geometric arrangement of the first sample image and the second sample image in the input training image; and training the encoding model and the relationship estimation model, based on the first loss data and the second loss data.
Opening claim text (preview).
What is claimed is: 1. A processor-implemented method comprising: generating a first sample image and a second sample image by performing data augmentation on an input training image; generating a first feature map of the first sample image and a second feature map of the second sample image by performing feature extraction on the first sample image and the second sample image using an encoding model; determining first loss data according to a relationship between first feature vectors of the first feature map and second feature vectors of the second feature map; estimating relative geometric information of the first feature map and the second feature map using a relationship estimation model; determining second loss data according to the relative geometric information, based on label data according to a geometric arrangement of the first sample image and the second sample image in the input training image; and training the encoding model and the relationship estimation model, based on the first loss data and the second loss data. 2. The method of claim 1 , wherein the determining of the first loss data comprises: selecting, from among the first feature vectors and the second feature vectors, overlapping feature vectors corresponding to an overlapping region of the first sample image and the second sample image; and determining the first loss data, based on a difference between the overlapping feature vectors. 3. The method of claim 1 , wherein the relative geometric information comprises at least a portion of relative position information of corresponding grid cells according to the first feature vectors and the second feature vectors in the input training image and relative scale information of corresponding images according to the first feature map and the second feature map. 4. The method of claim 3 , wherein the corresponding images comprise either one or both of: the first sample image and the second sample image; and a first resized image resized from the first sample image and a second resized image resized from the second sample image. 5. The method of claim 3 , wherein the relative position information is configured to specify an offset between the corresponding grid cells as an x-axis component and a y-axis component, and the relative scale information is configured to specify a scale ratio of the corresponding images as a width component and a height component. 6. The method of claim 3 , wherein the label data comprises at least a portion of label data of the relative position information according to grid cells of the first sample image and the second sample image and label data of the relative scale information according to the first sample image and the second sample image. 7. The method of claim 1 , wherein the relative geometric information comprises mask information representing an overlapping region of corresponding images according to the first feature map and the second feature map. 8. The method of claim 7 , wherein the label data comprises label data of the mask information according to the first sample image and the second sample image. 9. The method of claim 1 , wherein the label data is determined according to the geometric arrangement of the first sample image and the second sample image, and the determining of the second loss data comprises determining the second loss data, based on a difference between the label data and the relative geometric information. 10. The method of claim 1 , wherein the encoding model and the relationship estimation model correspond to a neural network model. 11. The method of claim 1 , wherein the estimating of the relative geometric information comprises: determining input data by concatenating the first feature map and the second feature map; and estimating the relative geometric information by performing a convolution operation according to the input data. 12. A non-transitory computer-readable storage medium storing instructions that, when executed by one or more processors, configure the one or more processors to perform the method of claim 1 . 13. An apparatus comprising: one or more processors configured to: generate a first sample image and a second sample image by performing data augmentation on an input training image; generate a first feature map of the first sample image and a second feature map of the second sample image by performing feature extraction on the first sample image and the second sample image using an encoding model; determine first loss data according to a relationship between first feature vectors of the first feature map and second feature vectors of the second feature map; estimate relative geometric information of the first feature map and the second feature map using a relationship estimation model; determine second loss data according to the relative geometric information, based on label data according to a geometric arrangement of the first sample image and the second sample image in the input training image; and train the encoding model and the relationship estimation model, based on the first loss data and the second loss data. 14. The apparatus of claim 13 , wherein, for the determining of the first loss data, the one or more processors are configured to: select, from among the first feature vectors and the second feature vectors, overlapping feature vectors corresponding to an overlapping region of the first sample image and the second sample image; and determine the first loss data, based on a difference between the overlapping feature vectors. 15. The apparatus of claim 13 , wherein the relative geometric information comprises at least a portion of relative position information of corresponding grid cells according to the first feature vectors and the second feature vectors in the input training image and relative scale information of corresponding images according to the first feature map and the second feature map. 16. The apparatus of claim 15 , wherein the label data comprises label data of the relative position information according to grid cells of the first sample image and the second sample image and label data of the relative scale information according to the first sample image and the second sample image. 17. The apparatus of claim 13 , wherein the relative geometric information comprises mask information representing an overlapping region of corresponding images according to the first feature map and the second feature map. 18. The apparatus of claim 17 , wherein the label data comprises label data of the mask information according to the first sample image and the second sample image. 19. The apparatus of claim 13 , wherein the label data is determined according to the geometric arrangement of the first sample image and the second sample image, and for the determining of the second loss data, the one or more processors are configured to: determine the second loss data, based on a difference between the label data and the relative geometric information. 20. The apparatus of claim 13 , wherein the encoding model and the relationship estimation model correspond to a neural network model. 21. The apparatus of claim 13 , wherein, for the estimating of the relative geometric information, the one or more processors are configured to: determine input data by concatenating the first feature map and the second feature map; and estimate the relative geometric information by performing a convolution operation according to the input data. 22
Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods · CPC title
using neural networks · CPC title
Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.