Evaluation method for learning models, training method, device, and program
US-2020005183-A1 · Jan 2, 2020 · US
US12530865B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12530865-B2 |
| Application number | US-202117920276-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 18, 2021 |
| Priority date | Apr 30, 2020 |
| Publication date | Jan 20, 2026 |
| Grant date | Jan 20, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
There is provided an information processing device to improve the accuracy of estimation using a student network, the information processing device including an estimation unit that estimates an object class of an object included in an input image using a student network generated based on a teacher network generated by machine learning using images stored in a large-scale image database as training data. The student network is generated by machine learning using, as training data, synthetic images obtained using the teacher network and real environment images acquired by a plurality of different modalities in a real environment in which estimation by the estimation unit is expected to be executed.
Opening claim text (preview).
The invention claimed is: 1 . An information processing device comprising circuitry configured to estimate an object class of an object included in an input image using a student network generated based on a teacher network generated by machine learning using images stored in a large-scale image database as training data, wherein the student network is generated by machine learning using, as training data, synthetic images obtained using the teacher network and real environment images acquired by a plurality of different modalities in a real environment in which the estimation by the circuitry is executed, and the synthetic images are generated based on inputting the real environment images acquired by a corresponding modality to each of a plurality of the teacher networks corresponding to a single modality. 2 . The information processing device according to claim 1 , wherein the synthetic images are generated based on adding noise to a feature quantity obtained by inputting the real environment images to the teacher network. 3 . The information processing device according to claim 2 , wherein the synthetic images are generated based on adding noise in a principal component direction in a feature quantity distribution obtained by inputting the real environment images to the teacher network. 4 . The information processing device according to claim 3 , wherein the synthetic images are generated so that a difference between a feature quantity after average pooling obtained by inputting the real environment images to the teacher network and a feature quantity in which noise is added to the feature quantity in the principal component direction decreases. 5 . The information processing device according to claim 1 , wherein the student network is generated by machine learning using the synthetic images whose degree of similarity between different modalities exceeds a threshold value among the generated synthetic images. 6 . The information processing device according to claim 1 , wherein the student network is generated by machine learning using the synthetic images whose degree of similarity in a same modality exceeds a threshold value among the generated synthetic images. 7 . The information processing device according to claim 1 , wherein the student network is generated by machine learning using a fusion image obtained by fusing a plurality of the generated synthetic images. 8 . The information processing device according to claim 7 , wherein the fusion image is generated by fusing a plurality of the synthetic images whose degree of similarity exceeds a threshold value among the generated synthetic images. 9 . The information processing device according to claim 7 , wherein the fusion image is generated by fusing a plurality of the synthetic images related to a same object class among the generated synthetic images. 10 . The information processing device according to claim 7 , wherein the fusion image is generated by concatenating the synthetic images related to a plurality of different modalities. 11 . The information processing device according to claim 10 , wherein the fusion image is generated by concatenating a plurality of the generated synthetic images whose degree of similarity between different modalities exceeds a threshold value in a channel direction. 12 . The information processing device according to claim 1 , wherein the synthetic images are generated based on a process of similarizing feature quantity distributions related to each modality obtained by inputting the real environment images to the teacher network. 13 . The information processing device according to claim 12 , wherein the synthetic images are generated using the teacher network generated by machine learning using an image obtained by concatenating real environment images related to a plurality of modalities acquired at a same timing and from a same direction in a channel direction as training data. 14 . The information processing device according to claim 12 , wherein the synthetic images are generated based on a process of decreasing a distance on a feature quantity space between a feature quantity obtained by inputting the real environment images related to a certain modality to the teacher network and a feature quantity obtained by inputting the real environment images related to another modality different from the certain modality to the teacher network. 15 . The information processing device according to claim 12 , wherein the synthetic images are generated based on a process of transforming a feature quantity obtained by inputting the real environment images related to a certain modality to the teacher network and a feature quantity obtained by inputting the real environment images related to another modality different from the certain modality to the teacher network. 16 . The information processing device according to claim 1 , wherein the circuitry is further configured to acquire images in the real environment, and the circuitry estimates an object class related to an object included in an acquired image. 17 . The information processing device according to claim 16 , wherein the circuitry acquires images by at least one modality among a plurality of modalities used for acquiring the real environment images used for generating the synthetic images. 18 . An information processing device comprising circuitry configured to generate a student network based on a teacher network generated by machine learning using images stored in a large-scale image database as training data, wherein the circuitry generates the student network by machine learning using, as training data, synthetic images obtained using the teacher network and real environment images acquired by a plurality of different modalities in a real environment in which an estimation of an object class of an object included in an input image using the student network is to be executed, and the synthetic images are generated based on inputting the real environment images acquired by a corresponding modality to each of a plurality of the teacher networks corresponding to a single modality. 19 . A non-transitory computer-readable medium having embodied thereon a program, which when executed by a computer causes the computer to perform an information processing method comprising estimating an object class of an object included in an input image using a student network generated based on a teacher network generated by machine learning using images stored in a large-scale image database as training data, wherein the student network is generated by machine learning using, as training data, synthetic images obtained using the teacher network and real environment images acquired by a plurality of different modalities in a real environment in which the estimation is executed, and the synthetic images are generated based on inputting the real environment images acquired by a corresponding modality to each of a plurality of the teacher networks corresponding to a single modality.
Image fusion; Image merging · CPC title
Artificial neural networks [ANN] · CPC title
Training; Learning · CPC title
using two or more images, e.g. averaging or subtraction · CPC title
Proximity, similarity or dissimilarity measures · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.