Information processing device and program

US12530865B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12530865-B2
Application numberUS-202117920276-A
CountryUS
Kind codeB2
Filing dateMar 18, 2021
Priority dateApr 30, 2020
Publication dateJan 20, 2026
Grant dateJan 20, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

There is provided an information processing device to improve the accuracy of estimation using a student network, the information processing device including an estimation unit that estimates an object class of an object included in an input image using a student network generated based on a teacher network generated by machine learning using images stored in a large-scale image database as training data. The student network is generated by machine learning using, as training data, synthetic images obtained using the teacher network and real environment images acquired by a plurality of different modalities in a real environment in which estimation by the estimation unit is expected to be executed.

First claim

Opening claim text (preview).

The invention claimed is: 1 . An information processing device comprising circuitry configured to estimate an object class of an object included in an input image using a student network generated based on a teacher network generated by machine learning using images stored in a large-scale image database as training data, wherein the student network is generated by machine learning using, as training data, synthetic images obtained using the teacher network and real environment images acquired by a plurality of different modalities in a real environment in which the estimation by the circuitry is executed, and the synthetic images are generated based on inputting the real environment images acquired by a corresponding modality to each of a plurality of the teacher networks corresponding to a single modality. 2 . The information processing device according to claim 1 , wherein the synthetic images are generated based on adding noise to a feature quantity obtained by inputting the real environment images to the teacher network. 3 . The information processing device according to claim 2 , wherein the synthetic images are generated based on adding noise in a principal component direction in a feature quantity distribution obtained by inputting the real environment images to the teacher network. 4 . The information processing device according to claim 3 , wherein the synthetic images are generated so that a difference between a feature quantity after average pooling obtained by inputting the real environment images to the teacher network and a feature quantity in which noise is added to the feature quantity in the principal component direction decreases. 5 . The information processing device according to claim 1 , wherein the student network is generated by machine learning using the synthetic images whose degree of similarity between different modalities exceeds a threshold value among the generated synthetic images. 6 . The information processing device according to claim 1 , wherein the student network is generated by machine learning using the synthetic images whose degree of similarity in a same modality exceeds a threshold value among the generated synthetic images. 7 . The information processing device according to claim 1 , wherein the student network is generated by machine learning using a fusion image obtained by fusing a plurality of the generated synthetic images. 8 . The information processing device according to claim 7 , wherein the fusion image is generated by fusing a plurality of the synthetic images whose degree of similarity exceeds a threshold value among the generated synthetic images. 9 . The information processing device according to claim 7 , wherein the fusion image is generated by fusing a plurality of the synthetic images related to a same object class among the generated synthetic images. 10 . The information processing device according to claim 7 , wherein the fusion image is generated by concatenating the synthetic images related to a plurality of different modalities. 11 . The information processing device according to claim 10 , wherein the fusion image is generated by concatenating a plurality of the generated synthetic images whose degree of similarity between different modalities exceeds a threshold value in a channel direction. 12 . The information processing device according to claim 1 , wherein the synthetic images are generated based on a process of similarizing feature quantity distributions related to each modality obtained by inputting the real environment images to the teacher network. 13 . The information processing device according to claim 12 , wherein the synthetic images are generated using the teacher network generated by machine learning using an image obtained by concatenating real environment images related to a plurality of modalities acquired at a same timing and from a same direction in a channel direction as training data. 14 . The information processing device according to claim 12 , wherein the synthetic images are generated based on a process of decreasing a distance on a feature quantity space between a feature quantity obtained by inputting the real environment images related to a certain modality to the teacher network and a feature quantity obtained by inputting the real environment images related to another modality different from the certain modality to the teacher network. 15 . The information processing device according to claim 12 , wherein the synthetic images are generated based on a process of transforming a feature quantity obtained by inputting the real environment images related to a certain modality to the teacher network and a feature quantity obtained by inputting the real environment images related to another modality different from the certain modality to the teacher network. 16 . The information processing device according to claim 1 , wherein the circuitry is further configured to acquire images in the real environment, and the circuitry estimates an object class related to an object included in an acquired image. 17 . The information processing device according to claim 16 , wherein the circuitry acquires images by at least one modality among a plurality of modalities used for acquiring the real environment images used for generating the synthetic images. 18 . An information processing device comprising circuitry configured to generate a student network based on a teacher network generated by machine learning using images stored in a large-scale image database as training data, wherein the circuitry generates the student network by machine learning using, as training data, synthetic images obtained using the teacher network and real environment images acquired by a plurality of different modalities in a real environment in which an estimation of an object class of an object included in an input image using the student network is to be executed, and the synthetic images are generated based on inputting the real environment images acquired by a corresponding modality to each of a plurality of the teacher networks corresponding to a single modality. 19 . A non-transitory computer-readable medium having embodied thereon a program, which when executed by a computer causes the computer to perform an information processing method comprising estimating an object class of an object included in an input image using a student network generated based on a teacher network generated by machine learning using images stored in a large-scale image database as training data, wherein the student network is generated by machine learning using, as training data, synthetic images obtained using the teacher network and real environment images acquired by a plurality of different modalities in a real environment in which the estimation is executed, and the synthetic images are generated based on inputting the real environment images acquired by a corresponding modality to each of a plurality of the teacher networks corresponding to a single modality.

Assignees

Inventors

Classifications

  • Image fusion; Image merging · CPC title

  • Artificial neural networks [ANN] · CPC title

  • Training; Learning · CPC title

  • using two or more images, e.g. averaging or subtraction · CPC title

  • Proximity, similarity or dissimilarity measures · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12530865B2 cover?
There is provided an information processing device to improve the accuracy of estimation using a student network, the information processing device including an estimation unit that estimates an object class of an object included in an input image using a student network generated based on a teacher network generated by machine learning using images stored in a large-scale image database as tra…
Who is the assignee on this patent?
Sony Group Corp
What technology area does this patent fall under?
Primary CPC classification G06V10/764. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 20 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).