Information processing apparatus, information processing method, and program

US2023245423A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2023245423-A1
Application numberUS-202118002690-A
CountryUS
Kind codeA1
Filing dateJun 18, 2021
Priority dateJul 2, 2020
Publication dateAug 3, 2023
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present technique relates to an information processing apparatus, an information processing method, and a program that enable recognition accuracy to be improved while suppressing an increase in load in object recognition using a CNN. An information processing apparatus: performs, a plurality of times, convolution of an image feature map representing a feature amount of an image of a first frame and generates a convolutional feature map of a plurality of layers; performs deconvolution of a feature map based on the convolutional feature map based on an image of a second frame preceding the first frame and generates a deconvolutional feature map; and performs object recognition based on the convolutional feature map based on an image of the first frame and on the deconvolutional feature map based on an image of the second frame. The present technique can be applied to, for example, a system which performs object recognition.

First claim

Opening claim text (preview).

1 . An information processing apparatus, comprising: a convoluting portion configured to perform, a plurality of times, convolution of an image feature map representing a feature amount of an image and to generate a convolutional feature map of a plurality of layers; a deconvoluting portion configured to perform deconvolution of a feature map based on the convolutional feature map and to generate a deconvolutional feature map; and a recognizing portion configured to perform object recognition based on the convolutional feature map and the deconvolutional feature map, wherein the convoluting portion is configured to perform, a plurality of times, convolution of the image feature map representing a feature amount of an image of a first frame and to generate the convolutional feature map of a plurality of layers; the deconvoluting portion is configured to perform deconvolution of a feature map based on the convolutional feature map based on an image of a second frame preceding the first frame and to generate the deconvolutional feature map, and the recognizing portion is configured to perform object recognition based on the convolutional feature map based on an image of the first frame and on the deconvolutional feature map based on an image of the second frame. 2 . The information processing apparatus according to claim 1 , wherein the recognizing portion is configured to perform object recognition by combining a first convolutional feature map based on an image of the first frame and a first deconvolutional feature map which is based on an image of the second frame and of which a layer is the same as the first convolutional feature map. 3 . The information processing apparatus according to claim 2 , wherein the deconvoluting portion is configured to generate, based on an image of the second frame, the first deconvolutional feature map by performing deconvolution of a feature map based on a second convolutional feature map which is deeper by n-number (n ≥ 1) of layers than the first convolutional feature map n-number of times. 4 . The information processing apparatus according to claim 3 , wherein the deconvoluting portion is configured to further generate, based on an image of the second frame, a second deconvolutional feature map by performing deconvolution of a feature map based on a third convolutional feature map which is deeper by m-number (m ≥ 1, m ≠ n) of layers than the first convolutional feature map m-number of times, and the recognizing portion is configured to perform object recognition by further combining the second deconvolutional feature map. 5 . The information processing apparatus according to claim 3 , wherein the second frame is a frame immediately preceding the first frame, n = 1 is satisfied, the deconvoluting portion is configured to further generate a third deconvolutional feature map by performing deconvolution, once, of a second deconvolutional feature map which is one layer deeper than the first convolutional feature map and which is used in object recognition of an image of the second frame, and the recognizing portion is configured to perform object recognition by further combining the third deconvolutional feature map. 6 . The information processing apparatus according to claim 2 , wherein the recognizing portion is configured to perform object recognition based on a synthesized feature map obtained by synthesizing the first convolutional feature map and the first deconvolutional feature map. 7 . The information processing apparatus according to claim 6 , wherein the deconvoluting portion is configured to generate the first deconvolutional feature map by performing deconvolution of the synthesized feature map which is used in object recognition of an image of the second frame and which is one layer deeper than the first deconvolutional feature map. 8 . The information processing apparatus according to claim 1 , wherein the convoluting portion and the deconvoluting portion are configured to perform processing in parallel. 9 . The information processing apparatus according to claim 1 , wherein the recognizing portion is configured to perform object recognition further based on the image feature map. 10 . The information processing apparatus according to claim 1 , further comprising a feature amount extracting portion configured to generate the image feature map. 11 . The information processing apparatus according to claim 1 , further comprising: a first feature amount extracting portion configured to extract a feature amount of a photographed image obtained by a camera and to generate a first image feature map; a second feature amount extracting portion configured to extract a feature amount of a sensor image representing a sensing result of a sensor of which a sensing range at least partially overlaps with a photographing range of the camera and to generate a second image feature map; and a synthesizing portion configured to generate a synthesized image feature map being the image feature map obtained by synthesizing the first image feature map and the second image feature map, wherein the convoluting portion is configured to perform convolution of the synthesized image feature map. 12 . The information processing apparatus according to claim 11 , further comprising: a geometric transformation portion configured to transform a first sensor image representing the sensing result according to a first coordinate system into a second sensor image representing the sensing result according to a second coordinate system, wherein the second feature amount extracting portion is configured to extract a feature amount of the second sensor image and to generate the second image feature map. 13 . The information processing apparatus according to claim 11 , wherein the sensor is a milliwave radar or LiDAR (Light Detection and Ranging). 14 . The information processing apparatus according to claim 1 , further comprising: a first feature amount extracting portion configured to extract a feature amount of a photographed image obtained by a camera and to generate a first image feature map; a second feature amount extracting portion configured to extract a feature amount of a sensor image representing a sensing result of a sensor of which a sensing range at least partially overlaps with a photographing range of the camera and to generate a second image feature map; a first recognizing portion which includes the convoluting portion, the deconvoluting portion, and the recognizing portion and which is configured to perform object recognition based on the first image feature map; a second recognizing portion which includes the convoluting portion, the deconvoluting portion, and the recognizing portion and which is configured to perform object recognition based on the second image feature map; and an integrating portion configured to integrate a recognition result of an object by the first recognizing portion and a recognition result of an object by the second recognizing portion. 15 . The information processing apparatus according to claim 14 , wherein the sensor is a milliwave radar or LiDAR (Light Detection and Ranging). 16 . The information processing apparatus according to claim 1 , wherein a feature map based on the convolutional feature map is the convolutional feature map itself. 17 . The information processing apparatus according to claim 1 , wherein the first frame and the second frame are adjacent frames. 18 . An information processing method, comprising the steps of: pe

Assignees

Inventors

Classifications

  • G06V10/82Primary

    using neural networks · CPC title

  • Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title

  • the classifiers operating on different input data, e.g. multi-modal recognition · CPC title

  • G06V10/764Primary

    using classification, e.g. of video objects · CPC title

  • by mapping characteristic values of the pattern into a parameter space, e.g. Hough transformation · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2023245423A1 cover?
The present technique relates to an information processing apparatus, an information processing method, and a program that enable recognition accuracy to be improved while suppressing an increase in load in object recognition using a CNN. An information processing apparatus: performs, a plurality of times, convolution of an image feature map representing a feature amount of an image of a …
Who is the assignee on this patent?
Sony Semiconductor Solutions Corp
What technology area does this patent fall under?
Primary CPC classification G06V10/82. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Aug 03 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).