Information processing apparatus, information processing method, and non-transitory computer readable medium

US12361682B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12361682-B2
Application numberUS-202118008730-A
CountryUS
Kind codeB2
Filing dateDec 15, 2021
Priority dateDec 15, 2021
Publication dateJul 15, 2025
Grant dateJul 15, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An information processing apparatus includes a processor configured to read program code stored in memory and operate as instructed by the program code. The program code includes acquisition code configured to cause the at least one processor to acquire a red, blue, green (RGB) image including an object. The program code includes converting code configured to cause the at least one processor to apply a discrete cosine transform (DCT) to the RGB image to generate image coefficients corresponding to an YCbCr image comprising Luma (Y) elements and Chroma (Cb, Cr) elements. The program code includes prediction code configured to cause the at least one processor to predict various attributes relating to the object by inputting the image coefficient into a learning model. The learning model is a learning model that is stored in the at least one memory and shared between a plurality of different objects including the object.

First claim

Opening claim text (preview).

The invention claimed is: 1. An information processing apparatus comprising: at least one memory configured to store program code; and at least one processor configured to read the program code and operate as instructed by the program code, the program code including: acquisition code configured to cause the at least one processor to acquire a red, blue, green (RGB) image including an object; converting code configured to cause the at least one processor to apply a discrete cosine transform (DCT) to the RGB image to generate image coefficients corresponding to an YCbCr image comprising Luma (Y) elements and Chroma (Cb, Cr) elements; and prediction code configured to cause the at least one processor to predict various attributes relating to the object by inputting the image coefficient into a learning model, wherein the learning model is a learning model that is stored in the at least one memory and shared between a plurality of different objects including the object, and wherein the learning model includes: a plurality of estimation layers that estimate a plurality of attribute values for a plurality of attributes relating to the plurality of different objects, and an output layer that concatenates and outputs the plurality of attribute values outputted from the plurality of estimation layers. 2. The information processing apparatus according to claim 1 , wherein the learning model is composed of a first part and a second part, the first part receives the image coefficients as an input and outputs a feature vector expressing features of the object, the second part includes the plurality of estimation layers and the output layer, the plurality of estimation layers receive the feature vector as an input and output a value indicating an object type of the object and the plurality of attribute values, and the output layer concatenates and outputs the value indicating the object type of the object and the plurality of attribute values outputted from the plurality of estimation layers. 3. The information processing apparatus according to claim 2 , wherein the prediction code is further configured to cause the at least one processor to predict the various attributes from the plurality of attribute values outputted from the second part of the learning model. 4. The information processing apparatus according to claim 2 , wherein at least one valid attribute value out of the plurality of attribute values is set in advance in keeping with the value indicated by the object type, and wherein the prediction code is further configured to cause the at least one processor to acquire, from the plurality of attribute values, the at least one valid attribute value in keeping with a value indicated by the object type, and predict attributes corresponding to the at least one valid attribute value as the various attributes. 5. The information processing apparatus according to claim 1 , wherein in a case where the RGB object image includes a plurality of objects, the prediction code is further configured to cause the at least one processor to predict various attributes that relate to each of the plurality of objects. 6. The information processing apparatus according to claim 1 , wherein the image coefficients are concatenated data produced by size matching of the Y elements, the Cb elements, and the Cr elements out of the data produced by the discrete cosine transform. 7. The information processing apparatus according to claim 1 , wherein the program code further comprises output configured to cause the at least one processor to output the various attributes. 8. An information processing method performed by at least one processor, the method comprising: acquiring a red, blue, green (RGB) object image including an object; applying a discrete cosine transform (DCT) to the RGB image to generate image coefficients corresponding to an YCbCr image comprising Luma (Y) elements and Chroma (Cb, Cr) elements; and predicting various attributes relating to the object by inputting the image coefficients into a learning model, wherein the learning model is a learning model that is stored in a memory and shared between a plurality of different objects including the object, and wherein the learning model includes: a plurality of estimation layers that estimate a plurality of attribute values for a plurality of attributes relating to the plurality of different objects; and an output layer that concatenates and outputs the plurality of attribute values outputted from the plurality of estimation layers. 9. A non-transitory computer readable medium having instructions stored therein, which when executed by a processor, cause the processor to execute a method comprising: acquiring a red, blue, green (RGB) an object image including an object; applying a discrete cosine transform (DCT) to the RGB image to generate image coefficients corresponding to an YCbCr image comprising Luma (Y) elements and Chroma (Cb, Cr) elements; and predicting various attributes relating to the object by inputting a learning model to the object image, wherein the learning model is a learning model that is stored in memory and shared between a plurality of different objects including the object and includes: a plurality of estimation layers that estimate a plurality of attribute values for a plurality of attributes relating to the plurality of different objects; and an output layer that concatenates and outputs the plurality of attribute values outputted from the plurality of estimation layers.

Assignees

Inventors

Classifications

  • using classification, e.g. of video objects · CPC title

  • G06V20/60Primary

    Type of objects · CPC title

  • Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components · CPC title

  • Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12361682B2 cover?
An information processing apparatus includes a processor configured to read program code stored in memory and operate as instructed by the program code. The program code includes acquisition code configured to cause the at least one processor to acquire a red, blue, green (RGB) image including an object. The program code includes converting code configured to cause the at least one processor to…
Who is the assignee on this patent?
Rakuten Group Inc
What technology area does this patent fall under?
Primary CPC classification G06V20/60. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 15 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).