Who is the assignee on this patent?

Sony Interactive Entertainment Inc

What technology area does this patent fall under?

Primary CPC classification G06V20/70. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jan 20 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Qualifying labels automatically attributed to content in images

US12530913B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12530913-B2
Application number	US-202318171256-A
Country	US
Kind code	B2
Filing date	Feb 17, 2023
Priority date	Feb 17, 2023
Publication date	Jan 20, 2026
Grant date	Jan 20, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for image generation. The method including identifying a plurality of features of an image. The method including classifying each of the plurality of features using an artificial intelligence (AI) model trained to identify features in a plurality of images, wherein the plurality of features is classified as a plurality of labels, wherein the image is provided as input to the AI model. The method including receiving feedback for a label, wherein the feedback is associated with a user. The method including modifying a label based on the feedback. The method including updating the plurality of labels with the label that is modified. The method including providing as input the plurality of labels that is updated into an image generation artificial intelligence system configured for implementing latent diffusion to generate an updated image.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method, comprising: identifying a plurality of features of an image; classifying each of the plurality of features using an artificial intelligence (AI) model trained to classify features in a plurality of images, wherein the plurality of features is classified as a plurality of labels, and wherein the image is provided as input to the AI model; determining, based on a commentary, an object within the image to which the commentary applies; determining one or more labels of the object to which the commentary applies; translating the commentary into feedback for the one or more labels; modifying the one or more labels based on the feedback; updating the plurality of labels with the one or more labels that is modified; and providing, as input, the plurality of labels that is updated into an image generation artificial intelligence system configured for implementing latent diffusion to generate an updated image. 2 . The method of claim 1 , further comprising: generating the image using the image generation artificial intelligence system. 3 . The method of claim 1 , further comprising: receiving the feedback via a user interface, wherein the feedback is formatted in text. 4 . The method of claim 1 , further comprising: receiving the feedback as audio, wherein the feedback is presented in natural language; converting the audio to text; and presenting the text via a user interface. 5 . The method of claim 1 , further comprising: receiving identification of an object within a scene that is presented on a display; presenting one or more labels of the object in a user interface via the display, wherein the one or more labels of the object includes the label; and receiving identification of the label by a user via the user interface. 6 . The method of claim 5 , wherein the receiving identification of the object includes: determining that the user is pointing to a location in physical space corresponding to a location of the object within the scene in virtual space, wherein the scene is presented on the display of a head mounted display worn by the user; and determining that the user is pointing to the object within the scene based on the pointing. 7 . The method of claim 5 , wherein the receiving identification of the object includes: determining that the user selects the object in the scene using a controller. 8 . The method of claim 1 , further comprising: presenting a plurality of labels of a plurality of objects of a scene of the image in a user interface on a display, wherein the plurality of objects are presented in the user interface as a hierarchical file system of objects; receiving selection of an object via the hierarchical file system; presenting one or more labels of the object in the user interface via the display, wherein the one or more labels of the object includes the label; and receiving identification of the label by the user via the user interface. 9 . The method of claim 1 , further comprising: highlighting the object that is presented on a display. 10 . The method of claim 1 , further comprising: determining that a user is pointing to a location in physical space corresponding to a location of an object within a scene in virtual space, wherein the scene is presented on a display of a head mounted display worn by the user; highlighting the object in the scene; determining that the user is selecting the object based on the pointing; receiving commentary to modify the object from the user, wherein the commentary is presented in natural language; determining one or more labels of the object, wherein the one or more labels of the object includes the label; determining that the commentary applies to the label; and translating the commentary into the feedback for the label. 11 . The method of claim 1 , further comprising; determining a context based on the feedback for the label; and modifying one or more of the plurality of labels based on the context, wherein the plurality of labels that is updated includes one or more of the plurality of labels that have been modified. 12 . The method of claim 1 , further comprising: adding a new object corresponding to the label based on the feedback; determining a context based on the new object; and modifying one or more of the plurality of labels based on the context, wherein the plurality of labels that is updated includes one or more of the plurality of labels that have been modified. 13 . The method of claim 1 , further comprising: removing an object corresponding to the label based on the feedback; and removing the label from the plurality of labels when performing the modifying the label and when performing the updating the plurality of labels. 14 . A non-transitory computer-readable medium having instructions stored thereon, which, when executed by a processor, cause the processor to perform a method comprising: identifying a plurality of features of an image; classifying each of the plurality of features using an artificial intelligence (AI) model trained to classify features in a plurality of images, wherein the plurality of features is classified as a plurality of labels, wherein the image is provided as input to the AI model; determining, based on a commentary, an object within the image to which the commentary applies; determining one or more labels of the object to which the commentary applies; translating the commentary into feedback for the one or more labels; modifying the one or more labels a label based on the feedback; updating the plurality of labels with the one or more labels that is modified; and providing as input the plurality of labels that is updated into an image generation artificial intelligence system configured for implementing latent diffusion to generate an updated image. 15 . The non-transitory computer-readable medium of claim 14 , further comprising instructions that, when executed, cause the processor to perform the method comprising: receiving identification of an object within a scene that is presented on a display; presenting one or more labels of the object in a user interface via the display, wherein the one or more labels of the object includes the label; and receiving identification of the label by a user via the user interface. 16 . The non-transitory computer-readable medium of claim 14 , further comprising instructions that, when executed, cause the processor to perform the method comprising: presenting a plurality of labels of a plurality of objects of a scene of the image in a user interface on a display, wherein the plurality of objects are presented in the user interface as a hierarchical file system of objects; receiving selection of an object via the hierarchical file system; presenting one or more labels of the object in the user interface via the display, wherein the one or more labels of the object includes the label; and receiving identification of the label by the user via the user interface. 17 . The non-transitory computer-readable medium of claim 14 , further comprising instructions that, when executed, cause the processor to perform the method comprising: highlighting the object that is presented on a display. 18 . A computer system comprising: a processor; and memory coupled to the processor and having stored therein instructions that, if executed by the computer system, cause the computer system to execute a method for implementing a graphics pipeline, comprising: identifying a plurali

Assignees

Sony Interactive Entertainment Inc

Inventors

Green Arran

Classifications

G06T17/00
Three-dimensional [3D] modelling for computer graphics · CPC title
G06F3/167
Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title
G06F3/0482
Interaction with lists of selectable items, e.g. menus · CPC title
G06V2201/07
Target detection · CPC title
G06V10/764
using classification, e.g. of video objects · CPC title

Patent family

Related publications grouped by family.

View patent family 90366851

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12530913B2 cover?: A method for image generation. The method including identifying a plurality of features of an image. The method including classifying each of the plurality of features using an artificial intelligence (AI) model trained to identify features in a plurality of images, wherein the plurality of features is classified as a plurality of labels, wherein the image is provided as input to the AI model. …
Who is the assignee on this patent?: Sony Interactive Entertainment Inc
What technology area does this patent fall under?: Primary CPC classification G06V20/70. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jan 20 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Automatically curated image searching

Characterization System and Method With Guided Defect Discovery

Identifying image comments from similar images

Method and computing device for generating image data set to be used for hazard detection and learning method and learning device using the same

Automatically curated image searching

Identifying image comments from similar images

Frequently asked questions