What technology area does this patent fall under?

Primary CPC classification G10L15/22. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Oct 07 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Display apparatus and a voice control method

US12437758B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12437758-B2
Application number	US-202318169313-A
Country	US
Kind code	B2
Filing date	Feb 15, 2023
Priority date	Nov 13, 2020
Publication date	Oct 7, 2025
Grant date	Oct 7, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Some embodiments of the present application disclose a display apparatus and a voice control method for the display apparatus. The display apparatus comprises a display, a detector and a controller. The display is configured to present a user interface, and the detector is configured to acquire user voice information; and the controller is configured to cause the display apparatus to perform: acquiring voice information inputted from a user; in response to the voice information, extracting at least one keyword from the voice information; traversing action items in a configuration library; in response to determining that no action item in the configuration library matches the at least one keyword, obtaining text information of the user interface on the display to in order to determine an control instruction according to the text information.

First claim

Opening claim text (preview).

What is claimed is: 1. A display apparatus, comprising: a display, configured to display an image from a broadcast system or a network, and/or a user interface; a detector, configured to acquire voice information from a user; and a controller, in connection with the display and the detector and configured to: display a user interface on the display; obtain the voice information input from the user while the user interface is displaying on the display; in response to the voice information, extract at least one keyword from the voice information, wherein the at least one keyword comprises a name content for indicating a controlled object and an action content for indicating an execution action; traverse action items in a configuration library, wherein controlled objects of the action items in the configuration library are configured according to applications built-in the display apparatus; in response to determining that no action item in the configuration library matches the at least one keyword, obtain text information of the user interface on the display, and obtain layout information of the user interface; extract a function control in a layout of the user interface according to the text information, wherein the function control is a control having a first text presented on the display and matched with the at least one keyword; and generate a control instruction according to the function control and the voice information; in response to determining that a first action item in the configuration library matches the at least one keyword, cause the display apparatus to execute the first action item; wherein the controller is further configured to: traverse positions of all controls in the layout information of the user interface; calculate a distance between a position of a second text in the text content in an image of the user interface and a position of a second control among the controls in the layout information of the user interface; and in response to determining that the distance is less than or equal to a preset distance threshold, mark the second control corresponding to the distance as the function control. 2. The display apparatus according to claim 1 , wherein the controller is further configured to: acquire the voice information from the user via the detector; convert the voice information into a voice text; and extract the at least one keyword from the voice text. 3. The display apparatus according to claim 1 , wherein the first action item comprises an action item of which a controlled object is the same as or similar to the name content in the at least one keyword and an action of which an execution action is the same as or similar to the execution action in the at least one keyword. 4. The display apparatus according to claim 2 , wherein the controller is further configured to: determine whether the voice text include an action instruction through a preset semantic recognition model; in response to determining that the voice text include the action instruction, proceed to extract the at least one keyword from the voice text; and in response to determining that the voice text include no action instruction, cause the display to present a prompt, wherein the prompt comprises the voice text extracted from the voice information of the user. 5. The display apparatus according to claim 1 , wherein the controller is further configured to: take a screenshot of the user interface on the display to generate an image of the user interface; and perform optical character recognition (OCR) on the image of the user interface to obtain the text information of the user interface, wherein the text information comprises a text content and a position of the text content in the image of the user interface. 6. The display apparatus according to claim 1 , wherein the controller is further configured to: construct a set of words associated with the text information, wherein the set of associated words comprises a synonym of a name word in the text information; traverse all control names in the layout information of the user interface; compare the control names with the set of associated words; and in response to determining that a control name is the same as a content of any word item in the set of associated words, mark a control corresponding to the control name as the function control. 7. The display apparatus according to claim 1 , wherein the controller is further configured to: obtain one or more operation types supported by the function control and an action type specified based on the voice information; compare the one or more operation types with the action type; and in response to determining that at least one of the one or more operation types is the same as the action type, generate the control instruction. 8. The display apparatus according to claim 1 , wherein the controller is further configured to: execute the control instruction; and construct an action item in a configuration library based on the control instruction and the controlled object. 9. The display apparatus according to claim 1 , wherein the function control comprises a control which is able to configure with a picture or text for visual presentation on a user interface and an application icon. 10. A voice control method for a display apparatus, comprising: displaying a user interface on a display of the display apparatus, wherein the display is configured to display an image from a broadcast system or a network, and/or display the user interface: obtaining voice information input from a user while the user interface is displaying on the display; in response to the voice information, extracting at least one keyword from the voice information, wherein the at least one keyword comprises a name content for indicating a controlled object and an action content for indicating an execution action; traversing action items in a configuration library, wherein controlled objects of the action items in the configuration library are configured according to applications built-in the display apparatus; in response to determining that no action item in the configuration library matches the at least one keyword, obtaining text information of the user interface on the display, and obtaining layout information of the user interface; extracting a function control in a layout of the user interface according to the text information, wherein the function control is a control having a first text presented on the display and matched with the at least one keyword; and generate a control instruction according to the function control and the voice information; in response to determining that a first action item in the configuration library matches the at least one keyword, causing the display apparatus to execute the first action item; wherein the voice control method further comprises: traversing positions of all controls in the layout information of the user interface; calculating a distance between a position of a second text in the text content in an image of the user interface and a position of a second control among the controls in the layout information of the user interface; and in response to determining that the distance is less than or equal to a preset distance threshold, marking the second control corresponding to the distance as the function control. 11. The voice control method according to claim 10 , further comprising: acquiring the voice information from the user via the detector; converting the voice information into a voice text; and extracting the at least one keyword from the voice text. 12. The voice control method according to claim 10 , wherein the first

Assignees

Hisense Visual Tech Co Ltd

Inventors

Classifications

G10L15/08
Speech classification or search · CPC title
G06V30/10
Character recognition · CPC title
G06V2201/02
Recognising information on displays, dials, clocks · CPC title
G06V20/635
Overlay text, e.g. embedded captions in a TV programme · CPC title
G06F3/167
Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title

Patent family

Related publications grouped by family.

View patent family 81600828

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12437758B2 cover?: Some embodiments of the present application disclose a display apparatus and a voice control method for the display apparatus. The display apparatus comprises a display, a detector and a controller. The display is configured to present a user interface, and the detector is configured to acquire user voice information; and the controller is configured to cause the display apparatus to perform: a…
Who is the assignee on this patent?: Hisense Visual Tech Co Ltd
What technology area does this patent fall under?: Primary CPC classification G10L15/22. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Oct 07 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).