What technology area does this patent fall under?

Primary CPC classification G10L15/1815. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Apr 01 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Speech interpretation based on environmental context

US12266354B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12266354-B2
Application number	US-202117500518-A
Country	US
Kind code	B2
Filing date	Oct 13, 2021
Priority date	Jul 15, 2021
Publication date	Apr 1, 2025
Grant date	Apr 1, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and processes for speech interpretation based on environmental context are provided. For example, a user gaze direction is detected, and a speech input is received from a first user of the electronic device. In accordance with a determination that the user gaze is directed at a digital assistant object, the speech input is processed by the digital assistant. In accordance with a determination that the user gaze is not directed at a digital assistant object, contextual information associated with the electronic device is obtained, wherein the contextual information includes speech from a second user. Determination is made whether the speech input is directed to a digital assistant of the electronic device. In accordance with a determination that the speech input is directed to a digital assistant of the electronic device, the speech input is processed by the digital assistant.

First claim

Opening claim text (preview).

What is claimed is: 1. An electronic device, comprising: one or more processors; and memory storing one or more programs configured to be executed by the one or more processors, the one or more programs including instructions, which when executed, cause the electronic device to: detect a user gaze direction, wherein the user gaze direction is associated with a first user of the electronic device; receive, from a first user of the electronic device, a first speech input including first content; in accordance with a determination that the user gaze direction associated with the first user is not directed at a displayed digital assistant object: obtain contextual information associated with the electronic device, wherein the contextual information includes a second speech input from a second user, wherein the second speech input includes second content; adjust a confidence value based on the first content and the second content; determine, based on the contextual information and the confidence value, whether the first speech input is directed to the digital assistant of the electronic device; and in accordance with a determination that the first speech input is directed to the digital assistant of the electronic device: process, by the digital assistant, the first speech input. 2. The electronic device of claim 1 , wherein the instructions cause the electronic device to: detect a beginning of the second speech input from the second user; and in response to detecting the beginning of the second speech input from the second user, store, in the memory, the second speech input from the second user. 3. The electronic device of claim 2 , wherein the instructions cause the electronic device to: in accordance with a determination that the user gaze direction is directed at a displayed digital assistant object, remove, from the memory, the second speech input from the second user. 4. The electronic device of claim 2 , wherein the instructions cause the electronic device to: in accordance with a determination that the first speech input is directed to a digital assistant of the electronic device, remove, from the memory, the second speech input from the second user. 5. The electronic device of claim 2 , wherein the instructions cause the electronic device to: identify a first time associated with the storing of the second speech input from the second user; and in accordance with a determination that a current time is not within a threshold time duration from the first time, remove, from the memory, the second speech input from the second user. 6. The electronic device of claim 1 , wherein the instructions cause the electronic device to: detect, at a first time, motion corresponding to the second user; identify a second time associated with a beginning of the second speech input from the second user; and in accordance with a determination that the first time is not within a threshold duration of time from the second time, adjust a confidence value associated with the first speech input. 7. The electronic device of claim 6 , wherein the detected motion corresponds to one of movement of the second user and movement of an avatar associated with second user. 8. The electronic device of claim 1 , wherein determining, based on the contextual information, whether the first speech input is directed to a digital assistant of the electronic device comprises: obtaining a confidence value corresponding to a confidence that the first speech input is directed to the digital assistant of the electronic device; and in accordance with a determination that the confidence value exceeds a threshold confidence value, determining that the first speech input is directed to the digital assistant. 9. The electronic device of claim 1 , wherein the instructions cause the electronic device to: determine a direction associated with the second speech input from the second user; and in accordance with a determination that the direction associated with the second speech input from the second user corresponds to the user gaze direction associated with the first user, adjust a confidence value associated with the first speech input. 10. The electronic device of claim 1 , wherein the instructions cause the electronic device to: identify a time associated with the second speech input from the second user; determine a direction associated with the second speech input from the second user; and obtain second contextual information within a time range from the identified time, wherein the second contextual information includes user gaze information within the time range. 11. The electronic device of claim 10 , wherein the instructions cause the electronic device to: in accordance with a determination that the second contextual information includes a user gaze direction corresponding to the direction associated with the second speech input from the second user: adjust a confidence value associated with the first speech input. 12. The electronic device of claim 1 , wherein the instructions cause the electronic device to: identify a first time associated with the first speech input; identify a second time associated with the second speech input from the second user; and in accordance with a determination that the first time and the second time are within a predetermined time range, adjust a confidence value associated with the first speech input. 13. The electronic device of claim 1 , wherein the instructions cause the electronic device to: determine a first word included within the first speech input; determine a second word included within the second speech input from the second user; and in accordance with a determination that the first word corresponds to the second word, adjust a confidence value associated with the first speech input. 14. The electronic device of claim 1 , wherein the instructions cause the electronic device to: obtain a first semantic representation of the first speech input; obtain a second semantic representation of the second speech input from the second user; and in accordance with a determination that the first semantic representation corresponds to the second semantic representation, adjust a confidence value associated with the first speech input. 15. The electronic device of claim 1 , wherein the instructions cause the electronic device to: determine content associated with the second speech input from the second user; and in accordance with a determination that the determined content corresponds to predefined content, adjust a confidence value associated with the first speech input. 16. The electronic device of claim 15 , wherein the predefined content includes at least one of an interrogatory sentence, a name associated with the first user, and a reference to a parameter associated with a profile corresponding to the first user. 17. The electronic device of claim 1 , wherein at least one of the first content and the second content includes a word. 18. A non-transitory computer-readable storage medium storing one or more programs configured to be executed by one or more processors of an electronic device, the one or more programs including instructions, which when executed, cause the electronic device to: detect a user gaze direction, wherein the user gaze direction is associated with a first user of the electronic device; receive, from a first user of the electronic device, a first speech input including first content; in accordance with a determination that the user gaze direction associated with the first us

Assignees

Apple Inc

Inventors

Classifications

G10L15/25
using position of the lips, movement of the lips or face analysis · CPC title
G06F3/013
Eye tracking input arrangements (G06F3/015 takes precedence) · CPC title
G10L15/1815Primary
Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning · CPC title
G06F3/167Primary
Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title

Patent family

Related publications grouped by family.

View patent family 82748471

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12266354B2 cover?: Systems and processes for speech interpretation based on environmental context are provided. For example, a user gaze direction is detected, and a speech input is received from a first user of the electronic device. In accordance with a determination that the user gaze is directed at a digital assistant object, the speech input is processed by the digital assistant. In accordance with a determi…
Who is the assignee on this patent?: Apple Inc
What technology area does this patent fall under?: Primary CPC classification G10L15/1815. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Apr 01 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).