Who is the assignee on this patent?

Microsoft Technology Licensing Llc

What technology area does this patent fall under?

Primary CPC classification G06F3/167. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Mar 05 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Leveraging environmental context for enhanced communication throughput

US10223067B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10223067-B2
Application number	US-201615211794-A
Country	US
Kind code	B2
Filing date	Jul 15, 2016
Priority date	Jul 15, 2016
Publication date	Mar 5, 2019
Grant date	Mar 5, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An environmental context of a user may be taken into account to enhance the user's communication throughput. An “environmental context” can include spatial surroundings of a user, device, and/or sensor of the device and more broadly to denote the context of the user in a multiplicity of environments such as, for example, the surroundings of a user, a digital environment such as the user or other individuals' interactions with or made near a device, etc. The techniques can include obtaining contextual data to provide context-predicted suggestions of words and/or phrases that a user can select to be output on the user's behalf. In some examples, the techniques can also use contextual data to weight, sort, rank, and/or filter word and/or phrase suggestions.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for improved communications throughput for at least one user by reducing a communication throughput time or increasing a communication throughput rate at which communication is produced, the system comprising: one or more cameras; one or more microphones; one or more processors; a human-machine interface; and computer-readable media having stored thereon computer-executable instructions, that, when executed by the one or more processors, configure the system to perform operations comprising: obtaining an image by the one or more cameras; obtaining a salient object label for an object in the image; capturing audio data via the one or more microphones; determining a conversational context based on the audio data, the conversational context including one or more of syntax information, semantic information, and key words; generating a predicted word based on the salient object label; generating predicted phrases based on the predicted word, the salient object label, the object in the image, and the conversational context; selecting one or more suggested utterances from among the predicted word and the predicted phrases; and outputting a representation of the one or more suggested utterances via the human-machine interface, thereby providing improved communications throughput for the at least one user by reducing the communication throughput time or increasing the communication throughput rate at which communication is produced. 2. The system as claim 1 recites, the outputting including: rendering the image via the human-machine interface; and rendering the representation of the one or more suggested utterances as one or more labels, the one or more labels being displayed as one or more superimposed labels over a location in the image corresponding to the object. 3. The system as claim 1 recites, the outputting including: one or more of: rendering the one or more suggested utterances via the human-machine interface as a symbol, word, or phrase completion, rendering the one or more suggested utterances via the human-machine interface as a part of a rendered keyboard, rendering at a devoted portion of the human-machine interface, or rendering a pictorial representation of the one or more suggested utterances via the human-machine interface. 4. The system as claim 1 recites, the human-machine interface including a display and the outputting the one or more suggested utterances to the human-machine interface including: rendering, in a designated area of the display devoted to environmentally contextual predicted words or phrases, a portion of the one or more suggested utterances corresponding to predicted word or phrases that relate to surroundings of a user; ranking remaining portions of the one or more suggested utterances based on relatedness to the surroundings of the user; and rendering, at the display outside the designated area, the remaining portions in an order corresponding to the ranking, the remaining portions being displayed as one or more superimposed labels over a location in the image corresponding to the surroundings of the user. 5. The system as claim 1 recites, the operations further comprising: receiving an input indicative of a selection of a word or a phrase of the one or more suggested utterances; provisioning for output the selected word or phrase to one or more of: a network interface, an application stored on the computer readable media, or the human-machine interface. 6. The system as claim 5 recites, wherein the receiving an input indicative of a selection includes: obtaining gaze data by the one or more cameras; and correlating the gaze data with a discrete portion of the human-machine interface, the discrete portion corresponding to one of the predicted word or predicted phrases. 7. The system as claim 1 recites, wherein at least one of the generating the predicted word, the selecting the one or more suggested utterances, and the outputting the one or more suggested utterances via the human-machine interface, is further based on the conversational context. 8. The system as claim 1 recites, the operations further comprising: capturing gaze data of a user of the system; identifying at least one of a portion of the human-machine interface or the salient object based on the gaze data; and wherein at least one of: the generating the predicted word, the generating the predicted phrases, the selecting the one or more suggested utterances, or the outputting the one or more suggested utterances via the human-machine interface, is further based on the identifying. 9. The system as claim 8 recites, the operations further comprising: identifying, based on the gaze data, one or more of: a portion of a user interface rendered via the human-machine interface, a representation of the object, a portion of the image, a portion of an environment of the user, or a subset of options for input rendered via the human-machine interface. 10. The system as claim 1 recites, the operations further comprising: determining a condition or a position of the object, the object being a first object; determining a second salient object label for a second object in the image; identifying a relative position of the first object relative to the second object based on the condition or the position of the object and based on a condition or a position of the second object; and wherein at least one of: the generating the predicted word, the generating the predicted phrases, the selecting the one or more suggested utterances, or the outputting the one or more suggested utterances via the human-machine interface, is further based on one or more of one or more of: the second salient object label, a co-occurrence of the first object and the second object in the image, or the condition of the first object, the position of the first object, or the relative position of the first object to the second object. 11. The system as claim 1 recites, the operations further comprising: obtaining application data, the application data including one or more of: an identification of an application-in-use or a recently-used application and identifiers of functionality of the application-in-use or the recently-used application, an identifier of a portion of the application-in-use or the recently-used application designated to receive the one or more suggested utterances, or a usage history or data stored by an application stored on the computer-readable media; and wherein at least one of: the generating the predicted word, the generating the predicted phrases, the selecting the one or more suggested utterances, or the outputting the one or more suggested utterances via the human-machine interface, is further based on the application data. 12. The system as claim 1 recites, wherein the image is an image of physical surroundings of the user. 13. A method for improved communications throughput for at least one user by reducing a communication throughput time or increasing a communication throughput rate at which communication is produced, the method comprising: obtaining an image by one or more cameras; obtaining a salient object label for an object in the image; capturing audio data via one or more microphones; determining a conversational context based on the audio data, the conversational context including one or more of syntax information, semantic information, and key words; generating a predicted word based on the salient object label; generating predicted phrases based on the predicted word, the salient object label, the obje

Assignees

Microsoft Technology Licensing Llc

Inventors

Classifications

G06F16/14
Details of searching files based on file metadata · CPC title
G06F3/167Primary
Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title
G06F16/58
Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually · CPC title
G06F16/144
Query formulation · CPC title
G10L15/26
Speech to text systems (G10L15/08 takes precedence) · CPC title

Patent family

Related publications grouped by family.

View patent family 59409757

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10223067B2 cover?: An environmental context of a user may be taken into account to enhance the user's communication throughput. An “environmental context” can include spatial surroundings of a user, device, and/or sensor of the device and more broadly to denote the context of the user in a multiplicity of environments such as, for example, the surroundings of a user, a digital environment such as the user or othe…
Who is the assignee on this patent?: Microsoft Technology Licensing Llc
What technology area does this patent fall under?: Primary CPC classification G06F3/167. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Mar 05 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Information superimposed image display device, non-transitory computer-readable medium which records information superimposed image display program, and information superimposed image display method

Vpa with integrated object recognition and facial expression recognition

In-store object highlighting by a real world user interface

Methods and systems for image or audio recognition processing

Frequently asked questions