Leveraging environmental context for enhanced communication throughput

US10223067B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10223067-B2
Application numberUS-201615211794-A
CountryUS
Kind codeB2
Filing dateJul 15, 2016
Priority dateJul 15, 2016
Publication dateMar 5, 2019
Grant dateMar 5, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An environmental context of a user may be taken into account to enhance the user's communication throughput. An “environmental context” can include spatial surroundings of a user, device, and/or sensor of the device and more broadly to denote the context of the user in a multiplicity of environments such as, for example, the surroundings of a user, a digital environment such as the user or other individuals' interactions with or made near a device, etc. The techniques can include obtaining contextual data to provide context-predicted suggestions of words and/or phrases that a user can select to be output on the user's behalf. In some examples, the techniques can also use contextual data to weight, sort, rank, and/or filter word and/or phrase suggestions.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for improved communications throughput for at least one user by reducing a communication throughput time or increasing a communication throughput rate at which communication is produced, the system comprising: one or more cameras; one or more microphones; one or more processors; a human-machine interface; and computer-readable media having stored thereon computer-executable instructions, that, when executed by the one or more processors, configure the system to perform operations comprising: obtaining an image by the one or more cameras; obtaining a salient object label for an object in the image; capturing audio data via the one or more microphones; determining a conversational context based on the audio data, the conversational context including one or more of syntax information, semantic information, and key words; generating a predicted word based on the salient object label; generating predicted phrases based on the predicted word, the salient object label, the object in the image, and the conversational context; selecting one or more suggested utterances from among the predicted word and the predicted phrases; and outputting a representation of the one or more suggested utterances via the human-machine interface, thereby providing improved communications throughput for the at least one user by reducing the communication throughput time or increasing the communication throughput rate at which communication is produced. 2. The system as claim 1 recites, the outputting including: rendering the image via the human-machine interface; and rendering the representation of the one or more suggested utterances as one or more labels, the one or more labels being displayed as one or more superimposed labels over a location in the image corresponding to the object. 3. The system as claim 1 recites, the outputting including: one or more of: rendering the one or more suggested utterances via the human-machine interface as a symbol, word, or phrase completion, rendering the one or more suggested utterances via the human-machine interface as a part of a rendered keyboard, rendering at a devoted portion of the human-machine interface, or rendering a pictorial representation of the one or more suggested utterances via the human-machine interface. 4. The system as claim 1 recites, the human-machine interface including a display and the outputting the one or more suggested utterances to the human-machine interface including: rendering, in a designated area of the display devoted to environmentally contextual predicted words or phrases, a portion of the one or more suggested utterances corresponding to predicted word or phrases that relate to surroundings of a user; ranking remaining portions of the one or more suggested utterances based on relatedness to the surroundings of the user; and rendering, at the display outside the designated area, the remaining portions in an order corresponding to the ranking, the remaining portions being displayed as one or more superimposed labels over a location in the image corresponding to the surroundings of the user. 5. The system as claim 1 recites, the operations further comprising: receiving an input indicative of a selection of a word or a phrase of the one or more suggested utterances; provisioning for output the selected word or phrase to one or more of: a network interface, an application stored on the computer readable media, or the human-machine interface. 6. The system as claim 5 recites, wherein the receiving an input indicative of a selection includes: obtaining gaze data by the one or more cameras; and correlating the gaze data with a discrete portion of the human-machine interface, the discrete portion corresponding to one of the predicted word or predicted phrases. 7. The system as claim 1 recites, wherein at least one of the generating the predicted word, the selecting the one or more suggested utterances, and the outputting the one or more suggested utterances via the human-machine interface, is further based on the conversational context. 8. The system as claim 1 recites, the operations further comprising: capturing gaze data of a user of the system; identifying at least one of a portion of the human-machine interface or the salient object based on the gaze data; and wherein at least one of: the generating the predicted word, the generating the predicted phrases, the selecting the one or more suggested utterances, or the outputting the one or more suggested utterances via the human-machine interface, is further based on the identifying. 9. The system as claim 8 recites, the operations further comprising: identifying, based on the gaze data, one or more of: a portion of a user interface rendered via the human-machine interface, a representation of the object, a portion of the image, a portion of an environment of the user, or a subset of options for input rendered via the human-machine interface. 10. The system as claim 1 recites, the operations further comprising: determining a condition or a position of the object, the object being a first object; determining a second salient object label for a second object in the image; identifying a relative position of the first object relative to the second object based on the condition or the position of the object and based on a condition or a position of the second object; and wherein at least one of: the generating the predicted word, the generating the predicted phrases, the selecting the one or more suggested utterances, or the outputting the one or more suggested utterances via the human-machine interface, is further based on one or more of one or more of: the second salient object label, a co-occurrence of the first object and the second object in the image, or the condition of the first object, the position of the first object, or the relative position of the first object to the second object. 11. The system as claim 1 recites, the operations further comprising: obtaining application data, the application data including one or more of: an identification of an application-in-use or a recently-used application and identifiers of functionality of the application-in-use or the recently-used application, an identifier of a portion of the application-in-use or the recently-used application designated to receive the one or more suggested utterances, or a usage history or data stored by an application stored on the computer-readable media; and wherein at least one of: the generating the predicted word, the generating the predicted phrases, the selecting the one or more suggested utterances, or the outputting the one or more suggested utterances via the human-machine interface, is further based on the application data. 12. The system as claim 1 recites, wherein the image is an image of physical surroundings of the user. 13. A method for improved communications throughput for at least one user by reducing a communication throughput time or increasing a communication throughput rate at which communication is produced, the method comprising: obtaining an image by one or more cameras; obtaining a salient object label for an object in the image; capturing audio data via one or more microphones; determining a conversational context based on the audio data, the conversational context including one or more of syntax information, semantic information, and key words; generating a predicted word based on the salient object label; generating predicted phrases based on the predicted word, the salient object label, the obje

Assignees

Inventors

Classifications

  • Details of searching files based on file metadata · CPC title

  • G06F3/167Primary

    Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title

  • Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually · CPC title

  • Query formulation · CPC title

  • Speech to text systems (G10L15/08 takes precedence) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10223067B2 cover?
An environmental context of a user may be taken into account to enhance the user's communication throughput. An “environmental context” can include spatial surroundings of a user, device, and/or sensor of the device and more broadly to denote the context of the user in a multiplicity of environments such as, for example, the surroundings of a user, a digital environment such as the user or othe…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06F3/167. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 05 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).