Reception of audio commands

US10210863B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10210863-B2
Application numberUS-201615341552-A
CountryUS
Kind codeB2
Filing dateNov 2, 2016
Priority dateNov 2, 2016
Publication dateFeb 19, 2019
Grant dateFeb 19, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed herein are system, apparatus, article of manufacture, method and/or computer program product embodiments, and/or combinations and sub-combinations thereof, for controlling a media device and a display device using audio commands. In so doing, some embodiments operate to suppress noise from the display device, and enhance audio commands from users. Some embodiments operate by determining a position of the display device and de-enhancing audio from the display device based on the display device position. The position of the user is determined, and audio from the user based on the user position is enhanced. Then, a command in the enhanced user audio is identified, and the media device and/or the display device are caused to operate according to the command.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of controlling a media device and a display device using audio commands, comprising: detecting, at an audio responsive control device, a trigger word in audio from a source of audio commands; determining, at the audio responsive control device, that a user is the source of audio commands based on the trigger word being associated with the audio responsive control device; determining, at the audio responsive control device, a position of the display device; de-enhancing, at the audio responsive control device, audio output by a speaker of the display device based on the position of the display device and the determination that the user is the source of audio commands; determining, at the audio responsive control device, a position of the source of audio commands; enhancing, at the audio responsive control device, the audio from the source of audio commands based on the determined position of the source of audio commands and the determination that the user is the source of audio commands; identifying, at the audio responsive control device, a command in the enhanced audio from the source of audio commands; and causing, at the audio responsive control device, at least one of the media device and the display device to operate according to the command. 2. The method of claim 1 , further comprising: receiving, at the audio responsive control device, input identifying the position of the display device relative to a microphone array in the audio responsive control device, and the position of the source of audio commands relative to the microphone array. 3. The method of claim 1 , the determining the position of the source of audio commands further comprising: identifying, at the audio responsive control device, a microphone in a microphone array in the audio responsive control device having a greatest signal strength when receiving the audio from the source of audio commands; and determining, at the audio responsive control device, the position of the source of audio commands based on the identified microphone. 4. The method of claim 3 , wherein the enhancing the audio from the source of audio commands comprises: beam forming, at the audio responsive control device, an audio reception pattern of the identified microphone to improve receipt of the audio from the source of audio commands. 5. The method of claim 1 , wherein the enhancing the audio from the source of audio commands comprises: beam forming, at the audio responsive control device, an audio reception pattern of a microphone in a microphone array proximate to the position of the source of audio commands to improve receipt of the audio from the source of audio commands. 6. The method of claim 1 , wherein the de-enhancing the audio from the display device based on the position of the display device comprises at least one of: deactivating, at the audio responsive control device, one or more microphones in a microphone array in the audio responsive control device that are proximate to the position of the display device; and muting the audio from the speaker of the display device after receipt of the trigger word. 7. The method of claim 1 , wherein the de-enhancing the audio from the display device based on the position of the display device comprises: beam forming, at the audio responsive control device, an audio reception pattern of a microphone in a microphone array that is proximate to the position of the display device to suppress receipt of the audio from the speaker of the display device. 8. The method of claim 1 , wherein the de-enhancing the audio output by the speaker of the display device based on the position of the display device comprises: de-enhancing, at the audio responsive control device, the audio output by the speaker of the display device based on at least the trigger word in the audio from the source of audio commands. 9. The method of claim 1 , wherein the enhancing the audio from the source of audio commands based on the position of the source of audio commands and the trigger word in the audio from the source of audio commands comprises: enhancing, at the audio responsive control device, the audio from the source of audio commands based on at least a trigger word type of the trigger word. 10. The method of claim 1 , wherein the de-enhancing comprises: receiving, via a network, an audio stream that is to be output by the speaker of the display device; determining the audio output by the speaker of the display device matches the audio stream that is to be output by the speaker of the display device; and subtracting the audio stream that is to be output by the speaker of the display device from audio received by a microphone in a microphone array in the audio responsive control device. 11. An audio responsive control device to control a display device and a media server, comprising: a memory; and at least one processor operatively coupled to the memory, the at least one processor configured to: detect a trigger word in audio from a source of audio commands; determine that a user is the source of audio commands based on the trigger word being associated with the audio responsive control device; determine a position of the display device; de-enhance audio output by a speaker of the display device based on a position of the display device and the determination that the user is the source of audio commands; determine a position of a source of audio commands; enhance audio from the source of audio commands based on the determined position of the source of audio commands and the determination that the user is the source of audio commands; identify a command in the enhanced audio from the source of audio commands; and cause at least one of the media device and the display device to operate according to the command. 12. The audio responsive control device of claim 11 , the at least one processor further configured to: receive input identifying the position of the display device relative to a microphone array, and the position of the source of audio commands relative to the microphone array. 13. The audio responsive control device of claim 11 , the at least one processor further configured to: identify the trigger word in the audio from the source of audio commands; and identify a microphone in a microphone array having a greatest signal strength when receiving the audio from the source of audio commands; and determine the position of the source of audio commands based on the identified microphone. 14. The audio responsive control device of claim 13 , wherein to enhance the audio from the source of audio commands the at least one processor is configured to: beam form an audio reception pattern of the identified microphone to improve receipt of the audio from the source of audio commands. 15. The audio responsive control device of claim 11 , wherein to enhance the audio from the source of audio commands the at least one processor is configured to: beam form an audio reception pattern of a microphone in a microphone array proximate to the position of the source of audio commands to improve receipt of the audio from the source of audio commands. 16. The audio responsive control device of claim 11 , wherein to de-enhance the audio from the display device based on the display device position the at least one processor is configured to: deactivate one or more microphones in the microphone array that are proximate to the position of the display device. 17. The audio responsive control device of claim 11 , wherein to de-enhance the

Assignees

Inventors

Classifications

  • G10L15/22Primary

    Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

  • for combining the signals of two or more microphones (specially adapted for hearing aids H04R25/407) · CPC title

  • Word spotting · CPC title

  • Distributed recognition, e.g. in client-server systems, for mobile phones or network applications · CPC title

  • microphones · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10210863B2 cover?
Disclosed herein are system, apparatus, article of manufacture, method and/or computer program product embodiments, and/or combinations and sub-combinations thereof, for controlling a media device and a display device using audio commands. In so doing, some embodiments operate to suppress noise from the display device, and enhance audio commands from users. Some embodiments operate by determini…
Who is the assignee on this patent?
Roku Inc
What technology area does this patent fall under?
Primary CPC classification G10L15/22. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 19 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).