What technology area does this patent fall under?

Primary CPC classification G10L15/22. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jan 28 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Context-based device arbitration

US10546583B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10546583-B2
Application number	US-201715691460-A
Country	US
Kind code	B2
Filing date	Aug 30, 2017
Priority date	Aug 30, 2017
Publication date	Jan 28, 2020
Grant date	Jan 28, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

This disclosure describes, in part, context-based device arbitration techniques to select a voice-enabled device from multiple voice-enabled devices to provide a response to a command included in a speech utterance of a user. In some examples, the context-driven arbitration techniques may include determining a ranked list of voice-enabled devices that are ranked based on audio signal metric values for audio signals generated by each voice-enabled device, and iteratively moving through the list to determine, based on device states of the voice-enabled devices, whether one of the voice-enabled devices can perform an action responsive to the command. If the voice-enabled devices that detected the speech utterance are unable to perform the action responsive to the command, all other voice-enabled devices associated with an account may be analyzed to determine whether one of the other voice-enabled devices can perform the action responsive to the command in the speech utterance.

First claim

Opening claim text (preview).

What is claimed is: 1. A system comprising: one or more processors; computer-readable media storing computer-executable instructions that, when executed by the one or more processors, cause the one or more processors to perform operations comprising: receiving, from a first voice-enabled device, first audio data representing a speech utterance; receiving, from the first voice-enabled device, a first audio signal metric value indicating a first signal-to-noise ratio associated with the first audio data; receiving, from a second voice-enabled device, second audio data representing the speech utterance; receiving, from the second voice-enabled device, a second audio signal metric value indicating a second signal-to-noise ratio associated with the second audio data; determining that the first signal-to-noise ratio is greater than the second signal-to-noise ratio; identifying device state data associated with the first voice-enabled device; generating, using automatic speech recognition (ASR) on at least one of the first audio data or the second audio data, text data corresponding to the speech utterance; determining, using natural language understanding (NLU) on the text data, intent data associated with the speech utterance, the intent data representing a request for a client device to perform an action; determining, based at least in part on the device state data, that the first voice-enabled device is capable of performing the action responsive to the speech utterance; determining a command to cause the first voice-enabled device to perform the action; and sending, to the first voice-enabled device, data indicating the command. 2. The system of claim 1 , the operations further comprising causing the second voice-enabled device to stop transmitting the second audio data, the second voice-enabled device being stopped from transmitting the second audio data prior to the first voice-enabled device stopping transmitting the first audio data, wherein generating the text data is performed using ASR on the first audio data. 3. The system of claim 1 , the operations further comprising: determining that the first voice-enabled device is included in a stored grouping of devices that includes the first voice-enabled device and a third voice-enabled device; identifying device state data associated with the stored grouping of devices; and determining that the stored grouping of devices is capable of performing the action responsive to the speech utterance. 4. The system of claim 1 , wherein identifying the device state data associated with the first voice-enabled device comprises: sending a request to an event component to provide an indication of the device state data associated with the first voice-enabled device; and receiving, from the event component, the device state data. 5. A system comprising: one or more processors; computer-readable media storing computer-executable instructions that, when executed by the one or more processors, cause the one or more processors to perform operations comprising: receiving a first device identifier of a first device; receiving first audio data associated with the first device identifier, the first audio data representing a speech utterance; receiving a second device identifier of a second device; receiving second audio data associated with the second device identifier, the second audio data representing the speech utterance; determining intent data associated with the speech utterance, the intent data representing a machine response for a device to perform responsive to the speech utterance; identifying first device state data associated with the first device; identifying second device state data associated with the second device; based at least in part on the second device state data, determining the second device is to be used for the machine response; determining command data to cause the second device to perform the machine response; and sending, to the second device, the command data to perform the machine response. 6. The system of claim 5 , further comprising determining, based on the first device state data, that the first device is offline. 7. The system of claim 5 , the operations further comprising: determining that the first device is included in a stored grouping of devices that includes the first device and a third device; identifying device state data associated with the stored grouping of devices; and determining, based on the device state data associated with the stored grouping of devices, that the stored grouping of devices is offline. 8. The system of claim 5 , the operations further comprising: determining that the first device is associated with a secondary device; identifying third device state data associated with the secondary device; and determining, based on the third device state data, that the secondary device is offline. 9. The system of claim 5 , the operations further comprising: determining, based on the first device state data, that the first device is offline; and storing an indication that the second device is to perform the machine response. 10. The system of claim 5 , the operations further comprising receiving an indication that the first device is ranked higher than the second device based at least in part on a first audio signal metric associated with the first audio data and a second audio signal metric associated with the second audio data. 11. The system of claim 10 , wherein: the first audio signal metric associated with the first audio data comprises at least one of: a first signal-to-noise value of the first audio data; a first amplitude of the first audio data; or a first level of voice activity in the first audio data; and the second audio signal metric associated with the second audio data comprises at least one of: a second signal-to-noise value of the second audio data; a second amplitude of the second audio data; or a second level of voice activity in the second audio data. 12. The system of claim 5 , the operations further comprising receiving an indication that the first device is ranked higher than the second device, wherein the first device and the second device are ranked based on one or more of: input received via an input control of the first device; a distance of a user to the first device; or image data indicating that the user is at least partially facing the first device. 13. A method comprising: receiving first audio data associated with a first device, the first audio data representing a speech utterance; receiving second audio data associated with a second device, the second audio data representing the speech utterance; identifying first device state data associated with the first device; identifying second device state data associated with the second device; determining intent data associated with the speech utterance, the intent data representing a machine response for a device to perform responsive to the speech utterance; based at least in part on the second device state data, determining the second device is to be used for the machine response; determining command data to cause the second device to perform the machine response; and sending, to the second device, the command data to perform the machine response. 14. The method of claim 13 , further comprising determining, based on the first device state data, that the first device is offline. 15. The method of claim 13 , further comprising: determining that the first device is included in a stored grouping of devices that includes the first device and a third device; id

Assignees

Amazon Tech Inc

Inventors

Classifications

G10L15/22Primary
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
G10L2015/223
Execution procedure of a spoken command · CPC title
G10L25/84
for discriminating voice from noise · CPC title
G10L15/1815
Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning · CPC title
G10L15/28
Constructional details of speech recognition systems · CPC title

Patent family

Related publications grouped by family.

View patent family 63638329

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10546583B2 cover?: This disclosure describes, in part, context-based device arbitration techniques to select a voice-enabled device from multiple voice-enabled devices to provide a response to a command included in a speech utterance of a user. In some examples, the context-driven arbitration techniques may include determining a ranked list of voice-enabled devices that are ranked based on audio signal metric val…
Who is the assignee on this patent?: Amazon Tech Inc
What technology area does this patent fall under?: Primary CPC classification G10L15/22. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jan 28 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Neural network based beam selection

Neural network based beam selection

Device selection for providing a response

Arbitration between voice-enabled devices

Device Leadership Negotiation Among Voice Interface Devices

Device designation for audio input monitoring

Frequently asked questions