Determining device groups

US10685652B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-10685652-B1
Application numberUS-201815928682-A
CountryUS
Kind codeB1
Filing dateMar 22, 2018
Priority dateMar 22, 2018
Publication dateJun 16, 2020
Grant dateJun 16, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

This disclosure describes, in part, techniques for determining device groupings, or clusters, for multiple voice-enabled devices. The device clusters may be determined based on metadata data for audio signals (or audio data) generated by each of the multiple voice-enabled devices. For example, a remote system may analyze timestamp data for the audio signals received from the devices, and determine that the devices detected the same voice command of a user based on the timestamp data indicating that the audio signals were received within a threshold period of time from each other. Additionally, the remote system may analyze other metadata of the audio data, such as signal-to-noise (SNR) values, and determine that the SNR values are within a threshold value. The remote system may determine device clusters for the voice-enabled devices of a user based on these, and potentially other, types of metadata of the audio signals.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving, from a first device, first audio data representing first sound; receiving, from a second device, second audio data representing second sound captured by a second microphone of the second device; determining that the first audio data was received within a threshold period of time of when the second audio data was received; based at least in part on the first audio data being received within the threshold period of time of when the second audio data was received, generating an association between the first device and the second device indicating that the first device is located in a same physical environment as the second device; and storing the association indicating that the first device is in the same physical environment as the second device, wherein the association is to be used in future processing of audio data received from only the first device. 2. The method of claim 1 , further comprising: identifying a first signal-to-noise (SNR) value associated with the first audio data; identifying a second SNR value associated with the second audio data; determining that the first SNR value is greater than or equal to a threshold SNR value; determining that the second SNR value is greater than or equal to the threshold SNR value; and wherein the generating the association between the first device and the second device is further based at least in part on the first SNR value and the second SNR value being greater than or equal to the threshold SNR value. 3. The method of claim 1 , further comprising: identifying a first audio-signal metric associated with the first audio data; identifying a second audio-signal metric associated with the second audio data; determining that the first audio-signal metric is within a threshold amount to the second audio-signal metric; and wherein the generating the association between the first device and the second device is further based at least in part on the first audio-signal metric is within a threshold amount to the second audio-signal metric. 4. The method of claim 1 , further comprising: determining a number of instances where audio data was received from the first device within the threshold period of time of when audio data was received from the second device; determining that the number of instances is greater than or equal to a threshold number of instances; and wherein the generating the association between the first device and the second device is further based at least in part on the number of instances being greater than or equal to the threshold number of instances. 5. The method of claim 1 , further comprising: determining a number of instances where audio data was received from the first device within the threshold period of time of when audio data was received from the second device; identifying first signal-to-noise (SNR) values associated with the first device, wherein an SNR value of the first SNR values is associated with corresponding audio data received from the first device in the number of instances; identifying second SNR values associated with the second device, wherein an SNR value of the second SNR values is associated with corresponding audio data received from the second device in the number of instances; determining that, for more than a threshold number of the number of the instances, the first SNR values and the second SNR values are greater than or equal to a threshold SNR value; and wherein the generating the association between the first device and the second device is further based at least in part on the determining that, for more than the threshold number of the number of the instances, the first SNR values and the second SNR values are greater than or equal to the threshold SNR value. 6. The method of claim 1 , further comprising, prior to the generating the association: storing, in memory of a network-based computing device, an initial association between the first device, the second device, and a third device; determining that third audio data was not received from the third device within the threshold period of time from when at least one of the first audio data or the second audio data was received; and based at least in part on the third audio data not being received from the third device within the threshold period of time from when at least one of the first audio data or the second audio data was received, removing the initial association from the memory of the network-based computing device. 7. The method of claim 1 , further comprising: storing, in memory of one or more network-based computing devices, the association between the first device and the second device; determining metadata for the association between the first device and the second device, the metadata indicating at least one of: a device name assigned to the first device; an action previously performed by the first device; or an identity of a user that issued a voice command represented by the first audio data; and storing the metadata in the memory of the one or more network-based computing devices. 8. A system comprising: one or more processors; and computer-readable media storing computer-executable instructions that, when executed by the one or more processors, cause the one or more processors to perform operations comprising: receiving, from a first device, first audio data representing first sound captured by a first microphone of the first device; receiving, from a second device, second audio data representing second sound captured by a second microphone of the second device; determining that the first audio data was received within a threshold period of time of when the second audio data was received; based at least in part on the first audio data being received within the threshold period of time of when the second audio data was received, generating an association between the first device and the second device; receiving, from the first device, third audio data representing a speech utterance captured by the first microphone of the first device; determining intent data representing the speech utterance; determining, based at least in part on the association and the intent data, a command to cause the second device to perform an action; and sending, to the second device, command data indicating the command. 9. The system of claim 8 , the operations further comprising: determining a first signal-to-noise (SNR) value associated with the first audio data; determining a second SNR value associated with the second audio data; determining that the first SNR value is greater than or equal to a threshold SNR value; and determining that the second SNR value is greater than or equal to the threshold SNR value, wherein generating the association between the first device is further based at least in part on the first SNR value and second SNR value being greater than or equal to the threshold SNR value. 10. The system of claim 8 , the operations further comprising: identifying a first audio-signal metric associated with the first audio data; identifying a second audio-signal metric associated with the second audio data; and determining that the first audio-signal metric is within a threshold amount to the second audio-signal metric, wherein the generating the association between the first device and the second device is further based at least in part on the first audio-signal metric is within a threshold amount to the second audio-signal metric. 11. The system of claim 8 , the operations further comprising: determining a number of instances where audio data was received from the first device within the threshold period of tim

Assignees

Inventors

Classifications

  • Speaker identification or verification techniques · CPC title

  • G10L25/51Primary

    for comparison or discrimination · CPC title

  • the extracted parameters being power information · CPC title

  • G10L15/22Primary

    Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

  • Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10685652B1 cover?
This disclosure describes, in part, techniques for determining device groupings, or clusters, for multiple voice-enabled devices. The device clusters may be determined based on metadata data for audio signals (or audio data) generated by each of the multiple voice-enabled devices. For example, a remote system may analyze timestamp data for the audio signals received from the devices, and determ…
Who is the assignee on this patent?
Amazon Tech Inc
What technology area does this patent fall under?
Primary CPC classification G10L25/51. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 16 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).