Hotword detection on multiple devices
US-9424841-B2 · Aug 23, 2016 · US
US11087760B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11087760-B2 |
| Application number | US-201916696622-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 26, 2019 |
| Priority date | Dec 30, 2016 |
| Publication date | Aug 10, 2021 |
| Grant date | Aug 10, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A system of multi-modal transmission of packetized data in a voice activated data packet based computer network environment is provided. A natural language processor component can parse an input audio signal to identify a request and a trigger keyword. Based on the input audio signal, a direct action application programming interface can generate a first action data structure, and a content selector component can select a content item. An interface management component can identify first and second candidate interfaces, and respective resource utilization values. The interface management component can select, based on the resource utilization values, the first candidate interface to present the content item. The interface management component can provide the first action data structure to the client computing device for rendering as audio output, and can transmit the content item converted for a first modality to deliver the content item for rendering from the selected interface.
Opening claim text (preview).
What is claimed is: 1. A system to transmit data in a voice-based computing environment, comprising: a data processing system comprising one or more processors and memory to: receive, via an interface of the data processing system, data packets comprising an input audio signal detected by a sensor of a client computing device; parse the input audio signal to identify a request; generate, based on the request, a first action data structure; select a content item responsive to the request; identify a plurality of interfaces of the client computing device; determine a characteristic of each of the plurality of interfaces; select, based on the characteristic of each of the plurality of interfaces, a first interface of the plurality of interfaces having a first characteristic; and provide the first action data structure and the content item to the client computing device for presentation as audio output via the first interface of the client computing device. 2. The system of claim 1 , comprising: the data processing system to provide the content item in a modality compatible with the first interface. 3. The system of claim 1 , comprising the data processing system to: determine a capability of the first interface; and convert the content item to a modality compatible with the capability of the first interface. 4. The system of claim 1 , wherein the first interface comprises an audio interface, comprising: the data processing system to provide the content item for presentation via the audio interface. 5. The system of claim 1 , comprising the data processing system to: select a second content item based on the first characteristic of the first interface; and provide the second content item to the client computing device for presentation via the first interface. 6. The system of claim 1 , comprising the data processing system to: parse the input audio signal to identify a keyword corresponding to the request; and select the content item based at least on the keyword. 7. The system of claim 1 , comprising the data processing system to: select a second content item; and provide the second content item to the client computing device for presentation via a second interface of the client computing device that has a different characteristic than the first characteristic. 8. The system of claim 1 , comprising the data processing system to: select a second content item comprising visual output; select a second interface comprising a display device based on the second content item comprising visual output; and provide the second content item to the client computing device for presentation via the second interface of the client computing device. 9. The system of claim 1 , wherein the characteristic of each of the plurality of interfaces comprises a resource utilization value, comprising: the data processing system to select the first interface based on the resource utilization value to reduce resource utilization associated with presentation of the content item. 10. The system of claim 9 , wherein the resource utilization value comprises at least one of a battery status, a processor utilization, a memory utilization, or a network bandwidth utilization. 11. The system of claim 1 , comprising: the data processing system to deliver the content item to the client computing device subsequent to transmission of the first action data structure to the client computing device. 12. The system of claim 1 , wherein the plurality of interfaces include at least one of a display screen, an audio interface, a vibration interface, an email interface, a push notification interface, a mobile computing device interface, a portable computing device application, a content slot on an online document, a chat application, mobile computing device application, a laptop, a watch, a virtual reality headset, and a speaker. 13. A method of transmitting data in a voice-based computing environment, comprising: receiving, by a data processing system comprising one or more processors and memory, data packets comprising an input audio signal detected by a sensor of a client computing device; parsing, by the data processing system, the input audio signal to identify a request; generating, by the data processing system based on the request, a first action data structure; selecting, by the data processing system, a content item responsive to the request; identifying, by the data processing system, a plurality of interfaces of the client computing device; determining, by the data processing system, a characteristic of each of the plurality of interfaces; selecting, by the data processing system based on the characteristic of each of the plurality of interfaces, a first interface of the plurality of interfaces having a first characteristic; and providing, by the data processing system, the first action data structure and the content item to the client computing device for presentation as audio output via the first interface of the client computing device. 14. The method of claim 13 , comprising: providing the content item in a modality compatible with the first interface. 15. The method of claim 13 , comprising: determining a capability of the first interface; and converting the content item to a modality compatible with the capability of the first interface. 16. The method of claim 13 , wherein the first interface comprises an audio interface, comprising: providing the content item for presentation via the audio interface. 17. The method of claim 13 , comprising: selecting a second content item based on the first characteristic of the first interface; and providing the second content item to the client computing device for presentation via the first interface. 18. The method of claim 13 , comprising: parsing the input audio signal to identify a keyword corresponding to the request; and selecting the content item based at least on the keyword. 19. The method of claim 13 , comprising: selecting a second content item; and providing the second content item to the client computing device for presentation via a second interface of the client computing device that has a different characteristic than the first characteristic. 20. The method of claim 13 , comprising: selecting a second content item comprising visual output; selecting a second interface comprising a display device based on the second content item comprising visual output; and providing the second content item to the client computing device for presentation via the second interface of the client computing device.
with rate being modified by the source upon detecting a change of network conditions · CPC title
Parsing for meaning understanding · CPC title
Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title
Speech to text systems (G10L15/08 takes precedence) · CPC title
Word spotting · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.