Speech recognition system and method
US-2015019221-A1 · Jan 15, 2015 · US
US11257504B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11257504-B2 |
| Application number | US-202016881625-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 22, 2020 |
| Priority date | May 30, 2014 |
| Publication date | Feb 22, 2022 |
| Grant date | Feb 22, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
This relates to systems and processes for using a virtual assistant to control electronic devices. In one example process, a user can speak an input in natural language form to a user device to control one or more electronic devices. The user device can transmit the user speech to a server to be converted into a textual representation. The server can identify the one or more electronic devices and appropriate commands to be performed by the one or more electronic devices based on the textual representation. The identified one or more devices and commands to be performed can be transmitted back to the user device, which can forward the commands to the appropriate one or more electronic devices for execution. In response to receiving the commands, the one or more electronic devices can perform the commands and transmit their current states to the user device.
Opening claim text (preview).
What is claimed is: 1. A non-transitory computer-readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by one or more processors of one or more servers, cause the one or more servers to: receive, from a user device, data corresponding to an audio input comprising a user speech; perform speech to text conversion on the data corresponding to the audio input to generate a textual representation of the user speech; determine that the textual representation of the user speech represents a user intent to change a state of each of a plurality of electronic devices based on a stored configuration, wherein the stored configuration defines the state of each of the plurality of electronic devices to use in response to a command that references the configuration; and transmit to the user device: a plurality of commands to set the state of each of the plurality of electronic devices based on the configuration; and identifications associated with each of the plurality of commands, wherein the identifications identify each of the plurality of electronic devices for performing each of the plurality of commands. 2. The non-transitory computer-readable storage medium of claim 1 , wherein the one or more programs include further instructions, which when executed by the one or more processors, cause the one or more servers to: after transmitting the plurality of commands and the identifications, receive an updated state of each of the plurality of electronic devices from the user device; and update a state of one or more of the plurality of electronic devices in a database based on the received updated state of each of the plurality of electronic devices. 3. The non-transitory computer-readable storage medium of claim 2 , wherein the one or more programs include further instructions, which when executed by the one or more processors, cause the one or more servers to: after receiving the updated states of each of the plurality of electronic devices, generate an audio or visual indication of a result of the transmitted plurality of commands, wherein the result is based on the transmitted plurality of commands and the received updated state of each of the plurality of electronic devices; and transmit, to the user device, data corresponding to the audio or visual indication of the result. 4. The non-transitory computer-readable storage medium of claim 2 , wherein at least one of the updated states of each of the plurality of electronic devices is an error state, and wherein the error state represents an unavailable or undetermined state of an electronic device. 5. The non-transitory computer-readable storage medium of claim 2 , wherein the updated state of each of the plurality of electronic devices comprises at least one or an ON/OFF state, a dimmable state, a color state, an ACTIVE/INACTIVE state, a LOCKED/UNLOCKED state, an OPEN/CLOSED state, or a temperature state. 6. The non-transitory computer-readable storage medium of claim 1 , wherein the user device is a mobile phone, a desktop computer, a laptop computer, a tablet computer, a portable media player, a television, a television set-top box, or a wearable electronic device. 7. The non-transitory computer-readable storage medium of claim 1 , wherein the plurality of electronic devices includes at least one of a light bulb, an electrical outlet, a switch, a door lock, a garage door, and a thermostat. 8. The non-transitory computer-readable storage medium of claim 1 , wherein the one or more programs include further instructions, which when executed by the one or more processors, cause the one or more servers to: receive, from the user device, data corresponding to a location of the user device. 9. The non-transitory computer-readable storage medium of claim 8 , wherein the one or more servers determine that the textual representation of the user speech represents a user intent to change the state of each of a plurality of electronic devices based on a stored configuration further based on the data corresponding to the location of the user device. 10. The non-transitory computer-readable storage medium of claim 1 , wherein the one or more programs include further instructions, which when executed by the one or more processors, cause the one or more servers to: receive, from the user device, data corresponding to a current date and time. 11. The non-transitory computer-readable storage medium of claim 10 , wherein the one or more servers determine that the textual representation of the user speech represents a user intent to change the state of each of a plurality of electronic devices based on a stored configuration further based on the data corresponding to the current date and time. 12. A method for controlling electronic devices using a virtual assistant on a user device, the method comprising: receiving, by one or more servers, data corresponding to an audio input comprising a user speech; performing speech to text conversion on the data corresponding to the audio input to generate a textual representation of the user speech; determining that the textual representation of the user speech represents a user intent to change a state of each of a plurality of electronic devices based on a stored configuration, wherein the stored configuration defines the state of each of the plurality of electronic devices to use in response to a command that references the configuration; and transmitting to the user device: a plurality of commands to set the state of each of the plurality of electronic devices based on the configuration; and identifications associated with each of the plurality of commands, wherein the identifications identify each of the plurality of electronic devices for performing each of the plurality of commands. 13. The method of claim 12 , further comprising: after transmitting the plurality of commands and the identifications, receiving an updated state of each of the plurality of electronic devices from the user device; and updating a state of one or more of the plurality of electronic devices in a database based on the received updated state of each of the plurality of electronic devices. 14. The method of claim 13 , further comprising: after receiving the updated states of each of the plurality of electronic devices, generating an audio or visual indication of a result of the transmitted plurality of commands, wherein the result is based on the transmitted plurality of commands and the received updated state of each of the plurality of electronic devices; and transmitting, to the user device, data corresponding to the audio or visual indication of the result. 15. The method of claim 13 , wherein at least one of the updated states of each of the plurality of electronic devices is an error state, and wherein the error state represents an unavailable or undetermined state of an electronic device. 16. The method of claim 13 , wherein the updated state of each of the plurality of electronic devices comprises at least one or an ON/OFF state, a dimmable state, a color state, an ACTIVE/INACTIVE state, a LOCKED/UNLOCKED state, an OPEN/CLOSED state, or a temperature state. 17. The method of claim 12 , wherein the user device is a mobile phone, a desktop computer, a laptop computer, a tablet computer, a portable media player, a television, a television set-top box, or a wearable electronic device. 18. The method of claim 12 , wherein the plurality of electronic devices includes at least one of a light bulb, an electrical outlet, a
based on user interaction within the home (receiver circuitry for displaying additional information being controlled by a remote control apparatus H04N21/42204) · CPC title
Controlling appliance services of a home automation network by calling their functionalities (arrangements in telecontrol or telemetry systems for selectively calling a substation from a main station; in which substation desired apparatus is selected for applying a control signal thereto or for obtaining measured values therefrom H04Q9/00) · CPC title
Interactive procedures; Man-machine interfaces · CPC title
Domotique, domestic, home control, automation, smart house · CPC title
Voice, vocal command or message · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.