Signal processing apparatus having voice activity detection unit and related signal processing methods
US-8972252-B2 · Mar 3, 2015 · US
US9256269B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9256269-B2 |
| Application number | US-201313791716-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 8, 2013 |
| Priority date | Feb 20, 2013 |
| Publication date | Feb 9, 2016 |
| Grant date | Feb 9, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Aspects of the present disclosure describe methods and apparatuses for executing operations on a client device platform that is operating in a low-power state. A first analysis may be used to assign a first confidence score to a recorded non-tactile input. When the first confidence score is above a first threshold an intermediate-power state may be activated. A second more detailed analysis may then assign a second confidence score to the non-tactile input. When the second confidence score is above a second threshold, then the operation is initiated. It is emphasized that this abstract is provided to comply with the rules requiring an abstract that will allow a searcher or other reader to quickly ascertain the subject matter of the technical disclosure. It is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims.
Opening claim text (preview).
What is claimed is: 1. A method, comprising: recording one or more non-tactile inputs to a device with one or more sensors, wherein the one or more inputs are recorded to a first memory, wherein the device is operating in a low-power state in which power is provided to a first processor and the first memory; generating one or more first confidence scores, wherein each of the one or more first confidence scores is a measure of a degree of similarity between a corresponding recorded non-tactile input and a reference input stored in the first memory; initiating an intermediate-power state of the device when the first confidence score is above a first threshold level, wherein the intermediate-power state comprises providing power to at least a second processor, wherein the second processor has a greater amount of available processing capability than the first processor; outputting a challenge signal when the first confidence score is within a challenge range, and initiating the intermediate-power state when a response to the challenge signal is detected by one or more of the sensors; generating one or more second confidence scores with the second processor, wherein each of the one or more second confidence scores is a measure of a degree of similarity between each recorded non-tactile input and a reference input; generating a command signal that instructs the client device to execute one or more operations that are associated with the reference input when the second confidence score is above a second threshold. 2. The method of claim 1 , wherein a first sensor of the one or more sensors is a microphone. 3. The method of claim 2 , wherein generating the first confidence score comprises analyzing one or more of the non-tactile inputs with a voice activity detection (VAD) algorithm. 4. The method of claim 3 , wherein the VAD algorithm is implemented by an application specific integrated circuit (ASIC). 5. The method of claim 2 , wherein generating the first confidence score comprises analyzing one or more of the non-tactile inputs with an automatic speech recognition algorithm. 6. The method of claim 2 , wherein generating the one or more first confidence scores comprises analyzing one or more of the non-tactile inputs with a voice activity detection (VAD) algorithm and an automatic speech recognition algorithm. 7. The method of claim 2 , wherein a second sensor of the one or more sensors is configured to detect the presence of a human proximate to the client device platform. 8. The method of claim 7 , wherein the sensor configured to detect the presence of a human proximate to the client device platform is a video camera. 9. The method of claim 7 , wherein the sensor configured to detect the presence of a human proximate to the client device platform is an infrared camera. 10. The method of claim 7 , wherein the sensor configured to detect the presence of a human proximate to the client device platform is a terahertz sensor. 11. The method of claim 2 , wherein generating the one or more second confidence scores comprises analyzing one or more of the non-tactile inputs with an automatic speech recognition algorithm that utilizes phonemes. 12. The method of claim 2 , wherein generating the second confidence score comprises analyzing one or more of the non-tactile inputs with an automatic speech recognition algorithm that utilizes auditory attention cues. 13. The method of claim 2 , wherein generating the second confidence score comprises analyzing one or more of the non-tactile inputs with a voice recognition algorithm configured to identify the voice of a specific human. 14. The method of claim 1 , wherein one of the one or more sensors is a video camera. 15. The method of claim 14 , wherein generating the first confidence score comprises analyzing one or more of the non-tactile inputs with an object recognition algorithm. 16. The method of claim 15 , wherein the object recognition algorithm is configured to detect the presence of a human proximate to the client device platform. 17. The method of claim 14 , wherein generating the first confidence score comprises analyzing one or more of the non-tactile inputs with a gesture recognition algorithm. 18. The method of claim 14 , wherein generating the second confidence score comprises analyzing one or more of the non-tactile inputs with an audio visual speech recognition (ASVR) algorithm. 19. The method of claim 1 , wherein one of the one or more sensors is a motion sensor and wherein generating the first confidence score includes performing motion detection. 20. The method of claim 1 , wherein the challenge signal that is detectable to a human is a blinking light emitting diode (LED). 21. The method of claim 1 , wherein the challenge signal that is an audible tone configured to be detectable by a human. 22. The method of claim 1 , wherein the non-tactile response input is an audible phrase. 23. The method of claim 1 , wherein the non-tactile response input is a gesture. 24. The method of claim 1 , wherein the intermediate-power state is implemented on a cloud based server. 25. The method of claim 24 , wherein the one or more non-tactile inputs are delivered over a network to the cloud based server. 26. The method of claim 1 , wherein the secondary processor is coupled to a second memory. 27. The method of claim 26 , wherein the second memory comprises one or more reference signals that are not stored on the first memory. 28. The method of claim 1 , wherein the first processor comprises one or more cores of a multi-core processor. 29. The method of claim 28 , wherein the second processor comprises the first processor and one or more additional cores of the multi-core processor. 30. The method of claim 1 , wherein one of the one or more the operations is configured to initiate a full-power state on the client device platform. 31. The method of claim 1 , wherein one of the one or more the operations is configured to initiate the playback of a particular media title on the client device platform. 32. The method of claim 1 , wherein one of the one or more the operations is configured to load a player profile. 33. A client device platform configured to operate on a network, comprising: a processor; a memory coupled to the processor; one or more instructions embodied in memory for execution by the processor, the instructions being configured to implement a method, the method comprising: recording one or more non-tactile inputs to a device with one or more sensors, wherein the one or more inputs are recorded to a first memory, wherein the device is operating in a low-power state in which power is provided to a first processor and the first memory; generating one or more first confidence scores, wherein each of the one or more first confidence scores is a measure of a degree of similarity between a corresponding recorded non-tactile input and a reference input stored in the first memory; initiating an intermediate-power state of the device when the first confidence score is above a first threshold level, wherein the intermediate-power state comprises providing power to at least a second processor, wherein the second processor has a greater amount of available processing capability than the first processor; outputting a c
Power management, i.e. event-based initiation of a power-saving mode · CPC title
Monitoring of events, devices or parameters that trigger a change in power modality · CPC title
Monitoring the presence, absence or movement of users · CPC title
Energy efficient computing, e.g. low power processors, power management or thermal management · CPC title
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.