Multisensory speech detection
US-9570094-B2 · Feb 14, 2017 · US
US10026419B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10026419-B2 |
| Application number | US-201514645802-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 12, 2015 |
| Priority date | Nov 10, 2008 |
| Publication date | Jul 17, 2018 |
| Grant date | Jul 17, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A computer-implemented method of multisensory speech detection is disclosed. The method comprises determining an orientation of a mobile device and determining an operating mode of the mobile device based on the orientation of the mobile device. The method further includes identifying speech detection parameters that specify when speech detection begins or ends based on the determined operating mode and detecting speech from a user of the mobile device based on the speech detection parameters.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method, comprising: initiating, by a mobile computing device, an audio recording process using a microphone of the mobile computing device; identifying, by the mobile computing device, a first orientation of the mobile computing device at a beginning of the audio recording process; determining, during the audio recording process, that the mobile computing device has transitioned from the first orientation to a second orientation; in response to determining that the mobile computing device has transitioned from the first orientation to a second orientation, determining a speech energy threshold for comparing to speech received during the audio recording process and while the mobile computing device is in the second orientation; comparing an energy of the speech received during the audio recording process and while the mobile computing device is in the second orientation to the speech energy threshold; determining an end of speech condition based on (i) the energy of the speech received during the audio recording process and while the mobile computing device is in the second orientation and (ii) the speech energy threshold; and stopping, based on determining the end of speech condition, the audio recording process. 2. The method of claim 1 , wherein identifying the first orientation of the mobile computing device is performed in response to a proximity sensor of the mobile computing device detecting an object within a predetermined distance of the proximity sensor. 3. The method of claim 1 , wherein identifying the first orientation of the mobile computing device comprises detecting that a user is within a predetermined distance of a proximity sensor of the mobile computing device, and wherein determining that the mobile computing device has transitioned from the first orientation to the second orientation includes determining that the user is no longer within the predetermined distance of the proximity sensor. 4. The method of claim 1 , wherein determining that the mobile computing device has transitioned from the first orientation to the second orientation is performed using one or more accelerometers of the mobile computing device. 5. The method of claim 1 , comprising: classifying the transition of the mobile computing device from the first orientation to the second orientation as one of a “to mouth” gesture, a “from mouth” gesture, a “to ear” gesture, or a “from ear” gesture, wherein determining the speech energy threshold for comparing to speech received during the audio recording process and while the mobile computing device is in the second orientation is based on classifying the transition of the mobile computing device from the first orientation to the second orientation as one of the “to mouth” gesture, the “from mouth” gesture, the “to ear” gesture, or the “from ear” gesture. 6. The method of claim 1 , wherein determining, during the audio recording process, that the mobile computing device has transitioned from the first orientation to the second orientation comprises determining that an angle between a user and the mobile computing device is outside of a selected set of angles. 7. The method of claim 1 , wherein determining, during the audio recording process, that the mobile computing device has transitioned from the first orientation to the second orientation comprises determining that the mobile computing device is beyond a predetermined distance from a portion of a user. 8. The method of claim 1 , comprising: while the mobile computing device is in the first orientation, comparing the energy of the speech received during the audio recording process to a different speech energy threshold; determining that the energy of the speech received during the audio recording process and while the mobile computing device is in the first orientation does is greater than the different speech energy threshold; and based on determining that the energy of the speech received during the audio recording process and while the mobile computing device is in the first orientation is greater than the different speech energy threshold, continuing the audio recording process. 9. A non-transitory computer storage medium encoded with a computer program, the program comprising instructions that when executed by data processing apparatus cause the data processing apparatus to perform operations comprising: initiating, by a mobile computing device, an audio recording process using a microphone of the mobile computing device; identifying, by the mobile computing device, a first orientation of the mobile computing device at a beginning of the audio recording process; determining, during the audio recording process, that the mobile computing device has transitioned from the first orientation to a second orientation; in response to determining that the mobile computing device has transitioned from the first orientation to a second orientation, determining a speech energy threshold for comparing to speech received during the audio recording process and while the mobile computing device is in the second orientation; comparing an energy of the speech received during the audio recording process and while the mobile computing device is in the second orientation to the speech energy threshold; determining an end of speech condition based on (i) the energy of the speech received during the audio recording process and while the mobile computing device is in the second orientation and (ii) the speech energy threshold; and stopping, based on determining the end of speech condition, the audio recording process. 10. The computer storage medium of claim 9 , wherein identifying the first orientation of the mobile computing device is performed in response to a proximity sensor of the mobile computing device detecting an object within a predetermined distance of the proximity sensor. 11. The computer storage medium of claim 9 , wherein identifying the first orientation of the mobile computing device comprises detecting that a user is within a predetermined distance of a proximity sensor of the mobile computing device, and wherein determining that the mobile computing device has transitioned from the first orientation to the second orientation includes determining that the user is no longer within the predetermined distance of the proximity sensor. 12. The computer storage medium of claim 9 , wherein determining that the mobile computing device has transitioned from the first orientation to the second orientation is performed using one or more accelerometers of the mobile computing device. 13. The computer storage medium of claim 9 , comprising: classifying the transition of the mobile computing device from the first orientation to the second orientation as one of a “to mouth” gesture, a “from mouth” gesture, a “to ear” gesture, or a “from ear” gesture, wherein determining the speech energy threshold for comparing to speech received during the audio recording process and while the mobile computing device is in the second orientation is based on classifying the transition of the mobile computing device from the first orientation to the second orientation as one of the “to mouth” gesture, the “from mouth” gesture, the “to ear” gesture, or the “from ear” gesture. 14. The computer storage medium of claim 9 , wherein determining, during the audio recording process, that the mobile computing device has transitioned from the first orientation to the second orientation comprises determining that an angle between a user and the mobile computing device is outside of a selected set of angles. 15. The co
Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title
with voice recognition means · CPC title
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
Detection of presence or absence of voice signals (switching of direction of transmission by voice frequency in two-way loud-speaking telephone systems H04M9/10) · CPC title
Arrangements for converting the position or the displacement of a member into a coded form · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.